Elasticsearch – timing out queries (REST and Java API)

Posted by in Elasticsearch Java API

clockA few days ago I’ve been playing around with Elasticsearch and possibilities or earlier query termination. This was one of the requirements for a user I talked to and we wanted to know all the possibilities we have in the newest version of Elasticsearch available at the given moment (1.5.0). This post will quickly go through those possibilities.

Let’s look at the following REST query sent to Elasticsearch:

curl -XGET 'localhost:9200/_search?pretty&timeout=100ms'

In general it will result in the query results being returned to us after each of the partial queries (executed on shard level) has finished or the 100 milliseconds passed.

Of course that doesn’t mean that if the query is executed for longer than 100 millisecond on the shard level it will be terminated. Elasticsearch will just return the partial results and that’s all. Because of that if we hope for resource saving after Elasticsearch returns the partial results, we won’t be happy. But if we want to get partial results after certain, limited amount of time this is a good solution (of oucrse the time passed on shard level, not the main query level part).

The results of the query with timeout will look more or less like this:

{
  "took" : 229,
  "timed_out" : true,
  "_shards" : {
    "total" : 1966,
    "successful" : 1966,
    "failed" : 0
  },
  "hits" : {
    "total" : 81771855,
    "max_score" : 0.0,
    "hits" : [ ]
  }
}

As you can see Elasticsearch informed us that the query is timed out (time_out property set to true) and eventhough all the shards successfully returned query results, they are probably partial.

What about Java API?

During one of the recent discussions I’ve got a question how to achieve the same using Elasticsearch Java API? It is very simple:

client.prepareSearch().setTimeout("1ms").execute().actionGet();

We can also use the TimeValue class:

TimeValue timeout = TimeValue.timeValueMillis(1);
client.prepareSearch().setTimeout(timeout).execute().actionGet();

Terminating after N documents

In addition to time based termination Elasticsearch provides us with an experimental feature that allows us to terminate results gathering after certain amount of documents have been collected, for example:

curl -XGET 'localhost:9200/_search?pretty&terminate_after=100'

The above request tells Elasticsarch to gather no more than 100 documents from each shard during the query. The same functionality using Elasticsearch Java API would look as follows:

client.prepareSearch().setTerminateAfter(100).execute().actionGet();

Client side timeout using Java API

There is one more timeout we can configure when we are using Elasticsearch Java API – the client side timeout. The client side timeout allows us to specify how long we want to wait for the results to be returned to the client. We use it by specifying the time in the argument of actionGet method, for example:

client.prepareSearch().setTimeout(timeout).execute().actionGet(100);

The above request would be terminated on the client side when 100 milliseconds would pass since the start of the request. The only thing to rememeber is the termination results not in a partial results, but with exception being thrown. You can expect something like this:

org.elasticsearch.ElasticsearchTimeoutException: Timeout waiting for task.