Firstly, thank you very much for the module, it works and seems to do what is advertised.
I'm trying to use it as a Search API query backend, but it seems very slow indeed.
I've surrounded the code in the Search API service.inc with the following:
// Do search.
timer_start('elasticsearch');
$response = $this->elasticsearchClient->search($params);
$timer = timer_stop('elasticsearch');
dpm(format_string('HTTP request took: @countms', array('@count' => $timer['time'])));
if (isset($response['took'])) {
dpm(format_string('Elasticsearch query took: @countms', array('@count' => $response['took'])));
}
So I can capture the timings of the query itself, and the HTTP request.
I get query times of around 8ms, and HTTP request times of around 1600ms. Ouch!
I think this module uses Guzzle eventually to send the request, do you know where the best place is to insert some more timing code to see if it's the actual request that's taking the time, or everything else around the request that's taking the time?
Unfortunately I don't have xhprof available on this site, otherwise I'd get to the bottom of this in a snap.
Comments
Comment #1
skek commentedHi @Steven Jones,
Thank you for using the module, hope in future to become a really nice one.
You are completely right, the library is using Guzzle to do the http requests.
I've experienced similar problem with the module but the problem was connected somehow with the DNS resolving.
Are you using domain in the cluster settings and vhosts? If yes try to use the IP and let me know the result.
If the same problem you could wrap the code into this function:
Elasticsearch\Connections\GuzzleConnection::performRequest(string $method, string $uri, null|string $params, null|string $body, array $options)If you are using the easy install module, the path to the file is:
elasticsearch_connector/modules/elasticsearch_connector_easy_install/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/GuzzleConnection.php
Otherwise it depends where the composer manager has been setup to store the libraries.
Please check and let me know.
Best Regards,
Nikolay Ignatov
Comment #2
steven jones commentedOh interesting, there are two guzzle requests.
one to just: {indexname}/elasticsearch
the next to one to just: {indexname}/elasticsearch/_search
The first request takes 300ms, and the second 1100ms.
This matches up with it taking around 130ms to ping the Elasticsearch server from my webserver.
I'll have a look on another machine with xhprof and report back about where the time is being spent.
Comment #3
skek commentedHow many documents do you have indexed in your index?
Comment #4
steven jones commentedOnly 14,000
Comment #5
steven jones commentedOh, maybe the ElasticSearch endpoint is mis-reporting the query time?
Comment #6
skek commentedHmmm, I don't think that the problem is with Elasticsearch, most probably is something with my code :). You can easily check this by executing the raw request to the elasticsearch from your browser for example or using an rest client e.g. "Advanced Rest Client" in Chrome.
Can you paste me the body of the slow request?
And also did you check the configuration of the cluster in Elasticsearch Connector?
Comment #7
steven jones commentedRight..got to the bottom of this:
There are two things making my request particularly slow:
searchmethod:This causes an extra HEAD request per search.
This means that it returns all of the fields in the response. My documents are quite large, and the network link to the Elasticsearch server is quite slow, making this more obvious.
Basically it spends around 1000ms transferring the data!
This was 'solved' for me by not removing an empty 'field' array from the search query, making the query look like this:
With these two issues sorted my search request is back down to 350ms total, which is basically the latency of the HTTP connection. For various reasons this is a connection to a datacenter on a different continent :)
Anyway, I'm going to mark this issue as fixed, but feel free to use this information to create some other issues.
Comment #8
skek commentedThank you for the info.
I will try to eliminate the additional query to the Elasticsearch.
I remember there as an issue without this check if the type is missing. However will try to think this in a better way.
Fill free to share/request any features/bugs you find. I will try to dev/clean them ASAP.
Best Regards.