Because Search Lucene API is a fully integrated solution, it is scalable to roughly 5,000 - 10,000 nodes (depending on enabled facets, word stemmers, etc.) before the "Result set limit:" setting must be enabled and tweaked. Although this number of nodes is not a problem for the audience that Search Lucene API targets, adding support for Java Lucene may provide more scalability.
Comments
Comment #1
cpliakas commentedThere are a number of reasons why the Zend Framework's Zend_Search_Lucene component is not as scalable as Java Lucene, one of which is that it is a fully integrated solution meaning the web request must handle the memory and processing requirements of the Lucene search. Another reason is being debated by Zend developers, but it may be related to the lack of native UTF8 support in PHP. Despite the drawbacks, Search Lucene API still makes a lot of sense in many different applications.
However, early experimentation with the Zend Server's Java bridge have yielded extremely promising results. Because PHP can call Java directly, there would be no need to manage external services thus allowing the module to be used exactly as it is now. If Search Lucene API implements "adapters", administrators could conceivably choose between PHP Lucene or Java Lucene depending on the site's requirements. Modules that implement the API would be able to use the API functions without regard for the actual back-end.
The Zend Server runs a maintenance-free daemon that handles the JVM and PHP integration, so the memory consumption and processing power are offloaded from the web request allowing the Java Lucene adapter to scale to a much higher level. Although I can't wait to start development on this, it is out of the scope of 2.0 and will have to wait until 3.0.
Comment #2
cpliakas commentedTagged for 3.0 release.
Comment #3
cpliakas commentedRe-tagging for consistency.
Comment #4
cpliakas commentedCreating Lucene adapters, such as the one that utilizes the Zend Server's Java Bridge, will be a cornerstone of the 3.0 branch. Switching to a task.
Comment #5
cpliakas commentedI don't see the need for this anymore. Elastic Search integration and Apache SOlr Search Integration are probably better options anyways.