Closed (fixed)
Project:
Drupal.org security advisory coverage applications
Component:
module
Priority:
Normal
Category:
Task
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
23 Feb 2012 at 15:37 UTC
Updated:
4 Jan 2014 at 01:39 UTC
Jump to comment: Most recent
Comments
Comment #1
klausiRemoving review bonus tag, there are not manual reviews listed in the issue summary. Please read #1410826: [META] Review bonus again.
Comment #2
orakili commentedAh sorry about that. I misread and thought it was related to the PAReview only.
Comment #3
orakili commentedChanging priority as it's been more than 4 weeks.
Comment #4
zenlan commentedCan imagine why its taking so long to get a review, any module that requires the installation of third-party services will probably put off some reviewers. Problem with the review process not your module though. Am never sure if we are conducting code reviews or user testing. Anyway, as I write Search API service classes myself and use Solr a lot I thought I'd give Sphinx a go. The usual kerfuffle using a service for the first time but not the worst I've encountered (Sphinx I mean, not your module).
PAREVIEW:
Returns no errors
http://ventral.org/pareview/httpgitdrupalorgsandboxorakili1443484git-7x-1x
INSTALLATION:
Smooth, no errors.
USAGE / WORKFLOW:
Created a server, no problems. The correct server path was created.
Created an index, no problems but could use a little bit of helpful text to guide user through the 'workflow'. You have some information on the create server page and in the README.txt that would be useful on the index pages instead, imo, if it is possible to get that info across there (haven't checked myself).
'Index Now' threw an exception in db_query:
The table "sphinx_index_mysite_sphinx_test_index_rt" did not exist in my db and I couldn't find any code that created such a table. My db user has permissions to create tables, I debugged through creating an index and didn't see a table get created.
The configuration file was created and contents appeared correct (using all defaults) but the specified index path was not created on the filesystem: (yes I am using Windows here btw)
MINOR/COSMETIC:
Missing default dir on Search API create/edit server config page:
"Base directory in which the necessary files and directories will be created. The directory must exist and the application user The Boss must have full access. Default is"
Typo on Search API create/edit index config page:
'MORE LIKE THIS' SETTINGS Minimim word length
GENERAL COMMENTS:
I found the code well-structured, easy to read and understand. Having never used Sphinx before but plenty of experience with Search API I didn't have any trouble figuring out what to do. Perhaps less savvy users might like a bit more help but you will find out what they need from the issue queue no doubt. :)
Perhaps defining all of those configuration defaults every time the module/service class is loaded is a bit unnecessary, they are only used by the configuration form and could probably be simple local variables or even just hard-coded assignments.
I would add 'write some tests' to your ToDo list, even though testing in Drupal Simpletest is less fun that watching paint dry. Useful nonetheless.
Haven't yet tried the facets or mlt features.
Am happy to continue testing this if you can help me get over the indexing bug. It looks like a very useful module. :)
Comment #5
klausiSetting priority according to http://drupal.org/node/894256
Comment #6
orakili commentedThank you very much Zenlan for taking time to review this module.
I'm setting up a windows environment to solve the issues you encountered. They seem to be related to the way the sphinx search service is started.
I will update the code once the testing is finished. Thanks for you patience.
Comment #7
orakili commentedHow to start/stop the sphinx service under windows has been changed (commit 28f5549ed0bcf02b2ac26c4e02b82fbcc5abc4fa).
It should work properly now.
Comment #8
zenlan commentedHello,
I can index now! No more exception. :)
I have created a view based on a search_api_sphinx index:
The fulltext search query string generated by this index...
SELECT *, weight() AS score FROM sphinx_index_adlibtools_sphinx_test_index_rt WHERE MATCH('@(sphinx_title,sphinx_body___value) (\"Two\")') ORDER BY sphinx_author ASC LIMIT 0,10... is throwing an error: "Non recognized field type list" and no results are being returned. My term is in the node body text. Terms in the title are returning correct results in fulltext search despite the error.
Sorting on 'Relevance' returns an error: "Invalid sort criteria on unknown or unindexed field 'search_api_relevance'"
Using 'Tags' as a filter returns an error: "Invalid filter on unknown or unindexed field 'field_tags:name'"
Using 'Title' as a filter returns error: "Invalid filter operator '='. Field: 'title'."
I will leave it at that for now, still needs some polishing up with regard to Views integration but it is rather good. :)
Comment #9
zenlan commentedI should point out that this is a minimal Drupal 7 installation with only content types Article and Page and one simple 'Tags' vocabulary.
Comment #10
andyposttryed module, Default node index from search_api does not work
#1536866: Error - Couldn't index items. Check the logs for details.
Also minor code bug #1536836: Unknown variable $items in search() implementation
Comment #11
orakili commentedThanks both of you for helping with the review.
#1536866 has been fixed and was due the use of the tokenizer preprocessor which was not handled as Sphinx has its own preprocessors.
Most of the errors encountered with views are due to unsupported features.
It's not possible to sort using a field indexed as fulltext because Sphinx doesn't store the string (ex: tags:name). If the field has only a sorting or filtering purpose it should be set as string instead of fulltext.
I will add that to the documentation and make the error messages more explicit.
Comment #12
zenlan commentedHi,
The #1536866 issue harks back to my comment earlier:
I recalled that you wrote on the server info page...
...so I knew not to enable any processors. I realise that Search API doesn't really consider that not all index types support or require all of the search_api workflow elements such as processors, it would be great if there was a hook_default_search_api_index_info() method where you could add some info or the user as per the server page. I wonder if you can do something with hook_default_search_api_index_alter() to remove unsupported processors?
'relevance' is a special field, its not fulltext, you should be able to sort on it. If you take a look at the service class for search_api_solr you can see how it maps the Solr 'score' to the search_api_relevance field. I am working on a service class that does not return any kind of 'score' but I still handle that special field by assigning it with 0 always. The field is rendered effectively useless but at least it doesn't simply throw an error.
I know what you mean about the datatype dictating the function. Just an idea, but perhaps hook_views_data_alter() might be useful, possibly assigning your own filter handler to special fields? Is there any way of tweaking the query to allow a fulltext field to act as a filter? (I do this in one of my service classes)
Anyway, I don't think that fine-grained views integration or search_api api constraints should stop your module getting through, adding documentation is imo sufficient at this point. Going forward, these things might get resolved in the issue queue if users actually require them and the search_api maintainers will no doubt help along the way.
I consider this module to be in good enough shape to be accepted.
Comment #13
klausiYou have not listed any reviews of other project applications in your issue summary as strongly recommended in the application documentation.
manual review:
But otherwise this looks nearly ready to me.
Comment #14
orakili commentedThank you for pointing out those problems.
Issues (1), (2) and (5) have been fixed. For (4), file_unmanaged_delete_recursive() can not do the work as I'm deleting several files with different extensions for which only the basename is known so it would be more complex than GlobIterator.
Still working on (3). I'm using db_query to communicate with Sphinx which emulates the mysql protocol and has its own syntax (SphinxQL) which is close to standard SQL but has some differences especially regarding escaped values.
Comment #15
orakili commentedI've reviewed the security issues and committed some changes. Note that I could not use db_query and placeholders for several reasons related to the fact the syntax used is not true SQL but SphinxQL which has its own specificity.
- db_query calls DatabaseConnection::query() which use PDOStatement::execute(). As stated in the documentation page, all arguments passed through execute() are treated as strings. Thus everything is quoted or SphinxQL doesn't not accept quoted values other than strings.
- I couldn't use either PDOStatement bind parameters or values (see PDOStatement::bindValue()) because PDO doesn't handle unsigned integers (see bug https://bugs.php.net/bug.php?id=44639). Unsigned integers are an important part of the SphinxQL language especially for string comparison using CRC32 function.
So values are sanitized using the function SearchApiSphinxService::convertFieldValueToSphinxValue() and some others, which check the type of the value and prepare it accordingly so it's safe to use in queries.
Comment #15.0
orakili commentedNow supports Search API MLT.
Comment #16
andypostI'd like to mension about SphinxQL - it does not support table-aliases so @orakili is right that PDO and dbAPI has a lot issues when communicationg with Sphinx.
My +1 to promote this project to full!
Comment #17
avpadernoComment #18
orakili commentedRemoved "PAReview: security". See #15.
Comment #18.0
orakili commentedAdded reviews section
Comment #19
orakili commentedAdded reviews of modules.
Comment #20
klausiPlease don't remove the security tag, we keep that for statistics and to show examples of security problems.
manual review:
most sprintf() %s calls are useless as PHP casts variables to strings anyway when concatenating.
Looks RTBC to me. Removing review bonus tag, you can add it again if you have done another 3 reviews of other projects.
Comment #21
klausiNo objections for more than a week, so ...
Thanks for your contribution, orakili! Welcome to the community of project contributors on drupal.org.
I've granted you the git vetted user role which will let you promote this to a full project and also create new projects as either sandbox or "full" projects depending on which you feel is best.
Thanks, also, for your patience with the review process. Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.
As you continue to work on your module, keep in mind: Commit messages - providing history and credit and Release naming conventions.
Thanks to the dedicated reviewer(s) as well.
Comment #22.0
(not verified) commentedAdditional reviews