Closed (fixed)
Project:
Apache Solr Search
Version:
6.x-1.x-dev
Component:
Code
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
2 Sep 2008 at 13:19 UTC
Updated:
15 May 2009 at 00:19 UTC
Jump to comment: Most recent file
Comments
Comment #1
brainski commentedI missunderstood the schema.xml file.
While indexing the nodex, the path alias are added to the text as well. This might not be useful as described above. Maybe one could implement a flag in the settings if the path aliases should be indexed or not.
Comment #2
brainski commentedAfter fiddeling around with Eclipse for more than an hour, I finally was able to create my first PATCH! Hurray! :-)
And here is the description.
- New flag in settings: Exclude Path Alias from index. This created a lot of problem in my index. I also saw a lot of duplicate entries.
- New flag in settings: Include only the bodyfield of the node in the index. This option is useful, if you have nodes that containing views with a lot of redundandent information. Because I want only the relevant information of the node in the index, I added this option.
I corrected a typo in the code:
Old: $text = check_plain($node->title) . $node->body;
New: $text = check_plain($node->title) .' '. $node->body;
I tested this functionality very careful. It would be great if someone could review this and then commit it to the dev version.
Comment #3
brainski commentedchanged title
Comment #4
JacobSingh commentedHi Brainski,
This is indeed an interesting although probably larger issue. I don't think the settings page should be cluttered with options like this because there are probably dozens more in the offing like:
- Only index these CCK fields
- Only index these node types
- Only index these vocabularies
etc. etc...
Some of these options may be covered by the module, many will need to be cusotmized by administrators in their solr instance and/or in drupal. My feeling is that we need really split up the module into a more plugin based architechture. So the apachesolr_node module would provide the basics, however, apachesolr_path might provide the options for indexing the path, and would be a LOCAL_TASK of the main settings page.
What do you think of this? I'm not saying your issue isn't relevant, just that we could go on tacking options onto that main page related to how the content gets indexed and it would become quite a complicated page to look at, and the module would be full of bloat.
Comment #5
brainski commentedI don't see it the way you see. These are only 2 options and they can solve a lot of problems. If you compare the settings page of apache solr with the one from pathauto, you will agree with me, that apachesolr settings page is almost empty.
For me its better to have everything in one place than 20 different modules. And I expect from every user that implements apache solr, that he is able to handle two or more additional checkboxes because he was able to manage the complexity of solr..
What do you mean?
Comment #6
robertdouglass commented@brainski: thanks for the patch - and congratulations on rolling your own =)
@JacobSingh: More modular is fine. I can see a lot of things being refactored out into plugins or separate modules. However, I think that there could be a section on the configuration page that has checkboxes for all the things that there are to be indexed. If someone doesn't want paths, they can uncheck it. If they don't want taxonomy, they uncheck it. This could be implemented by a hook.
In any case, I see this as post version 1.0 work.
Comment #7
brainski commentedWhat do you mean with post version 1.0 work? Has someone tested this patch? Was it already commited to the dev version?
Comment #8
robertdouglass commented@brainski: we did a prioritization exercise of the issue queue and marked as "critical" all issues that we want to close to be feature complete for a 1.0 release of this module. There are a lot of issues in the queue. Some are left out based on a gut feeling of priority. I decided to address the settings issue in your patch post 1.0 release.
The space between title and body has been committed in another issue, thanks for pointing it out to us.
Comment #9
brainski commentedok thanks for the feedback. If I can support you with this issue, please send me an email. I have some capacity for developing on the solr module.
Comment #10
pwolanin commentedseems to be no longer relevant