Posted by webchick on December 24, 2010 at 8:54pm
30 followers
| Project: | Apache Solr Multisite Search |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed (fixed) |
| Issue tags: | d7 ports, Issue summary initiative |
Issue Summary
To support Drupal 6.x-3.x and Drupal 7.x-1.x
What is needed :
- use the site and hash information in solr to facet on but also to filter queries (deletions, selects if needed)
- Add metadata for D6 and D7
- Content types
- Bias information
- Site Hash (already included from the module)
- Site Url (already included from the module)
- Vocabulary names
- Add a facet on the hash with output the name of the site
- Modify the bias pages to include bias for content from the other sites.
I propose that we open up a new branch for D7 and for D6 and we start developing.
Let's do this in the same line as apachesolr. 7.x-1.x for the Drupal 7 version and 6.x-3.x for the Drupal 6 version.
Comments
#1
Actually, this title will make my community initiatives page make more sense. ;)
#2
One more. I'm really done now. :P
#3
subscribe
#4
cool , subscribe
and drupal.org already use it in d7core, how come ?
#5
Upgrade path will be 6.x-1.x -> 6.x-2.x -> 7.x-1.x
#6
#7
Noting that apachesolr 7.x now has significant schema changes.
#8
Hi,
Anyone working on this , any progress ?
#9
sub
#10
sub
#11
Subscribe
#12
Sub
#13
subscribe
#14
I'm taking this on, expect an initial patch soon.
#15
Here's the first basic version. Everything works except for the blocks/facets. Any help getting this last bit fixed with the new Facet API would be very much appreciated.
(Go to admin/config/search/settings and make sure the checkbox for Multisite is checked. Possibly clear cache afterwards.)
#16
I'm actually curious if we need to add any other filter than "Filter by site". Filters provided by apachesolr (like "Filter by content type") also work in this D7 multisite version.
#17
Here's a new patch. All it needs now is the "Filter by site" facet, and probably cleaning up some legacy code afterwards.
#18
Thanks for the patches.
re:
$document->entity_id = 1;, seems like instead it should be the hash?Or maybe we should make that a non-required field?
#19
We should discuss the architecture - I had though that we might actually merge this into the main module, depending on what's left after we remove the facet code.
#20
Making entity_id a non-required field seems like a good idea. Also, entity_id is type long, so it can't be the hash.
Making the multisite functionality part of the main module makes sense to me, we're still using a lot of semi-duplicate code anyway. What's the best way to discuss this?
#21
adding d7 ports tag
#22
@wmostrey - I hope to have a better handle on this architecture by late next week. Maybe we can have a call on Sept 23? I'll look at changing the schema in advance of that.
#23
Let's do that, great!
#24
An updated version, with all d6 facet code removed and a clean settings page. Tested with both Drupal sites using the apachesolr module and non-Drupal sites crawled with Nutch.
#25
#26
Here are the instructions: http://drupal.org/node/666606/git-instructions/6.x-1.x
In short:
1. Setting up repository for the first time
git clone --branch 6.x-1.x http://git.drupal.org/project/apachesolr_multisitesearch.gitcd apachesolr_multisitesearch
2. Applying a patch
Download the patch to your working directory. Apply the patch with the following command:
git apply -v [patchname.patch]#27
#28
Can you create the 7 branch with this patch?
That way it is easier to get the module.
#29
The patch will most likely need to pass review first before Peter Wolanin creates the branch. So if you want to help move this forward: review the patch and get the status to RTBC. Thanks!
#30
I now also added a site/hash facet so you can now again filter the search results per site.
#31
looks like a good start, especially if it's moving toward faceapi integration.
We'll need to figure out how to expose the appropriate multi-site facets there, however.
#32
I had a issue with the site metadata, when go to the "Multisite seetings" section then under the "Delete data from sites using this index" section I only can see two sites (total subsites are >10). My question is how to get the correct information from all subsites on the list?
Thanks a lot!
#33
Since the schema has change a bit to include an entity_id (instead of entity) you need to index each individual subsite again using this module. That should fix your problem.
#34
Note that we don't yet have a 6 version of apachsolr compatible with the 7 version. That will be the 6.x-3.x branch.
#35
Subscribe.. As always reminding folks about using http://drupal.org/project/coder to look over patches and help review them before release.
#36
Subscribe
#37
Maybe I will create the branch if this is a generally working basis for progress.
#38
For 7.x I would like to figure out how to meld the multisite search functionality with the search environments concept we added in 7.x apachesolr module, as well as with the custom search pages.
I feel like we should be able to make this module even smaller, since I always work to have the support for multi-site search pretty well baked into the main module.
#39
I agree. I'll see what I can do to integrate the concepts of apachesolr_multisite into the apachesolr modules.
#40
Ok, I may take a crack at it this weekend myself.
#41
note, patch above I committed to a new branch, so setting back to active
#42
see: #1340552: Make facet generation more flexible/overrideable
#43
Also, I think we should potentially remove the use of the core search hooks (especially for the search page), and just leverage the user defined search pages.
#44
I think we need this patch: #1341854: Pass the query object into hooks altering search results
#45
Starting to reduce this down to the essence.
#46
Hi Guys,
I have a question here and I really appreciate your help! I have a drupal7 with multisite setup and solr multisite search module to do the durpal multisite search. Now I have another non drupal (simple html) site running on another web server. My question is how to search cross the many drupal sites and the non-drupal site and get the results from ALL sites?
Thanks a lot!
#47
@synbaxp - off topic. This issue is about the code update for 7.x.
Open a separate support request or try IRC.
#48
I've been talking through this with Nick Veenhof. I did some testing with the latest Apache Solr dev module, and since it now also takes the hash into account, every page is actually ready to support multisites. I believe we need the following functions in Apache Solr to get it working:
We might need to work out the details as to what configuration goes where, but this will bring us a long way.
Your patch is good to go, except that the function should be apachesolr_multisitesearch_facetapi_facet_info() and not apachesolr_multisitesearch_facet_info().
#49
If the complete module is replaced with this code it is already working between different Drupal 6 and 7 sites. What is left is to make node access integration work between Drupal 6 and 7.
<?php
/**
* @file
* Extends Apache Solr Search module to provide multisite support.
* This includes
* 1) A facet that allows filtering per site
* 2) changes the links so they redirect to the approriate site
*
*/
/**
* Implements hook_facetapi_facet_info().
*
* @param type $searcher_info
* @return type
*/
function apachesolr_multisitesearch_facetapi_facet_info($searcher_info) {
$facets = array();
$facets['site'] = array(
'field' => 'site',
'label' => t('Site Name'),
'description' => t('Filter by Site Name'),
);
return $facets;
}
/**
* Make sure that the links in our search results link to the website of origin
*/
function apachesolr_multisitesearch_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) {
foreach ($results as $id => $result) {
$results[$id]['link'] = $results[$id]['fields']['url'];
}
}
?>
#50
I propose a much bigger change and make it easier for all of us to build it from scratch again
#51
Fixing the right package so it shows up in search toolkit now
#52
Some namespace issues. I think this one should be good to go in and let's follow up with other functionality later on? What do you think?
#53
The patch in #52 is good to go. I would already prefer to see a dev release based on this patch to continue working on.
#54
I pinged pwolanin to take a look at this issue. Afaik he will do that asap.
#55
I reckon this might benefit from a good issue summary too...
#56
Trying to figure out all the deletions
function apachesolr_multisitesearch_map_hash() becomes a no-op? You removed hook_facetapi_facet_info()?
We certainly still need the hook_apachesolr_query_alter(), but it should be looking to a per-envirnoment setting.
Also, all the metadata functionality seems to be removed. I'm not sure what's going on - is this the right patch?
#57
To support Drupal 6.x-3.x and Drupal 7.x-1.x
What is needed :
I propose that we open up a new branch for D7 and for D6 and we start developing.
Let's do this in the same line as apachesolr. 7.x-1.x for the Drupal 7 version and 6.x-3.x for the Drupal 6 version.
(added this to the opening post)
#58
These patches should include the metadata + the corrected hash to sitename mapping that comes from the metadata.
as I mentioned before I would prefer if those were added to the 7.x-1.x branch and the 6.x should be added to a new branch 6.x-3.x
I've tested these patches on a 6.x site and on a 7.x site and multisite between 6.x and 7.x is working perfectly. The regular module takes care of indexing fields with their machine name so a D6 and a D7 site can easily create a facet that is using content from both.
I also added the content types/bundles to the meta information but I'd like to have some more input how we could handle bias information for content types/bundles that are not part of the site where the search was executed
#59
The patch for 6 was diffed with the current 6, the patch for 7 was diffed with the current 7
Would this be a good starting point for all?
#60
Forgot to remove a dsm...
#61
Let's change this:
$document->entity_type = 'multisite_meta';and use a string that cannot be a valid Drupal entity type.
e.g. 'multisite.meta' or 'multisite/meta' or 'multisite-meta'
You moved a bunch of functionality like apachesolr_multisitesearch_generate_metadata() into the .module instead of leaving it in the admin.inc. If it's not used on most page loads, I think better to keep in the .inc file?
#62
How should we go about testing this? I ran into several issues so I might be doing something wrong.
I tried this with apachesolr 7.x HEAD and 3 sites sharing the same solr core. I enabled multisite support for the solr server and cleared the index and reindexed every site. I ran into these issues:
$results[$id]['fields']['hash']doesn't exist (line 77 in apachesolr_multisitesearch.module#63
@pwolanin : I moved it because I felt some of this code did not belong in an admin.inc. The code that is used is not only for the admin pages but could be used as an API (crud of the metadata) for those that need it.
I only included functions in the admin file that are directly related to the admin configuration. Which one do you want to move to the admin.inc?
Patch attached with multisite.meta as entity type
@klaasvw
This is still very much a work in progress. I suggest that you try to find the broken part, correct it and upload the patch. Does that work for you?
#64
Although I'm not that familiar with the module, here's a quick Dreditor scan! This is off drupal7.patch. I really like the LOC I/D ratio :-) .
+++ b/apachesolr_multisitesearch.infoundefined@@ -2,5 +2,5 @@ name = Apache Solr Multisite Search
-package = Apache Solr
+package = Search Toolkit
Should it be "Apache Solr Search Toolkit" instead? It's usually good to namespace module names.
+++ b/apachesolr_multisitesearch.moduleundefined@@ -23,133 +23,89 @@ function apachesolr_multisitesearch_menu() {
+ return $data;
+}
+
+function apachesolr_multisitesearch_apachesolr_process_results(&$results, DrupalSolrQueryInterface $query) {
+ $env_id = $query->solr('getId');
Might like some doc block for apachesolr_multisitesearch_apachesolr_process_results().
+++ b/apachesolr_multisitesearch.moduleundefined@@ -23,133 +23,89 @@ function apachesolr_multisitesearch_menu() {
+ *
+ * @param string $query
+ * Defaults to *:*
*/
-function apachesolr_multisitesearch_cron() {
- apachesolr_multisitesearch_refresh_metadata();
+function hook_apachesolr_delete_by_query_alter($query) {
+ // use the site hash so that you only delete this site's content
+ if ($query == '*:*') {
+ $query = 'hash:' . apachesolr_site_hash();
Is this suppose to be "hook_apachesolr_delete_by_query_alter()"? Maybe we should move that to apachesolr_multisitesearch.api.php instead?
#65
@Nick - I was using admin.inc as a generic include file, despite the name.
#66
@rob loach - That hook is clearly wrong, should be fixed indeed. The search toolkit is a general package name so this module will appear in the same list as apachesolr and its derivatives.
@pwolanin, Are you ok with moving them to an apachesolr_multisite.index.inc (similar to apachesolr?). We could even call it meta.inc or something similar.
#67
@Nick - have a index.inc file is fine as you like it - I was just lazy when I wrote it and found it easier to have just one .inc file to look in.
#68
This patch should have an index.inc + the fix with the delete hook. Tested out most of the functionality with a D6 and D7 site. Also the D6 and the D7 module are now very similar when compared to eachother so I'll include a small diff of that alsoignore this one
#69
This patch should have an apachesolr_multisitesearch.index.inc + the fix with the delete hook. Tested out most of the functionality with a D6 and D7 site. Also the D6 and the D7 module are now very similar when compared to each other so I'll include a small diff of that to show the differences.
#70
Looks better. I think we still need to e.g. alter the author facet for a multisite environment, but that can be a follow-up.
@Nick - I added your commit access if you want to get these patches into git.
#71
Commited to 7.x-1.x
#72
Created a branch 6.x-3.x and applied the patch for the 6.x-3.x branch
#73
Oh, rock!!! Thanks so much guys! :D
#74
The cool thing is that you can now do a multisite between D6 and D7 sites ;-) Still a work in progress though!
#75
Automatically closed -- issue fixed for 2 weeks with no activity.