Project:Apache Solr Search Integration
Version:6.x-3.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (won't fix)

Issue Summary

I am currently (successfully) using apachesolr for project with custom facet on cck date field.
As the code that I've made is done inside apachesolr module I want to make it separate module (so I can update apachesolr without patchig the code again)

I am using hook for (standard) cck fields hook_apachesolr_cck_fields_alter(&$mappings) and this is enough to get facet.
But to make it work like date facets (for node created and node changed fields) I had to change
function apachesolr_search_add_facet_params(&$params, $query)
and
function apachesolr_search_block($op = 'list', $delta = 0, $edit = array())

There is one more change that I've done. It is in function apachesolr_search_date_range($query, $facet_field) and it is handling dates that are in ISO format (string) in database.

I am attaching diff.

Is there support for additional cck fields that will act like date facets planned (and when)? Is there some plan that I should consider when trying to make support for date facets from cck fields?

I was thinking maybe to automaticly consider using funcionst for date facets if cck field is date or datetime or timestamp type...

Mihajlo
Kontrola

AttachmentSizeStatusTest resultOperations
recdate.diff2.88 KBIgnored: Check issue status.NoneNone

Comments

#1

Any chance we can get basic cck date field support in core apachesolr? if not, lets try to clear these blockers. I have not tried the code, but will probably need this soon.

#2

Version:6.x-1.0-rc2» 6.x-1.x-dev
Category:support request» feature request

I'm very interested in this. Would this be possible for 1.x-rc4, or will this have to go into 2.x? The patch seems simple enough.

#3

Status:active» needs work

The patch hardcodes the name of the CCK field. There needs to be more abstraction.

#4

I'm working on this. It's a big patch and a custom module. Will post soon.

#5

Subscribing.

#6

Just want to confirm ongoing work and the imminent propinquity of the first patch + new module to try.

#7

Subscribing

#8

Version:6.x-1.x-dev» 6.x-2.x-dev

Here's a patch against 6.2 that adds a new module, apachesolr_date.

It also refactors much of the CCK handling code altogether. Notably, CCK facet definitions must always define an indexing callback.

I'm eager to have people try it but the code needs more cleaning up.

AttachmentSizeStatusTest resultOperations
apachesolr_date.patch41.84 KBIgnored: Check issue status.NoneNone

#9

Subscribing...

#10

I have tested the patch on my system.

When rebuilding the index only the 'tdate' types are being indexed.

When rebuilding the index using a patched version of nd_search, everything works just fine. The only difference is that those 'tdate' types are now 'date' types.

Will have a deeper look into it, to get things working without nd_search.

#11

What is nd_search?

Thanks for testing. I have an updated patch for later today.

#12

nd_search is a contrib module for display suite.

Good, cause I'm unable to get those facets as you showed them on Fosdem.

#13

Status:needs work» needs review

Here's a new version. I want to commit this soon to get wider testing.

AttachmentSizeStatusTest resultOperations
date_facets.patch25.73 KBIgnored: Check issue status.NoneNone

#14

Status:needs review» needs work

This new patch doesn't contain anything of the apachesolr_date module. Still have it from the older patch.

Without nd_search, only those tdates fields are being indexed.

The problem in building the facets is that $response->facet_counts->facet_dates are empty.

#15

This issue is related to #664896: Automatic CCK introspection, especially the changes that the patch in comment 8 makes to indexing.

I do think that indexing_callback shouldn't be required for fields that can be indexed with a default indexer, such as text and number fields. A lot of CCK fields have a single value attribute which stores the value of the field.

I'm working on merging these two patch sets (the ones from #664896: Automatic CCK introspection and the one here) together in a Git repository at http://github.com/haxney/apachesolr . I'll be pushing all of my updates there, and when I have something presentable, I'll submit a patch here.

#16

@haxney - that sounds great. @DenRaf - here's the patch I tried to submit before.

AttachmentSizeStatusTest resultOperations
apachesolr_dates.patch43.31 KBIgnored: Check issue status.NoneNone

#17

Still the same issue: $response->facet_counts->facet_dates is empty. (apachesolr_date.module:347)

Are there changes required to the solrconfig or the schema maybe ?

#18

Well, I'm using the new tdates:
shema.xml:

    <!-- A Trie based date field for faster date range queries and date faceting. -->
    <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>

Did you launch Solr with the schema.xml from the latest 2.x-dev version? If so it should be fine.

#19

Yes, and I checked that immediately when I saw you were using the tdate type.

Any idea why that facet_dates are empty ?

#20

@robertDouglass I've updated #664896: Automatic CCK introspection, and have new code at my GitHub project. I've changed some of how hook_apachesolr_cck_fields_alter() works (not dramatically; is pretty much exclusively redundancy elimination), so you could probably save yourself some time by avoiding having to write a bunch of identical callbacks (if it can handle them automatically).

I'm also definitely planning on using your system of having parallel Solr fields for complex CCK fields (like 'date' and 'date_end'). Hopefully, I'll be able to make your life a bit easier, too. :)

#21

@haxney - excellent. Keep up the work. I'll be able to review this coming weekend.

#22

Thanks haxney .. I will test this weekend too!

#23

Looks like you didn't add date fields to the introspection code? I tried adding it explicitly myself but they don't seem to be working. Can you give me an example hook_apachesolr_cck_fields_alter for a date field to test this?

#24

Progress. The facet blocks are *generally* not working correctly, but the date stuff seems to be in place. Any help debugging greatly appreciated.

AttachmentSizeStatusTest resultOperations
apachesolr_dates.patch45.2 KBIgnored: Check issue status.NoneNone

#25

I'd like to help debugging but I want to make sure I'm configuring things correctly before I do.

I applied your patch and uninstalled and then re-enabled the search and date solr modules on a sandbox with a nodereference to page, a select text field, and a date field on the story type. The noderef and text field were autodiscovered, but only the noderef filter is appearing. As for the date field, I added this hook to a custom module:

function customsolr_apachesolr_cck_fields_alter(&$mappings) {
  $mappings['per-field']['field_date'] = array('callback' => '', 'index_type' => 'string');
}

It then appeared as a filter, but didn't appear as a facet block when I enabled it. (The blocks are enabled, and, as I said, the noderef one is appearing correctly.) Is that not the right way to map the date field?

#26

mcarbone - It was my intention that you wouldn't need to map the date field at all - that the module would make facets for starting and ending dates without you doing anything special.

It's been quite frustrating though, because the behavior that other people see when they test it often diverges from what I was seeing during development.

For the patch I attached, I was seeing these things working:
1. Date fields of all types get recognized automatically and make separate facet blocks for starting and ending dates.
2. You can drill down into any date facet down to the hour.

What wasn't working was the interplay between other facet blocks. Clicking a facet link was making all other facet blocks disappear, despite the fact that the right search results were being fetched.

#27

Aha. It appeared when I switched the date widget type from popup to select or text. Seems like you need to add 'date_popup' in addition to 'date_select' and 'date_text.'

The date facet block is now appearing (the text select one still isn't), although it's timing out when I click one of its options. The noderef facet block works fine.

#28

Huh, the date facet drilldown worked for me briefly. And it worked in conjunction with the noderef facet. Now it's timing out again -- very inconsistent. (The noderef facet works consistently.)

#29

mcarbone: it's exactly these reports of inconsistent behavior that have been haunting me on this patch. Thanks for your testing. Let me know if you find anything. Did you look at the Solr logs for possible errors?

#30

After doing some digging:

First, it seems like the reason my field_options (text with select widget) facet isn't working is because it's not being indexed, and it's not being indexed, I think, because it doesn't have an index_callback. E.g.,

Array ( [field_name] => field_page [indexing_callback] => apachesolr_cck_nodereference_indexing_callback )
Array ( [field_name] => field_date [indexing_callback] => apachesolr_date_date_field_indexing_callback )
Array ( [field_name] => field_options [indexing_callback] => )

As such, the following conditional on line 119 of apachesolr.index.inc is false for field_options:

        if ($cck_info['indexing_callback'] && function_exists($function)) {

Without the patch, it gets indexed fine and the facet works.

Now, as to why the date facets are timing out, I'm still not entirely sure. I'm seeing this in watchdog: "0" Status: Request failed I didn't set up the Solr instance, so I'm not sure if I'm looking at the right logs, but my sysadmins told me to look in /var/log/daemon.log -- is that the right log to look in? If so, I saw some solr-related lines in there, but nothing that indicated an error.

The freezing seems to be happening on line 377 of apachesolr_search.module:

$response = $solr->search(htmlspecialchars($query->get_query_basic(), ENT_NOQUOTES, 'UTF-8'), $params['start'], $params['rows'], $params);

It's an empty query, and here is what's in the $params variable:

Array
(
    [fl] => id,nid,title,comment_count,type,created,changed,score,path,url,uid,name,teaser
    [rows] => 10
    [facet] => true
    [facet.mincount] => 1
    [facet.sort] => true
    [facet.date] => Array
        (
            [0] => tds_cck_field_date
        )

    [f.tds_cck_field_date.facet.date.start] => 2010-02-08T04:00:00Z/HOUR
    [f.tds_cck_field_date.facet.date.end] => 2011-02-14T04:00:00Z+1HOUR/HOUR
    [f.tds_cck_field_date.facet.date.gap] => +1HOUR
    [facet.field] => Array
        (
            [0] => is_cck_field_page
            [1] => ss_cck_field_options
        )

    [facet.limit] => 20
    [bf] => Array
        (
            [0] => recip(rord(created),4,30,30)^200.0
        )

    [start] => 0
    [q.alt] => tds_cck_field_date:[2010-02-08T04:00:00Z TO 2011-02-14T04:00:00Z]
)

Not sure if that's helpful, but I'm fairly new to this and am not sure how to debug the actual search call.

#31

Very helpful. Tell your sysadmin that the log of interest is catalina.out in Tomcat (if that's what you're using).

#32

#33

@mcarbone can you look and see if you are seeing queries like these take exceptionally long to execute?
SELECT MIN(cck.field_mt_dates_facet2_value2) FROM content_field_mt_dates_facet2 cck INNER JOIN node n WHERE n.status = 1;

#34

So on the problem server which is hanging, the SELECT MIN() query with the INNER JOIN takes minutes to return, but this returns instantly:
SELECT MIN(cck.field_mt_dates_facet2_value) FROM content_field_mt_dates_facet2 cck;

Anyone know how to optimize the original query to get better performance?

#35

AHH! I'm missing the ON part of the join and it is therefore doing massively stupid things at the database level.

#36

Status:needs work» needs review
AttachmentSizeStatusTest resultOperations
apachesolr_date.patch47.81 KBIgnored: Check issue status.NoneNone

#37

AttachmentSizeStatusTest resultOperations
apachesolr_date.patch47.51 KBIgnored: Check issue status.NoneNone

#38

There's packaging script crud in all the .info files.

#39

Status:needs review» needs work

a bunch of cruft on your patch like:

Index: apachesolr.info
===================================================================
RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr.info,v
retrieving revision 1.1.2.1.2.8
diff -u -p -r1.1.2.1.2.8 apachesolr.info
--- apachesolr.info 4 Jun 2009 13:33:36 -0000 1.1.2.1.2.8
+++ apachesolr.info 21 Mar 2010 19:07:23 -0000
@@ -5,3 +5,10 @@ dependencies[] = search
package = Apache Solr
core = "6.x"
php = 5.1.4
+
+; Information added by drupal.org packaging script on 2010-03-10
+version = "6.x-2.x-dev"
+core = "6.x"
+project = "apachesolr"
+datestamp = "1268179277"
+

#40

Status:needs work» fixed

I cleaned the info files and committed. Further testing and review welcome - but there's great value in getting the code out there.

#558160 by robertDouglass, mihha | DenRaf, mcarbone, haxney: Added date facet for cck fields.

#41

Followup to fix broken text indexing.

AttachmentSizeStatusTest resultOperations
date_facet_followup.patch2.43 KBIgnored: Check issue status.NoneNone

#42

That patch works for me and the text facet now appears. However, and this was happening before but I was waiting to report it until the other missing facet issue was resolved, the date facet block disappears once I filter using either one of my other non-date facets.

#43

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

#44

Hi,

Is this patch now committed to the current dev or do we still have to apply it?

Also, there is another patch in this thread http://drupal.org/node/920880

Which is the right one to use to get CCK date facets working correctly?

Thanks

#45

#41 has been committed, yes.

#46

Hi - I'm still getting problems after using the latest dev as per below. Do I need to reindex before it's fixed? MC

Filter by Date

2009 (15)
2009 (15)
2009 (15)
2009 (11)
2009 (11)
2009 (11)
2009 (11)
2009 (10)
2009 (10)
2009 (9)

#47

The same issue here: I've an output for cck date (latest -dev) that isn't usable (sorting is so curious!)

2009 (2)
2002 (1)
2005 (1)
2009 (1)
2009 (1)

Afterwards, when I click on a link, I obtain directly some hour instances:

(-) 8:52 AM
8:52 AM (1)
2:33 PM (1)

I've re-indexed without solving the issue.

#48

See the thread: #920880: facet_block_callback not propagated
There is patch in http://drupal.org/node/920880#comment-3543678

One part is in DEV the other part is not. It is behaving the same on my sites without second part of the patch... Give me few hours to make a patch again and I'll post it.

#49

it's been more then few hours...

and maybe we should continue in the other issue...

AttachmentSizeStatusTest resultOperations
fix-date-facet.patch962 bytesIgnored: Check issue status.NoneNone

#50

Status:closed (fixed)» patch (to be ported)

@mihha: thank you very much! Your patch is the only one needed to be applied to the latest 2.x-dev version to solve this issue.

#51

@mihha: brilliant! Well done - the last patch seems to have fixed the CCK date formatting, now showing years at the top level, drilling down into months and days.

I now need to figure out how to control the index so it only offers upcoming dates, and optionally past dates. any clues how to do that? One thought was to have a separate, maybe calculated field with values such as 'Today', 'Tomorrow', 'This Week','Next Week', etc

Thanks, MC

#52

Version:6.x-2.x-dev» 6.x-3.x-dev

Moving to 6.x-3.x. Will close once facetapi has this working for Drupal 6

#53

Status:patch (to be ported)» closed (won't fix)

Closing, 6.x-3.x is working and facetapi is also semi working.

nobody click here