hi there,

i have a question in regards how to best integrate with the views module for a use case, i will explain in more detail below. basically is it feasible to implement aggregation similarly to how views does it, but not in mysql but in php?

background story

i'm currently working on a research project that deals with server-side clustering of geospatial data in drupal: http://drupal.org/project/geocluster
the clustering will work similarly to http://www.crunchpanorama.com/ and therefore combine points that overlap by comparing their distances.

displaying points on a map is yet another great use case for the views module, and has already been implemented by modules like openlayers, geofield and others. i want my solution to integrate as cleanly as possible with drupals best-practice modules.

the two main approaches that i want to implement are

  1. cluster views results in php (average users will like that for simplicity)
  2. cluster in apache solr with search_api (advanced users will like that for performance)

i have some draft code, that already combines points in a views result but i don't know how to manipulate the views result in order to replace two points with a cluster of those: http://drupalcode.org/project/geocluster.git/blob/refs/heads/7.x-1.x:/vi...

you might also think of this like replacing the original result items with fake entities.

thanks for any input that might lead me into the right direction!

CommentFileSizeAuthor
#2 views_post_execute_query_hook.patch736 bytesdasjo

Comments

dasjo’s picture

had some good discussions with dawehner.

pasting irc log for reference:

20:13:07 dawehner: dasjo: i'm wondering whether you could work with a imaginary hook_views_query_result_alter?
20:13:24 dasjo: dawehner: another option would be implement a custom views_plugin_query
oadaeh left the room (quit: Quit: Quitting). (20:14:33)
20:14:34 dawehner: dasjo: ... you wrote that you want to support both search_api and normal sql, do you think you can abstract that to work with both at the same time?
20:14:37 dasjo: dawehner: yep, that sounds interesting. the interesting part i guess is how to alter the result data in a way that it will still be valid for views
20:15:26 dawehner: dasjo: this sandly depense on the different kind of style plugins the user chooses at the end
20:15:35 dasjo: dawehner: supporting both approaches is definitely on the agenda, but i'm not sure if for both i will be able to integrate in the same way
20:16:33 dasjo: dawehner: i would like to work on the data itself as much as possible, so different style plugins can use the clustered data
20:16:44 dawehner: dasjo: your query plugin plan will alter some custom logic after the query was executed?
20:17:06 dasjo: dawehner: and clustered data has to be exposed to views in some way
20:17:08 dawehner: dasjo: one problem for you is probably that fieldapi data is not part of the actual resultset?
20:17:52 dasjo: dawhehner: yes, i have a custom logic that will iterate over the result set and merge some result items to clustered items
20:17:54 dawehner: dasjo: is your plan to change the result/entity data to contain fake data?
20:18:22 dasjo: dawehner: for the search_api approach, the partly-clustered data should already be provided via apache solr
20:18:54 dasjo: dawehner: about changing the data - that's a key question here
20:19:31 dasjo: dawehner: i can either modify existing data (cluster 3 points, point #1 will be the clustered point, update its lat/lng etc…)
20:19:33 dawehner: dasjo: i like the seperation of load_entities!
20:20:21 dasjo: dawehner: or add new fake entities for the clustered points and remove all of original ones (maybe that's better)
20:20:26 dawehner: dasjo: can you assume that every geo data is stored in a fieldapi field?
20:20:38 dasjo: dawehner: yep, did you see that in the geocoder module draft?
20:20:50 dawehner: dasjo: yes i'm looking at it right now
20:21:11 dasjo: dawehner: hmmm, that might be the main use case. what other sources come to your mind?
20:21:25 dasjo: dawehner: regarding fieldapi geodata
20:22:12 dawehner: maybe you want to support some external data, which you can't control how they are stored
20:22:26 dawehner: but i would guess this is too much abstraction at that point
20:22:51 dasjo: dawehner: could be, but i'm ok with depending on geofield for now and abstract that further once i have a working prototype
20:25:08 dawehner: dasjo: so the current solution of extending the flapi handler works for sql, but what about search_api, do you need geolocation specific functionality on the query level?
EclipseGc left the room (quit: Quit: EclipseGc). (20:26:28)
20:26:43 dasjo: dawehner: i like the idea of fake entities, but don't really know how to approach it. the potential views plugin (which ever kind of it is) knows about the fields that the views has been configured for. in the plugin settings, the user assigns for every field what to do with it (geofield: cluster, id & title: something like concat, any number: sum) … this leads us to the same use case that aggregation does
20:27:27 dasjo: dawehner: so the plugin knows what to do with all the fields, iterates over the results and creates fake entities that have the same fields but with aggregated values
20:28:08 dawehner: dasjo: so kind of how views implements sql aggregation at the moment as well
20:28:37 dasjo: dawehner: yes exactly - could that work? in the end the result data has to be consistent (schema or how you call it in views, drupal)
20:28:38 dawehner: dasjo: creating fake entities is probably more complicated then updated existing ones?
20:29:19 dasjo: dawehner: yes, maybe. that's the part i really don't know about well and i would love to find advice for
20:30:19 dasjo: dawehner: updating existing results (they aren't == entities i guess) is still tricky as i have to create the same data structure that views creates for its results 
20:30:29 dawehner: dasjo: do you have to not only change the actual data but also change the amount of items in the result?
20:31:30 dasjo: dawehner: yes, aggregation or in my case clustering combines multiple results into one (places that overlap / are very close will be a single cluster)
20:32:47 dasjo: dawehner: i outlined this behavior roughly add the end of the addCluster function: http://drupalcode.org/project/geocluster.git/blob/refs/heads/7.x-1.x:/views/handlers/geocluster_handler_field_geofield.inc#l108
20:32:49 dawehner: dasjo: yeah i don't see some critical problems with that approach
20:33:35 dawehner: dasjo: you might run into additional problems like pagers, but i'm not sure how this is actually useful for maps
20:34:26 dasjo: dawehner: good point, added it to my todo list :)
20:35:19 dawehner: dasjo: for the entity altering of views you could have a look at http://drupalcode.org/project/views.git/blob/refs/heads/7.x-3.x:/modules/field/views_handler_field_field.inc#l727
20:36:24 dasjo: dawehner: looks interesting
20:37:21 dasjo: dawehner: how do you call that grouping in views? that's not aggregation right?
20:37:51 dawehner: dasjo: in the ui we call it aggregation
20:38:30 dawehner: dasjo: and in code it is called groupby , but there is also the string based groupby of styles
20:38:32 dasjo: dawehner: so there is sql group-by thats called aggregation, and the other one?
20:39:09 dasjo: dawehner: yeah the string based groupby i was thinking. the code refers to sql group-by or the other?
timplunkettAFK is now known as timplunkett (20:39:16)
20:39:39 dawehner: dasjo: it's sql groupby
20:40:05 dasjo: dawehner: alright, anyways i think i have something to work with here, thanks!
20:40:16 dawehner: to be able to fake the entity (to render the field) it adds MIN(entity_id) to the query
20:40:32 dawehner: which doesn't change the result if you have a "GROUP BY" in sql
izaiah left the room (quit: Quit: izaiah). (20:40:50)
20:42:05 dasjo: alright, thats neat. so it somehow stored the aggregated information in a fake entity with the minimum entity_id of the grouped results?
20:43:08 dasjo: dawehner: maybe i can just reuse that code snippet and we can think later of splitting it like load_entities 
20:45:16 dawehner: dasjo: exactly!
20:45:39 dasjo: dawehner: should i care about the rest here? http://drupalcode.org/project/views.git/blob/refs/heads/7.x-3.x:/modules/field/views_handler_field_field.inc#l760
20:45:52 dasjo: to be honest, i don't really know about deltas :) 
20:47:15 dasjo: ah well, that seems to expain it http://drupal.stackexchange.com/questions/13902/what-is-the-meaning-of-the-fielddelta-content-type-offered-in-contextual-filter
aspilicious left the room (quit: Ping timeout: 256 seconds). (20:48:02)
20:49:46 dasjo: dawehner: well thanks so far, i think i will give it some trial and error now :)
dasjo’s picture

Title: How to inject a custom aggregation implementation? » Allow to inject a custom aggregation implementation
Category: support » feature
Status: Active » Needs review
StatusFileSize
new736 bytes

i have further developed my prototype on clustering points using php. the current implementation builds upon a patch, but that might not be the final solution as explained further.

the geocluster_handler_field_geofield doesn't contain any logic, but serves for configuration of the clustering.

the Geocluster class currently depends on the field_handler. i think the custom field_handler won't be necessary, instead it should be possible to create some kind of views plugin that takes over the configuration options and hooks in, where i currently use my implementation of a new hook_views_post_execute_query (see attached patch).

also, related - we have discussed splitting up views_handler_field_field::post_execute a bit:
i have an equal implementation of entities_by_type.
in addition to that, the load_entity_fields method is similar to the load entities part, but avoids doing unnecessary entity loads for result items that will be clustered anyways.

dawehner’s picture

It would be cool to document this hook in views.api.php to tell people when exactly this hook should be used.

dasjo’s picture

It would be cool to document this hook in views.api.php to tell people when exactly this hook should be used.

yes, of course. but i'm still not convinced that this hook is the right approach to solving my problem:

the Geocluster class currently depends on the field_handler. i think the custom field_handler won't be necessary, instead it should be possible to create some kind of views plugin that takes over the configuration options and hooks in, where i currently use my implementation of a new hook_views_post_execute_query (see attached patch).

so instead of storing configuration in a custom field_handler, i would like to be able to create a plugin/handler that alters views results after query execution. i was looking into altering views_object_types() but i can't see a straight forward approach to doing this, as the type of plugin/handler isn't in the views apis yet.

dasjo’s picture

after having some good conversation with dawehner and him actually drafting code for me (thanks!!), i was able to make some progress here.

i have moved from a custom field handler to using a custom views_plugin_display_extender to store clustering configuration: GeoclusterViewsDisplayExtender will capture all settings required for clustering.

the actual clustering is still invoked using the above patch for adding hook_views_post_execute_query, see #2.

that approach seems ok to me for now.

mpgeek’s picture

Need to evaluate here if this is already covered in D8 in general. This is of immediate use for #2578785: Large-scale location mapping in Drupal 8 (Views). Adding as a related.

chris matthews’s picture

The 3 year old patch in #2 to views_plugin_query_default.inc applied cleanly to the latest views 7.x-3.x-dev and if still relevant needs to be reviewed.