The block has been removed due to low usage and dependency on external database.
Remaining tasks:
- Remove the DB grants and credentials in settings.local.php
Original issue summary:
We've been working on the d.o. "Related Modules" block for 2 years now. We have evaluated 9 alternative algorithms. Result shows that improving the algorithms increases click-through rate from 0.1% to 1.5%. We summarized the results and wrote a paper for ACM Recommender System Conference'09. Early access of the paper can be found at http://mrzhou.cms.si.umich.edu/node/139 - [Edit link broken, no archive available.].
One conclusion from our study was that automatic algorithms, no matter how we refine it, had certain limitations. To further improve the quality of "Related Modules" block, perhaps we have to use Web 2.0 techniques. That is, to provide a textbox (w/ auto-complete) on the module pages for the users to suggest related modules. And then, we aggregate the suggestions using some smart algorithms to display in the "Related Modules" block.
An alternative might be to have the module authors add related links on their module page. However, this has 2 problems: 1) the authors don't necessarily know all the related modules. 2) they might be reluctant to add links to substitute modules.
If the community generally thinks this is a good idea, I can work on a prototype.
Your comments are appreciated!
| Comment | File | Size | Author |
|---|---|---|---|
| #31 | pivots-drupal-devsite-backup.patch | 5.58 KB | tvn |
| #23 | pivots_block.patch | 7.74 KB | danithaca |
| #22 | pivots_block.js_.txt | 1.3 KB | drumm |
| #21 | pivots_block.patch | 7.17 KB | danithaca |
| #18 | pivots_block.patch | 6.44 KB | danithaca |
Comments
Comment #1
killes@www.drop.org commentedI think this needs to go into the redesign queue.
Comment #2
Amazon commentedDaniel, I think you should go ahead with this. I don't think adding another block for adding module recommendations is that cumbersome. We can test on drupal.org and see if it adds a lot of value.
Comment #3
danithaca commentedThanks! I'll go ahead and try to implement it :)
Comment #4
Amazon commentedThis is now deployed on staging8.drupal.org. Let's get the module recommendation blocks and the ability to recommend modules that are related added to that site.
Comment #5
danithaca commentedExplanation of the algorithm from my advisor Paul Resnick:
"...[This is] an integrated version that automatically adjusts which recommendations it makes based on user clickthroughs, through a technique called a Multi-Armed Bandit learning algorithm. All of the matching algorithms (based on text similarity, co-mention in conversations, or co-installation) generate initial candidate sets of items to recommend, but the actual items to recommend will adjust over time based on clickthroughs. the adjustments can happen offline, in batches, so they don't affect page load performance."
Comment #6
danithaca commentedThe code is checked in at https://svn.drupal.org/drupal/redesign_modules_sandbox/staging8/custom/
The same copy is posted here for easier review:
Comment #7
danithaca commentedA working demo can be found at http://staging8.drupal.org/project/cck
Comment #8
gerhard killesreiter commentedPlease provide a patch so I can see the changes.
Comment #9
danithaca commentedpatch attached. thanks!
Comment #10
fgmI'd have liked to read the paper, but the PDF does not seem to be downloadable from the URL given:
http://mrzhou.cms.si.umich.edu/files/sites/mrzhou.cms.si.umich.edu/files...
(interesting spam you have over there, BTW: some of it looks actually manual)
Comment #11
michelleAny movement on this? Having the broken block on there is just taking up space...
Michelle
Comment #12
danithaca commented@Michelle: Thanks for asking. May I ask a favor from you to review the code for me? I'll reciprocate if ever you'll need a code review. thanks!! --daniel
Comment #13
michelleI don't really know anything about reviewing code... I looked at the patch and didn't see anything that jumped out at me. Not sure what to say about it...
Michelle
Comment #14
drummThis looks like a patch to drupalorg.
Comment #15
drummThis only sends data to Google Analytics. They do provide an API to get data back out, but it would be good to have a copy recorded directly in our local database, in a new table.
Comment #16
danithaca commented@drumm: I have uploaded a new patch with minor changes on the Google Analytics code. Yes. You are right. The patch only sends data to GA, but then we have a Python script (mab4do.py in attachment) that retrieves the GA data, saves the data in a local database hosted on d.o., and then updates the "related projects" block based on the GA clicks/suggestions data. The Python script would be running on scratchvm.drupal.org once a day. The algorithm we use is based on a particular version of the multi-armed bandit algorithm. It is documented in the paper Optimal learning and experimentation in bandit problems.
I have fully tested the patch on our local d.o. development environment. If the code review is fine, I can do "git apply patch" to the drupalorg module myself. Thanks.
Comment #17
drumm- The JS should be in an external file, rather than being even more inline. This should allow
\/rather than[/]for escaping the/.- The URL patterns should check for https
Comment #18
danithaca commentedThanks drumm! New patch attached:
- made JS into an external file; used
\/instead of[/]- URL patterns check for https
- made some code refactoring.
Comment #19
drummLooking better.
- In JS, use a Drupal behavior, http://drupal.org/node/304258. This will make sure the JS runs only once.
- The
ul[id^="pivots-block-list"]selector might be slow. Can an exact class or id be used?- onclick events should be defined in the .js file and attached with jQuery.
Comment #20
drummComment #21
danithaca commentedThanks for the feedback drumm. New patch attached, and is tested successfully on my local d.o. dev.
-- Used Drupal.behaviors in JS
-- Instead of using
[id^="pivots-block-list"], now it uses exact class/id match.-- onclick events are now defined in JS and attached with JQuery.
If the patch looks fine, can you let me check in the patch myself? I'm waiting for #658218: get access to stagingvm in order to run the python update script that goes together with the patch. Thanks.
Comment #22
drummThe PHP looks like an improvement.
$ga_data = "$node_id";seems a bit redundant and I'd like to see'variables' . $concatenated . 'like_this', but it is still improved enough.I did run the JS through JSLint, and got to the attached JS, which I have not tested at all.
Comment #23
danithaca commentedThanks drumm for the suggestion and JSlint code. New patch attached:
-- changed the redundant
$ga_data = "$node_id";to be$ga_data = $node_id;-- replaced original JS file with the JSLint output.
I have fully tested the patch on my local d.o. dev environment.
Comment #24
drummCommitted and will probably deploy this afternoon.
Comment #25
drummDeployed.
Comment #26
drummThis got rid of the existing suggestions, and threw a JS error on splitting undefined. I had to roll it back.
I'll need to see this working on a dev site, http://drupal.org/node/1018084, with the block cache turned on, before considering again.
Comment #27
danithaca commentedGot it. Sorry drumm for the trouble. I've tested it on my local d.o. dev server; don't know why it throws JS error on d.o....
I'm travelling now. Will resume in 10 days.
Comment #28
drummI queued a dev site named pivots for this.
Comment #29
gerhard killesreiter commentedI am questioning the approach taken here. I'd like to see _less_ integration with GA rather than more. Is there an opt-in button?
Comment #30
tvn commentedIs anyone working on pivots-drupal.redesign.devdrupal.org site? Last access was on April 17. We have limited space on the server, so if this development site is not needed right now, we'd love to use that space for another project.
Comment #31
tvn commentedThere was no reply since August and last access date is still April 17. The dev site will be destroyed tomorrow. You can request it again, when it's needed. Attached is the patch with all current changes on the site.
Comment #32
mgiffordThat patch no longer applies (obviously) - 2 out of 3 hunks FAILED - although the module's moved to sites/all/modules/drupalorg/pivots_block
Comment #33
mgiffordComment #35
mgiffordI've been surprised lately when I've been looking at this block how often it doesn't include modules that I'd assume would have to be there. You look at the related projects block, vs what is done in the main description here:
https://drupal.org/project/apachesolr
How are the related projects here actually related Metatags & Pathauto aren't all that related:
https://drupal.org/project/pathauto
Maybe it would be easier if there was a More link that gave you the option to see others. Right now it almost seems random.
Comment #40
Bojhan commentedJust wondering are there any stats on how often this is used?
Comment #43
drummThis issue is about adding a way to suggest related modules to the algorithm used. Since we haven't heard from danithaca in 3 years, I don't think it is happening.
Comment #44
ar-jan commentedI'd like to hear whether we don't actually want this feature, or if it's something that is just not a priority at the moment.
Because I really recognize what mgifford noticed in #35, and it seems to me like crowdsourcing this could improve the situation.
I often notice the abscence of related modules that are in my mind clearly related. Since better recommendations are especially helpful to new Drupal users, I'd say we do want this. Could we set this to postponed (waiting on volunteer effort or budget) instead?
Comment #45
drummThe algorithm that runs to recommend related modules actually runs independently of our infrastructure, on something danithaca has set up. This issue is about improving that system. We can't deploy anything without his participation.
Replacing the whole system would be a separate issue, and an opportunity to decide what factors into the lists, automated or not.
Comment #46
mgiffordHow can this be a Won't Fix? If we don't have control of this block we should just remove it. But before we get to that stage, let's try to bring this functionality inhouse #2414533: Make a new "Related Modules" block that works well and sits on our infrastructure
Anyways, this is still a problem on the d.o site and certainly isn't something that should be considered beyond our capacity as a community.
Comment #47
yesct commentedComment #48
mgiffordThis really shouldn't be seen as a feature request. This functionality is very flawed right now.
Just looking at the top modules it is pretty clear how useless this block is at the moment.
Views - Content Construction Kit (CCK), Panels, Advanced help, Views Bonus Pack, ImageCache
CTools - Image Resize Filter, Background, Toggle WWW, Admin role, Token
Token - Pathauto, Automatic Nodetitles, ImageAPI, Transliteration, Global Redirect
Libraries - Fetchgals, Goto, Library, API, YUI Rich Text Editor
Pathauto - Handy alias, Extended Path Aliases, View Alias, Sub-path URL Aliases, NodeSymlinks
Entity API - RESTful Web Services, Entity cache, Remote entities, Relation, Dynamic properties
Admin Menu - ImageAPI, Token, Admin Menu Hider, Toolbar, Backup and Migrate
Date - Calendar, Javascript Aggregator, Media: Archive, Case Tracker Due Date, GCal Events
Webform - Clientside Validation, Salesforce Webform Integration, Webform Validation, Webform Conditional (Same Page Conditionals), Webform Views Integration
This would be a great space to introduce folks strategically to modules that would either be competitors (say Webform & Entityform) - but really, webform stands out as providing at least some useful information.
Date & pathauto do pretty well too, but this block really isn't useful on the majority of project pages.
Why the heck do we even have a Fetchgals module on Drupal.org let alone, why are we promoting it on one of the top project pages of d.o?
Comment #49
tvn commentedLuckily we do have events tracking set up for this block in Google analytics. For the last 12 months, users clicked on any link in the 'Related projects' block in 1.75% of times it was displayed. Considering the suggestions the block provides currently are not too accurate, and it relies on external infrastructure, I think we should remove it. With the possibility to bring it back in some form, which would ensure accuracy of information and no external infrastructure involved.
Comment #50
drummThe block is now disabled. Leaving the issue open for code cleanup.
Comment #52
drummThe DB grants and credentials in settings.local.php can now be removed.
Comment #53
drummComment #54
Bojhan commented@tvn Of that 1.75% times how much of that led to a successful download?
Comment #55
danithaca commentedSorry guys for not following up as the creator of the issue. For some reason I didn't receive updates on this issue until now.
The patch I submitted earlier used the "Multi-armed bandit" algorithm that automatically moves modules that got more clicks to the front of the "related projects" list. Overall the goal is to increases clicks rate on the "related projects" block, which is an indicator to show whether people find the recommendations useful or not.
I'm not sure whether there is still an interest to deploy this feature. If so, I can follow up and contribute more time to it.
An alternative is to use ApacheSolr's "More Like This" feature. Since d.o. already runs on ApacheSolr, it is the easiest approach to get it deployed. Our research back in 2009 on d.o. used A/B testing (actually multivariate testing) and showed that "More Like This" is the second best approach (in terms of clicks rate) compared to a few other alternatives. It is not ideal but still a decent option.
Comment #56
tvn commented@Bojhan, I don't think that event tracking which was setup for this gives us an answer to that, sorry.
@danithaca, thanks for your suggestion. 'Related projects' block on project pages is valuable, however we are not going to re-deploy this particular implementation of it, unless it is refactored to not be dependent on external infra and to show more relevant results. It might indeed make sense to look into alternative implementations, such as ApacheSolr. Another alternative might be #1323826: Make it easier to identify and discover ecosystem modules.
Comment #57
mlhess commentedClosing issue.
@danithaca, if you want to refactor this to work on d.o infra, please create another issue and email me about it.