Download & Extend

Upgrade Memetracker to work with Feeds instead of FeedAPI

Project:Memetracker
Version:6.x-1.1-alphpa5
Component:Code
Category:task
Priority:normal
Assigned:Unassigned
Status:active

Issue Summary

Memetracker is great, but it's dependent on FeedAPI which has been deprecated in favor of the Feeds project. Feeds is better in all respects. It's easier to map RSS items to CCK fields, it performs better, it tends to not cause cron to crap its pants quite so much etc etc etc.

I am not an experienced developer. I will try to write out my attempts at making this happen here. If there are good coders who'd like to help, please pitch in.

Part 1: Find the parts of the Memetracker code that plug into FeedAPI

Intellectually, my understanding is that the memetracker module is a bunch of code that does various things. One of those things consists of grabbing content to run through the memetracker process and build memes. So it's a safe bet there are places in the code where functions have been written that grab content. As we can assume that Kyle wrote decent code, it's a safe bet that there are just a few isolated functions that control this behavior. Also, it's a safe bet that within those isolated functions, there are isolated lines that explicitly refer to FeedAPI stuff.

Assertion: In order to update memetracker to work with Feeds, we need to find the bits of the code that reference FeedAPI, understand what they're doing, figure out the equivalent code for Feeds, and then replace the lines in question with new code.

So let's find the places in the code that handle grabbing content:

The memetracker module consists of a folder full of files.

A few of these files are .tpl files. They handle how stuff gets displayed, so we don't need to look inside of them.

Similarly, the .info .txt and .install files don't have code in them, so we can rule them out as places to look.

That leaves .inc files and the memetracker.module file.

As I understand it, the memetracker.module file is the "topmost" file in the module. It explains how Drupal interacts with all the core code functions that make memetracker do its thing. Those core code functions live in the .inc files. Let's take a look at them:

There are 7 .inc files in the memetracker module. I will go inside each of those files and "ctl + F" for the word "feed." Assuming that the lines that need to be replaced and updated are lines that will mention "FeedAPI", this should give us an idea of which files will need to be patched to get Memetracker working with Feeds.

-click.data.inc (0)
-content.inc (20)
-content_source.inc (22)
-machine_learning_api.inc (0)
-meme.inc (0)
-memetracker.admin.inc (47)
-memetracker.inc (0)

Based on this, it's clear that we will only need to worry about 3 files: content.inc, content_source.inc, and memetracker.admin.inc.

That's great news. We've got from 7 scary files full of code that we have to understand down to just 3 scary files full of code that we have to understand. Progress!

Now we need to figure out what each of those three files do. Kyle gave them names so we can assume they are not arbitrary collections of code. Here's how I'd describe them each:

- Content.inc - This file appears to contain the code that defines all the get_whatever theme functions used in memetracker to display content. These are the functions you call from the theme layer to show the title, timestamp, content source for items in the $meme array.

(Digression to explain how memetracker works: In memetracker, you're displaying a list of "memes" listed in reverse order of "interestingness." Each meme is a $meme object. And within that object you've basically got a list of feed items that have been clustered together based on keyword relevance by the memetracker code (thank goodness we're not trying to update that stuff!). Each item in the code is assigned an interestingness value based on criteria defined somewhere in the code (I don't know where yet). The most interesting item in the object is designated as the lead headline, the top story. All the other items are then display below it in the meme as related content. As far as I can tell, these items are listed in random order. That's another thing I hope to fix in another task here in the issue queue.)

Looking through the file, it appears the Kyle originally built memetracker so it could work with FeedAPI OR the core aggregator module. This is very interesting. Did not know that.

So we'll need to patch this file so that the functions grabbing content grab from the Feeds tables and references and not the FeedAPI ones. Ok.

-Content_source.inc - This file appears to contain the functions that actually go grab up all the feed items you need when constructing your memetracker. It appears to contain a bunch of mySQL queries that go grab feed items from the database. So to update memetracker to work with FEEDS, we'll need to rewrite these queries to match the FEEDS conventions. Ok.

-memetracker.admin.inc - This file has all the functions that create the admin UI for memetrackers. Each memetracker has two admin pages, a page where you can add and remove feeds from the tracker, and a page where you can tweak the display settings for the feed. This file contains the functions that generate what shows up on those pages, so it too has functions that plug into the FeedAPI stuff. We'll need to update these.

Ok, so we've found all the bits that plug into FeedAPI and have a sense of what they do.

Next steps:

1. Figure out confusing functions in content.inc - The relevant-looking functions in content.inc are lines 452 - 557. These appear to be a "class" full of "private functions." I took an OOP class a few years ago so I now these are 101 CompSci concepts, but I will need to read up on what they mean so I understand what's happening here. Also, the biggest function inside this class is called ___construct and contains stuff I'm not familiar with. I'll go ask over on Stack Overflow and see if people can explain.

Here's the function that's confusing to me. not sure what it's doing.

public function __construct($mid, $cid = Null, $int_id = Null, $content_type = Null,
    $timestamp = Null, $source = Null) {
    // call parent constructor in content_drupal
    parent::__construct($mid, $cid, $int_id, 'content_drupal_feedapi', $timestamp, $source);
   
    // If the content object is already saved to database, when a content object
    // is created, we will know the content id and not the internal id
    if (is_null($int_id) AND !is_null($cid)) {
      // Query for nid int_id = internal id so the node id or aggregator id or whatever else
      $iid = db_result(db_query("SELECT int_id FROM {memetracker_content} WHERE
      cid = %d", $cid));
      $this->int_id = $iid;
    }
  }

I'm not sure what this parent::__construct nonsense is. what does :: mean? Parent? Is it calling itself from inside the function? I don't get it. Will need to understand this before I can patch it correctly.

2. Figure out how FeedAPI stores and organizes feed items I'm going to need to look at FeedAPI to get a good sense for where it sticks stories and how it offers them up to modules like memetracker. This will help me understand the memetracker code that references FeedAPI and also understand how I need to update those refereneces to point to the right tables to work with Feeds.

3. Figure out how Feeds stores and organizes feed items? I'll need to familiarize myself with the same stuff for Feeds so I can update memetracker to work with it.

4. Patch memetracker's latest alpha release so that it works with Feeds instead of FeedAPI. Once I've figured out the memetracker code that confuses me and familiarized myself with FeedAPI and Feeds, I should have the tools to update memetracker so that it works with Feeds instead of FeedAPI.

5. Figure out the rules and procedures for sharing my code on Drupal.org so other people can test it. I think I can post a tar file of the module in these issue queue's that people can mess with.

6. Recruit long-suffering friends to look at my code, beg people on Drupal.org to give it a go. Code review seems like it'll be important for this as I suspect I'll make a lot of mistakes as I go.

Ok. So I'm going to go work through the relevant code bits that are confusing to me in the next post.

nobody click here