I'm having some fun with feedapi and would like to figure out how to write custom processors for specific feeds.

Right now I've got a bunch of php that's scraping images off one of the sites I'm aggregating on the fly. I've been looking into caching those images and just saw that it's possible to write a rule for the feed itself that would grab and cache the image at the time it gets sucked into drupal.

Do I understand that right?

And if so, where can I read up on how to write something like that?

Here's the basic approach I'm using for one of the feeds. Again, this is code in my template that's pulling in images from the feed source on the fly. Hotlinking stinks so I want to grab these images at time of feed item capture if possible:


//set xpath of the image I'm trying to grab
  $xpath = '/html/body/div[2]/div[2]/div/div[2]/table/tbody/tr[5]/td/div/table/tbody/tr/td/img';
                
//create new document
 $html = new DOMDocument();

  // fetch file and parse it (@ suppresses warnings).
   @$html->loadHTMLFile($content->get_link());
  
// convert DOM to SimpleXML
   $xml = simplexml_import_dom($html);   
 
   // run an XPath query on the path above
 $source_image = $xml->xpath($xpath);

//find the right spot in the object for the src of the image
$source_image = $source_image[0]['src'];

 

So what might be my approach for grabbing this data from the feed items source page at the moment Drupal pulls it in? Thanks!