Is the Affiliate Products Shop module still being maintained?
TallDavid - March 9, 2009 - 21:36
| Project: | Affiliate Products Shop |
| Version: | 5.x-1.x-dev |
| Component: | Miscellaneous |
| Category: | support request |
| Priority: | normal |
| Assigned: | StephenGWills |
| Status: | active |
| Issue tags: | affilate, comission junction, linkshare |
Jump to:
Description
Is the Affiliate Products Shop module still being maintained?

#1
I have received permission from apurv@cmswebsiteservices.com to start maintaining this project.
MY first step will be to assess the current landscape, i.e. the ubercart module and other projects that
have sprung up while this one was lying fallow. Any help is appreciated but I would like to focus on
solutions that provide affiliate links from the major brokers with in Drupal.
Steve
----------------------------------------------------
From: Apurv Bordia \( CMS \)
Date: Monday, July 06, 2009 06:03 am
Subject: RE: [drupal.org] 6.x version of affiliate_shop
HI Steve,
Thanks for contacting CMS,
You are more than welcome to use the shell,
Let us know if we can help you in any way.
Thanks
Apurv
#2
An additional 5 months have passed without activity. I'll ask again, is the Affiliate Products Shop module still being maintained?
ref: http://drupal.org/node/251466
#3
Hi David,
IN the end of October I was given access to the project as is suggested in the Abandoned Projects document you ref'ed.
4 days ago, I posted some notes on the linkshare web services that replace the ones APS originally interfaced with.
It's not much after 5 months, but it is a start. Is there an aspect of the APS that you are interested in using or, better yet,
working on?
@StephenGWills
#4
Stephen,
I'm interested in the general functionality of integrating affiliate programs like Commission Junction and Linkshare into Drupal. Unfortunately, the Affiliate Products Shop module has never been in a state (Drupal 6) that would warrant further investigation. Since 2007, a lot of progress has been made on feed processing in Drupal. At this time I'm not sure if the Affiliate Products Shop module is viable or if a combination of Feeds, CCK & Views would be a more productive approach.
#5
I would agree that creating modules for the sake of having another module is senseless. The original APS was offered to solve the SOAP to links interface issues with CJ and Linkshare. If you are right and there is nothing a self contained module can offer that a deft combination of other techniques can solve, let's get that technique defined and turn APS into a documentation piece.
Before doing further work on APS as a standalone modules, therefore, let's research:
Feeds: http://groups.drupal.org/node/3418
CCK&Views vs. New Module discussion: http://groups.drupal.org/node/29790
TallDavid, do you already have a feeds based solution or are you in the research phase on all this? If so, which feeds module do you use to interface with Linkshare?
#6
I'm very interested in this as well, however I have not yet had exposure to either Linkshare or CJ and their various interfaces and APIs. Once I get up and going with them I will post some thoughts. Might take a few weeks but I wanted you guys to know someone else is out there with this interest. And this is better than a "subscribing" post, right? :)
From my perspective, my ultimate goal is to import items from an affiliate (or link to them - as long as they show up on my site), link them to taxonomy, and create numerous different views based on the content. Not sure how to arrive yet but this is a starting point.
#7
Worthwhile reading: http://drupal.org/node/53986
Looks like two other options are to:
1) Use Node Import to pull in content that was exported in CVS format
2) Use the Feeds module and a feed aggregator to pull them in via RSS
From what I have read, #2 is more likely with Linkshare as they are recognized as having better RSS feeds.
I did notice that Linkshare has web services available, which is probably where this module comes into play. Here's what they list as available:
Coupon Web Service: The Coupon Web Service provides you with coupon and promotional link data for your advertisers
LinkGenerator Service: Dynamically create links to any page on an Advertiser's site
Merchandiser Query API: Run queries against Advertisers' Merchandiser data
Targeted Merchandiser API: Access product information that will be chosen based on the page on your Web site
I'll be testing shortly and will post results. I'm focusing on the RSS angle as my desire is to import all of the items and create one node per item on my local Drupal site.
#8
I'm going to post a few more thoughts, starting with my personal requirements for my affiliate site. I'm not sure how to get there yet.
1) "Set it and forget it" whereby everything happens automatically. There are no manual update/import steps.
2) The content is automatically updated on a schedule. This includes pricing changes, stock changes, new items, etc. Real-time works as well but minimum requirements for me are once per day.
3) The resulting content, including the actual photo (not a link), is output to a node. Specifically, one node per item. The fields would be CCK fields of various types. The images would be ImageFields so they can take advantage of all of the supplementary goodness from ImageCache. The other CCK field types would be text, links, date, content taxonomy, etc.
4) Any updates or new products do not result in duplicate nodes being created (i.e. what might happen if you imported the same dataset twice.)
5) When items are no longer available for sale, they are NOT removed. Once the Drupal nodes are created, they live forever. The items that are not for sale are marked as such on the Drupal site but we don't want to lose any comments or other created data.
6) I'm not planning to query them for search as I will index and facet locally with Apache Solr search integration and views. I just need to get the items imported and managed appropriately.
7) I want the entire product catalog in most cases.
I'm in on CJ now, but the export functions they offer seem to be Tab, Pipe, XML, Quoted CSV, Standard CSV. And from what I can tell, they re-publish them on a URL that changes from time to time AND (at least for XML) the file was gzipped. The XML is not an RSS feed as far as I can tell. So this would seemingly make it difficult to use for this purpose.
These are links worth reading and the author illustrates the downside of using XML/RSS/feeds as an approach.
http://www.cumbrowski.com/CarstenC/affiliatemarketing_datafeeds.asp
http://www.revenews.com/carstencumbrowski/options-for-using-affiliate-pr...
If that's the case, that leaves us with Web Services as the only good option for CJ. I believe this link is public but it's worth knowing about: http://webservices.cj.com/
It looks like they are in flight of converting their APIs from SOAP to REST. Not everything is converted, including the one that seems most interesting: Item-based Details Report.
Do any of you have any thoughts on the various approaches? Is this looking like:
A) Web services approach, perhaps using:
-- http://drupal.org/project/soapclient
-- http://drupal.org/project/webservicesapi
B) FeedAPI/Feed Element Mapper/FeedAPI Image Grabber
C) Node Import (http://drupal.org/project/node_import)
#9
I spent some time today looking at Node Import and a few other import options. This is a page worth reading:
http://groups.drupal.org/node/21338
Based on that page and what I understand, node import, migrate, and the other options are a good option when moving content in one big batch. If you are migrating from another CMS into Drupal they are great for that purpose. If you are importing your own product set again a great use of it. The key use cases are either a one-time mass import OR an import of new/unique data that does not conflict with existing data on your Drupal site from a previous import.
When you want to maintain an ongoing relationship with an affiliate, who could update their inventory and prices at any time, the import modules may not be a good choice. My belief (not yet backed up by fact) is that a re-import of a new XML file from CJ or Linkshare may re-create duplicate nodes for items that were previously imported. The other negative could happen where they don't create duplicates but may not update a single changed value on an existing node (i.e. a node was previously imported, exists with a CCK field for cost, the cost field changed in the new dataset, and it's not picked up.)
Is anyone else out there or am I talking to myself? :)
#10
After further research on Node Import, I'm pretty confident that is not a good solution for this. It is missing two important features.
Updating existing content that was previously imported:
#150397: Allow Mapping of NID to Update/Overwrite Existing Nodes
Automatic or scheduled updates:
#184308: automatic import
Now I'm back to FeedAPI. The reason for FeedAPI and not Feeds is maturity. Feeds only came about in September and does not have the robust support for additional add-on modules that FeedAPI does. Perhaps at some point in the future it will be worthwhile, but not yet. I need something now, and I don't want to write or roll my own code if I can avoid it.
Commission Junction provides feed exports in Tab, Pipe, XML, Quoted CSV, and Standard CSV formats. They offer transport via FTP or HTTP (from them), FTP or HTTP (push to you), or e-mail. There is no charge for exports once you are signed up.
Linkshare requires you to sign up for an additional service after you are established as a normal user. Search for "Merchandiser Data Feed Program" or visit this link. They charge $250 if you are not an established/seasoned user of their site.
http://helpcenter.linkshare.com/publisher/questions.php?questionid=62
Linkshare offers the feed exports in Pipe or XML. The only transport method is FTP.
So if we are looking to normalize a solution that works for both, it needs to be Pipe or XML, transported by FTP.
The XML schemas for both are here:
http://www.cj.com/downloads/tech/scheduled_pub_transdata.pdf
http://helpcenter.linkshare.com/publisher/images/Publisher%20Merchandise...
My current interest is in the following module:
http://drupal.org/project/feedapi_eparser
This module appears to allow you to set up a custom XML parser that would also enable the use of the other FeedAPI goodness like the field mapping to CCK with Feed Element Mapper. I'll start to test with this and see what I can come up with in terms of aligning a correct XML parser for the two different formats. I'm still not signed up for Linkshare's extra feed service so I have no test files yet, but I have real live CJ data to play around with.
#11
Heya Robert,
on 11/25 you asked: "Is anyone else out there or am I talking to myself? :)"
You are NOT speaking to yourself. Thank you so much for all the thought you are putting into this. I've been being drawn and quartered with some library apps I've been hacking but I wanted to make sure you know you are not shouting into the wind here. more when I have something as intelligent to post as you have been doing!
Happy Holidays!
@StephenGWills
=~-*-~=
added later thanksgiving day while the jocks are watching the games *g*:
1. If you are right and feedsAPI is the way to go, wouldn't one want to add feeds to the roadmap since the devs are moving that way?
2. if we are going to use XML and FTP, feeds probably is the way to go since feedAPI is http only, did I read that correctly?
#12
Thanks Stephen. I'll keep going :)
Important tip
Whenever you start messing with product imports from retailers, download the import file and REMOVE ALL BUT 5 ITEMS. Otherwise you may find it starting to work and importing hundreds of new items and products. It's a lot of cleanup. Trust me - I populated my taxonomy with a few hundred bad items :) What I did now is kill all but 5 items in the XML file and then place the file on an unauthenticated web URL that I control. That takes web authentication out of it and makes it easy to clean up any resulting broken nodes from bad imports.
End Important Tip
The FeedAPI Extensible Parser module sounded promising, but unfortunately after installing it your only two options are feed types of RSS 2.0 and Atom - neither of which is useful to parse the custom XML from CJ & Linkshare. There is what looks like an XML parser under the hood of this module, but it's a helper module for the RSS parser and can't be used on its own. If we wanted to roll our own 'custom' affiliate processing module this is probably a good place to start, but again my goal at the moment is to do this without writing a bunch of new code.
And yes, Feeds is almost certainly the right direction. It just does not have any supporting modules that have been ported to work with it yet. I started with Feeds and then moved back to FeedAPI when I discovered I could not do things like auto-import to a filefield/imagefield (which works but requires manual application of an uncommitted patch for FeedAPI, but I'll get to that later.)
FeedAPI does seem to be HTTP only, however a few different supporting modules extend this by doing things like using the Curl PHP module. I'm less worried about the transport at the moment and more interested in trying to get the XML document imported in such a way that I can start to process it and map fields. If I had to, I could always roll a shell script+cron to curl them down via FTP and serve them up from a local URL via HTTP. That might be a required step anyway as at least CJ serves the extracts only via filename.gz and I don't think FeedAPI is smart enough to uncompress the file.
I still have not achieved a successful XML import into a feed content node.
What I did get working was the FeedAPI CSV parser module. This perfectly read the structure for a CJ product catalog that was exported via Standard CSV format. I was able to map the fields and do all of the other fun stuff with it. This helped me line up the correct CCK destination fields and fix a number of things with them. I'll have more to post about this later if I have to use it, but at any rate it does work. Unfortunately Linkshare does not support CSV from everything I read.
The CSV import led me down a few other paths in relation to the data format and CCK fields. To import the photos into your site via a filefield/imagefield, you need this patch:
http://drupal.org/node/319538#comment-1942462
The above patch works and it's pretty cool. The remote images are downloaded and saved into a filefield, where you can now display them on a node, resize them with Imagecache, etc. Neat.
In terms of the import I was able to get textfields imported properly, CCK number fields, CCK link fields, and date fields. The thing that does not work for import are content taxonomy fields. For example, I wanted to pull the Manufacturer field into a content taxonomy CCK field. Apparently FeedAPI does not support content taxonomy, but there is a patch here:
http://drupal.org/node/463670#comment-1806926
Unfortunately that patch does not work so it probably needs a few small changes. The core FeedAPI module does allow import to the core taxonomy system and that does work, but I really want it in a CCK field.
The big thing that I ran into was the actual import file. Each line of that CSV file is for a different item, like this:
MANUFACTURER Pullover T-Shirt In Red
MANUFACTURER Pullover T-Shirt In Blue
... but that is represented on two different lines, which is read as two different import targets, which lands in two different nodes. So now I have the same product in different colors in two nodes. What I really want is the product in ONE node with a list of available colors - possibly via content taxonomy tags. Hopefully this makes some sense. This is further complicated by the fact that there is no field called color - the color is listed in the item title.
This is a big long post, but I'll summarize where I am.
-- I have not yet gotten XML imports to work.
-- I have been successful with a CSV import, but only for CJ.
-- Import targets work for CCK date, text, filefield/imagefield, link, and numeric types.
-- Import targets to core taxonomy work.
-- Import targets to CCK content taxonomy do not work and/or need to be fixed
-- The same item in different colors shows up in different nodes because of the format of the export file
My current line of thinking is to write an external program/script to download the product catalog files in XML format from CJ/Linkshare, process the file to fix the case (a lot of stuff in the export file is ALL IN UPPERCASE) and fix the color problem, and then move them to my own http directory. I could then point the FeedAPI URL at that post-processed file and have the items imported correctly. All of this works under the assumption that I can solve the problem of importing the XML files in the first place.
If things in this post did not make sense, please let me know. I'll add more detail.
#13
Based on the following node it looks like a small custom XML parser may be needed.
http://drupal.org/node/220785
#14
I have decided to go in a slightly different direction. I'm going with FeedAPI + the CSV importer.
I determined it was a critical requirement to pre-process the files before import. I wrote a fairly lengthy php script to do that. Here's what I am doing in my preprocess step. This is only for CJ at the moment.
1) Load the CSV file as exported from CJ in standard CSV format
2) Loop through each value and make changes to a number of fields. This includes:
-- Running mb_case_convert to translate ALL UPPERCASE to Title Case
-- Copying the "price" field to "retail price" field in the event it is empty (which it is sometimes for different retailers)
-- Modifying the advertisercategory field to match my taxonomy. I.E. they have "Belt" but I have "Belts" in my taxonomy so I need to normalize this before importing
-- Parsing out specific things from the title string, such as sex. For example, there is no field to tell if an item of clothing is for men or women but they have one of those words in the item title. I'm pulling this out and adding it as a CSV field so FeedAPI can properly import it
-- I'm also parsing out color into a CSV field similar to sex as they have no field for this normally
-- Excluding specific product categories from the import that I am not interested in
3) Output the new/updated CSV file to a directory visible to my web site
4) Use that web location as the FeedAPI URL for import
When I get to working with Linkshare, who does not support CSV, my plan is to use PHP+Curl to pull down the XML file via FTP, and then run a converter to change the XML to CSV. That way I can process them in a similar manner.
I'm also creating a table showing the field names from CJ and Linkshare as well as the CCK field types I am pulling them in to. I'll post that after I have a real export file from Linkshare to test with.
I still have the FeedAPI + Content Taxonomy problem but I may take a crack at fixing the code. I need content taxonomy for this process to work like I want it to.
#15
So much for not writing any code now that I am ~600 or so lines into it :)
I have it all working the way I want it to. My previous post is accurate, but now it works like this:
1) PHP cURL function downloads the CJ product catalogs to a temp/staging area.
2) Today's file is compared to yesterday's file. If the files have not changed, exit since we have nothing to do.
3) If the files have changed, open up the new file and apply a retailer-specific postprocessing routines. This includes things like #2 in my previous post to change the case, parse out specific categories, etc.
4) Write out a post-processed CSV file to a web directory
5) Import the CSV via FeedAPI
I also fixed the FeedAPI mapper for content taxonomy so that it now works. I'm now getting a clean import of data to all of the CCK fields.
My code at the moment is in a giant command-line PHP script. I have started the process of creating a Drupal module to facilitate this process and store the various configuration variables. It would require PHP with the cURL library installed, although I think the FeedAPI CSV parser also requires cURL so either way it comes into play. One thing I am thinking of is some kind of pluggable parser interface where you could write your own routines for each retailer. This would allow for mapping/renaming of fields like changing colors from "Auburn Sunshine" to "Brown" so you could normalize the data before it is imported. This is important for me as I want Apache Solr + Faceted search by taxonomy and 500 different names for the same colors would break that model.
I have looked at the current Affiliate Product Shop module and it has some interesting features, but it is also doing things like db_query and manually crafting HTML output for tags and what not. The web services aspect seems to be able to query for coupons and other realtime information. Based on looking at the code, it seems like it was designed for a coupon-oriented web site that would pull down coupon feeds from across the retailers and show items that were on sale or had a coupon. This is different from what I need to do, which is to import ALL of the retailer's items that I want, create permanent nodes, and keep them updated with changes.
The next questions are:
1) What do the readers of this giant thread think of this approach, especially as compared to the existing Affiliate Products Shop module?
2) What are your requirements for affiliate sites? Do you only want coupons? Is timing important for downloading them? Do you want full items and product dumps? If so, all of them or only some of them?
3) Do you like my cURL -> custom postprocess plugin per retailer -> FeedAPI import approach? (It will require zero effort to convert to Feeds - we just need the mappers and CSV support to be there and most of them are in 'needs review' in their issue queue already.) Note that postprocessing would not be a requirement for everyone if you wanted to just take the imports as-is.
4) Is anyone else interested enough to help develop this? Develop is the key word here - I'm far from a master PHP programmer but I really need to do this so I'm going to do it and figure it out as I go. So far so good.
5) If yes to most of the above, should this be forked into a new module?
#16
robert, you are a wild man! I took a different approach this weekend.
After setting up CJ xml feeds, I setup the drupal ftp module. Upon playing with it I have to say that there is no compelling reason to have drupal manage the download of the cj data at this point.
I agree that your approach will better maintain a full advertiser's product catalog and keep it updated.
I suspect that there is a place for both approachs but I am not ready to offer both case studies in this posting.
I guess that relegates me back to cheerleader, again! :)
As far as forking goes, I tink we need to examine what Drupal adds to the process.
Do we need to have Drupal manage the file aquisition including the Diff of last file to current? if so, why?
Once the files are in temp holding, what does Drupal add to the "retailer-specific postprocessing routines" of the CSV files?
Are these retailer-specific postprocessing routines specific to your account or can other affiliates use them?
I am curious if your CJ preprocessor might be a sub project whose output could be read by the affiliate_shop_module?
I will think on this more.
#17
Thanks for the reply.
I did not go into this part trying to determine if I wanted to manage it with Drupal or not. I just simply was trying to solve the preprocess problem, which is just the download and processing step. I solved it via PHP and it works, and I started to think of Drupal to enable easier managing of the configuration variables, especially in terms of mapping tables. Right now my php script has preprocess routines that look like this:
$colorreplace = array('Abyss' => 'Black',
'Abyss Pigment' => 'Grey',
);
and for the CSV field that is being processed:
if (array_key_exists($color, $colorreplace)) {$color = $colorreplace[$color];
}
So this isn't really Drupal-specific as much as I could envision a nice interface that would allow the import and management of CSV fields with replacement names and what not. I'm basically thinking of using Drupal for the UI. I'm dealing with a few retailers now but this is going to grow quite a bit and the script would quickly become difficult to manage.
My retailer postprocessing routines are not specific to me as much as they are specific to the retailer. For example, I'm pulling in an item called "Retailer T-Shirt in Red", pulling out the word "Red", and adding it as a new CSV field that is imported to a taxonomy called Color. This processing is specific to that retailer's import file and how they format their items/titles, but anyone who is an affiliate of the retailer could use the same processing routine. That's where I was going with the preprocessing step per-retailer. Each one seems to do things slightly differently and needs a small bit of PHP manipulation. Anyone who wrote a new preprocess routine could contribute it to the community.
I don't think Drupal needs to manage the download and acquisiton process, but FeedAPI wants you to point at either a URL or to have an acquisition process built in to your FeedAPI processor. I did not want to go down the path of writing any type of custom FeedAPI processor given that it will move to Feeds. So part of my goal was to keep the acquisition and preprocess out of the FeedAPI/Feeds transition and have a solution that would work with either one going forward. In terms of managing the diff, what I am simply trying to do is not run through a re-import process if it isn't needed. My code currently functionalizes the acquisition and preprocess steps so it's easy to just get rid of acquisition if something else would work better.
My timing is terrible with this as Feeds probably will do much of this in a few months, and if not it would be extensible to support some of these processes.
Another question: the current module seems to focus heavily on web services, and I am not focusing on them at all. Is there any advantage or difference that could be gained with that approach? Or even a partial approach in each area with CSV+Feed/FeedAPI for product catalog import and web services for something else?
Last but not least, here's the fix for the FeedAPI Mapper content taxonomy field.
#454420: Mapper for content taxonomy