Download & Extend

Allow Kiva module to obtain and process large amounts of data from the API

Project:Kiva
Version:6.x-1.x-dev
Component:Code
Category:task
Priority:normal
Assigned:coderintherye
Status:needs review

Issue Summary

So the module doesn't choke on a huge task (e.g., trying to fetch 5000+ loans).

Comments

#1

I think I will work on this if it is one of the things needed for a stable release?

#2

Working on this now.

#3

Assigned to:CrookedNumber» coderintherye

#4

I wrote some batch process using the batch API to do this, but it doesn't really make sense to do this on every page load. Seems like it should do this when the user submits the administrative form for the Kiva loan settings. But then what about getting the updated list of loans, then it seems more like it should be done by job_queue and run on crons. Perhaps, a combination of both.

#5

Actually by default the API only returns 20 loans or 50 lender profiles per request anyways, and pages the rest.

So a request to http://api.kivaws.org/v1/loans/newest.json will return the first 20 newest loans
then a request to http://api.kivaws.org/v1/loans/newest.json will return the next 20 (loans 21-40)

So, while batching is good for the Drupal side, what is more important is first batching the requests to the API to get all the data we want in the case we are requesting large amounts of loans to be returned.

#6

Title:Work batch api (or something similar) into kiva_parser» Allow Kiva module to obtain and process large amounts of data from the API

By the way I kind of hijacked this issue before realizing you meant working the Batch API into the defunct kivaparser sub-module, whereas I was thinking of linking it into the main Kiva module.

So I'm changing this issue to better reflect what I was thinking, and please change back if this doesn't make sense.

#7

Status:active» needs review

Ok, here is the patch which enables getting many pages of loans from Kiva's API, as well as which caches the result set and gets from the cache unless it is expired or missing (as opposed to the current implementation which draws from the cache only when the API is down or returns an error).

I've got the batch code pretty much read in a separate patch, but I first want to make sure this looks good and then commit it then commit the batch code separately.

AttachmentSize
pagesofloansandcache-504942.patch 4.83 KB
nobody click here