Closed (works as designed)
Project:
Feeds
Version:
7.x-2.0-alpha8
Component:
Code
Priority:
Major
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
6 Dec 2011 at 22:30 UTC
Updated:
17 Jun 2016 at 00:46 UTC
Jump to comment: Most recent
Comments
Comment #1
twistor commentedThere is always the feeds_process_limit variable, that you can set to the appropriate level. It defaults to 50, which I've seen to be a decent amount. It's debatable whether Feeds should handle one item at a time or several, but I'd hardly call Feeds' batch processing pointless.
Downloading the enclosure of each item is certainly going to take longer than normal.
Comment #2
pwaterz commentedfeeds_process_limit does not work in d7 as per my other issue http://drupal.org/node/1363094.
It still does not make sense to use batch, you could just add a page call back and process it in one page request. Are you just using it to show a loading bar?
Comment #3
dooug commentedIt appears in feeds 7.x-2.x-dev that the
feeds_process_limitis only used in theclear()function of theplugins/FeedsProcessor.inc. This doesn't seem to be the case for the import processing functions... (correct me if I have missed something.)Our use case of feeds requires better process limiting on import. We are trying to import a large (multiple thousand) quantity of nodes/users but are getting timeout errors on the import. The import needs to be processed in smaller batches, but this does not appear to be the case in Feeds.
Also, found a similar issue for Large Feeds imports: #1302034: Large Feeds import exhausts RAM & corrupts DB
Comment #4
manu manuusing
feeds_process_limitmay not meet all use cases:As the feeds import process is divided in 3 parts:
Even if feeds could process smaller chunks of items in step 3, steps 1 and 2 may time out before.
In my case I was importing some 300MB xml files...
I ended to use the "Process in background" option and hacking import & clear form handlers to not process the first chunk, witch was causing the time out.
Hope it helps
Comment #5
twistor commentedHoly crap, I just noticed what #4 is pointing out. Apparently when #744660: Expand batch support to fetchers and parsers was added, support for batching on processors was removed. Oh joy!
Theoretically, this would be fine. Fetchers and parsers would respect the processor's limit, and only give it what it could handle. In practice, this doesn't work at all. Most fetchers fetch 1 item, and ignore batching altogether. LOTS of parsers HAVE to parse everything at once. There is a limited number of parsers that can batch. CSV is one, because it can read from a file. Most XML based parsers use the DOM, and parse everything at once.
arg.
Comment #6
byronveale commentedThere was discussion related to this over on issue #1470530, and some code that could perhaps move this toward completion.
See comments #100 and #121.
Thanks for everyone's efforts!
Comment #7
liquidcms commentedpossibly related: #1778972: batch import does not start - no errors i have been battling with this all day.. it seems as though only 50 records get imported and then "batch" hangs. i am sure this used to work months ago.
Comment #8
osopolarFor large XML-files see also: Feeds XPath Parser in combination with Steven Jones's sandbox:Feeds XPath Parser + XMLReader and #2052081: Update FeedsXpathParserXMLReaderParser to current version of feeds_xpathparser..
Comment #9
jenlampton@loquidcms I don't think you're seeing any batching at all, the variable feeds_process_limit is set to 50 by default, so that's why you're seeing that many records imported.
Is the batching being added back over in this issue?
#1470530: Unpublish/Delete nodes not included in feed
If not, can any of the code from this D6 patch be of use here?
#1139376: Batch processing fails on large feeds
Comment #10
jenlamptonWell, it looks like #1470530: Unpublish/Delete nodes not included in feed landed, but since I'm not entirely sure the batching that was added on delete will solve my problem of needing batching on import, so commenting here too.
I have a site I'm supporting that needs it feeds_process_limit increased by about 200 each quarter so that the import will complete on cron as more data is added. I'd love to see a suggestion / recommendation on how to solve this problem if anyone has one.
Comment #11
twistor commentedAt this point, there's nothing we can (or should) do.
The way it works is thus:
Feeds XPath parser was fixed a long time ago.