Hi,

Problem nº 1:

I have about 200 Feeds splited into around 40 Categories.

There are 20 main categories to which each of the Feeds belong to and then they are also splited into minor categories.

Some of the main Categories are not being updated at all. They are still resting at 0 items!!!

I've tried with Linux and Windows versions of Drupal 7 with the same results!!!

Problem nº 2:

When Cron runs in Linux or Windows it just updates some Feeds, not all 200 Feeds at the same runtime.

This means that besides Aggregator not updating the categories, I'm left with a vast percentage of Feeds left to be updated.

It seams that Cron Jobs just runs about 20-30 Feeds at one time. Which means that at the next runtime it will update the same Feeds and most of the Feeds are left without updating at all.

Even if the Feeds are set to be updated at 30min and Cron schedule to run every hour.

These problems are very urgent!

Guys wake up!

We need better products!!!!

Regards,

njardim

Comments

StevenPatz’s picture

Priority: Major » Normal

Readjusted Priority to more closely match what is in the Issue queue handbook.

njardim’s picture

Hi spatz4000,

Could you please concentrate urgently on Problem nº1 for updating the Categories?

This is URGENT! and I need to go LIVE with a site and can´t with the actual problem!!!

I will be delivering this to Ministers (government), like in Senators in the US, in a couple of weeks time.

When can I get a solution for this issue?!!!

Thanks,

njardim

bfroehle’s picture

njardim: Try configuring cron to run more often.

njardim’s picture

bfroehle: That's the workarround for problem nº2.

Doesn't solve problem nº1.

bfroehle’s picture

When Cron runs in Linux or Windows it just updates some Feeds, not all 200 Feeds at the same runtime.

This is the desired behavior. aggregator_cron() populates the queue of feeds to be refreshed, and aggregator_cron_queue_info() attempts to drain the queue for a minute. This means that it might take a few cron runs before the queue is completely emptied.

Do your logs (Admin > Reports > Logs) show anything?

njardim’s picture

I see. Is there a problem in changing this expected behavior and increase the time to 180 in an attempt to drain all Feeds at once? I need all 200 Feed updated each time the cron runs, instead of hitting the cron 3 times per hour. Is this safe?

Logs are fine, problems remain.

What about problem nº 1?!

StevenPatz’s picture

Please be assured that we are all volunteers here and work on issues as time permits.

njardim’s picture

Regarding Problem nº2 I was able to increase the time in aggregator_cron_queue_info() to 240 and it seems to solve this problem about updating all the Feeds at once each time the cron runs. But I don't think this solution should be based on time but rather on the number of Feeds needed to be updated. If the Feeds themselves already have an Update interval configured why shouldn't Cron just run and grab all these Feeds at once?

Regarding Problem nº1 I realize you guys are all volunteers.

Thanks,

njardim

oskar_calvo’s picture

Hi,

Maybe the solution is not the best one, but if should works.

Maybe if you use yahoo pipe to join all the feeds in one that can be manipulate by Drupal can be the best options.

Take a look at http://pipes.yahoo.com/pipes/ and drink a tea ;)

Oskar

njardim’s picture

Hi Oskar,

I'll drink my tea at the end :) when the problem is solved.

I trust the Drupal team will fix it.

Regards,

njardim

StevenPatz’s picture

What's to fix?

njardim’s picture

Priority: Normal » Major

Problem nº 1.

I have about 200 Feeds splited into around 40 Categories.

There are 20 main categories to which each of the Feeds belong to and then they are also splited into minor categories.

Some of the main Categories are not being updated at all. They are still resting at 0 items!!!

I've tried with Linux and Windows versions of Drupal 7 with the same results!!!

Problem 1 is a MAJOR problem for this project. I told you already in this thread that this will have to be in production next week!!! So this is URGENT!!!

We will not add yahoo pipes because this is adding an additional external dependency and we don't want that. Another reason is that, that's not the right way to correct the problem.

The problem should be fixed at it's core.

bfroehle’s picture

njardim: I hope you understand that it is nearly impossible to debug your number 1 issue given what you have currently provided.

How did you import these feeds? Can you refresh them manually in the admin interface? What happens if you manually run cron? Manually run cron repeatedly?

njardim’s picture

bfroehle: there were 2 major problems.

Problem nº 1 is about Categories not being updated even though the Feeds belonging to those categories are being updated. We have 20 main Categories to which all the Feeds belong to and some of theses Categories are still resting at zero items. This is our main issue at the moment. Without this problem fixed we can't go LIVE with this project.

Problem nº2 was about all the Feeds not being updated at once when cron runs. There's a time variable in aggregator_cron_queue_info() that was increased to 240 and this seems to solve Problem nº2, since 60 seconds is not enought to update more then 200 Feeds at once.

Q: How did you import these feeds? A: Directly from source.
Q: Can you refresh them manually in the admin interface? A: Yes. But Categories are not being updated.
Q: What happens if you manually run cron? A: Feeds get updated Categories won't.
Q: Manually run cron repeatedly? A: No need to anymore, Problem nº 2 has a workaround.

Do you need additional information? Logs are clean which means source Feeds are OK.

Maybe you could try and simulate the issue since it happens in Windows and Linux.

There are other issues with the aggregator.module but first we really need Problem nº 1 to be fixed and then I will tell you about the other consistency problems with aggregator.

We appreciate your help.

droplet’s picture

"that was increased to 240 and this seems to solve Problem nº2, since 60 seconds is not enought to update more then 200 Feeds at once."

absolutely need to dig inside to see what happening, I was though it is adding each feed to jobs queue and run it one by one. (it's the best I think)

njardim’s picture

droplet: please read post #8. I agree with you. It should be based on the total number of Feeds and categories, in precise order, one by one, not based on time.

This is a technical issue but maybe the time factor has others implications and it also affects the correct update of all categories.

The answer as to be precise, based on the number and order of the Feeds and Categories and not based on random and external factors like, please consider the CPU usage when cron runs. If CPU is already at 100% and disk queues are high not even 240 seconds will be enough for 200 feeds and categories. Consider this as indication of what the solution should be.

Also Problem nº 1 it's critical for us at the moment. Please consider solving this issue first.

Thanks.

Best regards,

njardim

njardim’s picture

Priority: Major » Critical

Is there a time variable for updating the Categories or something similar?

C'mon guys we're on a tight schedule here!!!

Thanks

bfroehle’s picture

Category: bug » support
Priority: Critical » Major
StevenPatz’s picture

Priority: Major » Normal
njardim’s picture

Status: Active » Closed (fixed)

Problem nº 1:

I have more then 200 Feeds splited into arround 40 Categories. There are 20 main categories to which each of the Feeds belong to and then they are also splited into minor categories. All Feeds are being updated but some of the Categories are not being updated at all. They are still resting at 0 items.

I've tried with Linux and Windows versions of Drupal 7 with the same results.

The fix for Problem nº1 can be found here:

http://drupal.org/node/1023190

Thank you.
njardim

StevenPatz’s picture

Status: Closed (fixed) » Closed (duplicate)
bfroehle’s picture

Category: support » bug

njardim, thanks for your detective work!