Cool one!
I use this module for several months very closely: > 70 feeds, > 1500 feed items are loaded every day.

But there is a small problem. This is a competition between feeds.
I can not load all the feeds every cron. I download several feeds each time. Now it is about 5 feeds. All RSS have different publication frequency. Therefore, I consider this frequency setting the refresh interval. I thought that all downloads take account of that interval.

Simplification:
1. I have in long queues to download two different feeds: A with refresh interval in one update cron and B with the interval of updating the 5 crons.
2. After three crons Feed A has a weight of 3, Feed B has a weight of 1
3. Even after two crons Feed A has a weight of 5, Feed B has a weight of 2.
.....
4. Fiirstly feeds with more weight are loaded.
5 If the whole a very long queue Feed A has a maximum weight, then it will start first.

Now it is not so?

Sincerely,
Gans_S and Google Translator

Comments

Ashraf Amayreh’s picture

The intervals are indeed taken into account. When we pick the feeds to refresh, we make sure we pick the feeds that are ripe (meaning, whose waiting/refresh intervals have elapsed). So you can be sure the five feeds that are refreshed on the next cron will be feeds that are not within the waiting/refresh interval.

Furthermore, when we retrieve the feeds, we retrieve them ordered by last refresh (last time items were pulled from them). So the one that was not refreshed for the most time will be the first feed to get processed on the next cron run. That way, if you have 100 candidate feeds, every time five of them are refreshed, they will move to the end of the queue.

Does it cover all your troublesome cases?

Gans-S’s picture

It's right, but not enough, I think.
It's very important: how many times feed in queue must be retrieved.

Gans-S’s picture

Look, please, real data:

I have two RSS: A makes 50 items per hour, B makes 100 items per hour. I retrieve 5 feeds every cron with 10 minutes inteval

I don't like load more 25 items per time, but 50 items loading is bad.
OK. I make A refresh time = 30 min, B refresh time = 15 min.
A and B riped and they are in queue which has 15 feeds. B (25 items) is first. A (25 items) is second.

B RSS has around 100/60*10*2=32 items plus after two cron intervals, when it is in the end of the queue.
It has 25+32=57 items - it is bad. More queue - more troubles.

A RSS has around 25/60*10*2=8 items plus after two cron intervals, when it is in the end of the queue.
It has 25+8=33 items - it is not bad.

I think, feeds in queue have to be sorted by weight. One have to be calculated proportional to items which are "added to RSS" for queue interval:
W=C/F *S, where

C - cron inteval
F - Feed refresh interval
S - how many steps feed made in queue.

Or, shortly W=S/F, C is equal for all feeds.

Big weight will moved to the end of the queue.

Is it not difficult?