Comments

FiReaNGeL’s picture

Duplicate of :

http://drupal.org/node/174617

http://drupal.org/node/193383

But this doesn't mean this couldn't get quick fixed in this issue, as the other one has been active for quite a while

dries’s picture

Status: Needs review » Needs work

The code comment needs some work then.

We should mark the other issues as duplicate (after researching them).

aron novak’s picture

Status: Needs work » Needs review
StatusFileSize
new817 bytes

Better comment text.

FiReaNGeL’s picture

Status: Needs review » Needs work

$allowed_time doesn't feel very descriptive, especially since its a hard constant - maybe $minimum_cron_time or something like that? Aside from that, it's a very simple patch fixing a hard to track bug (I should know!) - I would RTBC but I will leave it at need work just for that 'issue', which is really a nitpick. Thanks Aron!

alex_b’s picture

Category: task » bug
Status: Needs work » Needs review
StatusFileSize
new769 bytes

#4 - I'd even say it's not necessary to introduce a variable here.

#2:

- #174617: Environment::setTimeLimit() ignores current value, can reduce timeout value has been marked as duplicate
- #193383: set_time_limit: Centralize calls and prevent warnings and errors does not address this issue at all - it only deals with the problems with the call to set_time_limit() itself.

This patch:

- removes $allowed_time variable
- cleans up comment

Once this goes into 7.x I'd love to see this go into 6.x, too. This behavior is causing confusion in cases where scripts go over 240 seconds: #370318: Keep exceeding cron time with FeedAPI.

FiReaNGeL’s picture

Status: Needs review » Reviewed & tested by the community

Thanks alex_b, i think this is ready to go.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

alex_b’s picture

Status: Needs work » Needs review
StatusFileSize
new769 bytes

Tests pass locally. Patch in #5 applies with an offset of 10 lines. Rerolling and nudging the testbot.

Status: Needs review » Needs work

The last submitted patch failed testing.

alex_b’s picture

Status: Needs work » Needs review
FiReaNGeL’s picture

Status: Needs review » Reviewed & tested by the community

RTBC as it was before the test bot decided to give problems for no reason

webchick’s picture

Version: 7.x-dev » 6.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

I tweaked the comment a touch to just " // Try to increase the maximum execution time if it is too low." -- no sense in regurgitating the code back word for word, esp. if we decide to change 240 to something else in the future.

Committed to HEAD. Bumping down to 6.x.

alex_b’s picture

Status: Patch (to be ported) » Needs review
StatusFileSize
new539 bytes

- Backported patch
- Replaced the existing comment with the D7 line: Try to increase the maximum execution time if it is too low. (removed the "if in safe mode", because it is obvious from code).

Not tested yet.

alex_b’s picture

StatusFileSize
new538 bytes

A space had snuck in.

TheRec’s picture

Hello,

You check does not insure that time limit will not be decreased. Just read my last comment in #193383: set_time_limit: Centralize calls and prevent warnings and errors. Addressing this in a single place when set_time_limit is called in many other places does not sound like a best option either.

As quoted in the previously mentioned issue, which needs reviewed by the way :

When called, set_time_limit() restarts the timeout counter from zero. In other words, if the timeout is the default 30 seconds, and 25 seconds into script execution a call such as set_time_limit(20) is made, the script will run for a total of 45 seconds before timing out.

Source: PHP Manual -> http://www.php.net/set_time_limit

So, just so we are clear, set_time_limit can only do what it advertises, set a new time limit, but the moment when it is called matters and the previously set limit does NOT really matter. If it was reached then your set_time_limit was not called, if it was not reached, it will get reset during your call of set_time_limit). If you want to know if the new time limit gives less time than what was set before you should check WHEN the call is made, how much time was spent running the script. It would a lot of processing just to set a time limit, specially if this setting is up to the user in the first place (he configures his server, it is not Drupal's work) which is why I also removed this part in the patch of #193383: set_time_limit: Centralize calls and prevent warnings and errors.

I really feel this issue is a duplicate, now that you even try to check against safe_mode.

TheRec’s picture

Status: Needs review » Closed (duplicate)
alex_b’s picture

Status: Closed (duplicate) » Needs review

This is a 6.x issue. The same fix went into 7.x (#12), check against safe_mode is part of D6 and a different issue altogether.

I see that the proposed solution does not heed any previous calls to set_time_limit() but given that this happens very early in the Drupal bootstrapping process I think this is fine.

What's important is that it DOES heed timeout limits set in php.ini.

[edit]

Given the fact that this fix is already in 7.x I think we should commit the same approach to 6.x. The fix could subsequently be improved by #193383: set_time_limit: Centralize calls and prevent warnings and errors .

gábor hojtsy’s picture

Good catch. I think this would be a good improvement. Did not actually test.

gpk’s picture

Version: 6.x-dev » 7.x-dev
Status: Needs review » Needs work

As I understand it, if max_execution_time is set to 0 (= unlimited) then the code

  // Try to increase the maximum execution time if it is too low.
  if (ini_get('max_execution_time') < 240) {
    @set_time_limit(240);
  }

per http://api.drupal.org/api/function/drupal_cron_run/7 will still reduce the time limit.

The patch at http://drupal.org/node/193383#comment-1176097 had pretty exhaustive checking to make sure this didn't happen (too exhaustive for some people's liking :-P ).

TheRec’s picture

Version: 7.x-dev » 6.x-dev
StatusFileSize
new2.52 KB

gpk has a valid point, the only case where you can be sure that you decrease the effective total time allowed to the script (without tracking how much time the script already spent running) is when the max_execution_time is already set at 0. It is not possible to add a new function to D6 (correct me if I'm wrong), I propose this patch that regroups all the remarks made in #193383: set_time_limit: Centralize calls and prevent warnings and errors (no safe_mode check anymore, etc.; read the issue if you want more info) but does no create a wrapper function.
This patch also addresses the two other occurences of set_time_limit in D6.

gpk’s picture

@20: the patch doesn't fix it for http://api.drupal.org/api/function/drupal_cron_run/6. Also if the current limit is 300 then the new limit will be lower.

So I still suggest something of the form

if (function_exists('set_time_limit') {
  $desired_time_limit = 240;
  if(is_numeric($current_time_limit = ini_get('max_execution_time')) && $current_time_limit > 0 && $desired_time_limit > $current_time_limit) {
    @set_time_limit($desired_time_limit);
  }
}

TheRec’s picture

gpk: That is not the way set_time_limit works... I think I must have explained that 5 times in #193383: set_time_limit: Centralize calls and prevent warnings and errors and Damien Tournod too in this same issue (see #38)... and one time in here (see #15, on this very page). You could also read my answer of today I shortly explain it...and if you can still not believe me then Read PHP documentation ;)

**EDIT**:
So unless you code a timer to know the time spent running at the moment when you are about to call set_time_limit (and I doubt it would get commited just to make a call to set_time_limit), my last patch is the best you can do to check that you are not decreasing the effective total time allowed for the script.

The verification you offer might in many case PREVENT the script to get more time... It is not reliable.

gpk’s picture

For the case in point for this original issue, i.e. cron, the time already consumed by PHP at the time the code in drupal_cron_run() is executed will generally be well under a second. For our purposes here I think we can therefore assume it is near enough to zero. (If a full bootstrap is taking many seconds then you're going to have problems running Drupal on your server in any case).

>my last patch is the best you can do
So given that the time already consumed is for all intents and purposes zero then my patch, which combines yours with #8, with an added check to make sure that ini_get('max_execution_time') returns something that can be interpreted as a number, is possibly a teeny bit better than yours :-P !!!. I tested this code pretty thoroughly in the other issue.

gpk’s picture

>The verification you offer might in many case PREVENT the script to get more time... It is not reliable.
Ah, I don't understand, can you explain?

TheRec’s picture

It would prevent the adding more time because you do not know how the code will evolve and if at some point longer tasks will be made before the call to set_time_limit... I do not deny that right now it's called early in the script but I usually try to code in a scalable way (we're already taking the user by the hand to avoid warnings that are only caused by their server configuration; whether they have control on it or not is not really relevant).

About checking for bad values of max_execution_time, I think this was also dropped in that other issue, so I'm not stubborn, I figured I would not add it again. And if you'd quote me honestly you'd have used : "the best you can do to check that you are not decreasing the effective total time allowed for the script" ... I check what I am sure I can check reliably, if you like to check unreliably it's up to you, see bellow.

I tested this code pretty thoroughly in the other issue.

Good :) then post a patch, get it reviewed.

gpk’s picture

Thanks for the clarification, as far as cron is concerned I think it is perfectly safe to assume that the time already consumed is negligible, which is I think the point alex_b makes at #17. The process for cron is just

- full bootstrap
- drupal_cron_run()

(per http://api.drupal.org/api/file/cron.php/6/source).

As i understand it, a core Drupal development philosophy is to make the bootstrap as quick as possible, which is why for example Drupal 7 has a code repository. So I don't think we should allow for the possibility of longer tasks being added before the call to set_time_limit() in drupal_cron_run(). We have to make *some* assumptions, and IMO this is a reasonable one. (And if things change in the future then they can always be addressed at that time.)

>the best you can do to check that you are not decreasing the effective total time allowed for the script
Absolutely, something we can both agree on I think!!! :-D

Thanks for your patience on this ...

TheRec’s picture

Well if we'd talk about D7, I could tell you that it set_time_limit is used in a SimpleTest for exemple...and typically a test could take time to be achieved... so that is why I say "you never know how the code will evolve".

gpk’s picture

Yes, but I'm just talking about drupal_cron_run() here, not a drupal_set_time_limit() wrapper function...

TheRec’s picture

Good, then... Post a patch, get it reviewed and we'll all be happy ever after :) I have no more time to debate over this... if you get enough positive reviews of this unreliable verification based on an assumption, then it will get commited... it's not like it will prevent me to sleep at night, it will just not come from me. I do not know what else I could do about it, thanks for the discussion.

damien tournoud’s picture

Status: Needs work » Closed (won't fix)

The fact is, set_time_limit() means "give me more time", and is designed to be called several times in the process of a request, not once and for all. The correct workflow should be:

- full bootstrap
- drupal_cron_run()
- a hook_cron() implementation that takes a long time calls set_time_limit() a few times during its process (for example, after treating a batch of 1000 "stuff")

Trying to call set_time_limit() only once breaks the model completely, because it basically means that you could have a PHP process running away for 240 seconds. You only want to give the process more time if it is legitimate to do so.

What we could do is to call set_time_limit() just before invoking each hook_cron() implementation, with a standard time limit (say 30s). If the implementation wants more time, it will have to call set_time_limit() itself regularly. That would insure that each implementation has at the minimum access to the same "share" of time.

Anyway, this particular issue is won't fix.

gábor hojtsy’s picture

@Damien Tournoud: do you have an issue for your suggestion? If not, then can we please repurpose this issue for your suggestion then?

damien tournoud’s picture

Title: Cron: do not lower execution time » Cron: guarantee each cron implementation the same share of execution time
Version: 6.x-dev » 7.x-dev
Category: bug » feature
Status: Closed (won't fix) » Active

Ok, let's repurpose this issue.

axyjo’s picture

Issue tags: +cron

Tagging for easier access.

alex_b’s picture

I just talked to Damien on IRC. For me, the missing piece to understand the approach he's advocating in #30 was that modules themselves should be responsible for extending max execution time on cron.

I think this makes sense, especially in conjunction with a mechanism that extends execution time for every invokation of hook_cron. I'm hoping we can backport this approach to D6 once it's fit for D7.

TheRec’s picture

Sounds like a way better path to follow and while we're at it, the way that set_time_limit is called should be centralized because of the repetitve checks (function_exists, etc.) we are doing before calling it (to avoid warnings and even Fatal errors which can happen in many server environements).
So I am wondering, should we fix #193383: set_time_limit: Centralize calls and prevent warnings and errors first and when we have the right tool to use set_time_limit continue with the current issue without filling the log with warnings ? Or do we take on the current issue first and each call to set_time_limit will have to be made with the error control operator (@) until the other issue is fixed. At some point those patches are overlapping so I think it would be wise to decide this now.

damien tournoud’s picture

Status: Active » Postponed

I would like #193383: set_time_limit: Centralize calls and prevent warnings and errors to get in first. Let's postpone this one.

TheRec’s picture

Assigned: Unassigned » TheRec
Status: Postponed » Active

We finally have drupal_set_time_limit, I'll try to post a patch tonight to improve the way we use it when running cron, as suggested previously :)

TheRec’s picture

Status: Active » Needs review
StatusFileSize
new2.55 KB

In this patch, I am assuming that "module_implements" returns an array of available functions (functions that exists in the current context of the script), right now there are discussion about this in #412808: Handling of missing files and functions inside the registry (which is linked to #528896: Small optimization to module_invoke_all()). After talking with Berdir on IRC, he suggested me to deal with it this way. This is why I am not checking if drupal_function_exists($module . '_cron') (or even function_exists($module . '_cron') || drupal_function_exists($module . '_cron') as catch suggests in the previously mentioned issues) in the foreach loop.

In drupal_set_time_limit(30), 30 is an arbitrary time limit for EACH hook_cron implementation we've came up with while talking on IRC with Damien Tournoud.

I added few dots at the end of inline comments which were missing originally in the same function, just for cleaning it a bit.

TheRec’s picture

StatusFileSize
new2.42 KB

Removed a "dot" which was added by mistake (by me) and modified the way the cron implementation is called on Damien Tournoud's advice (the use of call_user_func_array, was not necessary).

Status: Needs review » Needs work

The last submitted patch failed testing.

TheRec’s picture

Status: Needs work » Needs review

Bad test bot, shush, don't mess with that old patch :P

hass’s picture

I'm not yet sure if I'm a fan of cron processes taking longer than 240 seconds and that module developers now have the hand on processing time on my machines. This patch also do not care about environment that are not able/allowed to use set_time_limit(). In such a case cron will fail earlier than expected. Can we come up with a solution for hard limits we are not able to override?

TheRec’s picture

I'm not yet sure if I'm a fan of cron processes taking longer than 240 seconds and that module developers now have the hand on processing time on my machines.

Then disable set_time_limit() in your PHP configuration. If you don't want module developers to have control on processing time on "your" machines then don't use their module, because with or without this patch they can do whatever they like with your processing time when you use their code.

This patch also do not care about environment that are not able/allowed to use set_time_limit()

Yes it does care about it, we use drupal_set_time_limit(), which does.

In such a case cron will fail earlier than expected. Can we come up with a solution for hard limits we are not able to override?

In my opinion this is not in the scope of this issue. If cron fails because time limit has expired, then it will do so with an error message... then it's the task of server administrator (or alike) to check what is the problem and if he has to allow more ressources or if the script is buggy. If you mean handle the error when max_execution_time is reached, then it is also out of the scope of this issue, it does not concern cron only, any task could raise this error and as it is a fatal error you could not prevent it from happening (otherwise it wouldn't be fatal).

hass’s picture

Yes it does care about it, we use drupal_set_time_limit(), which does.

How is this possible? In save_mode environment it does nothing and the default timeout is used. The code says "give me 60" seconds, but an higher instance does not allow to set the 60 seconds and than your code that needs ~50 seconds will fail.

If you mean handle the error when max_execution_time is reached, then it is also out of the scope of this issue, it does not concern cron only, any task could raise this error and as it is a fatal error you could not prevent it from happening (otherwise it wouldn't be fatal).

No, I do not mean catching the error. Code may be intelligent like the below... You may have seen the code at http://drupal.org/node/193383#comment-2061014, too. This may not work for task taking 300s if limit is 240, but this could be up to the developer to optimize his code...

while ($link = db_fetch_object($result)) {
    // do some time consuming work here, that takes only a short time, but the while may executed up to 1000 times.

    if ((timer_read('page') / 1000) > (ini_get('max_execution_time') / 2)) {
      break; // Stop once we have used over half of the maximum execution time.
    }
}
TheRec’s picture

It cares in the way that it will not raise an error if the function cannot be used because or environment restrictions with function_exists() and error suppression (@ ... which I still think is not a great idea...but most people wanted it so we use it). Yes it will fail in the case you described, but it is the fault of the administrator which did not allow enough execution time regarding what the developer THINKS that is required... the keyword is think... set_time_limit() is by nature not guaranteed to success when you call it, it depends on the environement... the choice was made not to report those errors to users with the previously mentioned verifications.

Your suggestion is that we should avoid reaching this limit by breaking task execution (with break; when they used a certain amount of the maximum execution time... fine, but as Damien Tournoud explained and as it has been explained few times in #193383: set_time_limit: Centralize calls and prevent warnings and errors, set_time_limit() resets the execution time counter when it is called and it sets max_execution_time to the value passed in the parameter (providing the usage of this function is not prohibited by environement). So the condition you've written in your last message could be "false" when there is way more than they half of the execution time left which makes it inefficient. Your condition is right in two cases, when set_time_limit() was called once or if it was never called (or could not be called)... if you called it more than one time. To fix this, you should maybe reset the timer each time you call set_time_limit() (but you'd have to know if the set_time_limit() call succeeded, which is pretty hard... tracking max_execution_time would be useless... imagine max_execution_time is 60 ... you do set_time_limit(60) when the script ran already 29s ... if you're allowed to use this function you'll have 60 more seconds to run, in total 89s...but the max_execution_time will not have changed, it will still be 60 and your condition will be false because 60/2=30 and the next second your time will be 30).
In conclusion, there is no easy way to prevent the max execution time fatal error efficiently... sure your solution will work in cases where set_time_limit() is called once, but as it has been explained, it's not the way set_time_limit() should be used. In a time critical set of tasks, it is meant to be used everytime you know how much time at max a task will take.. and max_execution_time only tells you how much time was allowed in the last successful call to set_time_limit(), it does not track the total allowed time.

hass’s picture

Thank you for your detailed explanation.

I'm with you that calculating and breaking the while if the condition is greater than half cron time looks inefficient as it may not use the full time, but in the linkchecker example the last drupal_http_request() could have been fired on second 120 and may take longer than half cron time... plus processing time of the status and so on... so the process eats a bit more time :-).

Have you taken a look to debugging at http://drupal.org/node/580248#comment-2057860? It shows that ini_get('max_execution_time') seems to return what set_time_limit() set's. I have not verified this myself yet, but it looks so. If this wouldn't be the case the users would have never reported the core bug #193383: set_time_limit: Centralize calls and prevent warnings and errors as a linkchecker bug. So I expect that ini_get('max_execution_time') will always return the set_time_limit() value.

TheRec’s picture

Yes, that's what I've said, set_time_limit() sets max_time_execution to the passed argument (providing you are allowed to use set_time_limit() in your environment). And that is the problem with the verification you suggest, max_execution_time is not the total time the script will be allowed to run... it is the time that was given since the last time the counter was reset (either because the script started or because there was a sucessful call to set_time_limit()).

I know it's complicated, but the example I gave you shows the limit of your verification... it will work if max_execution_time is untouched from the beginning of the script to the end... if there is a successful set_time_limit() call in between it won't work ... so either you restart your timer at the same time you call set_time_limit() and you'll be able to do it...or you compute the real time allowed for the script and use it instread of max_time_execution.

hass’s picture

Ok - thank you for clarification. I think it's not that complicated, but I missed the point that a script may also not run only 240s in total today. By this way hook_cron should try to use set_time_limit() as suggested in a general approach.

But if the script runs in limited environments the ini_get('max_time_execution') gives me something helpful - if we try to set_time_limit(240) and we may get nevertheless 60s back from ini_get('max_time_execution') and than we know set_time_limit() isn't working. If the cron hook compares the value ini_get('max_time_execution') with timer_read('page') the module knows how much time is left until cron will be killed.

So for now - the current linkchecker and job_queue approach if ((timer_read('page') / 1000) > (ini_get('max_execution_time') / 2)) { is perfectly valid to prevent a kill while running the linkchecker code in limited environments. Additional to this - the set_time_limit() simply works as expected and if ((timer_read('page') / 1000) > (ini_get('max_execution_time') / 2)) { also works in unlimited environments, but may never fire the break :-). I believe I need to calculate this only for on reason - I cannot guess the time for a fixed number of http request that may be completed in a fixed time. Sometimes the code is able to complete ~120 and sometimes only 3 http requests. My hands are tied as the logic relies on unknown external servers speed that is not guessable at all.

Damien's statement is more or less - it's wrong to use ini_get('max_time_execution'), but this seems not correct.

TheRec’s picture

It is wrong to use max_time_execution as it does not reflect the total time that is allowed to run the script, it only holds the last limit set. You compare two things, the timer of page rendering and the time limit that was set last... the timer always gets increased, even if you call set_time_limit() which will change the time limit ... I know your current implementation will work in environement where set_time_limit() does not work (because max_execution_time will never change during the execution), but if set_time_limit() works max_execution_time will be able to change and it will not reflect the total allowed time, which renders your condition inefficient.

Your condition would be valid for every case if you'd use the total effective time allowed instead of max_execution_time. It's not stored anywhere as far as I know, you should calculate it, after each call to set_time_limit which makes it pretty hard to compute. It would be something like : max_execution_time (before set_time_limit) - timer('page') + max_exectution_time (after set_time_limit).
The other solution would be to use, instead of timer('page'), a timer that would be reset everytime set_time_limit() is called...

All the other solutions I could figure are not efficient in every case... sure they work in few cases, but I don't find it acceptable and I'd prefer my script to fail with fatal error rather than having it working half-way.

Status: Needs review » Needs work

The last submitted patch failed testing.

TheRec’s picture

Status: Needs work » Needs review
StatusFileSize
new2.56 KB

It needs a reroll now that #578676: Use queue for cron is in core. The use might be a bit limited now that we can use queues, but still, each task that uses hook_cron() only (and not Queue API) should be given (or at least we should try) have enough time to run... maybe the value could be lowered, that should be discussed I guess.

dave reid’s picture

Component: base system » cron system

Moving to new cron system component.

cosmicdreams’s picture

Issue tags: -cron

#51: drupal-375578_3.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, drupal-375578_3.patch, failed testing.

  • webchick committed 6ac0154 on 8.3.x
    #375578 by Aron Novak and alex_b: Only raise max execution time during...

  • webchick committed 6ac0154 on 8.3.x
    #375578 by Aron Novak and alex_b: Only raise max execution time during...

  • webchick committed 6ac0154 on 8.4.x
    #375578 by Aron Novak and alex_b: Only raise max execution time during...

  • webchick committed 6ac0154 on 8.4.x
    #375578 by Aron Novak and alex_b: Only raise max execution time during...

  • webchick committed 6ac0154 on 9.1.x
    #375578 by Aron Novak and alex_b: Only raise max execution time during...

Status: Needs work » Closed (outdated)

Automatically closed because Drupal 7 security and bugfix support has ended as of 5 January 2025. If the issue verifiably applies to later versions, please reopen with details and update the version.