Hi there,

I'm using the 6.x-1.0-beta3 version of this module with the 6.x-1.3 version of Simplenews. I have set up a schedule to go out on a monthly basis, for the 10th of every month. However each month it goes out on a slightly different date. This month it went on the 7th and last month it went on the 11th. I've checked the site timezone settings and I have disabled the user timezone. I have cron running on my server every half an hour. Is this a bug? And is there anything I can do to ensure it goes out on the same date each month? This poses a difficulty for me as I need to give a deadline to contributors to the newsletter each month.

Would upgrading to 6.x-2.0 solve the problem?

Many thanks,

Ben

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim’s picture

Version: 6.x-1.0-beta3 » 6.x-2.0

I'm seeing this in 2.0.

I tested this in a virtualbox by changing the system time and running cron.

My newsletter was set up to send on the 1st of the month at 4am. I ran cron on the 1st and last day of December, January, and February at 12.10pm, and got these newsletter editions sent:

2011-12-01 12:10 Newsletter for December 2011
2012-01-01 12:10 Newsletter for January 2012
2012-02-01 12:11 Newsletter for February 2012
2012-02-28 12:11 Newsletter for February 2012

joachim’s picture

I'm going to write some unit tests for this....

bmango’s picture

Many thanks joachim, I will probably switch to sending manually for the time-being to be on the safe side.

dgtlmoon’s picture

related to #984770: Use a better backend for triggering events

The logic for sending is really primitive

// Set the default intervals for scheduling.
$intervals['hour'] = 3600;
$intervals['day'] = 86400;
$intervals['week'] = $intervals['day'] * 7;
$intervals['month'] = $intervals['day'] * date_days_in_month(date('Y'), date('m'));

But adding a use-case to just specifically handle the month would fix most peoples problems, maybe see if the selection is "monthly" then just compare the day-of-month

bmango’s picture

Many thanks dgtlmoon for your helpful comment. That sounds like a good way to deal with it.

joachim’s picture

Version: 6.x-2.0 » 7.x-1.x-dev
Category: support » bug
Priority: Normal » Critical
Status: Active » Needs review
Issue tags: +Needs backport to D6
FileSize
1.49 KB

The problem is with date_days_in_month() taking the number of days in the current month, ie when cron runs.

Simple example this is wrong: suppose we last sent on 5 Feb. Next send should be 5 March, which is 28 days later. But by the time we get to 5 March, the current month is March and so date_days_in_month() returns 31. The check for whether the newsletter is due will return FALSE, and won't say TRUE until 3 days later when it's been 31 days.

The fix is to take the number of days in the month that the last sending occurred, so we correctly get 28 days when the last sent newsletter was in February.

Longer explanation here: #1463228: add tests for scheduled times, where I'm working on tests for this issue.

Here's a patch.

joachim’s picture

Urgh. Tests are now failing locally. No idea what I broke -- will look at it tomorrow.

joachim’s picture

It's probably this:

      // Add the number of seconds the newsletter's interval represents.
      $newsletter_parent_data->seconds = $seconds;

Oh and obviously the lack of an $intervals['month'] to iterate to.. DUH.

Berdir’s picture

+++ b/simplenews_scheduler.moduleundefined
@@ -377,14 +377,26 @@ function simplenews_scheduler_get_newsletters_due($timestamp) {
+      // @see http://drupal.org/node/1463228 for more details.
+      $sql .= "AND :now - last_run >= (DATE_FORMAT(LAST_DAY(FROM_UNIXTIME(last_run)), '%e') * 24 * 60 * 60) ";

Wondering how compatible this query is, no idea if this will work on PostgreSQL/SQLite/...

It's probably also time to convert this query to db_select().

We can do both of that (testing with other dbms and the db_select() conversion) in a follow-up, though.

joachim’s picture

> PostgreSQL/SQLite

Almost certainly not :D
The way to handle it would be with a query extender -- same pattern as this patch for Views: #1362524: add GROUP_CONCAT aggregate function; and move aggregate function helpers to a query extender?. But yes, definitely one for a follow-up!

I'll fix the patch tomorrow.

> It's probably also time to convert this query to db_select().

Probably, yes. I'll perhaps keep working on it as strings for the time being so backporting it is easier, and then convert it at the end.

joachim’s picture

Rats.
This isn't compatible with interval frequency: when you're sending every two months, the number of seconds between 5 jan and 5 march depends on two different months.

> But adding a use-case to just specifically handle the month would fix most peoples problems, maybe see if the selection is "monthly" then just compare the day-of-month

Quite likely the way we're going to have to do it.

miro_dietiker’s picture

Alternatively you could cycle n times through recalculation...

Hint 1:
On a different project, we did time interval calculations in sql...

Hint 2:
Why not use this method?
http://www.php.net/manual/en/datetime.add.php

Berdir’s picture

Because we *are* doing it in SQL ;). The start time is in the database, we don't want to load all schedulers and then calculate the next send time in PHP.

miro_dietiker’s picture

I think that's still wrong way.

If in sql, use sql native delta date support, not timestamp math addition..

We can also calculate the next sending time in php and persist it on the job.

Then you only need to compare with NOW() and it's a way less complex query.

Berdir’s picture

Calculating the next send time in advance in PHP might actually be the best idea.

Then all we'd need to do is set up the correct DateInterval object (we could even add support for http://ch2.php.net/manual/en/dateinterval.createfromdatestring.php), add that to $edition_time and update the scheduler row.

Edit: Maybe we can get something in that basically works including joachim's tests and then refactor it?

joachim’s picture

> Calculating the next send time in advance in PHP might actually be the best idea.

Yup, but you need to do that for each row -- because it's based on each row's start time or last send.

Though...

> we don't want to load all schedulers and then calculate the next send time in PHP.

How many monthly newsletters is a site going to have, feasibly? Most will have one. I'd really doubt it if a site had more than half a dozen.

> On a different project, we did time interval calculations in sql...

The thing is that we need to add an arbitrary number of months to the last_run, to cover the case where an edition was missed. Over at #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters I've done it in PHP like this:

+  if ($newsletter_parent_data->send_interval == 'month') {
+    // Start from the last run or the base time if we've never run.
+    $last_run = intval($newsletter_parent_data->last_run);
+    $base_time = $last_run ? $last_run : $first_run;
+
+    // Set a pointer we'll advance and the increment.
+    $pointer_time = $base_time;
+    $offset_string = "+{$newsletter_parent_data->interval_frequency} month";
+
+    while ($pointer_time <= $now_time) {
+      // Look one increment ahead.
+      $timestamp = strtotime($offset_string, $pointer_time);
+      if ($timestamp > $now_time) {
+        // It's in the future: hence take the current pointer time as the most
+        // recent possible edition time.
+        return $pointer_time;
+      }
+      else {
+        // We're not far enough: increment the pointer.
+        // Note that there is a special case where now and the pointer time
+        // are identical. We could handle this separately but it's just as
+        // well covered by going round for another iteration.
+        $pointer_time = $timestamp;
+      }
+    }
+  }

Is that doable in SQL?

> Edit: Maybe we can get something in that basically works including joachim's tests and then refactor it?

We could say we don't support the frequency parameter for monthly intervals... my previous patch covers the 1-month case.

miro_dietiker’s picture

> Yup, but you need to do that for each row -- because it's based on each row's start time or last send.

You just need to do that on an _update_xxxx function. And in all future subsequent writes.
That's nothing that happens when reading / checking.

It's quite important that you design cron functions that lightweight. Even it you think now it's not that important: It would be a waste of resources... sometimes less, sometimes more... :-)

joachim’s picture

> You just need to do that on an _update_xxxx function. And in all future subsequent writes.
That's nothing that happens when reading / checking.

Sorry you've lost me there... can you explain a bit more?

What I mean is that the question 'find all monthly newsletters that are due' requires us to calculate for each row based on each row's start_date and last_run.

Berdir’s picture

We're introducing a new row, called next_run. That is calculcated *in advance*. Say you have a monthly (or every two month, doesn't matter at all) newsletter, starting today.

So cron runs, deciced that a newsletter edition needs to be created and we do so. Now we are able to define exactly when the next newsletter is due. edition_time + 1 (or two) month. That's April 14th, 21:33 and this is not going to change, no matter what happens (except if you stop creating scheduled newsletters). This means that the cron query gets extremely simple. Select all schedules which are enabled and next_run is < $now_time.

The only thing that's left is creating a one-off update function that goes through all existing scheduled times and calculcates next_run. For newly enabled schedules, we set it to start_date.

The advantage of this is that the next_run calculation can be as complex as it needs to be. For example, we can move the stop-edition calculation and all that stuff into it, so that a possible result of the next_run calculation is FALSE which means disabling the schedule. We can add a hook and rules integration and whatever you could possibly think of ;) It's easy to add support for sending on weekdays, and so on...

joachim’s picture

Ah right, that makes perfect sense and sounds like a very good plan.

Berdir’s picture

I guess the question now is if we want to move in that direction immediatly or if we want to get your tests in first. Having tests before doing the refactoring would help but putting in much time into something that we know we're going to replace completely is well.. waste of time ;)

What do you think?

miro_dietiker’s picture

Berdir, any quick estimate about how much time it takes to do the quick and clean refactoring? :-)

joachim’s picture

Well the tests fail right now :)
I'm well on the way to getting them cleaned up; I can post a patch tomorrow.

My own use case is getting the 1-month interval fixed -- so I for one would be ok with saying that frequency interval doesn't work with months for now :)

I presume the new system would impact the tests somewhat, as simplenews_scheduler_calculate_edition_time() would be replaced with something that calculates the *next( due date, and the edition date for a created node would just be taken from the DB row value.

joachim’s picture

One other thing I thought of.

Currently, we handle the scenario where an edition is skipped. The proposed database schema won't do that. Therefore, we will still need to keep simplenews_scheduler_calculate_edition_time().

Here is a use case that illustrates this:

Suppose Alice has a newsletter that runs daily at noon.

1. on Monday it runs as normal.
-- old system: The last_run is set to Monday 1200
-- new system: the next_run is set to Tuesday 1200
2. on Tuesday at 1000 her site goes down
3. her site is not back up until Wednesday at 1000
4. on Wednesday at 1200, cron runs and finds correctly that a newsletter is due.
-- old system: simplenews_scheduler_calculate_edition_time() will correctly find that the edition time should be Wednesday 1200
-- new system: if we just take the database's next_run time as the edition time, everyone will receive a newsletter on Wednesday that is marked Tuesday.

Hence we need to work as follows:

- when sending a newsletter, mark the next_run in the database as (edition time + (period * frequency))
- when creating the next edition, try adding (period * frequency) to (edition time) to get the most recent possible time, using the same logic as my patch at #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters

miro_dietiker’s picture

How about an option to choose if missed intervals are post-executed or not?

For some implementations it makes sense or is even important to send them out.

joachim’s picture

Yup, but as part of a follow-up issue I'd say.

joachim’s picture

Status: Needs review » Needs work
FileSize
9.57 KB

Here's my work in progress so far:

- added an API function for calculating next_run
- added tests for it
- added the column to the schema
- started work on an update function

I'm concerned that on existing sites, last_run contains bad data that we can't easily know how to fix. That's because existing sites that have a monthly newsletter, each run is off by a day or two. So if the start date was 5 Jan 2010, I've no idea at all what the last_run might be now.

If we try to fix it to the nearest schedule date we could end up on the wrong one if the drift has got very bad: eg if the last run was 20 Feb, do we correct it to 5 Feb or 5 Mar? Hence I think people will have to be told to fix that by hand.

This is important because when we populate next_run in our update function we need to base it on last_run.
Though I guess we could do a one-off where we calculate next_run right from the start time -- that's expensive, but in a hook_update it's ok.

Also, I should probably commit #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters before this, so I can make use of it in the update function.

If I'm working more on this I'll be on IRC -- if anyone wants to help please ping me :)

miro_dietiker’s picture

+++ b/simplenews_scheduler.installundefined
@@ -24,6 +24,12 @@ function simplenews_scheduler_schema() {
+        'description' => 'The future timestamp the next scheduled newsletter is due to be sent.',

+++ b/simplenews_scheduler.moduleundefined
@@ -359,6 +359,46 @@ function simplenews_scheduler_calculate_edition_time($newsletter_parent_data, $n
+    $pointer_time = $newsletter_parent_data->last_run;
...
+    $pointer_time = strtotime($offset_string, $pointer_time);

newsletter edition, right?

+++ b/simplenews_scheduler.installundefined
@@ -131,3 +137,35 @@ function simplenews_scheduler_update_7000() {
+      'description' => 'The future timestamp the next scheduled newsletter is due to be sent.',

again edition.

+++ b/simplenews_scheduler.installundefined
@@ -131,3 +137,35 @@ function simplenews_scheduler_update_7000() {
+      //$schedule->next_run = ...;

@todo :-)

There's some risk that PHP function doesn't understand our interval names (for whatever reason). So we should test all possible interval options (weeks, days, ...) and check if the resulting time is different to the previous. We can use a foreach over the select options for that.

Rest looks good. Not tested.

joachim’s picture

The problem I found with writing the unit test for that is that to test whether the code for adding 1 month has worked properly I need to... add 1 month and compare! There seems to be a real risk of just writing the same code twice and checking the outputs are equal, which isn't very useful. I tried the approach of putting in several simple tests instead which together should check the sanity of the output.

> There's some risk that PHP function doesn't understand our interval names (for whatever reason).

I think all the ones we're using should be fine. strrtotime (or however that's spelled) seems very flexible in what it accepts!

miro_dietiker’s picture

Yeah, and the flexibility is frightening me. That's why i suggested to check that it does something...
We could also check growing from smaller to bigger interval sizes and always check if the next interval result size is bigger than the previous was.

Just quickly thought about what contains risk of breach.

dgtlmoon’s picture

there must be a better way to refactor this using a different module or method

dgtlmoon’s picture

you need some kind of crontab managed by PHP perhaps

theres some interesting ideas in 8x http://drupal.org/node/154043

dgtlmoon’s picture

miro_dietiker’s picture

Yes, there's a lot of redundancy for all modules that to periodic processes. Core cron is simply too limited.

However that's a different story. At the moment we need just something that works reliably for us... (and then we can think of building something more general long term... in contrib or even core 8 .. 9 ... 10 ;-) )

joachim’s picture

My idea for a roadmap is this:

- implement the 'next_run' fix in 7.x-1.x and 6.x-2.x
- look at using a better backend for a 7.x-2.x branch. The job_scheduler module looks like the best bet to me, but something to discuss over there
- when better cron lands in core, use that ;)

If we don't like using strrtotime, we can achieve the same effect using PHP DateInterval objects. We'd need to define for each of our time periods the corresponding string, eg:

'month' => 'M'

and then:

$interval_string = "P{$frequency}{$period_letter}"

Berdir’s picture

Roadmap sounds fine to me. Another possibility to base on would be rules scheduler, but that's certainly something for later.

- The approach in here looks fine to me.
- About the update function, as we discussed, let's just base it on last_run and if there are any monthly schedules, display a warning messages for users to verify. After all, the patch fixes the date on whatever it currently is, there is absolutely nothing we can do if the current date is wrong.
- Note that there is something tricky about the DateInterval construct, there is an additional fixed "T" before the period designators for hour and smaller. So it's "PT2H" for two hours. There is however also a method that accepts the same format as strtotime() does. That might be the easiest thing to do, initially create a DateInterval and then just $date->add($interval) (which can easily be repeated).

I vote for getting this in ASAP and then improve it. For example, this change will make it trivial to add an additional schedule period "Custom" which will display a textfield that allows to specify the interval format directly and use stuff like years or 27 days and 7 hours, if you want to ;)

joachim’s picture

> - About the update function, as we discussed, let's just base it on last_run and if there are any monthly schedules, display a warning messages for users to verify. After all, the patch fixes the date on whatever it currently is, there is absolutely nothing we can do if the current date is wrong.

I came up with a better solution to this: go right back to the start_date and work from that to get the first possible edition date that's in the future. That may cause a too short interval between the last edition before the code update and the first one after, so users should still be warned about it.

In the update hook it would work like this:

      // Remove last_run from the object to force the next_run calculation to
      // work from the start_date. This ensures that any drift in edition dates
      // due to month drift is ignored.
      // @see http://drupal.org/node/1364784
      unset($schedule_copy->last_run)

      // Get the next run time relative to the request time.
      $schedule->next_run = simplenews_scheduler_calculate_next_run_time($schedule_copy, REQUEST_TIME);
joachim’s picture

Status: Needs work » Needs review
FileSize
13.74 KB

Here's a patch. I *think* it covers everything that needs to change!

Still todo:

- add more tests
- consider how best to calculate simplenews_scheduler_calculate_next_run_time

>Note that there is something tricky about the DateInterval construct, there is an additional fixed "T" before the period designators for hour and smaller. So it's "PT2H" for two hours. There is however also a method that accepts the same format as strtotime() does.

That's true, I'd forgotten about that. But then if we use DateIntervals with the same string format as strtotime(), what do we gain apart from the complexity of working with the objects rather than timestamps?

(As an aside, I wouldn't mind eventually migrating to using native SQL datetime values in the database -- it would make debugging a whole lot easier!)

joachim’s picture

I've had an idea for making the test more generic:

      if ($next_run_time != $last_run_time) {
        $diff = $next_run_time - $last_run_time;
        $offset = _simplenews_scheduler_make_time_offset($newsletter_parent_data->send_interval, $newsletter_parent_data->interval_frequency);
        
        $next_run_date = date_add(date_create("@$last_run_time"), date_interval_create_from_date_string($offset));
        $this->assertEqual($next_run_date->getTimestamp(), $next_run_time, t('Next run timestamp blah vlag.'));
        
        $last_run_time = $next_run_time;
      }

This basically says: 'whenever a loop iteration gives you a new time for the next edition, check it's advanced by the newsletter's offset'.

joachim’s picture

And here's that added to the patch.

What I'm thinking we do for the tests is farm out the loop at the heart of testNextRunTimeOneMonth() to a helper function, and then define lots of simpler testNextRunTimeIntervalFoo() functions that define a $newsletter_parent_data and pass it to the helper.

Berdir’s picture

Status: Needs review » Needs work

The advantage of DateInterval() would IMHO be that we can move the parsing of the interval out of the loop and only add it repeatedly. That's not possible with strtotime() because it's does parsing and adding in the same step.

+++ b/simplenews_scheduler.moduleundefined
@@ -373,52 +417,37 @@ function simplenews_scheduler_calculate_edition_time($newsletter_parent_data, $n
-  // Set the default intervals for scheduling.
-  $intervals['hour'] = 3600;
-  $intervals['day'] = 86400;
-  $intervals['week'] = $intervals['day'] * 7;
-  $intervals['month'] = $intervals['day'] * date_days_in_month(date('Y'), date('m'));
-
-  $newsletters = array();
-  foreach ($intervals as $interval => $seconds) {
-    // Check daily items that need to be sent.
-    $sql = "SELECT * FROM {simplenews_scheduler} ";
-    $sql .= "WHERE activated = :active ";
-    $sql .= "AND :now - last_run > :interval ";
-    $sql .= "AND send_interval = :frequency ";
-    $sql .= "AND start_date <= :now ";
-    $sql .= "AND (stop_date > :now OR stop_date = 0)";
-
-    $result = db_query($sql, array(':active' => 1, ':now' => $timestamp, ':interval' => $seconds, ':frequency' => $interval));

Yay! ;)

+++ b/simplenews_scheduler.moduleundefined
@@ -373,52 +417,37 @@ function simplenews_scheduler_calculate_edition_time($newsletter_parent_data, $n
+  foreach ($result as $newsletter_parent_data) {
+    // The node id of the parent node.
+    $pid = $newsletter_parent_data->nid;
 
-      // Add the number of seconds the newsletter's interval represents.
-      $newsletter_parent_data->seconds = $seconds;
+    // Check upon if sending should stop with a given edition number.
+    $stop = $newsletter_parent_data->stop_type;
+    $stop_edition = $newsletter_parent_data->stop_edition;
 
-      $newsletters[$pid] = $newsletter_parent_data;
+    $edition_count = db_query('SELECT COUNT(*) FROM {simplenews_scheduler_editions} WHERE pid = :pid', array(':pid' => $pid))->fetchField();
+    // Don't create new edition if the edition number would exceed the given maximum value.
+    if ($stop == 2 && $edition_count >=  $stop_edition) {
+      continue;

Do I see it correctly that if you reach the stop_edition value, we're re-evaluating this *on every cron run* till the end of time? That's kinda bad.

We should really move all this stop stuff to next_run() function then we can return FALSE which means activated = 0 and be done with it.

I'm totally fine with a follow-up for that, though.

+++ b/simplenews_scheduler.moduleundefined
@@ -373,52 +417,37 @@ function simplenews_scheduler_calculate_edition_time($newsletter_parent_data, $n
+    // does this newsletter have something to evaluate to check running condition?
+    if (strlen($newsletter_parent_data->php_eval)) {
+      $eval_result = eval($newsletter_parent_data->php_eval);
+      if (!$eval_result) {
+        continue;
+      }

I also wouldn't be opposed to drop this php_eval stuff in favor of a hook. php_eval is so D6'ish ;) Again, something for later.

Berdir’s picture

Status: Needs work » Needs review

Didn't mean to set to needs work.

joachim’s picture

> We should really move all this stop stuff to next_run() function then we can return FALSE which means activated = 0 and be done with it.

Let's do that in this issue -- it affects the new version of the query in simplenews_scheduler_get_newsletters_due() and it's much more sensible!

> The advantage of DateInterval() would IMHO be that we can move the parsing of the interval out of the loop and only add it repeatedly. That's not possible with strtotime() because it's does parsing and adding in the same step.

Oh that's a very good point.

We should change simplenews_scheduler_calculate_edition_time() to use the same sort of logic -- and thus make #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters obsolete.

I also realized that we need to set next_run when a {simplenews_scheduler} record is saved. The tests probably fail because of that with this patch ;)

joachim’s picture

Status: Needs review » Needs work
joachim’s picture

> We should really move all this stop stuff to next_run() function then we can return FALSE which means activated = 0 and be done with it.

Let's do that in this issue -- it affects the new version of the query in simplenews_scheduler_get_newsletters_due() and it's much more sensible!

I've got working code for this in the bit that updates next_run.
However, it's going to make the hook_update_N() a whole load more complicated, as all existing schedulers need to be checked and deactivated, as the logic to test their stop_dates and edition_counts will be gone from simplenews_scheduler_get_newsletters_due().

Hence yes let's leave that for a follow-up: #1492394: check stop_date and edition_count when setting next_run and deactivate.

joachim’s picture

Status: Needs work » Needs review
FileSize
30.8 KB

Big patch with everything in it (I hope!)

- changed to use DateInterval in next run calculation
- changed to use DateInterval in edition time calculation -- makes #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters obsolete
- include tests from #1482128: simplenews_scheduler_calculate_edition_time() gives wrong times for monthly newsletters
- set next_run when saving a new schedule (a bit hacky as it's in the form submit handler, but that's another matter)
- includes tests from #1463228: add tests for scheduled times

All tests pass and the update function seems to do sane things :D

Berdir’s picture

Status: Needs review » Needs work

Awesome.

Both of use are currently in Denver, so I don't have much time to do a detailed review.

+++ b/simplenews_scheduler.module
@@ -76,6 +76,8 @@ function simplenews_scheduler_form_simplenews_node_tab_send_form_alter(&$form, &
+    $form['#simplenews_scheduler'] = $scheduler;
+

The common way to do this in Drupal 7 now is using something like $form_state['simplenews_scheduler']. And yes, we should really do that in Simplenews too ;)

Otherwise, this looks looks nice, really haven't looked at the tests at all though.

joachim’s picture

> The common way to do this in Drupal 7 now is using something like $form_state['simplenews_scheduler'].

*sigh* I get really confused about the changing ways to store things in forms. Loads of Drupal core still has $form['#node'] and the like!

The other thing that needs work is the comment here which is not true yet:

+++ b/simplenews_scheduler.module
@@ -330,18 +338,42 @@ function simplenews_scheduler_cron() {
+ * This should be called once a new edition has been created. This sets the
+ * next_run time and checks whether to disable a scheduler whose stop_time
+ * has passed.
Berdir’s picture

Drupal 7 only has $form['#node'] for backwards compatibility (no idea why). There is absolutely no need to introduce it for new code.

http://api.drupal.org/api/drupal/modules!node!node.pages.inc/function/no... :
// @todo D8: Remove. Modules should access the node using $form_state['node'].
$form['#node'] = $node;

joachim’s picture

Status: Needs work » Needs review
FileSize
30.76 KB

Updated patch with the two fixes mentioned.

joachim’s picture

Argh:

> The DateTime class
> (PHP 5 >= 5.2.0)

> The DateInterval class
> (PHP 5 >= 5.3.0)

Compare with http://drupal.org/requirements.

I think depending on 5.2 is fair enough, but 5.3 for D6 is still perceived as being an iffy combination due to patchy support in contrib (though one of my live sites runs on it fine).

So it looks like we'll need to roll back to strtotime for at least the D6 version of this patch, if not both.

Berdir’s picture

Ah yes, that's probably right.

PHP 5.2 is still used a lot, even though it's officially unsupported now.

dgtlmoon’s picture

Yes, funnily enough Acquia is still on 5.2 :)

joachim’s picture

Spotted a couple of typos in test messages and some lines of code that served no purpose, left over from my debugging.

joachim’s picture

Turns out that for PHP 5.2 support we need only a minimal change: we can still use a DateTime object as the pointer, and use http://www.php.net/manual/en/datetime.modify.php which takes a strtotime-type string.

  $offset_string = _simplenews_scheduler_make_time_offset($newsletter_parent_data->send_interval, 1);

  // And in the loop:
    // Add interval to the pointer time.
    $pointer_date->modify($offset_string);

So there's only two lines of difference: we don't make a DateInterval and we use the modify() instead of the add() method on the DateTime pointer.

joachim’s picture

Here's the patch for D6. All tests pass here too! :D

joachim’s picture

...and both those latest patches again.

Because you can't reliably use drupal_write_record() inside a hook_update_n(), as I found when I tried updating my site.

joachim’s picture

And a fix for #1496384: last_run gets clobbered when re-saving scheduler form (only affects D6), which is tangled up with the changes here so best to just fix here too as it was affecting next_run getting clobbered.

joachim’s picture

Another thing: if you go back to an existing scheduler and change the start date, what should happen to next_run?

Berdir’s picture

Same for changing anything on there, I guess (e.g. switching from daily to weekly). We need to recalculate next_run. Question is, when to recalculate and when not. We could simply define that saving the scheduler leads to a recalculation based on the start date.

Berdir’s picture

There is by the way also a second thing, that might be fixed by this patch: Summer/Winter time change. Not 100% if the patch fixes it, and adding test coverage for that is probably tricky as well. Or, actually not, we just nee to make sure that we force the timezone, then Drupal 7/PHP *should* respect, no?

joachim’s picture

Hmm... as far as I can tell, the current system is completely ignorant of timezones: everything is in timestamps. Follow-on, perhaps? (I'm getting really sick of this patch and I'd really like it to get in... :/)

miro_dietiker’s picture

Normally, internal times are calculated in UTC (which is timezone AND summer/winter time neutral).

Indeed in case of a summer/winter switch i'd expect that all hours shift by one...
Are you sure, berdir, that this is something that is / (might be) solved?

Now that you bring up this topic i think it should clearly defined and tested with a timezone fixed calculation sample...

Berdir’s picture

Yeah, but PHP's DateTime class is timezone *and* summer/winter time aware (given that it is *not* initialized with a unix timestamp, which is what we are currently doing, see http://ch2.php.net/manual/en/datetime.construct.php).

Have a look at the following code:

php > $date = new DateTime('2012-03-24 12:00');
php > echo $date->format('Y-m-d H:i');
2012-03-24 12:00
php > echo $date->getTimestamp();
1332586800
php > $date->modify('+1 day');
php > echo $date->getTimestamp();
1332669600
php > echo $date->format('Y-m-d H:i');
2012-03-25 12:00
php > print (1332669600 - 1332586800) / 3600;
23

So, adding one day on saturday before the time shift correctly shifts the date to the same time on the next day, even though the hour difference is 23, not 24.

So no, the current patch will not fix this yet, but if we initialize the DateTime() object with a datetime string and the configured site timezone, it should. Not 100% sure how exactly that should look like, though.

Berdir’s picture

Another note: The problem with updating an existing scheduler also exists with the current code because last_run is not reset and start_date is a separate condition. Example: I manually updated the scheduler start time yesterday to fix the time shift but that didn't help because it still used now - last_run (which was not cleared/updated when I re-saved the scheduler.

And I agree that this should go in asap. I'll test the patch including the update tomorrow.

Berdir’s picture

Ok, updated the patch and the tests to always work with Y-d-m H:i:s date strings to initalize DateTime() so that it correctly uses the timezone and associated things like summer/winter times.

The tests now depend on the currently configured timezone, but should pass always (e.g. the existing tests passed for joachim as he had set it to UTC but failed horribly for me).

It would be nice to additionally have a few tests, for example check that summer/winter time switch works correctly when using the german time zone. But we can do that later on and finally proceed here. Also attaching the interdiff to the previous patch. Note that this is only the 7.x version of the patch. Drupal 6 does not support time zones AFAIK, so I'm not sure how it will work there exactly.

joachim’s picture

> Drupal 6 does not support time zones AFAIK, so I'm not sure how it will work there exactly.

Installing Date API on D6 asks you to set up a site timezone, so it should work roughly the same.

joachim’s picture

joachim’s picture

Status: Needs review » Needs work

Sadly we still have a problem.

I backported your interdiff to D6, which works fine with just a change in the name of function.

However, tests for simplenews_scheduler_calculate_edition_time() then fail because once we pass April and summer time begins, the edition time is given as 13:00 rather than 12:00.

I did a lot of headscratching to try and figure out why this was failing on D6 but not on D7, which included testing whether D6's use of DateTime->modify() vs D7's DateTime->add() was the problem (it isn't).

And it turns out that I can get the tests on D7 to fail too -- just set the site timezone to something other than UTC on the real, actual site you're running tests on. (Which is wacky in itself -- tests are meant to run in a clean sandbox!!)

So we have a situation where during the summer, newsletters will be created an hour out. And for some reason I can't fathom, this is at 11:00 on D7 and 13:00 on D6.

At this point, I am tempted to commit these patches (I'll upload the modified D6 one shortly) and say that we have at least fixed the bug of monthly newsletters drifting by several days. Being one hour off during the summer (it goes back to normal in the autumn, I checked) is a pretty big improvement compared to what we had.

However, we should check what happens with daily and weekly newsletters. If these patches change those from being OK to being an hour out, then that's not ideal.

joachim’s picture

Here's the D6 patch brought up to date with the latest D7 one.

joachim’s picture

Oh I nearly forgot. Here's some code to run in the Exec PHP block on either D6 or D7 which showcases the problem:

$last_run = strtotime('01-Mar-12 12:00:00');
print $last_run . "\n";
print format_date($last_run) . "\n";
$now_time = strtotime('05-Apr-12 12:00:00');

$newsletter_parent_data = (object) array(
  'last_run' => $last_run,
  'send_interval' => 'month',
  'interval_frequency' => 1,
);
$edition_time = simplenews_scheduler_calculate_edition_time($newsletter_parent_data, $now_time);
print $edition_time . "\n";
print format_date($edition_time) . "\n";

We fake up newsletter data for a previous run on 1 March, and ask for the next edition time on 5 April, after the summer time change.

If the site timezone is UTC is works as expected. Set the timezone to Europe/London and it fails.

What's weird is the results are differently wrong on D6 and D7:

D7:
1330603200
Thu, 01/03/2012 - 12:00
1333278000
Sun, 01/04/2012 - 11:00

D6:
1330603200
Thu, 2012-03-01 13:00
1333278000
Sun, 2012-04-01 12:00

I wonder whether this is related to something I found with dates in Views ages ago -- something along the lines of times are interpreted according to the timezone's DST offset *now* rather than for the given time. I think this is it: #465148: Daylight saving time affecting the correct display of future events.

joachim’s picture

I've modified the above debug code to print the site timezone out first.

Here's the result on D6 with both UTC and London:

UTC
1330603200
Thu, 2012-03-01 12:00
1333281600
Sun, 2012-04-01 12:00

Europe/London
1330603200
Thu, 2012-03-01 13:00
1333278000
Sun, 2012-04-01 12:00

There is obviously some sort of a bug in either Drupal core or Date module here. The same timestamp, 1330603200, can't be two different times on 3 March in UTC and London, as on 3 March UTC and London are in sync!

on D7:

Europe/London
1330603200
Thu, 01/03/2012 - 12:00
1333278000
Sun, 01/04/2012 - 11:00

UTC
1330603200
Thu, 01/03/2012 - 12:00
1333281600
Sun, 01/04/2012 - 12:00

Here the problem looks more like one of output.
UTC 11:00 is London 12:00. So the second timestamp in the London output is actually correct, just that the printed human-readable date is being printed as UTC, not London.

So on D7 the problem is in fact with how we're using format_date: changing the calls to this works:

print format_date($edition_time, 'medium', '', variable_get('date_default_timezone', NULL)) . "\n";

At least, I *think*...

joachim’s picture

The culprit, at least on D7 is our use of format_date(). Using a timezone that's further away from UTC helps to see this.

    $start_datetime = new DateTime('01-Mar-2012 12:00:00', date_default_timezone_object(FALSE));   
    debug($start_datetime->format(DATE_ATOM));

// '2012-03-01T12:00:00+02:00'
// It's 12 noon in our timezone.

Adding one month to this and passing that to format_date() gives: 2012-04-01T09:00:00+00:00
Which is the right time, BUT in UTC.

If we use DateTime::format(), we get: 2012-04-01T12:00:00+03:00. Which is 12 noon in the current timezone, which is now on DST.

So I tentatively think the problem is in some of the assertions.

(Though of course I'm still not sure how format_date() picks up on the site timezone!)

Oh rather -- the problem is that format_date() *DOESN'T* pick up on the site timezone, but uses the Simpletest site timezone it's given. Meanwhile, the DateTime objects ARE using the real site timezone, even though they shouldn't.

joachim’s picture

ok the problem is that date_default_timezone() and friends rely on variable_get(), which within a DrupalUnitTestCase is grabbing settings from the live site (!!).

So switching to DrupalWebTestCase will make our tests run a whole lot slower, but should fix the problems. I think! It's Friday afternoon and I think I'll leave this next week :)

Berdir’s picture

Strange. Your example snippet works perfectly fine here on my laptop, with Europe/Zurich, Europe/London and UTC. Are you configuring the timezone in Drupal?

Re the timezone function. Yeah, you're right, the function exists in D6 as well. However, from what I'm seeing now is that using it wouldn't be necessary at all in D7. Drupal 7 forces the default timezone by calling http://ch2.php.net/manual/en/function.date-default-timezone-set.php during bootstrap. Maybe your configured time zones aren't in sync or something?

Berdir’s picture

Interestingly, the tests seem to completely fail on my PC at home, I'll try to investigate once I'm back home.

Overall, I do agree that this is already much better than what we had before and we should commit it asap. Maybe do so and open a follow-up for the timezone stuff? What we could do is force the timezone for the monthly tests to UTC (or anything else that is fixed and known to work) and then add some specific tests with a given timezone, basically something like your code snippet?

No idea about D6 but I think at least the D7 tests can be made to work consistently if we add a fixed time zone configuration in setUp().

Berdir’s picture

Status: Needs work » Needs review
FileSize
5.08 KB
30.19 KB

Yeah, removing the explicit timezone definition actually fixes it for me here on my PC. I guess the reason is that, whatever time zone it uses, it uses the same all the time when we don't specify it explicitly, which means that the calculation and tests are consistent.

I have no idea why but for some reason, my PHP timezone here was set to Europe/Kalinigrad, even though the one in Drupal was set to something different and then it got confused.

Again with interdiff but the actual changes are trivial, I just removed some commented out debug calls and dead code that was not used while on it.

Berdir’s picture

Suggestion: If you can confirm this fixes things for you as well, maybe commit the 7.x-1.x patch and then move the issue to 6.x-1.x to check if we can figure it out there as well? Or commit the 6.x-1.x patch as well and open a follow-up to add better test coverage for it including bugs we find while doing that?

joachim’s picture

> However, from what I'm seeing now is that using it wouldn't be necessary at all in D7. Drupal 7 forces the default timezone by calling http://ch2.php.net/manual/en/function.date-default-timezone-set.php during bootstrap. Maybe your configured time zones aren't in sync or something?

The call is in drupal_session_initialize(); in bootstrap it's for cached pages.
And yes, it looks like what happened is that my user account timezone was UTC, because it wasn't until last week that I set the site up to have a different timezone.

Because cron runs as anonymous, we should be ok to just count on the effects of date_default_timezone_set(). But if there any instances where something that counts on the time is run as UID 1 then we may have trouble, and I think it's very easy with the default settings to have UID 1 on a different timezone.

Tests all pass on D7 with your patch. They also pass with my changes, which are:

- change all test classes to subclass DrupalWebTestCase, as we rely on variable_get() indirectly in all the tests
- set the site timezone to Europe/Kiev (enough hours ahead of UTC to not be mistaken for it).

However... I can make the SimpleNewsSchedulerNodeCreationTest test fail if I set my user timezone to UTC while the site is set to Kiev (and the tests run as Kiev). The verbose output shows that instead of the expected:

"Submitted by t9FgxIyW on Wed, 04/11/2012 - 10:49" found

the node shows:

Submitted by t9FgxIyW on Wed, 04/11/2012 - 13:49

So that's some timezone fun -- which I think we should leave to another patch.

Patch of my changes coming once I've cleaned up the tests I've added.

joachim’s picture

Here are my patches for D7 and D6.

I've not applied all of the interdiff from #77 as I really want to keep in the debug() statements, even if commented out. At least until we finish nailing down the various timezone and time problems in general.

In addition:

- Added common test subclass. I've moved dependent modules and setting a dummy timezone to a common base class.
- Added DST test class. This tests the edition time is the same before and after Kiev changes to DST. At least I think it does -- after the DST change the edition time is given as 12:00 at UTC+3.

joachim’s picture

Working my way down all the follow-ups we've mentioned need doing so we don't forget any:

- (db_select stuff and PostGres concerns are now obsolete)
- How about an option to choose if missed intervals are post-executed or not?
- don't check $stop_edition till the end of time. Newsletters about to go past their stop count should be disabled in simplenews_scheduler_scheduler_update(): http://drupal.org/node/1492394.
- drop this php_eval stuff in favor of a hook. php_eval is so D6'ish ;)
- SimpleNewsSchedulerNodeCreationTest test fails if I set my user timezone to UTC: this is because of a strtotime() in the test I think.
- generally expand tests to cover cases other than monthly interval.

joachim’s picture

*sigh*

All tests pas on D7, but on D6 we seem to have more of this bleed from the real site:

    $timezone_name = date_default_timezone_name();
    debug($timezone_name); <--- says Kiev

    // Create a last run time before DST begins, and a now time after.
    // Use date_create() rather than strtotime so that we create a date at the
    // given time *in the current timezone* rather than UTC.
    $last_run_date = new DateTime("01-Mar-12 12:00:00");
    $now_date = date_create("05-Apr-12 12:00:00");

    debug('last run date TZ: ' . $last_run_date->getTimezone()->getName()); <-- says the timezone the real site is set to
joachim’s picture

.... which seems to be due to D6 core not using date_default_timezone_set() to set the default timezone for PHP to use. The ramifications of either babysitting all DateTime-type calls with a timezone object, or calling date_default_timezone_set() ourselves in hook_init() really warrant their own issue.

Hence here's the patch for D6 with a couple of changes: it turns out setUp() on D6 only takes a list, not an array. More details in the code comments.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Yes, let's get this in!

It's possible that we introduced a bug somewhere, but it's way better than what we currently have and there's a lot of test coverage that we can extend if we find bugs.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 1364784.D6.83.simplenews_scheduler.next-run.patch, failed testing.

Berdir’s picture

Hi testbot! Nice to see you :)

Looks like we have a single test failure in the existing tests, isn't this something you mentioned above as well?

If we can't easily fix this, I'd be fine with commenting out that line in the test and open a follow-up to deal with it. But there might be a reason for that error and it might be relatively easy to fix.

Berdir’s picture

Status: Needs work » Reviewed & tested by the community

Looks like we already have a follow-up issue for that.

So, as said in IRC, comment out the broken assertion, add a @todo and commit that thing :)

joachim’s picture

Status: Reviewed & tested by the community » Fixed

Yup, that issue is #1528586: node creation time test fails due to timezone.

Committed the patch at #80 with the failing test commented out and marked for follow-up:

- #1364784 by joachim, Berdir: Added a 'next_run' timestamp to scheduler data; fixed monthly intervals varying.

joachim’s picture

Version: 7.x-1.x-dev » 6.x-2.x-dev
Status: Fixed » Needs review

Now let's get the testbot on the D6 patch.

Berdir’s picture

joachim’s picture

Berdir -- you mentioned on IRC that the D6 tests fail because of PHP errors that only come up if Drupal core is running off a git checkout? I've switched my local environment to that and I'm not seeing any extra errors.

joachim’s picture

Figured out the timezone problem -- explanation over at #1528598: timezone problems in tests.

Here's a D6 patch. Tests all pass!!!!!!!! :D

joachim’s picture

Status: Needs review » Fixed

Phew.

joachim’s picture

Automatically closed -- issue fixed for 2 weeks with no activity.