Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
To effectively send out digests we should be using queue workers. This patch should aim to add the base for this queue and worker. In an effort to keep patches small and contained, this should not include much if any of rendering / mailing the digests. Just queuing digests at the correct times.
Proposed resolution
Add a digest queue and worker. In the cron hook check which digests need to be sent and queue up these items to be processed.
Comment | File | Size | Author |
---|---|---|---|
#17 | interdiff-13-16.txt | 14.22 KB | DerekCresswell |
#17 | 3162041-16.patch | 22.79 KB | DerekCresswell |
Comments
Comment #2
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedI need to write out my thoughts here as jumping into this I feel there is going to end up being too much going on at once. Here is the plan I hope to follow and I feel it will need a few separate issues to break this into in order to keep patches small and code testable.
* On a cron run we need to check each active digest to see if it is due and add it to the queue if it is.
* To check if it is due we need to know the last time we ran it.
* Since the digests are config entities we could store the last run with
\Drupal::state()
.I set out with TDD in mind and I ran into a few spots where I began to foresee creating too much code for just one issue.
To minimise the amount of code within the
.module
and make sure everything is split into testable chunks I created a class calledDigests
very similar to theViews
class.This contained a few static helper functions such as
getEnabledDigests
. This kept that code small and testable. It also kept code out of the.module
file.Then I wanted to take the checking of if a digest should run out of the cron function so I added a
shouldSend
function to the digests. This is where I began to feel the need for splitting this issue.Sadly, the default functionality of the cron expression library from what I could see runs on too small of a granularity for digests to be sent out effectively.
To see if a digest should be sent I needed to compare the last time the digest should have been sent (according to cron-expression) and the actual time it was sent out. This way I could see if the digest could be sent like this :
This works fine, but in order to have this work properly I needed to know the last time a digest was sent out. So I went to set up the state for holding the last run time. This would be fine, simply sent it when the digest is being sent.
The next problem came here, what to do if the digest has just been created and hasn't been sent yet? A naive solution would be if the last run state hadn't been set yet just assume it is 'now'. The problem with that is if you made a digest that should send out on monday and created it on a tuesday, the next cron would send this digest.
The idea I came up with would be to set this last run date to the NEXT scheduled run date when creating a digest. This would mean a new digest doesn't send until a reasonable time after it should be, rather than randomly the first time.
To achieve that we would need to create a custom digest storage to set up this state variable. This definitely feels like it should be separated to keep everything small, review able, and testable. This would have to be done in a way to avoid breaking other tests (where digests are created without a send_at) which would likely boil down to use the current time if no send_at is set. Or it could mean creating a
CreateTestDigestTrait
which would spit out a valid / random digest. I might lean to this second option since in reality, a digest should not be created without a send_at.TLDR:
Split this into two / three issues
* Create
Digests
static class for digest helper functions. Moves code out of .module and improves dev experience.* Create
DigestStorage
to help with storing last time digests were sent out* Perhaps this requires a test base or test trait be made to help with creating digests. ( this would be the / three)
* Finally, come back here and wire this all up to add digests to a queue.
This might sound like a big pile of gobbly guck, I will discuss it in person next week as I think it will come out better in person.
Comment #3
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedHere is a progress report more or less. This patch does not include the queue worker, just set up for it.
This is utilising a static
Digests
class to provide some helper functions for use in places like the .module. The Digest entity was also expanded to include some functions for detecting when to send it out.The option was brought up to move the functionality of the Digests class into a
DigestStorage
class. Accomplishes the same, just depends what we want to use.Unless this gets much too large for a single issue, it will stay here to keep context.
Comment #4
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedDigests::queueDigests
function (and tests).Digest->getSubscribedUsers
function (and tests).This should be ready for scrutiny now. The functionality of actually sending the digest will come in the next issue, this is just for queuing.
Comment #5
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedWhoops, the previous patch snuck itself in.
Comment #6
joshmiller<3
I wonder if our config of the field value can simply call this function?
This is more than a wrapper. It provides a single point to understand all digests in the system. I think you could add a sentence or two that would help others know more about how important this class has become.
Why is this static? If it wasn't static, you could do dependency injection.
Also, every other function with one line has extra lines above and below. It doesn't matter, as long as you stay consistent.
There is technical debt here. Consider adding a @todo to consider optimization strategies for when the number of digests passes thousands.
This static thing feels hacky.
:thinking: Hmmm ... if we store a state value on the config entity ... this would potentially mean exporting config would nearly always be changing. I believe this is not a good place to store this value.
This value might be able to be calculated by looking at the queue for the latest sent.
This is known as an implicit feature. Be wary of implicit, these undocumented and highly presumptive features can be a real headache for end-users. It's often far better to make things explicit by making certain features boolean with a checkbox. In this case, I think a note in the form about what a blank value does would be "good enough."
This could be a problem with 10k+ records (not unheard of). I would, at the very least, note in the README.md and on the module page that "This module was designed with a reasonably small number of users and digests in mind. Consider sponsoring batching operation additions if you need to scale past thousands."
Comment #7
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedI will create a patch that adds this functionality to a "DigestStorage" class instead as that was a possibility and we can judge them side by side. I can foresee that making the module functions less like hooks to a slight degree, but not much to worry about.
Comment #8
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedComparison between Digests and DigestStorage
Note, these are minimums. Some functions needed for the storage method will simply be added later for the static method. Also I identified some slight improvements so the storage has that benefit. Most suggestions from #6 were left out until a method is chosen.
Digests :
DigestStorage :
Pretty comparable by this metric.
I do feel that something like queuing the digests does not belong on the storage class. It simply feels out of place.
Comment #9
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedComment #11
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedFixed .module file for the storage patch.
Comment #12
joshmillerComment #13
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedWe are giving this a go with the Digest class but instead as the DigestManager. This avoids naming confusions and will allow us to use dependency injection.
Comment #14
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedI think there's two problems:
Drupal already has a static cache of entities. I don't think you should also cache them, since your references won't be cleared when the normal entity cache is reset.
I think you should also have it do something, even just adjusting the datetime in state so that the tests can have some basic coverage of processing the items, not just checking that they're in the queue.
And a few nitpicks:
The current time is now; you can't change it. Maybe call the parameter `$time` instead.
getLastSentAt returns something in the future if it hasn't ever run, but that's
a lie. Why not return null? Is that so new digests wait for their schedule to come around instead of sending
immediately?
Would a digest ever send? Ie, wouldn't getNextRunDate() always give something in the future?
What do you think about a digest's next run date being in state, so anything at or past its next run date runs? Then a newly created/updated digest could have its next run in state and you won't have this situation with a null run date.
Thousands of digests might be unlikely, but thousands of users isn't. I won't complain about leaving it as is since I don't think we have a use case for that yet, but an easy fix might be to return user IDs instead of loaded users, since this is just used to record IDs of users to send to anyway.
The time to set what to?
Comment #15
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedFor processing the digest, I am unsure where the best spot to update the run date would be. If we consider the queue reliable, we can sent the next date after queuing up the digest. The queueworker does not check if things should be sent, it merely sends brainlessly.
Perhaps this could mean shifting the name from last send to last queued.
Or we could update it each time the queueworker runs. This would mean it is constantly updating to match the last individual email that was sent out.
The reason I had the last send value be set to the next send date was to eliminate the problem of creating a digest on Tuesday that is scheduled for Monday and having it send out on Tuesday the first time. With the date in the future it allowed for the digest to wait until the next scheduled date instead of just running ASAP the first time.
I think the next run date might be the smoother option, that way we do not need to lie.
Comment #16
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedAhh, that makes sense. Just saving the time stuff was queued seems reasonable for that. In that case, we can probably wait for the actual email functionality to test that stuff worked correctly because then you can test that emails were sent.
Comment #17
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commentedI've updated this to use the next run date rather than the last. I do believe this has simplified the logic, but there still needs to be methods in place to catch things like the first time the state is accessed. I think what is here is better than something like setting one when a digest is saved as new.
I have not furthered the Queueworker tests here as that will come in the next major issue. I have added that the queuing of digests updates their send dates. I believe this makes sense as I've also made sure the digest queue is reliable. It would make sense to base this time off the queuing rather than the final sent email.
For setting the send date, I've made it so the time given is relative rather then what to set it to. This keeps the schedule on rails and makes it easy to update.
Other mentions in #14 were addressed.
Comment #18
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedLooks good to me.
Comment #20
DerekCresswell CreditAttribution: DerekCresswell as a volunteer and at Acro Commerce commented