distinguish between "not subscribed" and "unsubscribed"

joachim - July 23, 2009 - 16:13
Project:Simplenews
Version:HEAD
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:active
Description

Simplenews_roles module has a problem: http://drupal.org/node/331080#comment-1844330
Briefly, users can unsubscribe from a newsletter, but cron runs and other events just add their subscription right back!

I am not sure this can be fixed without a change to the architecture of simplenews: there is currently no way that I can tell for the system to know if a user has never subscribed, or has chosen to opt out.
Several approaches spring to mind:
- add a column to simplenews_snid_tid with a 0 for normal and -1 for 'has chosen to unsubscribe'
- add a new table for opt-outs -- basically the same as simplenews_snid_tid but storing user-initiated unsubscriptions.

However we store this, this could then be checked against before any non-user initiated subscription:
eg, if an admin goes to subscribe a user, ask "This user chose to not receive this newsletter -- are you sure you want to re-subscribe them?".

#1

Sutharsan - July 24, 2009 - 09:31
Version:6.x-1.0-rc6» HEAD

I have investigated this same situation for a legal case. My proposal is to store an subscription and unsubscription time-stamp aditionally a short desrciption of the source (the source of (un)subscription i.e. mass subscribe, website, webservice) can be stored. The most important change to the current api is that a newsletter subscription (simplenews_snid_tid) is no longer deleted upon unsubscription. The solution should be in simplenews core, not in a new module and the simplenews_snid_tid table is the right place to store the data.

I'm coding for this and will post some soon, but help is appreciated.

#2

joachim - July 24, 2009 - 10:22

Cool :)

Do you have a schema yet for the expanded simplenews_snid_tid table?
Also, would it be a list of events, or status?
In other words, when a user unsubscribes, do you keep the old record and add a new one, or update the record for that snid/tid combination?

How about
snid
tid
1/0 for subscribe/unsubscribe
source
timestamp

I need simplenew_roles to behave properly for a project with a fairly short deadline, so let me know how I can help -- what code needs to be written or tested :)

#3

Sutharsan - July 24, 2009 - 13:05

I have this schema:

$schema['simplenews_snid_tid'] = array(
    'description' => t('Newsletter series subscription data.'),
    'fields' => array(
       'snid' => array(
        'description' => 'The {simplenews_subscriptions}.snid who is subscribed.',
        'type' => 'int',
        'not null' => TRUE,
        'default' => 0,
      ),
      'tid' => array(
        'description' => 'The newsletter series ({term_data}.tid) the subscriber is subscribed to.',
        'type' => 'int',
        'not null' => TRUE,
        'default' => 0,
      ),
      'subscribe_timestamp' => array(
        'description' => 'The time the email is subscribed.',
        'type' => 'int',
        'unsigned' => TRUE,
        'not null' => TRUE,
        'default' => 0,
      ),
      'subscribe_source' => array(
        'description' => 'The source of subscription.',
        'type' => 'varchar',
        'length' => 24,
        'not null' => TRUE,
        'default' => '',
      ),
      'unsubscribe_timestamp' => array(
        'description' => 'The time the email is unsubscribed.',
        'type' => 'int',
        'unsigned' => TRUE,
        'not null' => TRUE,
        'default' => 0,
      ),
      'unsubscribe_source' => array(
        'description' => 'The source of unsubscription.',
        'type' => 'varchar',
        'length' => 24,
        'not null' => TRUE,
        'default' => '',
      ),
    ),
    'primary key' => array('snid', 'tid'),
  );

It is stored as status, not event. I don't see a use case for a subscription history.

You idea of a subscription state (1/0) is interesting. Let me think of it (or convince me ;) ).

#4

joachim - July 24, 2009 - 13:35

I agree that we don't need subscription history. At least I don't see a case yet.

As for subscription state: it saves on columns. You currently have two source and timestamp columns. If this is a state, then you're never populating both -- when you unsub, you clear the sub details, I presume?
Also, it gives a quick way to find what we need. Otherwise, subs are 'those with empty unsub timestamps' and unsubs are 'those with filled unsub timestamps' or 'those with empty sub timestamps' (the latter only if we're sure that gets cleared!)

#5

Sutharsan - July 27, 2009 - 13:36
Status:active» needs review

The attached patch add the ability to distinguish. The API should work, but can use some testing. The user interfaces still has some TODO items.

AttachmentSize
simplenews.528808-0.patch 25.26 KB

#6

joachim - July 27, 2009 - 16:24

- Subscribing via 'My account' produced 'unknown' source.

You're missing the language parameter in most of the calls to this I can see, so the source param is going to the wrong place.
I'm rearranging the parameters so source (which is always used) comes before language. That way it matches simplenews_unsubscribe_user().

- Unsubscribing via 'My account' produced 'Website' source -- I'd use all lowercase for this column

+            simplenews_subscribe_user($account->mail, $tid, FALSE, t('Website'));

It doesn't make sense to store translate stored data: things might break if site language changes?! -- better to translate it on output (if there is any UI output).

I'm attaching a new patch with the above changes.

Some subsequent things to consider:

I would define the values the core simplenews module will use in the function comment, and be quite disciplined about what values we use.
Other modules can of course send their own values -- up to them.
So far we have:
- website: something done through the website UI.
- mass subscribe: mass admin UI
- mass unsubscribe: mass admin UI
- action: Drupal actions

Shouldn't we have an 'email' source for unsubscribing?
I can't tell if simplenews_subscription_manager_form_submit() only covers users who have arrived from email.

Do we want to distinguish between an action the user takes themselves and the admin changing this? -- ie, if an admin edits another user's account. Is that possible and desirable?
I'm thinking of the following use cases:
- simplenews_roles needs to know that the user chose to remove themselves
- an admin doing a mass subscribe or user account edit should be warned that some emails are currently unsubscribed.

AttachmentSize
simplenews.528808-1.patch 25.26 KB

#7

Sutharsan - July 30, 2009 - 19:54

Thanks for your patch, your remarks and changes are very usefull. In the attached patch I added the source description to the comment.

I see no need to add an 'email' as source of (un) subscription. The email is just a means to confirm subscription via the website. I also don't want to distinguish between the different website forms for (un)subscription. For actions and mass (un)subscription we can consider to make the source configurable. Since actions can be used in may different ways, the source 'action' is probably not very usefull. But for now it is a start. Also Mass (un)subscription can be extended with a configuration field for the admin to enter a short message like 'conversion' or 'conference 20090506' or 'bounced'.

Mass subscription should ignore existing unsubscribed users. Perhaps they should be listed in a message.

Attached patch is committed to aid your Simplenews Roles progress.

AttachmentSize
simplenews.528808-2.patch 24.19 KB

#8

Sutharsan - July 30, 2009 - 19:58
Status:needs review» fixed

#9

joachim - July 31, 2009 - 15:45

Awesome :)
I'll get to work on how simplenews roles can best use this.

#10

System Message - August 14, 2009 - 15:50
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

#11

joachim - August 16, 2009 - 10:04

I see this is only in HEAD -- is your plan to have this in a 2.0 release?
I'm about to make SN roles dependent on this.

#12

Sutharsan - August 17, 2009 - 06:24

Yes, planned for 2.0

#13

joachim - August 17, 2009 - 07:33

No chance of it going into 1.0 then? The whole of simplenews roles and possibly other modules too in the pipeline too (see http://groups.drupal.org/eventrelated-drupal-code) will need this.

Also, I wonder whether we should rethink our data storage here.
I am changing simplenews roles to use the current system, and getting a query of "people who either are status 1 or status NULL" is a bit ugly. Seems you can't say "status != 0" without also excluding the NULLs.

#14

joachim - August 17, 2009 - 14:31
Status:closed» active

Hmm yeah, this is proving really tricky to work with. Some horrible queries.
See #331080: unsubscribe reverts back after cron run
I'm afraid I think we need to rethink this.

#15

chris.cohen - August 17, 2009 - 14:47
Status:active» closed

I don't believe status should be allowed to be NULL, due to joachim's reasoning that "WHERE status IS NULL OR status = 1" is inefficient. It would perhaps be better to set 0 as a default for status, and use values 1 and 2 for other statuses, meaning you could use "WHERE status <= 1" or similar.

Joachim's original post also mentions an explicit optout table whereby if a user opts out of receiving a newsletter, a row would be added to the table. I believe this is structurally the simplest solution, but raises additional requirements at code level, and possibly an extra join in the database to encompass this new table.

The optout table could have just two columns: the snid and the tid (unless timestamps were required or something). It would also be possible to consider tid == 0 as a 'global' optout (ie never send me ANYTHING).

In this situation, the module would send to a user if the user was in the simplenews_snid_tid table, and NOT in the simplenews_optout (or whatever name) table.

#16

chris.cohen - August 17, 2009 - 14:47
Status:closed» active

Ah damn somehow I managed to close this with my last comment. Sorry, setting it back to active.

#17

joachim - August 17, 2009 - 14:50

Ah, chris, I don't think I explained properly on IRC:

status can be NULL simply if SN knows nothing about the user -- you get a NULL from the LEFT JOIN.
the actual data is always 0/1.

#18

joachim - August 18, 2009 - 15:39

I've got something that seems to work, but it has to use PHP rather than only queries -- get the mails to affect with a query, then subscribe/unsub them with the SN API function.
I am concerned about its scalability.

Called by simplenews_roles_cron when required:

  /**
   * step 1:
   * remove users who are not in the role(s)
  */
  $query_1 = "
    SELECT sns.mail
    FROM
      {simplenews_snid_tid} snst
    JOIN
      {simplenews_subscriptions} sns
      ON snst.snid = sns.snid
    JOIN
      {users} u
      ON sns.uid = u.uid
    LEFT JOIN /* LEFT JOIN because we want the NULLs */
      {users_roles} ur
      ON
        u.uid = ur.uid
      AND
        ur.rid IN (%s) /* $rids_string */
    WHERE
      snst.tid = %d   /* $tid */
    AND   
      ur.uid IS NULL
    ";
     
  $mails_1 = db_result(db_query($query_1, $rids_string, $tid));
  foreach ((array)$mails_1 as $mail) {
    // Call SN API to unsub users.
    simplenews_unsubscribe_user($mail, $tid, FALSE, 'simplenews_roles');
  }
 
  /**
   * step 2:
   * add users who are in the role(s), and not subscribed
   * but only if the prior source of unsubscription is ourselves
   * In other words, if the source of their unsub is 'website', say
   * we don't touch it.
   */
  $query_2 = "
    SELECT u.mail
    FROM
      {users} u
    JOIN
      {users_roles} ur
      ON
        u.uid = ur.uid
      AND
        ur.rid IN (%s) /* $rids_string */
    LEFT JOIN /* LEFT JOIN because we want the NULLs */
    (
      {simplenews_snid_tid} snst
    JOIN
      {simplenews_subscriptions} sns
      ON
        snst.snid = sns.snid
      AND
        snst.tid = %d   /* $tid */
    )
    ON
      u.uid = sns.uid
    WHERE
      sns.snid IS NULL /* those that don't exist at all for this tid or for SN as a whole */
      OR
      snst.source = 'simplenews_roles' /* those we unsubbed ourselves previously */
  ";

  $mails_2 = db_result(db_query($query_2, $rids_string, $tid));
  foreach ((array)$mails_2 as $mail) {
    // Call SN API to sub users.
    simplenews_subscribe_user($mail, $tid, FALSE, 'simplenews_roles');
  }

#19

Sutharsan - October 24, 2009 - 15:42
Status:active» postponed (maintainer needs more info)

I don't see how a different data structure can make live easier for other modules and keep it simple for Simplenews. Please explain.

#20

joachim - October 24, 2009 - 17:02

The thing is, I'm not sure what data structure would be better.
I'm fairly convinced by those queries that the current one could be improved.

The more I think about things that could build around simplenews, the more I see the possibility of this kind of problem.

Example: suppose we want a module that targets customers in an online store. We could have a glue module between Simplenews and Ubercart, which sends a particular newsletter to people who have bought a particular product. But this involves the same sort of syncing as simplenews_roles, with the same complex queries and scalability.

So my feeling is that the data storage we thrashed out earlier in this issue isn't working out in actual application :(

So I'm thinking:
1. work towards something more like #536620: Subscriber API: separate subscribers from newsletter management and allow other modules to define subscribers.
2. the data storage should perhaps be two tables, one of subscriptions, one of unsubscriptions.
A hypotherical Ubercart module would cause no data to be put in subscriptions, only in unsubscriptions -- as users choose to opt out.
To get users to send to, you'd do a join with your list of users with orders and the unsubscriptions list and consider only the ones with a NULL on the right hand side of the join.
3. there's also the matter of global opt-outs, which I know is a feature on the todo list. How best should this be factored in to have sane and manageable queries? I'm thinking use unsubscriptions with a tid of 0 but I'll have to go and think about what this would mean in practice.

#21

Sutharsan - October 24, 2009 - 19:53
Status:postponed (maintainer needs more info)» active

The problems I see with two table for subscription and for unsubscription is that the data is contradicting (the same record in both subscription and unsubscription table). An API should be able to handle this. And if the API is easy to use there is no reason to bypass it.

However building the API takes more time than I have and the time I have I like to dedicate to moving Simplenews forward. Therefor I would like us to decide on a data structure. Simplenews and subscription API development can than be two parallel tracks. This discussion is very usefull for this route.

About global opt out: If build in a separate table this requires an extra join for what I expect a small percentage of the use cases. Using a tid = 0 could be usefull option, this only requires an additional OR in the query.

 
 

Drupal is a registered trademark of Dries Buytaert.