Just wanna paste my findings from IRC for later reference:

webchick: related, but not similar, but maybe worth a look: http://drupal.org/project/kiosk, http://drupal.org/project/roadblock

webchick: also related: http://drupal.org/project/phpids, http://drupal.org/project/track_host, http://drupal.org/project/statistics_advanced

webchick: anyway, 1) take the cookie management from roadblock or whatever 2) setup a 'visitor' table with vid, cookieid (sessid) 3) attach that info to the $user object in _boot() or _init(), and 4) provide hook_visitor_load() to let modules like BrowsCap attach additional data, f.e. browser information

http://drupal.org/project/session_restore

CommentFileSizeAuthor
#15 visitor.zip14.87 KBjvandyk
#15 324743jv.patch5.59 KBjvandyk

Comments

sun’s picture

sun’s picture

Note: $user->vid was mentioned somewhere for storing a unique visitor id, "vid" meaning "visitorid". However, if we'll ever have proper user profiles in core, which can be updated, translated, and enhanced with fields like we do with nodes today, $user->vid would refer to a user's or user profile's revision id, just like $node->vid.

@webchick: Call for another namespace action! ;)

webchick’s picture

I think the 'visitor' namespace is fine, just call it visitor_id, rather than vid. :)

sun’s picture

Assigned: Unassigned » sun

I'll try to scan those modules and come up with a feature list (proposal) for Visitor.

jhedstrom’s picture

Assigned: sun » Unassigned

subscribing

jhedstrom’s picture

Also, I'd be happy to abandon or merge session api in. The direction I envisioned that module going was to provide a hook, so, for example, the session favorites module could offload all that data to, views_bookmarks, or flags, when a 'visitor' became a user.

eaton’s picture

OK, here's how I was thinking about it:

  • Visitor module ALWAYS adds $user->visitor_data to the global $user. Period. Doesn't matter whether they're anon or what.<.li>
  • Visitor module, in its visitors table, stores the visitor_id, the user_id (optional), and cookie.
  • When a page is requested, Drupal handles its user account loading the way it usually does. If the user is anonymous, it checks for a browser cookie that matches a visitor record. If the user is logged in, it checks for a visitor record for the current user.
  • I don't know what else, but those seem important.

We might also want some sort of sessions-table-esque logging of the most recent access for the visitor. That would let us do auto-purging of, say, anything older than 6 months in the visitors table to avoid hellacious bloat.

The key payoff is to allow modules that want anonymous voting/rating/commenting/whatever to store things keyed on visitor id, and let the visitor api keep track of the complicated bits. There's no "transferring" of preferences to a user account when someone registers -- the stuff would continue to be associated with a single 'visitor', but that visitor would then be tied to an explicit account. Make sense?

nonsie’s picture

subscribing

sun’s picture

Visitor

Synopsis

Identify unique visitors of a Drupal site based on a browser cookie to allow third party modules to associate data to them, regardless of whether the visitor is authenticated or not. Map visitors to user accounts, and account for user account changes. Maintain visitor storage.

Use-cases

  • Track or limit actions for anonymous users, such as voting or referral tracking
  • Detect false/duplicate user accounts
  • Move or restore data between anonymous/authenticated user sessions
  • Advertisements
  • "Remember me"/persistent logins
  • Switch a site's theme

Visitor object

Each unique visitor has a

  • vid | visitor_id: Internal, serial id; assigned by Visitor.
  • cookie | visitor_cookie: Value of the cookie set by Visitor; generated and assigned by Visitor.
  • uid: (optional) The corresponding user id of a visitor; maintained by Visitor.
  • timestamp: A UNIX timestamp holding the creation/modification date; updated by Visitor.

To ensure a unique id for each visitor, Visitor uses the same method as drupal_get_token(), by using uniqid():

$key = md5(uniqid(mt_rand(), true)) . md5(uniqid(mt_rand(), true));
$unique_cookie = md5(session_id() . $key);
setcookie('visitor', $unique_cookie, ...);

Invocation

To avoid flooding Visitor's storage with data that is never used, third party modules fetch the unique visitor id on demand by invoking

visitor_get_id();

which returns a unique visitor id, but also adds a visitor_id property to the global $user object.

Identification

Visitors are identified by a direct lookup and match of the cookie value. If no matching cookie is found in the database, a new visitor id and cookie is created.

Namespace / Lifetime / Error handling issues

Different applications/use-cases may require different cookie lifetimes. Two examples:

  1. Roadblock displays a page once for each new visitor. Now your marketing gurus require that a new page should be displayed - regardless of whether a user has already seen the previous one. We know we don't want to do that, but we are forced to do it, and Visitor does not support it, without dropping all existing data that might also be used by other modules.
  2. Fivestar/VotingAPI allows anonymous users to vote only once. A serious bug somehow slipped into the 6.x-1.1 release and your community users start to whine they are not allowed to vote anymore. The cookie is set on the client-side - all you can do it truncate and again, break the (still working) functionality of other modules.

To circumvent this, allow modules to use namespaces for cookies (using the array syntax for setcookie), as in:

visitor_get_id('roadblock');

resulting in Visitor doing:

setcookie("visitor[roadblock]", $unique_cookie, ...);

This needs further consideration. Also, prove me wrong, please.

Mapping

Whenever an anonymous user becomes an authenticated user (i.e. registers), Visitor updates the uid column in its storage. When a user is deleted, Visitor removes the uid only, not the whole record.

Maintenance

Based on the last updated/modified timestamp, Visitor checks for obsolete records upon cron runs and deletes them. The expiry time must be configurable and always match the interval that is used for setcookie(). Since configuration options are provided to be changed, Visitor should probably store the expiry date instead of the last updated timestamp, and update that date (once) in each request, in which visitor_get_id() is invoked.

Integration

Visitor provides hook_visitor_load() for contrib modules to attach further information to Visitor-enhanced $user objects. For example, instead of having third party modules implement (most often optional) BrowsCap support all over again, they can test whether $user->visitor_browser is defined.

Possible enhancements

Implement fallback mechanisms/calculations for browsers not supporting cookies, anonymous proxies, and other hi-jacking attempts, such as clearing cookies or blocking cookies. Analyze Drupal core's flood protection for this.

sun’s picture

Meh. That should have read:

Visitor should probably store the expiry date instead of the last updated timestamp, and update that date (once) in each request, in which visitor_get_id() is invoked and hence, the client-side cookie is updated.

sun’s picture

That talk about namespaces and cookie-lifetime was much blah blah and pretty nonsense.

Fact is, we want and have to "recall" a visitor for a configurable time-frame. This effectively means that we need to validate the cookie a client sends us, and associate it with a known visitor id if we already have a record, or issue a new cookie.

In the end and from a high-level view, Visitor is a module merger of http://drupal.org/project/persistent_login and http://drupal.org/project/session_api

Unlike Session API, Persistent Login needs a coder_format clean-up though.

webchick’s picture

Ok. I did an initial check-in as a result of me trying to follow your notes. It's still pretty messy, and there are lots of TODOs, but some basics are there. Let me know how close I got to what you were thinking. ;)

eaton’s picture

This needs further consideration. Also, prove me wrong, please.

I don't think it should be the responsibility of the Visitor module to play dispatcher for module-specific data. Rather than Visitor storing a flag on the client side, I think its job should be JUST to maintain a unique visitor id, persisted via cookies. If modules need to maintain separate data, they can maintain their own tables with a visitor_id key, the same way modules do with the User object now.

IMO, at least. ;-)

Follow the same pattern, I say -- let Visitor focus on the problem of 'maintaining a unique ID' and other modules figure out what to do with it.

sun’s picture

Thought about this, and I think you're right, Eaton. If Visitor just takes over the job of identifying visitors, modules like Persistent Login, Fivestar/VotingAPI can use vid/visitor_id in their own tables as reference (probably as replacement for uid). If they need to reset, they can truncate their tables, not Visitor's.

Webchick has committed some bare-bone code in the meantime, closely following the code of session_api. However, I think the primary logic for identifying visitors must be based on Persistent Login module. Because of this, I took the time to perform a heavy #326279: Code clean-up for Persistent Login module.

IMO, the next step is to fork the required parts from Persistent Login into Visitor, possibly replacing large parts of the current code.

@jhedstrom: AFAICS, all we want to fork from session_api is probably the "visitor_get_id()" callback and your intelligent idea of checking whether cookies are enabled? What could we do about the session_api_settings hook?

@all: To test the resulting behavior, we need some integration code somewhere. At first I thought about doing this in any of the already mentioned modules, but in the meantime I'd say that an example integration module, along with tests, would be better for development and other developers. Anyone having a neat idea for a simple example integration module? For instance, visitor_count, providing a simple greeting message telling a visitor how many times she visited the site?

jvandyk’s picture

StatusFileSize
new5.59 KB
new14.87 KB

Here's an updated visitor module that just uses the same per-install cookie naming as PL and has a visitorcount module as a small test. Visitorcount started with hook implementations but since visitor sets $user->visitor_data it just uses that now.

webchick’s picture

Welcome to co-maintainership, jvandyk!! :)

jvandyk’s picture

Thanks...I guess! ;) Committed #15 to HEAD for easier testing.

sun’s picture

Probably completely offtopic, but wanted to share this very interesting finding with you: http://www.thomasfrank.se/sessionvars.html - /me thinks about possible use-cases related to Visitor... However, if at all, we should discuss this in a separate issue.

chilledoutbeardedman’s picture

Is this project still under discussion?

joachim’s picture

subscribing.

What's missing to make a release?

avpaderno’s picture

Issue summary: View changes
Status: Active » Closed (outdated)

I am closing this issue, as it has been created for a release that is now not supported.