This is my first time using purl, so maybe there's something that I completely missed.

Anyways, for my current project, I'm making a site for a fitness center chain, where each fitness center is a node. Each center will have a lot of similar subpages, with stuff for classes, events etc. So I wanted to use purl to set the active fitness center for this content.

What I did was.

1. Add a slug field to the node form. When the node is saved I save the slug, nid and my custom provider using purl_save. This works fine.
2. Add my custom provider using hook_purl_provider.
3. In the interface I setup the provider to use path.

When I started to test, things got a bit weird.

I have 3 nodes, with slugs A, B and C

if I go to A/B/C the callback function is called 5 times the nids recieved was:

A, B, C, A, A

and I end up getting redirected to A. This struck me as strange, but this might be as it should.

However, if I go any other url: A/[any-drupal-path] instead of ending at the drupal path, I end up in a redirect loop with the browser timing out.

After reading through the docs a few times I decided to step through the code to figure out what I did wrong. After some time looking through the code I ended up at purl_url_outbound_alter It seems what happened is that the url correctly gets changed to [any-drupal-path] but then later is changed back to the original url.

In purl_get_normal_path line 237
purl_active()->add($method, $element)->set($element);

This gives problems in purl_url_outbound_alter, as it somehow gets called for $_GET['q']. This results in the correctly changed path A/[any-drupal-path] -> [any-drupal-path], is changed back again when the provider from the cache is added, resulting in a redirect loop:

A/[any-drupal-path] -> [any-drupal-path] -> A/[any-drupal-path] …

I found that in my callback function I could call purl_disable(TRUE); which fixed the redirect loop, but I don't know if this can/will give problems elsewhere.

Anyways like I said, I'm new to purl, so maybe I missed something in the docs, or the way I setup my module to integrate with purl. But it seems there is something buggy going on, with $_GET['q'] getting changed by purl_url_outbound_alter which I believe only should be used for generating urls/links.

CommentFileSizeAuthor
#9 purl_877696.patch421 bytesgoogletorp

Comments

googletorp’s picture

Another thing that puzzles me a bit and might be related.

Purl is making a redirect instead of internally changing the url. So with the above example:

A/node will result in a redirect to node. This removes the context from the url, not like what's done in spaces, where the active space will be shown in the url. Also since a redirect is used, all static variables is deleted (a full new request is made).

If I have understood purl correctly, the url should be preserved and there shouldn't be made any redirect, so the above problem with the redirect loop, might really be a bug that purl shouldn't be redirecting at all.

I won't mind spending some time on this my self helping to find the cause of this and fix it. However since I'm still very new to purl, I would like some feedback to make sure that I have understood the module correctly and isn't going out on a wild goose chase. It might be my implementation that's completely wrong. I have tried to mimic what's done in spaces_og so it shouldn't be that far off.

yhahn’s picture

Status: Active » Postponed (maintainer needs more info)

Hm, it's quite hard to debug solely based on the symptoms you've described. Would it be possible for you to post some of your code? It does sound like you have the right concepts in place for working with PURL.

googletorp’s picture

Status: Postponed (maintainer needs more info) » Active

Sure, I'll post my code.

This first part is snippets containing the code to save the slug, to purl. I left out all the code that didn't relate to purl. I also left out the update function, which only save the slug value if there is changes to it, or it wasn't set. The node type's name is branch

/**
 * Implementation of hook_form().
 */
function module_form(&$node) {
...
$form['slug_settings']['slug'] = array(
    '#type' => 'textfield',
    '#title' => t('Slug'),
    '#description' => t('Short branch name, used in URLs. Allowed characters are a-z, 0-9, and .-_'),
    '#default_value' => $node->slug,
    '#required' => TRUE,
  );
...
}


/**
 * Implementation of hook_insert().
 */
function module_insert($node) {
  module_save_slug($node->nid, $node->slug);
  ...
}

/**
 * Save/update the slug for a branch node.
 */
function module_save_slug($nid, $slug) {
  return purl_save(array(
    'id' => $nid,
    'provider' => 'module',
    'value' => $slug,
  ));
}

This next part is the actual integration with purl. A note: atm I'm storing the active branch in the $_SESSION. This is a quick way to fix the redirect problem, but I would probably prefer to just store it as a static variable. I'm not sure yet, if I want to save the active branch for several requests. Users when logged in, will be associated with a center, so I probably want to use that as a fall back instead of using a session variable.

/**
 * Implementation of hook_purl_provider().
 */
function module_purl_provider() {
  return array(
    'module' => array(
      'name' => t('Fitness center branch'),
      'description' => t('Determines the related fitness center branch.'),
      'callback' => 'module_set_active_branch',
      'example' => 'my-branch',
    ),
  );
}

/**
 * Set the active branch, stored in the user's session.
 *
 * @param (int) $nid
 *    The node ID for the branch to activate
 *
 * @return node ID for the active branch
 */
function module_set_active_branch($nid=NULL) {
  // I don't know why this is needed, but if purl_disable is not set to true we get a redirect loop.
  purl_disable(TRUE);

  // Store active center in the user's session. This makes it possible to save
  // The users choice between sessions.
  if (!empty($nid)) {
    $_SESSION['module_active_branch'] = $nid;
  }
  return $_SESSION['module_active_branch'] ? $_SESSION['module_active_branch'] : FALSE;
}

/**
 * Get the active branch.
 *
 * @return node ID for the active branch, FALSE if no branch in active.
 */
function module_get_active_branch() {
  return module_set_active_branch();
}

Different places in the code I use module_get_active_branch to get the active branch. Atm I use it for blocks and page callbacks. Also for now I don't set the method used like it's done in spaces_og, but instead I've set it to path in the admin interface. At some point this stuff will probably be added in the install hook or we might just set it up in the interface, since it should only be done once anyways.

I think all of this should be the relevant code, if something is missing or you need clarification, let me know.

yhahn’s picture

This all makes sense to me. Are you using CVS/dev or beta13? I'm wondering if you're running into a bug in beta12 and prior where the saving of a numeric modifier could throw PURL into the kind of infinite loop you are experiencing #847786: Numbers in purl aliases causing problems

googletorp’s picture

I'm using beta12. At the time it was the most recent release.

However, I'm not using numbers in my purl aliases. It's possible, but for the test data I've used, only a-z values have been used.

so in the purl table:

value = [a-z]
provider = 'module'
id = [0-9]

googletorp’s picture

Status: Active » Fixed

I updated to the CSV version, and it seems to have fixed the issue, so must have been related to #847786: Numbers in purl aliases causing problems.

No redirects is made now, so url don't change and I can use static variables and I don't need to use purl_disable(TRUE)

Thanx for the help.

googletorp’s picture

Version: 6.x-1.0-beta12 » 6.x-1.0-beta13

Setting version to beta-13 to indicate that the fix applies for this version.

googletorp’s picture

Status: Fixed » Active

I'm sorry, I was a bit to quick to resolve this issue.

I downloaded the CVS-HEAD version, I thought that would be the most resent version, but when I was about to commit the changes I noticed that HEAD is from 2009/03/08.

It seems that the old version from 2009 actually works as it should, while the new version DRUPAL-6--1 does not work. The symptoms are exactly the same. So it seems somewhere from the old version and the current CVS version, the bug must have been introduced.

googletorp’s picture

Title: Redirect loop using custom provider. » PURL doesn't integrate well with all modules
Status: Active » Needs review
StatusFileSize
new421 bytes

So having a version that worked vs one that didn't work, helped quite a lot to debug the problem.

I tracked down the issue to drupal_init_path()

In the old code you did

$_REQUEST['q'] = $_GET['q'] = _purl_get_normal_path($q);

In the new code you did

drupal_init_path();

function drupal_init_path() {
  if (!empty($_GET['q'])) {
    $_GET['q'] = drupal_get_normal_path(trim($_GET['q'], '/'));
  }
  else {
    $_GET['q'] = drupal_get_normal_path(variable_get('site_frontpage', 'node'));
  }
}

So the only difference is that $_REQUEST is not changed in the new version. I did a search on my Drupal setup, and it turns out that globalredirect was causing the some of the problems. In it's init function it uses $_REQUEST instead of $_GET. Since modules can use either, the best thing to do, is to set both to the new value.

I added a patch that solves the issue.

yhahn’s picture

Status: Needs review » Needs work

This is not an acceptable change as $_REQUEST['q'] should never be altered as it is the only record of the actual request path made by the client. The previous version of the code was making an aggressive alteration that was removed.

I'd suggest doing a little more research into how globalredirect is using $_REQUEST['q'] and why the unaltered version is not working properly.

mikl’s picture

Well, $_REQUEST is supposed to be shorthand for $_GET + $_POST, so it makes very little sense that q in one is different for q in the other.

From the PHP docs on $_REQUEST:

An associative array that by default contains the contents of $_GET, $_POST and $_COOKIE.

If we want to store the original-unrewritten URL somewhere, I think it should be named differently.

googletorp’s picture

The global redirect modules vs purl setup is pretty simple.

Purl with weight -20 will have first op, and change $_GET['q']

globalredirect will compare $_GET['q'] and $_REQUEST['q'], if they are different perform redirect to $_REQUEST['q'] (the un altered request)

This is essentially the loop that I ran in to.

If you truely don't want to change $_REQUEST['q'] but want purl working with globalredirect redirect, an alternative option would be to alter the weight of globalredirect to have it perform it's init before purl does it's alteration.

I don't like this option, as it only will be a fix vs one module, and can be broken by changing db values. Also, as mikl writes, $_REQUEST should be related to $_GET, according to PHP docs, so the best option would be to save the original request with a different name, so it still is accessible if needed.

yhahn’s picture

@mikl: This may be the case in the PHP docs but in Drupal core the convention has been to leave $_REQUEST['q'] alone. For example, all of the language-based path handling rewrites $_GET['q'] stripping out language prefixes (e.g. es/node/5 becomes node/5) while $_REQUEST['q'] is left alone. The same methodology is used for path aliases. This convention has been relied upon across contrib (in i18n and/or path alias related modules particularly) to determine whether the current path in $_GET['q'] has been modified as a result of something like a prefix being stripped or the current path being aliased.

@googletorp: If this is the case globalredirect should have issues across the board with other path rewriters like Drupal core's language prefixing. I would have to look into it more carefully but it sounds like the module is expecting that $_GET['q'] and $_REQUEST['q'] are different only when a path alias is involved which is a bad assumption.

googletorp’s picture

Status: Needs work » Fixed

@yhahn I've taken a bit closer look to globalredirect. It seems for the language rewrites, it makes a special case for that, by getting the language prefix and appending it to the url alias. I think you are right that the issue should be raised at globalredirect instead of here thus I'm closing this issue.

Thanx for the assistance on debugging this.

googletorp’s picture

Status: Fixed » Closed (works as designed)

Updated close reason.

mikl’s picture

Status: Closed (works as designed) » Needs work
mikl’s picture

Status: Needs work » Closed (works as designed)

Ah, Firefox. Please quit remembering form state on reloads…