When a page request is protected by Shield, there's no way for other code/modules to be aware of that state (short of replicating most of the logic in shield_boot() which is inefficient).

Shield could set a static variable in shield_boot() if the current request is protected, e.g.:

$shielded = &drupal_static(__FUNCTION__);
$shielded = TRUE;

My specific use case: Setting an HTTP header so that responses are not cached by Akamai for Shielded paths, for example in a custom hook_init:

$shielded = &drupal_static('shield_boot');
if ($shielded) {
  // Set a header()
}
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

azinck’s picture

Here's a patch that rearchitects some things and adds some hooks.

I've separated out the logic that determines whether or not a page is to be shielded. This is in a new function shield_get_status(). Additionally, shield_get_status() now gives other modules the opportunity to override shield's logic to determine if a page is to be shielded by implementing hook_shield_status_alter().

I've also changed the way headers get output. When a page is shielded, the shield headers are added by hook_shield_headers. hook_shield_headers_alter has also been added to give modules the chance to act on each others' headers.

I've also switched to using drupal_send_headers to deliver the headers which makes sense to me and works but maybe someone has a use-case for which that's a bad idea?

azinck’s picture

Inspired by the work on #1898216: Choose shield method = protect all, exclude or include I'd rolled in the ability to specify protection by including/excluding specific paths into this patch.

azinck’s picture

Status: Active » Needs review
pbull’s picture

I like this approach as it seems to offer more flexibility. Re-rolling to fix a couple minor issues:

* Existing users would expect to retain the existing functionality (to protect all pages) when they update or patch. Changing the default value of shield_method to 0 to retain that behavior.

* A couple minor coding style tweaks (spaces between closing parenthesis and open brackets/control keyword).

In shield_boot() we can alter the headers that are sent when authentication fails. But to set specific headers for pages that are shielded AND the authentication succeeds (for example to set a no-cache header for the CDN, per my original comment), we would just call shield_get_status() in a custom module. (@azinck: This seems like an appropriate way to do this, just want to confirm it makes sense to you too.)

azinck’s picture

Thanks for the review, pbull.

Existing users would expect to retain the existing functionality (to protect all pages) when they update or patch. Changing the default value of shield_method to 0 to retain that behavior.

Actually, having shield_method set to 1 as default will retain existing Shield configuration. By default, Shield protects all paths except those that are excluded, which is what shield_method == 1 does, too.

A couple minor coding style tweaks (spaces between closing parenthesis and open brackets/control keyword).

Thanks!

In shield_boot() we can alter the headers that are sent when authentication fails. But to set specific headers for pages that are shielded AND the authentication succeeds (for example to set a no-cache header for the CDN, per my original comment), we would just call shield_get_status() in a custom module.

Yes, that's what I was thinking.

pbull’s picture

Actually, having shield_method set to 1 as default will retain existing Shield configuration. By default, Shield protects all paths except those that are excluded, which is what shield_method == 1 does, too.

Yes, this makes sense now that I think about it. Thanks for the clarification.

azinck’s picture

I'm getting some issues with this. It appears that calling drupal_alter in hook_boot() might be a bad idea. It appears the hooks (even hooks unrelated to shield) for some modules are overlooked when it's called this early and the result of that is cached which prevents such hooks from ever firing throughout the page request. Will have to see if I can rearchitect this.

azinck’s picture

Here's an update. Different approach. I've removed the calls to drupal_alter and module_invoke. If a module wants to affect whether or not a page is shielded it needs to implement hook_boot, have a low enough weight to be run before shield, and call shield_set_status() to set the status.

kalman.hosszu’s picture

Status: Needs review » Needs work

Please make clearable the patch because it contains other patch-es. So please re-roll it...and I think a static variable is not a hook...

Thanks

azinck’s picture

I can reroll the patch without the code from #1898216: Choose shield method = protect all, exclude or include.

Yes, a static variable is not a hook. As I mentioned in #7 I ran into problems when calling drupal_alter so early in the page request. You can't call any non-bootstrap hooks (https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/boot...) at that point since modules not implementing any bootstrap hooks would not have been loaded and, even if the relevant modules were, a call to module_implements() at that point in the page request would cause modules that don't implement a bootstrap hook to have *all* their hooks overlooked for the duration of the page request (due to caching in module_implements()). So I had to take a different approach.

I created 2 new functions: shield_set_status() and shield_get_status(). They behave as you'd expect. If another module wants to take over the shield logic then it can simply implement hook_boot() and set its weight so that it gets executed before shield_boot(). It can then call shield_set_status() to set the shield status according to its own logic.

Additionally, being able to call shield_get_status() lets modules react to the shield status however they want. In my use-case that means I can set custom headers based upon the Shield status. My custom module implments hook_boot() as above, calls shield_get_status() to determine the shield status, and calls drupal_add_http_header() to deliver an arbitrary header.

I'm open to suggestions if you think there's a better way to accomplish this. I rather liked the initial architecture I had but there's no good way to make it work that I can see.

kalman.hosszu’s picture

Yes I understand the problem and the motivation too...the problem is that I don't really like the idea to rewrite a static variable as a hook. Apart from this I don't have better idea so I have to think about it.

The other stuff: I can't apply the patch:

▶ git apply -v 1980354-8-shield-add_shield_get_and_set_status.patch
Checking patch shield.admin.inc...
error: while searching for:
    '#default_value' => variable_get('shield_allow_cli', 1),
  );

  $form['general']['shield_excluded_paths'] = array(
    '#type' => 'textarea',
    '#title' => t('Exclude these paths from Shield protection'),
    '#description' => t('Enter a list of paths that should be accessible without Shield credentials. Leave blank to protect all paths.'),
    '#default_value' => variable_get('shield_excluded_paths', ''),
  );


error: patch failed: shield.admin.inc:26
error: shield.admin.inc: patch does not apply
Checking patch shield.install...
error: while searching for:
  variable_del('shield_user');
  variable_del('shield_pass');
  variable_del('shield_print');
  variable_del('shield_excluded_paths');
}

error: patch failed: shield.install:12
error: shield.install: patch does not apply
Checking patch shield.module...
error: while searching for:
}

/**
 * Implements hook_boot().
 */
function shield_boot() {
  $user = variable_get('shield_user', '');
  if (!$user) {
    return;
  }

  // allow Drush to bypass Shield
  if (drupal_is_cli() && variable_get('shield_allow_cli', 1) == 1) {
    return;
  }

  // Allow excluded paths to bypass Shield protection.
  $excluded_paths = variable_get('shield_excluded_paths', '');
  if (!empty($excluded_paths)) {
    require_once './includes/unicode.inc';
    require_once './includes/path.inc';
    require_once './includes/locale.inc';
    require_once './includes/language.inc';
    drupal_language_initialize();
    $pages = drupal_strtolower($excluded_paths);
    $path = drupal_strtolower(drupal_get_path_alias($_GET['q']));
    // Compare the lowercase internal and lowercase path alias (if any).
    $page_match = drupal_match_path($path, $pages);

    if ($path != $_GET['q']) {
      $page_match = $page_match || drupal_match_path($_GET['q'], $pages);
    }

    // We're on a path that is excluded from Shield protection.
    if ($page_match) {
      return;
    }
  }

  $pass = variable_get('shield_pass', '');
  if (substr(php_sapi_name(), 0, 3) == 'cgi') {
    // We have (some sort of) CGI.

error: patch failed: shield.module:34
error: shield.module: patch does not apply
azinck’s picture

Yeah, the original patch was rolled against dev before you committed #1898216: Choose shield method = protect all, exclude or include. Re-rolling now.

I'm not sure why you're saying we're "rewriting a static variable as a hook". We're not creating a hook. What we are doing is offering an API to other modules that might want to affect Shield's behavior. The design pattern I took is similar to what core does with drupal_set_title() / drupal_get_title() and drupal_set_breadcrumb() / drupal_get_breadcrumb(), for instance.

kalman.hosszu’s picture

If you re-roll the code I will test and think about it. I did a 16 hours/day work week so I can't figure out the best solution to that problem. Maybe this will be the best but I am too tired to decide because I loused up another trivial issue and I think this indicates that I should finish the work :)

azinck’s picture

Ok, I've rerolled this. It requires that you first apply the patch here: https://drupal.org/node/1898216#comment-7592813

azinck’s picture

Re-rolled against latest 7.x-1.x

kalman.hosszu’s picture

+++ b/shield.moduleundefined
@@ -34,90 +34,134 @@ function shield_menu() {
-    $paths = variable_get('shield_paths', '');
-    $page_match = FALSE;
-    // Compare paths, if any have been set.
-    if (!empty($paths)) {
-      require_once DRUPAL_ROOT . '/includes/unicode.inc';
-      require_once DRUPAL_ROOT . '/includes/path.inc';
-      require_once DRUPAL_ROOT . '/includes/locale.inc';
-      require_once DRUPAL_ROOT . '/includes/language.inc';
-      drupal_language_initialize();
-      $pages = drupal_strtolower($paths);
-      $path = drupal_strtolower(drupal_get_path_alias($_GET['q']));
-      // Compare the lowercase internal and lowercase path alias (if any).
-      $page_match = drupal_match_path($path, $pages);
-      if ($path != $_GET['q']) {
-        $page_match = $page_match || drupal_match_path($_GET['q'], $pages);
-      }
-    }
-    // Enable shield or not, depending on shield_method.

You removed these lines and replaced it with the old lines (I think it must be a copy-paste issue :) ). I think using DRUPAL_ROOT is better and I also changed the code comments (dot at the end of the sentences). I would prefer if to live the current lines. Or do you see any problem with it?

azinck’s picture

Status: Needs work » Needs review
FileSize
6.96 KB

Ah, my mistake. Re-rolled.

azinck’s picture

I screwed up the paths in #17. Let's try again.

kalman.hosszu’s picture

Status: Needs review » Fixed

Thanks @azinck I have just committed your patch. I created some minor modification but I marked you as the author of the commit: http://drupalcode.org/project/shield.git/commit/62335ee

kalman.hosszu’s picture

Status: Fixed » Needs work

Hmm I tested the whole functionality and it seems that this function kills #2030017: Allow cron to run

@azinck could you check this? We have a checkbox to enable cron running, but now it disables always because the page url check doesn't match to cron.

azinck’s picture

I'm not somewhere I can test this right now, but looking at what's in the codebase I don't see the problem. Lines 70-78 should take care of the use-case you're talking about: http://drupalcode.org/project/shield.git/blob/refs/heads/7.x-1.x:/shield...

What do you mean by "now it disables always"? Are you suggesting that the checkbox literally cannot be checked? Or that the value isn't being honored?

kalman.hosszu’s picture

The problem is, that if I don't check the checkbox to enable cron running (so cron.php should be protected by shield) but the cron.php doesn't protected, so the shield's popup doesn't displayed. Is it clear?

azinck’s picture

Ah, I see the problem. Are you running cron at admin/reports/status/run-cron while logged in? It appears the module doesn't account for that particular use-case.

Line 70 needs to be changed from:

$cron = (basename($_SERVER['PHP_SELF']) == 'cron.php' && variable_get('shield_allow_cron', 1));

to:

$cron = ((basename($_SERVER['PHP_SELF']) == 'cron.php' || $_GET['q'] == 'admin/reports/status/run-cron') && variable_get('shield_allow_cron', 1));
azinck’s picture

We cross-posted. My comment in #23 shouldn't have anything to do with what you're seeing.

I know this sounds dumb, but are other paths being appropriately protected? You definitely have the Shield-enabled checkbox checked and Shield credentials set?

kalman.hosszu’s picture

Status: Needs work » Needs review
FileSize
503 bytes

No I run the cron with the secret key as an anonymous user.

I think the problem is if you select "Shield no paths except the following (include)" the regex won't apply to the current GET['q'] and it will disable shield.

A patch attach for this problem, please review it

azinck’s picture

Ok, now we're getting somewhere. The specific use-case you seem to be suggesting is:

  • You want to shield only a select few paths, so you've chosen "Shield no paths except the following (include)"
  • One of the paths you want to shield is cron.php, so you've added cron.php to the list in shield_paths
  • You're now expecting cron.php to be shielded

And you're right, cron.php is not shielded in this case. But I hasten to add: the problem has nothing to do with my specific patch. The bug is the fault of the interaction of #2030017: Allow cron to run and #1898216: Choose shield method = protect all, exclude or include together.

The problem is wider than cron, it affects update.php, install.php, and anything that doesn't hit Drupal's index.php. This is sort of a broader design question: is the list of shield_paths meant to be a list of "Drupal" paths, or can they actually be system paths to any file on the filesystem? The problem with suggesting that users can shield any file in the filesystem is that we can't actually enforce that; Drupal won't get bootstrapped for every possible file that could be requested so suggesting that a user can protect any path would be misleading. That said, there are cases (such as listed above) where Drupal *is* bootstrapped but $_GET['q'] isn't populated and for those I can see some merit in allowing them to be selectively Shielded.

I still have some reservations about how we word the text on the shield_paths input form, but I think we can get the functionality you want by replacing line 93 in shield.module with:

$path = drupal_strtolower(drupal_get_path_alias($_GET['q']));
if(empty($path) && isset($_SERVER['REQUEST_URI'])) {
  // Code stolen from core's request_path().
  // Extract the path from REQUEST_URI.
  $request_path = strtok($_SERVER['REQUEST_URI'], '?');
  $base_path_len = strlen(rtrim(dirname($_SERVER['SCRIPT_NAME']), '\/'));
  // Unescape and strip $base_path prefix, leaving path without a leading slash.
  $path = substr(urldecode($request_path), $base_path_len + 1);
  
  // Under certain conditions Apache's RewriteRule directive prepends the value
  // assigned to $_GET['q'] with a slash. Moreover we can always have a trailing
  // slash in place, hence we need to normalize $path.
  $path = trim($path, '/');
}
kalman.hosszu’s picture

Status: Needs review » Fixed

Yes you are right, this is not related to your patch. I will refactor the shield path handling to fix it.

kalman.hosszu’s picture

@azinck I create a separated issue for it #2063945: Refactor exclude paths if you want to review feel free to do it.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.