EDIT: Funny how sometimes asking for help will suddenly lead you down a path of finding your answer, even when no one offers any advice. I guess it just comes from typing out the question in long form. Just updating with my solution in case someone else can benefit from it.
ORIGINAL QUESTION:
I have inherited a 95% complete website (in Drupal 7.21) that I've been working on over the last year. Before working on this site I had never used Drupal. I've edited templates, created brand new modules, work with translation, regionally locked content, etc. I'm learning to harness Drupal's power, but this particular issue is eluding me.
A bit about how the site currently works :
If you come to the frontpage of the site, and don't have a cookie set, the site asks what part of the world you are from. This sets a cookie that shows people products that are from their area of the world. They can also pick the language they want to see the site in and most of our content is available in other languages.
So what's the problem? : When you go to a direct link in the site (not the frontpage) the site doesn't ask you what region you are from. When search engines crawl the site, they seem to pick a region and crawl the site (that's okay). However when they catalog a page and put it in their database, there is no link back to what region they selected. For example let's say that North America has a product called "ABC" ( example.com/en/products/ABC ). Now let's say that Europe doesn't have that same product. So if someone has Europe set in their cookie, that product would not be shown.
Google crawls the site, and picks North America. They list the URL in their database. Someone in Europe searches Google for something and comes across our link. The direct link won't ask what region they are from, but with no cookie, the site doesn't know what region they should be in. Which can cause mixed results.
It seems to me the solution would be one of two things :
1. When someone comes to ANY page of the site without the cookie set, it could ask them what region they are in. Then forward to whatever content they were attempting to access. However this seems to have some additional problems with the fact that the Google entry could be pointing to North American content, and they pick Europe, only to be met with "this content isn't for you". Additionally if they have visited the site before and click a direct link then they wouldn't be asked their region and still shown a "this isn't for you".
2. Encode the region selection in to the URLs... much like language currently is. For example : example.com/en/na/ or example.com/fr/eu/ -- however I can't for the life of me figure out how I would accomplish this. This way if someone sees content in Google that is for North America and they click on it, it will assign them North America and let them see the content.
I'm open to absolutely any suggestions. Has anyone done anything similar to this and have any ideas?
ANSWER: Looks like "hook_url_inbound_alter" is my solution. Note I couldn't get this function hooked in the template file, so it appears to need to be done from a module.
function (MODULE)_url_inbound_alter(&$path, $original_path, $path_language)
{
//SAVE THE SITE FROM LOADING THIS TWICE
//A BAD PATH WILL CALL THIS FUNCTION A SECOND TIME
//WHILE LOADING THE 404 PAGE.
if (!isset($GLOBALS['aRegionKeys']))
{
//FIND THE 'REGIONS' TAXONOMY
$taxRegion = taxonomy_vocabulary_machine_name_load('regions');
//LOAD THE TAXONOMY TO AN ARRAY
$aRegion = taxonomy_get_tree($taxRegion->vid);
//LOOP EACH 'REGION' TERM
foreach ($aRegion AS $nOrd => $aRegionTerm)
{
//LOAD TERM
$termObject = taxonomy_term_load($aRegionTerm->tid);
//IF TERM HAS A REGION CODE
if (isset($termObject->field_region_code['und']))
{
//SET REGIONAL CODE
$cRegionCode = $termObject->field_region_code['und'][0]['value'];
//SET REGION ID
$tempRegion = $termObject->tid;
$tempSubRegion = 0;
//IF REGION IS A SUBREGION, SET PARENT
if ($aRegionTerm->parents[0] > 0)
{
$tempSubRegion = $termObject->tid;
$tempRegion = $aRegionTerm->parents[0];
}
$GLOBALS['aRegionKeys'][$cRegionCode]['region'] = $tempRegion;
$GLOBALS['aRegionKeys'][$cRegionCode]['subregion'] = $tempSubRegion;
}
}
}
//EXPLODE THE PARTS OF THE PATH
$aParts = explode("/", $path);
//IS FIRST TERM FROM PATH ONE OF OUR REGION CODES?
if (isset($GLOBALS['aRegionKeys'][$aParts[0]]))
{
//SET COOKIE
$cookie = json_encode(array(
'region_id' => $GLOBALS['aRegionKeys'][$aParts[0]]['region'],
'subregion_id' => $GLOBALS['aRegionKeys'][$aParts[0]]['subregion']
));
setcookie('region', $cookie, time() + 31536000, '/', $GLOBALS['cookie_domain']);
//SET GLOBALS
$GLOBALS['region_id'] = $GLOBALS['aRegionKeys'][$aParts[0]]['region'];
$GLOBALS['subregion_id'] = $GLOBALS['aRegionKeys'][$aParts[0]]['subregion'];
//SET SESSION
$_SESSION['region_id'] = $GLOBALS['aRegionKeys'][$aParts[0]]['region'];
$_SESSION['subregion_id'] = $GLOBALS['aRegionKeys'][$aParts[0]]['subregion'];
//STRIP REGION CODE FROM PATH
$path = substr($path, strlen($aParts[0]) + 1);
}
//CONVERT ALIAS TO STD PATH (MAY RETURN FALSE IF PATH IS ALREADY OKAY)
$mResult = drupal_lookup_path('source', $path);
if ($mResult != false)
{
//NOT FALSE? USE THE PATH
$path = $mResult;
}
}
This only solves stripping the region code, and setting the cookie/session/global variables. Depending on how your links are constructed on your site, you'll need to make changes there to preserve whatever your region is as well. We're using laguage in the url (example.com/en/region/page/) and this works with it. I would expect it to work without as well.
The code could probably be improved, but it does what I need for now. As the user continues to surf the page, each page will update their cookie. So you may want to "if" around that to only reset the values if they have changed. In fact, I think I'll do that now.