When migrating a website from any system to Drupal you should be aware of existing inbound links to your site, as well as search engine indexes and ranking. In order to maintain your search engine ranking and also not break inbound links you should plan to redirect inbound requests to old uris to your new drupal nodes.

Instead of offering up 404 Errors, you can direct users to the content they are looking for. In some cases using the path_redirect module may be sufficient. In other cases you may want to write redirect rules in your .htaccess file, and in still other cases, the method described below may work for you. Another step, which will help with search engine indexing is to install and configure xmlsitemap module and submit your sitemap to the major search engines for indexing.

The rest of this article describes an approach that will parse the Search Engine query from the HTTP_REFERER and search the drupal website for what the user was actually looking for.

Create a page node with PHP code enabled in the input format, and add the following code.

  $searchengines = array(
            '^http://www\.google.*$' => 'q',
            '^http://www.googel.fi.*$' => 'q',
            '^http://.*search.msn.co.*results.*$' => 'q',
            '^http://.*\.mysearch.com/jsp/GGmain.jsp?searchfor=.*$' => 'searchfor',
            '^http://search.freeserve.com/.*$' => 'q',
            '^http://aolsearch.aol.co.*$' => 'query',
            '^http://search.yahoo.com.*$' => 'va',
            '^http://search.yahoo.com.*$' => 'p',
            '^http://www.bbc.co.uk/cgi-bin/search/.*' => 'q',
            '^http://www.tiscali.co.uk/search/results.php.*$' => 'query',
            '^http://www.altavista.com/web/results.*$' => 'q',
            '^http://search.hotbot.co.uk/cgi-bin/pursuit.*$' => 'query',
            '^http://www.excite.co.uk/search/web/results.*$' => 'q',
            '^http://uk.search.yahoo.com/search.*$' => 'p',
            '^http://search.wanadoo.*$' => 'q'
	);

	$referer = getenv("HTTP_REFERER");
	
	while( list( $regexp, $qsitem ) = each( $searchengines ) )
  {
  	if( eregi( $regexp, $referer ) )
    {
    	echo( t("<br/><h2>Search Engine Detected</h2>It would appear you arrived here on the wings of a search engine, so, I will search my local database and show you anything that matches what you were looking for:<br/>"));
    	$url = parse_url( $referer );
      $querystring = $url['query'];
      $querystring = explode( "&", $querystring );
      while( list( , $value ) = each( $querystring ) )
      {
      	$item = explode( "=", $value );
        if( $item[0] == $qsitem )
        {
          if( trim( $item[1] ) != '' )
          {
          	$item[1] = urldecode( $item[1] );
          	echo ( search_data( $item[1] ) );
          }
        }
      }
    }
  }

This provides a (partial) list of regular expressions for common search engines, with information as to which query string parameter is the query the user entered. The HTTP_REFERER value (the site the user clicked a link to get to your site) is then examined against this list. When a match is found, a search is done using the standard Drupal search call (search_data). This locates potential matches, and hopefully, keeps the user on your site.

In order to use this, create a new node which allows PHP code. You can call it what you want, and put whatever explanatory text you like on it. You can set whatever path you like. Just drop the above code-clip into place. Then, in Administration -> Settings set the 404 handler to be the path to the new node you created, and voila, if the user arrives from a recognized search engine, their search is performed on your site. It's working nicely for me.

Comments

hermit_kid’s picture

genius - i'm gonna take this page out of your book and report back on it.

aufumy’s picture

Have a look at path_redirect module. It is an alternative method to help maintain ranking, by providing 301 redirects to the new content created.