The USASearch team announced a new feature in late December 2011 called Document Tag or Discovery Tag. Below is some info about the feature.

We should discuss adding this as a configurable feature. The inline javascript code needs to be added to the bottom, before the closing </body> tag. Since it uses the affiliate name (text not numeric) and we already require that, it should be relatively easy to do. The included stats.js could be added to the normal header script area if the defer option is set as we've already done for the other two external javascript includes. Here is an example of the code:

<script type="text/javascript">
  //<![CDATA[
    var aid = "commerce.gov";
  //]]>
</script>
<script src="http://search.usa.gov/javascripts/stats.js" type="text/javascript"></script>
Yesterday we rolled out a new document tag feature. The tag helps us automatically index your content and display it in the search results, without your having to take extra steps each time you publish a new or updated page.

To add the document tag to your web pages:

  1. Visit the Affiliate Center at http://search.usa.gov/affiliates.
  2. Select your site and click on Get Code in the left-hand menu.
  3. Follow the instructions under Code for Content Discovery and Indexing to add the tag to your template or specific pages.

It's that easy. Once you've placed the tag on your site, you can monitor the pages we've indexed by clicking on URLs in the left-hand menu.

Comments

barrett’s picture

Version: » 6.x-2.x-dev

What would the interface for setting this look like? I'm imagining an admin configuration screen that would let the admin choose which content types can have the tag enabled for them and whether it should default to on or off, and then a section in the node creation/editing form that would turn it on or off for the specific node. is that kind of what you're thinking?

timwood’s picture

I was just thinking blindly output the inline JS and external JS file on all pages like we have for all the other JS. My assumption was that USASearch would keep a record of all pages where this JS runs and somehow identify changed or maybe just new pages. We can ask the USASearch team to confirm.

I do like the idea of optionally adding this tag to nodes by content type and also per node, but it might just add complexity and not much value. If the Document Tag feature is only to get new content indexed faster (by USASearch's indexer as opposed to Bing) and without user intervention, then adding a feature to include/exclude from certain content might be overkill especially if Bing is just going to come along and index the pages (assuming they are linked on the site or from another site).

One use case might be if you want to publish "hidden" pages (not linked) and NOT have USASearch index them. Are there any other use cases you can think of?

I've pointed USASearch to this thread for their input.

dg_search’s picture

We currently index all new and existing HTML pages with our JS tag, and PDF files linked from these pages. We follow robots.txt so if there are any "hidden" pages that you don't want indexed, you could add them to your robots.txt file.

For more information on how our tag works, read our post at http://usasearch.howto.gov/post/18904783060/how-to-add-your-urls-to-our-...

barrett’s picture

One use case might be if you want to publish "hidden" pages (not linked) and NOT have USASearch index them.

That's exactly the case I was thinking of. One of the sites I work on has very specific guidelines on what can and cannot be indexed. With Bing, Google, etc, we handle that by excluding the path in robots.txt. If, as @usasearch said, they'll honor robots.txt then I guess there's no need for content specific inclusion/exclusion of the tag.

timwood’s picture

Assigned: Unassigned » timwood
Status: Active » Needs review

I've implemented this feature in the 2.x-dev version. The development release should be updated shortly. If you are using Git you can pull the changes. You can also look at the commit here: http://drupalcode.org/project/USASearch.git/commit/b6688521c8e760bafcffb...

I've reviewed the code on a test site, but wasn't able to test if the new Discovery tag javascript was communicating with USASearch. I will try to do this, but it would be helpful if someone else could review the changes as well.

Thanks!

timwood’s picture

Any chance someone else can review this? I'd like to get out a new released version soon and would like a second opinion/tester.

Thanks!

timwood’s picture

Status: Needs review » Fixed

Included in 6.x-2.1 and 7.x-2.0.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

  • Commit b668852 on 6.x-2.x, 7.x-3.x, 7.x-4.x by timwood:
    Updated links to USASearch blog. Added Discovery Tag feature and admin...