Plagiarism Checker - Authentication Framework

liquidcms - April 20, 2008 - 17:12

The Authenticate module provides a mechanism to verify if site content has been plagiarized.

The module is a framework which supports various search APIs (plugins) to scour the net looking for possible plagiarized content. The framework provides support for 2 different types of APIs - Standard APIs (Google and Yahoo plugins are included here) and Custom APIs (such as the 3rd party paid authentication service from iThenitcate (www.ithenticate.com)).

The Standard APIs process is basically:

  • split BODY of node to be checked into configurable number of consecutive word "chunks"
  • use API's search engine (Google or Yahoo for example) to search for any URL's which match each chunk
  • load the full page content for each matching URL and do complex comparison against the entire body of the node
  • come up with a comparison score based on how closely the content matches between the scraped URL's content and the body of the node
  • provide a report of all matching URL's whose comparison score exceeds a configurable threshold

Custom API's like iThenticate's do their own search offsite from the user's Drupal site and return a report in any fashion they prefer (embedded in an iFrame within the Drupal site).

API accounts (Google, Yahoo, iThenticate or others which may be added) require API accounts with the respective companies.

This module is expected to be popular amongst schools, colleges and publishing companies.

This project was designed by LiquidCMS and funded by LifeWire, a NY Times Company.

NOTE: As per the README, this module requires the PEAR text_diff class to be installed.

NOTE: The iThenticate module was removed from the release as it contained code which was not allowed in Drupal CVS due to licensing issues. If you need this API, feel free to contact me or iThenticate directly.

Drupal 6 version now available

I have extensively tested the Google API but not the Yahoo API. I was getting connect errors with Yahoo API which may be due to their API changing as the API code is relatively untouched between Dr5 and Dr6 versions.

Funding for Drupal 6 version provided by ConsumerSearch.com, a NY Times Company.

Releases

Official releasesDateSizeLinksStatus
6.x-1.02009-Oct-1926.04 KBRecommended for 6.xThis is currently the recommended release for 6.x.
5.x-1.42009-Aug-2425.5 KBRecommended for 5.xThis is currently the recommended release for 5.x.


 
 

Drupal is a registered trademark of Dries Buytaert.