Posted by AndreU on May 10, 2009 at 11:27am
| Project: | Diff |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | yhahn |
| Status: | needs work |
Issue Summary
Suggestion
Daisy Diff can make a real diff of html-code instead of just striping out html-tags. In GSoC 2008 the author wrote a php version for Wikipedia: http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3/includes/diff/
Maybe it could also be used for the Drupal Diff module?
Daisy Diff Description
Daisy Diff is a Java library that diffs (compares) HTML files. It highlights added and removed words and annotates changes to the styling.
- Works with badly formed HTML that can be found "in the wild".
- The diffing is more specialized in HTML than XML tree differs. Changing part of a text node will not cause the entire node to be changed.
- In addition to the default visual diff, HTML source can be diffed coherently.
- Provides easy to understand descriptions of the changes.
- The default GUI allows easy browsing of the modifications through keyboard shortcuts and links.
Demo of HTML-Diff: http://code.google.com/p/daisydiff/wiki/Examples
Comments
#1
I had a detailed look at daisydiff and started integrating it into the diff module. The problem is that the PHP implementation of daisydiff is very slow when documens are longer (>4 screen pages). I talked to Guy, the developer of daisydiff and he confirmed this. The comparison does not deliver good results for longer documents. Therefore, I can not recommend using daisydiff for the diff module.
#2
Thanks for this evaluation - please feel free to make active again if/when the performance of this library is improved.
#3
PHP API http://code.google.com/p/daisydiff/source/browse/trunk/daisydiff-php/HTM...
PHP version developed for Wikipedia : http://www.mediawiki.org/wiki/Visual_Diff
#4
Daisydiff's output does look nice, however including this might be a non-starter due to license issues.
Given that Drupal is released under GPL 2 and Daisydiff Apache 2
From Various Licenses and Comments about Them - GNU Project
http://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses
#5
like many other drupal modules it can be used as an optional third party. It does not have to be bounded with this module. it might also be possible to use the java version as well
#6
subscribing
#7
I've been playing around with a couple diff packages. I'm hoping to have a test patch of one of them up and running soon. Hopefully, I'll post back within a few days.
#8
subscribe
#10
I've created a modified version that uses PEAR's Text_Diff package, which needs to already be installed on your system beforehand.
It's still quite kludgy at the moment, but it appears to work decently enough (but only for nodes in the current version).
I've attached a screenshot along with the patch files.
Also, it might be worthwhile to combine this with http://drupal.org/node/372957 to create a simple configuration screen to toggle between the diff engines and whether or not to include markup.
#11
If you're interested you might also want to look at:
http://drupal.org/project/lifewire_diff
It's only available for 5, but it also uses PEAR Text_diff and I used it as a jumping off point for my experiment. It also allows users to select between the 2 column and single column diff views.