div tags joined together on input
nirbhasa - February 3, 2009 - 02:14
| Project: | Import HTML |
| Version: | 5.x-1.2 |
| Component: | Code |
| Category: | support request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | postponed (maintainer needs more info) |
Jump to:
Description
My input contains something like this
<div class="classA"><div class="classB">Content</div></div>
which upon being processed gets turned into
<div class="classA classB">Content</div>
The only issue is that my XSL template setup to put anything in classA into the node body now does not work. So I am wondering how to either:
- Change my xsl to cater for this
or (seems easier)
- try and change the HTML Tidy config so it doesnt join the classes (I assume HTML Tidy is the culprit). The config option 'join-classes' is set to false by default, and I can't see anywhere in the code it is enabled.
Any ideas? Thanks very much in advance :)

#1
Sorted it out - I added
merge-divs: noto the xhtml_tidy.conf#2
#3
Wow. Never seen that before.
I'm pretty sure that none of my XSL would have tried to do that. It may have accidentally, but ??
Turn up the debug volume (somewhere in advanced settings - in the -dev branch at least) and confirm it's htmltidy.
If so, the way to pass config options is different depending on library or commandline versions. But no, it shouldn't be getting changed (never heard of the option before, myself)
#4
I turned up the debug to 4 in the .module file (im using 1.2 as .dev has unicode issues), and it was definitely at the html tidy stage. For some reason i get switched to commandline. But everything is ok now
I think merge-div: yes is automatically set when 'clean' is set to yes, but I may be wrong :)
ps the more i look at the inner workings of this module, the more impressed i am
#5
debuglevel 4 is really noisy ;-)
There is LOTS of debug scaffolding left in there from my own attempts to trace the process(s). One day I should pull some of that out. a good third of the code is docs or debug - at least!
but I guess you are seeing all the steps it takes to massage arbitrary unknown input into something useful!
Any suggestions for making the process and structure easier to follow (or debug) would be helpful. It's grown a bit without too much architectural plan, (first - make it work, then make it clever) although I've tried hard to pull as many of the exceptions and quirks out. Abstracting per-module support into the inc files was a recent rewrite. You should have seen it when it was all-in-one!
And yeah, I think I'm lost on the UTF thing. I also think I fixed it eventually with a vicious conversion to numeric entities. But I'm probably wrong.
I'll try to fold that conf fix into the code sometime.
.dan.