Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hi,
I was under the impression that HTML Purifier would correct my HTML issues (according to standards) however, it seems it removes almost everything including images and Google maps.
Is there any way of treating this or is this just normal behaviour?
Comment | File | Size | Author |
---|---|---|---|
#43 | html-one.png | 24.35 KB | Anonymous (not verified) |
#20 | iframes.patch | 1.5 KB | devkinetic |
Comments
Comment #1
sp_key CreditAttribution: sp_key commentedUpdate:
Articles I've written that contain images are stripped as soon as I install html purifier (without even enabling it).
Going to input format I can see purifier is not enabled for all profiles, still I have no images on my site. As soon as I uninstall the module my images get back.
Any help or suggestion would be hugely appreciated!
Comment #2
ezyang CreditAttribution: ezyang commentedHi!
External images are disabled by default, since Drupal's default HTML doesn't allow any images at all. You can turn them back on by setting "DisableExternalResources" to No.
As for Google Maps, it utilizes iframes, which are disallowed by HTML Purifier for obvious security reasons. There are some ways to work around this, but it will require writing a nominal amount of code. I can teach you how to do it, but it'll be kind of nontrivial if you've never written PHP before.
Cheers,
Edward
Comment #3
ezyang CreditAttribution: ezyang commentedComment #4
sp_key CreditAttribution: sp_key commentedezyang,
Many thanks for your response.
I can confirm that turning DisableExternalResources off does indeed allow us to embed images from external sources.
With regards to Google maps, can you please think of a workaround that would allow us to use them?
I'm afraid my knowledge of PHP is simply null. I have the confidence of pasting some code to a document but writing my own code...
Any alternative approach would be extremely appreciated!
Cheers
Comment #5
ezyang CreditAttribution: ezyang commentedWould a SafeIframe functionality work for you? This would require you to explicitly whitelist domains that you'd want to allow iframes from.
Comment #6
bryancasler CreditAttribution: bryancasler commentedI would like to second the SafeIframe whitelist idea. I think being explicit about who you trust is a perfect solution.
Comment #7
sp_key CreditAttribution: sp_key commentedSounds an excellent idea.
Can you show me the right direction? I need to see a few examples, maybe an article or a drupal resource?
Many thanks!
Comment #8
ezyang CreditAttribution: ezyang commentedRenamed.
Comment #9
ezyang CreditAttribution: ezyang commentedWe also need this because YouTube changed its embed code to use iframes. I need some UI advice from you guys: what kind of whitelisting mechanism do you want? Domains? Regexes? Arbitrary code? If we allow multiple whitelisting mechanisms, how do they interact with each other?
Comment #10
bryancasler CreditAttribution: bryancasler commenteddomain whitelisting would work to solve issues for non-mainstream websites.
Example Embed Code
ex www.democracynow.org
Comment #11
gesko CreditAttribution: gesko commentedI have some problems to embed Amazon banner code:
<iframe src="http://rcm-de.amazon.de/e/cm?t=xxxxxxxxxxxx&o=3&p=20&l=ur1&category=generic&banner=1VH46RJT28QKG4Q5HM02&f=ifr" width="120" height="90" scrolling="no" border="0" marginwidth="0" style="border:none;" frameborder="0"></iframe>
I think domain whitelisting would be great.
Any other ideas on how I could embed this code to a block with htmlpurifier turned on?
Comment #12
kevinquillen CreditAttribution: kevinquillen commentedI had to write Filter for HTMLPurifier, and tell HTMLPurifier module to add the filter in the config:
http://stackoverflow.com/questions/5144189/htmlpurifier-iframe-regex-iss...
Now I can embed Google maps and other iFrame content.
It would be nice to add a domain whitelist, so iframes would be allowed if the source was Google, Youtube, Vimeo, etc.
Comment #13
mgiffordYa, this is annoying. I had trouble embedding Youtube videos with this module enabled.
Comment #14
ParisLiakos CreditAttribution: ParisLiakos commented@Kevin Quillen:
can you provide more specific details on how you managed this?
I did what you mention on stack overflow, also read the HTMLPurifier forum but it wont work :(
Comment #15
kevinquillen CreditAttribution: kevinquillen commentedIn the HTMLPurifier module you also have to add to _htmlpurifier_get_config():
$config->set('Filter.Custom', array( new HTMLPurifier_Filter_MyIframe() ));
Comment #16
kevinquillen CreditAttribution: kevinquillen commentedI know I should not hack the module, unless there is a hook I simply did not see.
It might be best to have the _config function invoke a hook so other modules can set their own filters or other HTML Purifier settings through code.
In my case, I cannot enable Advanced mode with the iFrame plugin (PHP error, something about it cannot render it in the form). So I had to adjust the module. Is there any other way to change settings through code without editing the module? I could not get the format specific config file to work.
Comment #17
ParisLiakos CreditAttribution: ParisLiakos commentedThanks a lot Kevin!!
I know that i shouldnt hack the module as well,but my client cant wait :/
So i just add this to my list with hacked modules to watch out on upgrades
Comment #18
devkinetic CreditAttribution: devkinetic commentedKevin,
Can you provide a more detailed explanation?
I added:
$config->set('Filter.Custom', array( new HTMLPurifier_Filter_MyIframe() ));
to _htmlpurifier_get_config()But I'm unsure where to add the snippet from http://stackoverflow.com/questions/5144189/htmlpurifier-iframe-regex-iss....
Thanks!
Comment #19
ParisLiakos CreditAttribution: ParisLiakos commenteddevkinetic
i added it to HTMLPurifier_DefinitionCache_Drupal.php and it works perfectly:)
Comment #20
devkinetic CreditAttribution: devkinetic commentedUPDATE: The issue i was having was the line break converter in Drupal was wrapping the iframe in a P tag. The code was working correctly, but because the block element was placed within the inline p tag, Purifier was stripping it out anyways because it was invalid HTML.
Here is a patch file that is comprised of the suggestions in this thread.
back to the main point though, a safe-list sounds like the best bet!
Comment #21
ParisLiakos CreditAttribution: ParisLiakos commentedyeap +1 for domain whitelisting
Comment #22
El Bandito CreditAttribution: El Bandito commentedAnother +1 for whitelisting.
Cheers
El B
Comment #23
kevinquillen CreditAttribution: kevinquillen commentedI was able to get this working in 7.x but only briefly. Returning to the Text Format config form for any format utilizing HTML Purifier results in the following PHP error:
Object of class HTMLPurifier_Filter_MyIframe could not be converted to string";s:9:"%function";s:49:"HTMLPurifier_Printer_ConfigForm_default->render()
The page is not editable as it just says a generic Error message. It also points to line 266 of ConfigForm.php in the Printer library of HTMLPurifier:
Commenting out $value makes the form show up.
What are some possible solutions to this problem? Is it the plugin code, or the way it is trying to be interpreted? Casting (string) on the imploded value there also makes the form re-appear, though I do not know what implications that has on the library.
Comment #24
kevinquillen CreditAttribution: kevinquillen commentedComment #25
btopro CreditAttribution: btopro commented4.4.0 of html purifier now supports safeiframe
Comment #26
ezyang CreditAttribution: ezyang commentedFixed. You need HTML Purifier 4.4.0, and you need to access the "Advanced Settings" (as they are not shown in basic settings.) The configuration you need to set is: turn on HTML.SafeIframe, and fill in URI.SafeIframeRegexp with the necessary values. Here is an example that allows YouTube and Vimeo:
%^http://(www.youtube.com/embed/|player.vimeo.com/video/)%
Don't forget to add iframe (and the necessary attributes) to your allowed elements list, if you are manually configuring this.
Comment #27
kevinquillen CreditAttribution: kevinquillen commentedThis still doesn't work. It gets stripped out.
Comment #28
btopro CreditAttribution: btopro commentedif you have the remove empty items then yes it will. To fix this, give the iframe a name property and purifier will ignore it's remove empty things rule. You might also have to refresh the page after save as I've noticed I have to do this all the time after new-ly saving the node (6.x but should still be the same).
Comment #29
kevinquillen CreditAttribution: kevinquillen commentedIts so super confusing. I turned those off and cleared the cache, but iframes did not show up until the 8th reload. Why is that?
Comment #30
oriol_e9gI have tested this and you have to touch more things.
1. You need: RemoveEmpty: No
2. If you have: RemoveEmpty.RemoveNbsp: Yes, then you need to add > RemoveEmpty.RemoveNbsp.Exceptions: iframe
3. If you use HTML Allowed > You need to add here: iframe[frameborder|marginheight|marginwidth|scrolling|src]
4. Put SafeIframe: Yes and I use for SafeIframeRegexp:
%^http://(www.youtube.|player.vimeo.|maps.google.|www.slideshare.)%
Comment #32
theMusician CreditAttribution: theMusician commentedThis does not appear to work with 7.x-1.0-rc1 of HTML Purifier. I am using 4.4.0 of the HTMLPurifier library.
I wish to embed the following video.
http://www.youtube.com/embed/e3OthM-seJs?wmode=opaque
My Settings:
SafeIframe: Yes
SafeIframeRegexp: %^http://(www.youtube.|player.vimeo.|maps.google.|www.slideshare.)%
RemoveEmpty: No
RemoveEmpty.RemoveNbsp: No
I have added the following to AllowedFrameTargets:
_blank
_self
_top
_parent
I am using the default allowed HTML.
The output src attribute of the iframe is stripped out when I use HTML Purifier, however with Full HTML allowed I can see that the src that is output is as follows: //www.youtube.com/embed/e3OthM-seJs?wmode=opaque
I am guessing the regex is incorrect but everything I have tried is not working. The src link is being created by the media module filter that runs before HTML Purifier.
Any ideas as to why I cannot get a video to appear? If 7.x-1.0-rc1 does not support this where can I grab the 2.x-dev version?
Comment #33
theMusician CreditAttribution: theMusician commentedI tried this in another environment and have the same results. No YouTube video is shown. The src attribute is stripped out upon save when using HTML purifier.
Comment #34
heddnI tend to agree its a regex thing. Can you confirm if this an upstream library issue? If so, I'll point you to http://htmlpurifier.org/phorum/list.php?3.
Comment #35
theMusician CreditAttribution: theMusician commentedI have tried this with the standalone PHP library and it works great. The settings in the code block match what I have in Drupal.
I followed this thread, http://htmlpurifier.org/phorum/read.php?3,6237,6237#msg-6237 to set up the standalone version.
Comment #36
heddnMake sure that none of the other filters, including core's html filter don't break what is going on with htmlpurifier. Disable all the other filters and see if it still doesn't work...
Comment #37
heddnComment #38
theMusician CreditAttribution: theMusician commentedI apologize for the delay. I am only using videos on a few areas of this site. I have been using the default full html text format for the moment.
I turned off the two other filters, image resize and convert media tags to markup. I also tried it with one off the other one for both combinations with no luck. However, perhaps the media tag markup upon conversion is messing with HTML purifier. I am converting the media tags first in the filter processing order and html purify is running last.
If I switch that order and have the media tag markup filtered last the videos are output correctly. I am guessing this just avoids the regex check applied by HTML Purifier.
If it helps in diagnosis, the media markup that is output if I do not convert the markup with the media module's filter is as follows:
Video 1:
[[{"type":"media","view_mode":"media_large","fid":"33","attributes":{"alt":"Intro.mov","class":"media-image","typeof":"foaf:Image"}}]]
Video 2:
[[{"type":"media","view_mode":"media_large","fid":"35","attributes":{"alt":"WWU Summer Commencement 2011","class":"media-image","typeof":"foaf:Image"}}]]
For now, I will keep the order swapped as it works and the media sources are currently vetted before being posted. Thank you for the help.
Working Filter Order on text format
Comment #39
heddnGlad that worked for you.
Comment #41
Anonymous (not verified) CreditAttribution: Anonymous commentedHave you noticed a problem where the embedded media is wrapped in paragraph tags and mucks up the market when source is viewed?
Comment #42
Anonymous (not verified) CreditAttribution: Anonymous commentedGoing to reopen this b/c I having the same issue.
Yes I understand the fix is to run the Convert media filter after htmlpurifier, however that still isn't optimal since you lose out on stripping the automatic paragraph tags that are wrapped around your embedded content.
To reproduce, use wysiwyg, media + media_youtube, and htmlpurifier. Remove all filters except Covert media and htmlpurifier. Run 1) convert media before htmlpurifier then 2) htmlpurifier before convertmedia.
Create some content and insert a youtube video with media. You'll notice in the first case the iframe renders but the src and embedded markup do not exist so you get a blank square, in addition there are no P tags around the iframe's container div if you view source. In the second case, the iframe and video are rendered correctly, but viewing source you see a pair of empty P tags above and below the iframe container div.
Any ideas?
Comment #43
Anonymous (not verified) CreditAttribution: Anonymous commentedHere are two images to illustrate my previous post.
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedHere are two images to illustrate my previous post.
Comment #45
trkest CreditAttribution: trkest commentedThank you oriol_e9g - this works for me!
Comment #46
hawkeye.twolf#30 worked for me too (Thanks, oriol_e9g!) but note that I had to clear caches after making the configuration changes. Probably just clearing the HTML Purifier cache at admin/config/content/htmlpurifier should suffice.
Not recommended, but you can allow content from all sources by using
%^.*%
in the SafeIframeRegexp field.Comment #47
ergow CreditAttribution: ergow commented#30 it doesn't work for me I loose image, youtube and vimeo video. I'm using HTML purifier 7.x 1.0, and HTML Purifier v4.5.0. Should I to change to HTML purifier 7.x-2.x-dev?
Core is 7.23.
Thanks a lot!
Comment #48
csuggs4 CreditAttribution: csuggs4 commentedI've had some success with the #30's steps, plus the following regex for the SafeIframeRegexp:
%^(https?:)?//(www\.youtube(?:-nocookie)?\.com/embed/|player\.vimeo\.com/video/)%
This way it accommodates for a src that starts with "//", and also if you have http or https. I got it from the HTMLPurifier documentation.
Now, I said "some" success. I'm using the Media embed toolbar button. I can get the video to embed, and it shows when I view the node, but if I go to edit it again, all the iframe stuff gets stripped from the field. Has anyone else had this experience?
Comment #49
k.dani CreditAttribution: k.dani commentedSame problem. Just one more thing. If I disabled and re-enabled CKeditor on node edit form, the iframe is removed the same way as it is removed when someone is editing an existing node.
Comment #50
heddnBased on the conversation involved here, this is more of a support question. Not a bug.
Comment #51
gisleSince this is still active, I just point to an alternate solution for whitelisting (or "puryfying") HTML while allowing iframes to be embedded, but only from whitelisted domains.
This is the WYSIWYG filter + Src whitelist text filter.