Closed (outdated)
Project:
Drupal core
Version:
6.x-dev
Component:
base system
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
2 May 2009 at 04:31 UTC
Updated:
2 Mar 2016 at 22:18 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
loze commentedsubscribing. Just noticed this happening after upgrading to 6.11.
Comment #2
ainigma32 commentedEven drupal.org is outputting the content-type meta tag twice...
drupal_final_markup was added to prevent certain attacks (see the diff here http://cvs.drupal.org/viewvc.py/drupal/drupal/includes/common.inc?r1=1.7...)
As for the patch; I suggest removing the first line of the function as well.
After that, set this issue to patch needs review so the your patch will be tested automagically.
- Arie
Comment #3
damien tournoud commentedThis was introduced by SA-CORE-2009-005 on purpose, to prevent non updated themes to cause issues. The duplication of the meta content-type header is harmless.
Comment #4
Anonymous (not verified) commentedHarmless but inelegant.
Comment #5
gábor hojtsyYes, it is inelegant, we (at the security team) all agree it is inelegant, ugly, or whatever you want to call it. We are not going to argue over it. Unfortunately, we found most themes vulnerable to this bug, so we needed a fix which solves the vulnerability with all themes regardless of their release schedules. Thus we needed to make sure that the content type value is output no matter what. We ensured that the output still validates against the W3C validator and that we found no standard saying we should not output the same meta tag two times. We also looked at the output in various browsers and found that it had no issues in practice.
Comment #6
jcnventuraJust one thing.. It seems that there's this thing called FastCGI that seems to have an issue with duplicate content headers. See #462336: TCPDF: Pdf generator Problem with Fastcgi for more details.
I would actually prefer if we could apply the patch in #1 with the removal of the first line as suggested in #2 to remove one of the content-type lines, at least to Drupal 5 and 6.
Knowing how drupal_final_markup() is working I see no conflict in applying this patch and still complying with all the security concerns expressed in #5.
João
Comment #7
gábor hojtsyjcnventura: First, did you verify that when you remove one of the content type headers, it is indeed working right with TCPDF?
Comment #8
peacog commentedMy website delivers HTML4, not XHTML, so this new feature means that my site doesn't validate because of the trailing slash. Is there some configuration setting that tells Drupal whether to output HTML4 or XHTML, so that drupal_final_markup() can check which to use?
Comment #9
gábor hojtsyDrupal outputs XHTML, and the suggested XHTML notation for HTML backwards compatibility (from the W3C) is the slash separated with a space before the closing >. Drupal hardwires outputting XHTML in so many places, that it is hard to believe that you managed to output straight HTML without heavy hacking.
Comment #10
damien tournoud commented#462336: TCPDF: Pdf generator Problem with Fastcgi is unrelated. The "duplicate" headers in the error message is not the HTML one, but a true HTTP header.
Back to "by design".
Comment #11
jcnventuraGábor: I am the print module maintainer, that bug was reported by one of the users, so no I can't try to reproduce it with only one tag.. For me TCPDF works fine with one or two tags.
The problem is with FastCGI which is stupidly aborting when it detects a duplicate header. That's a known problem and is already solved in their CVS snapshot (of November 2008).. If you follow one of the links in the other issue, there's a patch there which simply removes the duplicate detection code to avoid that problem. (http://www.fastcgi.com/archives/fastcgi-developers/2008-September/000048...)
However, I would guess that most host providers which use FastCGI will be using the official release version of 2007 and not the CVS snapshot, so it may be that Drupal itself may run into problems in that setup. On the other hand, I don't understand how he can be running print 6.x-1.7 (which because it uses drupal_final_markup() requires D6.11 or later), and not have the same problem with core Drupal...
Anyway, how bad is it to commit the patch in this thread and have a single Content-Type tag??
João
Comment #12
peacog commentedI didn't have to do much hacking, just a couple of lines in phptemplate_preprocess_page to replace '/>' with '>'. But I will no doubt run into more problems as I add more modules. I'm using HTML4 for historical reasons because I ported an existing site to Drupal. I guess it's time to bite the bullet and move to XHTML.
Slightly off-topic, are there any plans to support HTML5 if it ever takes off? At that stage there might be a need for a Drupal setting to allow a choice between XHTML or HTML5.
Comment #13
damien tournoud commentedI take it that jcnventura cross-posted.
Comment #14
gábor hojtsyjcnventura: The patch you posted deals with HTTP headers, not meta tags in HTML. Therefore looks unrelated.
Comment #15
jcnventuraYes, I cross-posted..
And indeed, you're right, that's an HTTP header, so I very much doubt that he'll be able to generate PDFs as it is. I will investigate the issue further with the user on the other queue..
João
Comment #16
jcnventuraYes, I cross-posted..
And indeed, you're right, that's an HTTP header, so I very much doubt that he'll be able to generate PDFs as it is. I will investigate the issue further with the user on the other queue..
João
Comment #17
ao2 commentedHi, I also came across this one.
For me how the fix is implemented is actually an issue because I would like to serve my pages with content type application/xhtml+xml and having two conflicting content types is not correct, not only inelegant, which I could well live with.
I can't even
preg_matchin myTHEME_preprocess_page()hook becausedrupal_final_markup()seems to skip all the output mangling paths.Could you please suggest a way to work around that? Otherwise I will have to hack the core (sounds like a mild menace, eh?) :)
Thanks for the attention.
With Kind Regards,
Antonio
Comment #18
markus_petrux commentedSorry, to re-open, but I disagree on the status and I think it could be easily fixed. :)
Any reason for not fixing drupal_get_html_head() ?
s/harmless/hack-ish, I would say, and it could be easily fixed with the patch provided on top of the issue. It is completely unnecessary that drupal_get_html_head() prepends the Content-type meta while this job is now done by drupal_final_markup().
As a follow up to #17, I would say drupal_final_markup() would have to check if a Content-type meta exists in the header and ensure it is placed on top of it. So maybe drupal_get_html_head() does not need to be fixed, but add a bit more logic to drupal_final_markup(). And maybe even it wouldn't have to be restricted to certain values of $hook ('page' and 'book_export_html'). Think about alternative templates to 'page' that are used to render iframes, etc (several contrib modules do). So I think the issue status would have to be switched to "needs work" and let's try to elaborate on this please.
The security issue was fixed, so now that we have (relative) peace, I think we can try to think about a better approach.
Comment #19
gábor hojtsy@ao2: the reason we implemented it in a way that themes cannot override is exactly to ensure that it is always present first.
@markus_petrux: the reason we left the meta in drupal_get_html_head() is that we thought we cannot be 100% sure that the regular expression matches (however simple that regular expression is), and we still need to have that meta information on the page. If you can find a way which totally ensures that all themes are fixed regardless of their state, then that would be great.
Comment #20
markus_petrux commentedSince Drupal 6 has no API for using a different document type (as requested by #17), one possible approach would be to apply the patch in #0. That would mean Drupal 6 is still forcing the eixstence of the content-type meta, but injected on a different place. ie. what was doing drupal_get_html_head() to the header output, it now would be done from drupal_final_markup() only, but not on both places, which is the cause of duplicated meta tags for content-type.
Also, at the bottom of theme(), where drupal_final_markup() is invoked, I would remove the condition that triggers the call to drupal_final_markup(), so a mini-patch for discussion would look like this.
Changes to theme.inc:
Changes to common.inc:
And if you wanted to provide support for different mime types (rather than forcing 'text/html'), then Drupal could provide a $conf variable or something else than can be defined by the theme. But this part would really be a "feature request". Hence, probably something to take into account for D7, maybe.
Comment #21
gábor hojtsy@markus_petrux: it is totally possible that some theme function is used to generate arbitrary XML or other markup. That could easily contain markup which matches
<head[^>]*>or have<head>with a different meaning then HTML. Also, as said, we cannot ensure that<head[^>]*>is actually a character sequence present in all themes, so we cannot say we ensured that the meta tag appears, if we only rely on that being matched. Plus the regular expression is purposefully case insensitive. You now put a case sensitive strpos ahead of it, so themes using<HEAD[^>]*>would not work. We went through discussing these options in the security team, and were not at all happy to come out with the solution as we did, but we had the duty to try and ensure security with all custom themes, whoever and however did them.Comment #22
markus_petrux commentedhmm... I see.
Maybe drupal_get_html_head() could add a placeholder instead of the content-type meta tag, and that could be used in drupal_final_markup() to catch up it's a page template that was fine with that content-type meta tag, so it can be now safely added by drupal_final_markup().
Se the mini-patch for discussion would look like:
Changes to theme.inc:
Changes to common.inc:
Now, you can be sure strpos() will match.
Comment #23
markus_petrux commented#22 follow up:
The string '<!--head-->' could be replaced by something a bit more esoteric to ensure it's unique enough.
...or if we don't find a way to remove the check in theme(), then just apply the patch in #0, so we don't output more than one content-type meta tag?
Comment #24
ao2 commented@Gábor: I understand why you did this way, I just didn't like the hardcoded content-type. But IIUC this is a non issue for D6 since the content-type is hardcoded in other places as well, and this discussion is Off Topic here anyway. So I'll have to hack the core for the time being. I just wanted to be sure :)
Opening a feature request for D7 (#475242: Support Content-Type different from text/html).
Sorry for the noise.
Regards,
Antonio
Comment #25
samuele commentedI have solved with this code in page.tpl.php file
Is there a way to do it with a function in template.php file?
Comment #26
ao2 commented@samuele, something like:
Comment #27
gerritvanaaken commentedThank you for the quick fix @samuele and @ao2!
This needs to be fixed in a proper way, because the HTML 5 validator throws an error with this. In this particular case the validator might be wrong, because it’s not a real error to have two identical meta tags, but still ...
Comment #28
shelleyp commentedHaving to echo the above, as I found the additional meta tag was added when trying to validate HTML5. It would also be nice to be able to provide our own input with this, to meet differing standards.
Note, you can fix one of the meta tags using template.php, but not the other. The one that comes before Title. The only option seems to be to crack the core code, which I won't just to validate.
Comment #29
jgadrow commentedTo paraphrase this issue using an old commercial as a base:
Themer: Hey! You got some of your model in my view!
Developer: Hey! You got some of your view in my model!
Both: *munch* Whoa! This tastes like utter crap!
The argument about allowing the extra output in html_get_head is a moot point. If you NEED to place this call into drupal_final_markup (because you can't trust html_get_head to protect you from the vulnerability) then if the theme developer has done something screwy enough to mess with your regular expression, you still haven't guaranteed that the design has the content-type as the first line of the head. This will still render sites utilizing that theme vulnerable to attack. Thus, you probably should not be claiming that you have 'solved' this vulnerability issue.
While I agree that security is critical, I'm conflicted because I also believe this violates MVC. Clearly, if the theme is where the issue lies, why should it be the model's job to fix it? Granted, there are multiple examples at how Drupal already breaks the principles of MVC, but that's another argument. At most, Drupal should put out a warning for site admins to check that their themes are outputting a content-type and place the responsibility for this back where it belongs: with the theme designers. If a site admin is doing their job, they should be looking for news updates from Drupal anyways and if they decide they can't take 30 seconds to view the source of their page and verify that they see a content-type output, they deserve to be hacked!
However, I should state that I'm very against 'Big Brother' tactics like this, the seat belt laws, and pretty much anything ever made "mandatory" for my personal safety. So take my opinion with a grain of salt. ;)
Comment #30
damien tournoud commented@jgadrow: this is only true in Drupal 5 and 6, where is was not pratical to update all the themes distributed on drupal.org *and* all the custom themes deployed in the several hundred thousands of websites around the world. This ugly regexp is not in Drupal 7.
And by the way: (1) Drupal is not MVC (I'm still not sure what MVC means in the context of a web application anyway), (2) drupal_final_markup() is part of the theming layer, so it could be argued that it is part of the View (at least in the definition of MVC you seems to have).
Comment #31
jgadrow commentedWonderful that it's not in Drupal 7. So, it's only in the currently supported, stable systems. Which is, what? MOST Drupal sites then?
And I noticed that you completely skipped the major statement that I made: the fact that this vulnerability STILL exists because you can't be 100% positive that theme designers have:
1) Placed the $head variable at the beginning of their tag (or placed it at all for that matter)
2) Not done something stupid to mess up the regular expression in drupal_final_markup
So, you're forcing themers to remove an extra declaration of the content-type (or just look inelegant and inept) on the basis that this allows you to 'solve' a security vulnerability. And it STILL does not solve the vulnerability 100% of the time.
Once again, why not just leave the declaration out of html_get_head and put out a security warning advising administrators to check the output of their sites? Site admins can then alter their themes or not (at their leisure) to fix the vulnerability. You're not forcing themers to overcome a kludgy 'fix' to a problem that may not be affecting their theme anyways. Since themes are separate from the Drupal core, future updates will not invalidate a change made by a site admin meaning they make the change only once. And if a new version of their theme comes out and they overwrite their current theme. They should either already know to check the output for the content-type, or, the theme is being actively developed and the theme developer should have already altered their theme to remove the vulnerability. Note: If they haven't that's a good indicator to go with a different theme developer!
Additionally, there's the added performance hit (albeit a small one) of constantly telling PHP to remove that extra declaration that the theme I'm using didn't need in the first place. Or the extra bandwidth being consumed by the unnecessary characters if it's left in. Some people ARE charged for bandwidth, so, technically, there IS a cost (and an actual, physical cost at that!) associated with leaving the extra declaration over-and-above future semantic compatibility. So, the statement that "declaring it twice is harmless" is also misleading.
And, as for a definition of MVC:
Model-View-Controller - An architecture for programs that have graphical user interfaces. The architecture separates those parts of the program that handle the application data (the model) from those that present that data to users (e.g. graphically, on the screen) (the views) and from those parts that respond to changes made to the data (e.g. by the user manipulating the data presented on the screen) (the controllers). The architecture defines the ways these parts of the program communicate. The parts of the architecture have high cohesion and they are connected in such a way as to give low coupling.
That definition is just as suitable for web applications as it is for 'standard' applications. And I don't think you're allowed to state that Drupal is "not MVC" and then make a claim that it still follows "MVC." It's fine for a program to NOT be MVC. I had already stated in my previous post that Drupal is not truly MVC (try making your theme output an XML document utilizing XSL(T) instead of (X)HTML and serve it correctly as text/xml without hacking core... I dare ya! lol) but has tried to, largely, not interfere in the presentational layer. However, to claim that it's not MVC and then state that it could be argued that it conforms to it is, at best, contradictory.
Also, I would like to state that I'm not 'upset' or 'angry.' I know Drupal has limitations (just like any other system does) but I do enjoy in participating in intellectual debate. So, please, I'm not intentionallly trying to step on any toes here, only give a new opinion. My native language just happens to be sarcasm. :)
Comment #32
damien tournoud commented@jgadrow: according to your definition, the whole theming layer (and more) is part of the "View", that includes drupal_final_markup(), making your point in #29 moot.
Comment #33
jgadrow commentedI have already yielded on the grounds of MVC (see #31) as it's not terribly important. However, if you insist on making this 'by design' is there a known issue section for Drupal? I believe you should add one in that case:
Known issue:
If you are running Drupal 5 or Drupal 6 and the theme that you're using has not output the $head variable, has not supplied its own content-type declaration, and has irregular code that causes the regular expression in drupal_final_markup to fail, you will be vulnerable to an encoding type attack. Additional errors may crop up in some instances if your theme is not removing an extra content-type declaration supplied by Drupal in an attempt to safeguard MOST systems from this vulnerability.
Comment #34
design4effect commentedWhether it validates or not, having duplicate headers is a sloppy fix to anything.
There is no way to avoid putting the Content-Type meta before the TITLE tag when dynamically generating titles with user content.
If no content type is set, IE tries to interpret this:
as UTF-7 which is this:
For a full explanation please refer to: http://code.google.com/p/doctype/wiki/ArticleUtf7
So that is not an issue.
The issue is:
- drupal_get_html_head() [/includes/common.inc] tacks it on to the top of the headers.
- then, drupal_final_markup($content) [/includes/common.inc] adds one immediately after the HEAD tag (if there is a HEAD tag)
The two functions above are called in the given order. The result... is 2 identical headers.
So
- Drupal always adds it to the top of Headers
- Drupal adds it after HEAD only if there is a HEAD tag
My solution:
If there is a HEAD tag
- If there is a CONTENT-TYPE tag, remove it
- Insert a CONTENT-TYPE tag immediately after the HEAD tag
I changed the drupal_final_markup($content) function by adding two lines at the top as follows:
Which also removes the trailing space to keep the code good looking.
- Kent
Comment #35
rzelnik commentedsubscribe
Comment #36
jantoine commentedsubscribe
Comment #37
hass commented+
Comment #38
robert-hartl commentedsubscribe
security issue is okay, but for themers with standards in mind there must be a simple way to influence this. Framework!
Comment #39
rlmaltais commentedI don't have an elegant solution for this, so take this with a grain of salt. This fix just 'feels' wrong to me.
1. It is fixing a security issue in IE, not drupal.
2. It is forcing output that may not be desired.
I don't know a whole lot about charsets, but I did notice this in the link from #34: "Declaring an incorrect charset can be worse than not setting one at all."
To be honest, I never would have noticed if I wasn't working with HTML5. I generally skip over this part of the header when I view source.
Like I said, I don't have a good solution for this, my current one is to disable the action in common.inc and code my themes correctly. It's a pain, but it works.
Comment #40
naxoc commentedSubscribe
Comment #41
asak commentedsubscribing...
Comment #42
ianchan commentedsubscribe
Comment #43
summit commentedSubscribing
I added http://drupal.org/node/451304#comment-1711102 to my page.tpl.php and it worked.
But I also want the title tag to be the first and not the content line, is this possible using this core-patch. Will that ever be committed?
Hopefully no security issues doing this?
greetings, Martijn
Comment #44
sreynen commentedAfter reading this thread, the two meta tags don't seem to actually be "by design," but rather just one way to accomplish the security goal here. Seems like it should be possible to find a solution that accomplishes that goal and does not involve duplicate content types. Would be nice if someone in the "by design" camp would comment on the general idea of removing existing content types before inserting the new tag in drupal_final_markup().
Comment #45
hedac commentedamazing silly bugs drupal has...
oh well...
Comment #46
kimblim commented..unless you care about the HTML output of your page. Come on, outputting it twice is a hack and you know it - why not check if it's already been set and then decide whether or not to output it?
Comment #47
summit commentedAgree, also out of SEO-purposes it is better not to show it twice!
Greetings, Martijn
Comment #48
damien tournoud commentedWe output valid HTML (nothing in the (X)HTML specification says you should only have one of those meta-tag.
But yes, it is a hack. It was made to make everyone safer and it will be gone in Drupal 7. Overall, not a big deal at all.
Feel free to come up with an efficient code to "check if it's already been set and then decide whether or not to output it". I would be happy to review this patch.
You should fire your SEO consultant.
Comment #49
kimblim commentedValid HTML is not necessarily good HTML! Layouts built with tables validate, that doesn't mean that it's good HTML... The HTML validator service is a tool, not a stamp of approval.
Comment #50
Nagyman commentedThis is exactly how I came across this. I was generating a NewsML feed, which defines a
<HeadLine>tag. Mysteriously, a meta tag was being injected into the content. Ugh.Comment #51
MatthieuP commentedsubscribe
Comment #52
mattiasj commentedsubscribe
Comment #53
gollyg commentedsubscribe
Comment #54
gollyg commentedClearly there are some valid reasons for putting in this code, but it is (as everyone seems to agree) an inelegant hack. The people complaining here seem to be those who actually value their html validation (I include myself amongst them).
A major issue seems to be the lack of an option to disable the behaviour, and the way that this update has silently broken validation on, ironically, any site that is concerned enough about security to apply that upgrade. This is a slight betrayal of trust which is why it is evoking a strong response. I am not criticising the security team - they do an amazing job and I would rather they do things like this than not.
I have two ideas that may be worth discussing.
And on the issue of xhtml output, there are valid reasons to use HMTL 4.01. It would be great to have a theme level setting that triggered the empty tag syntax. I know, scratch my own itch ;)
Comment #55
hendrinx commentedSorry - mispost
Comment #56
hendrinx commentedDid you manage to come up with a workaround? I've got a similar issue when generating xml, which has a tag defined in it. Cheers
Comment #57
d3zign7reak commentedSubscribing!
Comment #58
mcrittenden commentedSub. Changing to CNW based on #48:
Please don't change back to 'by design' prematurely. Yes, it was originally done by design, but it's a hack and it's a pretty big DrupalWTF for D6 at the moment. Let's see if somebody wants to try and tackle this the right way.
Comment #59
brianmercer commentedsubscribed
Comment #60
sartogo commentedJust a small tweak to #25 to still give out the character encoding for html 5
You still need to edit drupal_final_markup($content) [/includes/common.inc] like this:
-- Alex Sartogo
Hot Sauce Studios
Comment #61
laja commentedsubscribe
Comment #62
denis_sle commentedIt's possible to solve the problem on theming level (without patching the drupal core)
Something like next line in your page.tpl.php (in used theme folder)
Comment #63
mcrittenden commentedRe: #62, first of all, something like that should go in template.php, and also, won't your example remove all instances of it?
Comment #64
ao2 commented@denis_sle, please see #26, and yes, with this _workaround_ you still need to emit your own content-type in page.tpl.php
Bye,
Antonio
Comment #65
sreynen commentedIn an effort to move a fix forward, I'm going to try to summarize the constraints such a fix would need to work under. Please correct any mistakes here:
Is that all correct?
Comment #66
danny englandersubscribing
Comment #67
System6Hosting commentedSubscribe.
Comment #68
antiorario commentedsubscribe
Comment #69
ayalon commentedstill an issue...
Comment #70
verikami commented...issue... and annoying thing...
I have a question for security team - what happens if we add this declaration to php header - as far as I know it takes precedence over the html declaration...
...is UTF-7 injection (in ie) is the case then...?
Comment #71
sobi3ch commentedsubscribing
Comment #72
codepeak-1 commentedsubscribing
Comment #73
fax8 commentedsubscribing
Comment #74
achtonsubscribing
Comment #75
sreynen commentedThis is a simple change to drupal_final_markup() that only adds the content-type meta tag at the top of head if we can't verify that it's already at the top of head. If we can verify it's already there, there's no benefit to adding it again, so we just return $content as-is.
Comment #76
perusio commentedThe patch in #75 is working correctly. Thank you.
Comment #77
intyms commentedThe patch from #75 fixes this issue.
I am using:
Drupal 6.17.
Comment #78
mcrittenden commentedRe: #75, seems like it might be a bit cleaner to only have 1 return statement, like so:
No?
Comment #79
sreynen commented@mcrittenden, I don't think that's really better, but it's pretty subjective and not a significant gain in speed or readability on either side (with the exact same functional result). If you think it's better, please post a patch so this can maybe get resolved before D8 is released.
Comment #80
mcrittenden commentedThis implements #78. Not sure what the Drupal coding style preference is on one or multiple return statements, so this may or may not be necessary, but it just looks better to me.
Comment #81
sreynen commentedLooks good to me. Since this is functionally identical to the previous patch, which had 2 positive reviews, I'm going to go ahead and move the status to RTBC.
Comment #82
gábor hojtsyBecause the security team introduced this code, I've sent a request to them to help double check.
Comment #83
pwolanin commentedLooks, reasonable - though this effectively is a minor performance hit for all sites without an updated theme, right?
Comment #84
davidwhthomas commentedSimilar to #26, this snippet will strip out only the duplicate metatag in template.php, without a core hack/patch.
Note: The assumes the tag is always output twice, which, as it's output in core is reasonable. But if a core patch is applied, this snippet will need to be removed.
*EDIT*
Here's a version that checks the string exists twice before stripping one:
DT
Comment #85
perusio commentedYes. It's a good idea. It can even be implement in a module:
Comment #86
guypaddock commentedHas anyone considered proposing a theme hook that allows the theme to certify that it correctly handles the content type? I mean, after all, the heavy-handed insertion of the content type meta tag was in response to lots of themes that exist that don't correctly set the content type. I would think that, with that in mind, it would make sense to give theme developers the option of turning off the insertion behavior simply by going the "Drupal Way" and implementing the appropriate hook.
I propose something like
THEME_content_type_is_provided().Comment #87
Kutakizukari commentedMy website will display just the source code one minute or the webpage the next. My web host says it is due of the duplicate content header tags. Which I really had three because I added one to my page.tpl.php file.
Comment #88
gábor hojtsy@Kutakizukari: did the patch solve your issue (I'm suspicious that it did not)?
@all: not sure its a good idea to preg_match case insensitively through the whole generated page content. That can easily get darn slow. Did you measure this?
Comment #89
sreynen commented@Gábor Hojtsy, this should actually be faster than the current code in the vast majority of cases, specifically cases where themes are doing things right, the cases we want to optimize for.
The existing code also does a case-insensitive search, but with preg_replace rather than preg_match. Because preg_match stops after the first match, and in the vast majority of cases a match will be found very early in the document, preg_match will almost never search the whole document. preg_replace, on the other hand, defaults to replacing all matches, meaning it currently does look through the whole document in every case.
I do think the preg_match could be case-sensitive, though, since it would still be case-insensitive on the replace. That would mean maintaining duplicate tags on invalid (uppercase) XHTML in themes, which I think is probably fine with people who don't care about valid markup anyway.
We could also apply a $limit of 1 to the preg_replace while we're at it, and speed it up even for the cases of bad themes. I'll make another patch with all of this soon if no one else gets to it first.
[edited to fix a typo]
Comment #90
sreynen commentedThis patch is essentially #80, with case-sensitive rather than case-insensitive preg_match. When I went to add a limit of 1 to preg_replace, I discovered it was already there, so some of what I wrote in #89 is wrong. Still, this should be faster than the current code in most cases now that it's case sensitive.
Comment #91
markus_petrux commentedJust a couple of minor improvements/suggestions over #90:
1) if the condition of the IF statement is reversed, then a) the patch is simpler, and b) the result of the preg_replace does not need to be assigned to $content.
2) preg_match could use a delimiter that does not exist in the regular expression, making it a bit easier to read.
The patch would look like this:
Comment #92
sreynen commented@markus_petrux, that's essentially undoing the change in #78, which is fine with me, but is starting to feel a bit like bikeshedding. Could you post an actual patch for review to keep this moving?
Comment #93
bleen commentedSubscribing
Comment #94
hellaswebnews commentedAnt help on the error below regarding printing would be appreciated.
www.hellaswebnews.com
Fatal error: Call to undefined function drupal_final_markup() in /home/.porche/myftp/hellaswebnews.com/sites/all/modules/print/print.pages.inc on line 32
array(4) { ["type"]=> int(1) ["message"]=> string(49) "Call to undefined function drupal_final_markup()" ["file"]=> string(80) "/home/.porchemyftp/hellaswebnews.com/sites/all/modules/print/print.pages.inc" ["line"]=> int(32) } n/a
Comment #95
sreynen commented@hellaswebnews, that looks like an unrelated issue that should have a separate thread.
Comment #96
dtsio commentedSubscribing
Comment #97
dragan_r commentedSubscribing
Comment #98
hillaevium commentedSubscribing :)
Comment #99
psynaptic commentedSeeing what happens...
Comment #100
psynaptic commentedOnly effects 6.x.
Comment #101
stemount commentedOnly affects 6.x.
Comment #102
momper commentedsubscribe
Comment #103
damienmckennaCould this be wrapped in a variable_get() check to allow some sites to disable it when needed? e.g.:
Comment #104
mcrittenden commented@DamienMcKenna, why would some sites need to disable it? Just curious as to what I'm missing.
Comment #105
sreynen commented@DamienMcKenna if you're going to create a variable (and I also don't see the point of that, as no one wants duplicate meta tags), you need to also provide an interface for updating that variable.
As this is currently in "needs review" status and we're well over a year in now, can we maybe focus on reviewing the current patch before we introduce more new ideas? If there's a better solution, let's state the benefits clearly and change the status to "needs work" while we work on that better solution. Or better yet, submit a better patch. But if there's just a different solution, that's not really helping move this forward.
Comment #106
TD912 commentedThis might also be affecting how favicons are displayed. The $head prints out a "shortcut icon" that does not change when I set a custom icon via the Theme settings. It always prints out "site/misc/favicon.ico". I could work around that by manually overwriting the favicon.ico file there, but I'd prefer not to.
Comment #107
hass commentedDo not highjack this case, please.
Comment #108
Kutakizukari commented@Gábor Hojtsy the host that I was using to get Drupal running I had to use Apache 1.2, PHP 5 Flex, CGI to get the file temporary directory right. I followed this https://members.nearlyfreespeech.net/wiki/Applications/Drupal.
File system settings and permissions
• All the Drupal files should belong to the web group (25000).
o All files you uploaded to the server will have your user ID for Owner and Group (A number like 165576, or "me"). The "web" group's number is 25000. You can change your files' group to web by typing "chgrp web filename" or "chgrp 25000 filename".
• sites/default/files or sites/default/yoursite.com/files should have 775 permissions, and again, belong to the web group.
• For the temporary directory, PHP's safe mode will keep it from writing to the system's /tmp directory. You should set this to your /home/tmp directory. To get the full path to this directory, visit your admin/reports/status/php page on your Drupal site and scroll down to the PHP Variables section and find the _SERVER["TMPDIR"] variable, which should look something like /f1/content/site-name/tmp/. Use that value for your Drupal site's temporary directory.
• Alternative: If the above doesn't work you might need to try creating a subdirectory of /home/tmp and chgrp it to web. Then in Drupal, go to Administer > Site Configuration > File System and update the temp directory path to use the subdirectory you just created. (Applicable to 6.16 and possibly other versions.)
Was able to get it to work with no problems running Apache 2.2, PHP 5.2 Fast, CGI.
Comment #110
mattwfx commentedWhat about delete some code from line 124 in includes/common.inc?
From:
$output = "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n";To this:
$output = "";Simple... but it is safe?
_______________
Sorry for my English
Comment #111
nedim.hadzimahmutovic commentedSmall modification to #84:
Now there is no blank line.
Comment #112
anujahiniduma commentedtry this. this will do the magic.
function drupal_get_html_head() {
//$output = "\n";
return $output . drupal_set_html_head();
}
Comment #113
adamcadot commentedsubscribe
Comment #114
kndr#111 works. Thanks for simple workaround!
Comment #115
tomass_zutis commented#111 works. Thanks for simple workaround! +1
Comment #116
benschaaf commented#111 worked to remove the second occurrence for me as well- which is great!
Is it possible for the first occurrence to output in the html5 meta tag format for charset?
<meta charset="utf-8" />Comment #117
sreynen commented@snowfox, that format is only a new option in HTML5. The longer version Drupal uses is still valid in HTML5.
Comment #118
benschaaf commented@117 Correct. They would both still be valid in HTML5 (but what isn't?). But, I would prefer to use the correct(er) more valid(er) flavor of meta tag declaration.
Comment #119
lizac commented#111 also worked for me. Is this the best solution?
Also, just for the sake of the conversation on this thread. does this issue has been solved in D7?
Comment #120
rosshj commented@snowfox
re:
<meta charset="utf-8" />Would love to know how to do this as well.
Comment #121
arjun0819 commented#75: duplicate.patch queued for re-testing.
Comment #122
ReKoNe commentedIf you have more than 2 lines displayed (which should not append) :
this is a small change from #111
Comment #123
Eduart commented#111 it works for me thank you!
subscribed
Comment #124
chinita7 commentedThanks #111 also works for me.
Comment #125
janton commentedthanks!
Comment #126
jelo commentedSorry, I did not read the whole discussion, just glanced through and am wondering if 3 years after this was reported a solution has been found that
- avoids the security issue with IE
- avoids a duplicate declaration of content type?
In 6.26 core I am still having this issue... Is #111 the agreed on solution that should be implemented in the theme?
Comment #127
aaronmeister commentedThe snippet from #122 worked perfectly. Thank you for providing all the information necessary to insert this code to fix this issue, especially for a non-coder like myself.
I'm not sure if this is a core issue or a theming issue, but I think, for the betterment of the Drupal community, this should be included in documentation and applied to core files.
Comment #128
hass commentedComment #129
perusio commentedI think that the simplest thing is just to first remove all tags and add the correct one (for preventing UTF-7 encoding attacks on IE) later:
I'll provide a patch later (I'm on a very old D6 site now).
Comment #130
damienmckenna(fixed spelling of the "content-type" tag.
Comment #131
damienmckennaThis version of the patch removes all existing http-equiv meta tags and then adds its own properly-formatted one at the top, thus ensuring that a) there are no duplicates, b) the first tag in the HEAD section will always be a correct http-equiv meta tag. Additionally, it will use the first http-equiv 'content' definition it finds, which might be useful for sites that needed to hardcode its own version, but most sites will not notice this additional step. Lastly, it removes the meta tag that was manually included via drupal_get_html_head() as that had become superfluous.
Note: the logic will still create duplicate meta tags should an existing meta tag exist with either of the following formats:
Any feedback on this approach to resolve the problem?
Comment #132
damienmckennaDoh.
Comment #133
mheinke commented#132: drupal-n451304-131.patch queued for re-testing.
Comment #134
RytoEX commentedI hate to ask at the risk of stirring the pot, but what is the status of this issue? I notice that there is a patch that is marked "PASSED" (http://drupal.org/comment/reply/451304#comment-6483206), but then it was "queued for re-testing", and nothing has happened since then (September 2012). What exactly does that mean for this issue? Thanks!
Comment #135
mheinke commentedim not exactly sure if this was solved or not.
but it should be apart of a release.
Comment #136
geerlingguy commentedI would RTBC... but this adds some regex which would be run on every page request. It won't affect load times much, but is there any other way we could do this?