When I first got into Drupal, it came with the promise that it is fully standards compliant. Well, for starters, checking the Drupal.org site can be pretty disappointing--w3's validator shows quite a few errors--and that's not even XHTML Strict! I have been checking quite a few Drupal sites lately, and to my amazement, there was not one site that fully validated!

Today, I saved the default front page of a fresh Drupal 6 installation (using the Garland theme), and validated it via upload. Again, more than 30 errors were reported. And that's the default page!

I hope there is a good explanation, but I wonder where all these syntax errors come from. Is it because of the way Drupal outputs data, or is it due to the theme? Finally, why are we using default themes that generate a lot of bad code?

I believe everyone should be concerned about web standards, and we all know why. But the fact that Drupal doesn't validate out of the box is something to be worried about.

What can a new Drupaler do to make his site standards-compliant?

P.S.

Validating Wordpress.org

Comments

heine’s picture

Beware that a browser may mutilate pages if you save them. For instance, a the default Drupal 6 frontpage saved with Google Chrome won't validate. When I save it with wget it passes validation without a problem.

yelvington’s picture

  <!-- Note: does not validate. We would like it to, but that would mean reduced user experience for the majority of our visitors. -->

Drupal core fully supports valid XHTML. Browsers used by real people, on the other hand, don't necessarily. So theme developers are forced to work around bugs.

apachelion’s picture

Drupal.org is not the "only" site that doesn't validate--actually, I am yet to see a Drupal site that fully validates! I don't know what the coders mean by "reduced user experience", but web standards are created so that everyone can have the best experience across all platforms and browsers.

Anyway, I am simply asking where the bad code comes from, and how one can start fresh with clean XHTML and CSS. Are the inconsistencies all due to the theme? If so, how is one supposed to go about designing the frontend?

Thanks.

cog.rusty’s picture

Probably only the specific errors reported by the validator will tell you what is wrong. Some module may be producing them.

I guess that your statement "there are no Drupal sites that do validate!" was added just for effect. How could you possibly verify such as statement?

"Reduced user experience" may mean that the theme would break in some browsers. Usually this is more important to the visitors than validation. Of course it would be better if there was a way to validate *and* not to break in any important browser, and I guess drupal.org people would be open to any suggestions which work.

apachelion’s picture

I didn't mean to bash at Drupal developers or Drupal itself. I believe I made myself clear that the reason for starting this topic was that I couldn't find a single Drupal website that passed W3's validator. While I can't say *every* Drupal installation doesn't validate, I can assure you I checked quite a few of the most popular sites (using Buytaert.net and Drupal.org), and they simply don't validate! Now, I assume there are very proficient teams of coders behind the majority of these websites, and this lead me thinking that there is something with the software that might be problematic. Actually, that's why I am asking! What can I do to make my site adhere to standards, if even a default installation produces errors?

I gave Wordpress.org as an example because it seems like Wordress's developers have found ways to make their software fully comply to w3. At least the project's home page passes validation completely.

cog.rusty’s picture

Fair enough.

What validation errors are you getting in a default Drupal installation?

apachelion’s picture

The majority of errors are due to improperly closed tags, others due to starting tags. Typical warning messages are:

start tag was here ...
end tag for "..." omitted, but OMITTAG NO was specified

The document type is not properly recognized either.

There are some irritating XML parsing errors like:

Opening and ending tag mismatch
Premature end of data in tag meta line...

On some of the other pages I've checked, one of the most common errors was linked to the ALT tag.

Obviously, these are not extremely bad errors per se, but having a 100 of them at a 100 different places can be very distracting, when you are trying to create a website. Unfortunately, my knowledge of PHP is still very limited, but my understanding is these xhtml inconsistencies will appear on every page Drupal generates--i.e. it won't be just one page that will be affected, it will be all of them, since they are all dynamically created. My guess is I can't do a lot about this with my HTML/CSS abilities. What do you think?

cog.rusty’s picture

I am not getting those. Are you using Garland?

It would help if you could give one specific example of a tag where this is happening and the exact error message (forget the other 99 for the moment), to establish what we are talking about and identify where it comes from. Does it refer to a css style class or ID? (which one?). Or is it in explicit HTML in the content?

Jeff Burnz’s picture

Sounds to me either you are using a broken theme or borked your own content.

Jeff Burnz’s picture

My site validates - http://adaptivethemes.com/ - granted the validator chokes on some internal pages, but as previously stated, this is the real world.

Default Garland validates - http://adaptivethemes.com/starter-themes/ (the default theme is Garland unmodified), this site uses many Drupal core modules and even some from contrib, no problems validating.

mariusilie’s picture

There are a lot of websites that validates w3c. The problem is not Drupal. It could be the site content or the theme. If you have embeded youtube movieclips on your content, this will never validate as xhtml.

This is a simple theme I made for a website and it is w3c valid. As you can see, there are drupal websites that are fully validated.

website: http://dev.mariusilie.net/
validating website: http://validator.w3.org/check?uri=http%3A%2F%2Fdev.mariusilie.net%2F

apachelion’s picture

OK, I tried a few different ways, and here are the results. I created a fresh Drupal installation, and saved the default front page (with the user logged out.)

Since this is a local installation, I can't verify the page in a way other than file upload, so keep in mind I am not validating a web address, but a saved page. I now understand why your pages validate--you validate the plain html only. It seems like the problems appear when I make a full save of the page (with images etc.) If I save the HTML only, the pages validate, but when fully saving with Firefox, I get these errors:

Validation Output:  50 Errors

   1. Error Line 1, Column 109: DTD did not contain element declaration for document type name

      …org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Line 2, Column 77: document type does not allow element "html" here

…http://www.w3.org/1999/xhtml" lang="en"><head>

Line 5, Column 68: end tag for "meta" omitted, but OMITTAG NO was specified

… content="text/html; charset=UTF-…

Line 5: start tag was here

><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

..................

Line 18, Column 9: XML Parsing Error: Opening and ending tag mismatch: link line 15 and head

  </head><body class="sidebar-left">

Line 74, Column 18: XML Parsing Error: Premature end of data in tag div line 24

    </body></html>

These are the basic types of errors. The end-tag-related and the XML-related ones are simply repeated throughout the page.

Oddly enough, with IE8 the validator recognizes the page as HTML 4.0?! However, there are fewer errors (13), most of them linked to the CSS. Here they are:


Line 3, Column 32: Attribute "XML:LANG" is not a valid attribute

<HTML dir=ltr lang=en xml:lang="en"

Line 4, Column 7: Attribute "XMLNS" is not a valid attribute. Did you mean "onmouseup" or "onmouseover"?

xmlns="http://www.w3.org/1999/xhtml"><HEAD><TITLE>drupal1</TITLE>

Line 6, Column 31: NET-enabling start-tag requires SHORTTAG YES

rel="shortcut icon" type=image/x-icon href="/misc/favicon.ico"><LINK

Line 7, Column 25: document type does not allow element "LINK" here

rel=stylesheet type=text/css href="drupal1-ie_files/node.css" media=all><LINK

....................

Line 17, Column 25: document type does not allow element "BODY" here

<BODY class=sidebar-left><!-- Layout -->

Line 30, Column 8: an attribute value must be a literal unless it contains only name characters

action=/node?destination=node>

The other errors are simply reiterations of the errors found on line 6 and 7.

What do you think is the cause for this strange behavior? Is there something that can be done to fix this?

Jeff Burnz’s picture

ummm, the software you are using to save the page is munging the html, clearly, e.g. theres no way in hell Garland uses upper case tags!

Look in the D6 Garland page.tpl.php <body<?php print phptemplate_body_class($left, $right); ?>> - lower case tag!

The way to do it is "view source" and paste the output into the validator as direct input.

So, the only thing to fix is your own methodology.