Hi,

when meta tag content is automatically generated from the node content, the output is not properly sanitized, because markup from the node's textfields is being inserted into the <meta name="description"... and <meta name="dc.description"... fields.

Configuration from ./admin/content/nodewords:

  • "Generate meta tag content when the meta tag content is empty"
  • Generation source: "Generate meta tags content from the node teaser"

Steps to reproduce: Install a 3rd party input filter from contrib, e.g. mediawiki_filter and configure nodewords as described above. You'll see that output like this in the generated HTML code:

<meta name="description" content="&#039;&#039;&#039;A famous name&#039;&#039;&#039; from Czech Republic is &#039;&#039;&#039;doing an excellent job&#039;&#039;&#039; in &#039;&#039;&#039;healthcare&#039;&#039;&#039; for the &#039;&#039;&#039;[[United Nations]]&#039;&#039;&#039;." />

&#039;&#039;&#039; is the Mediawiki markup ('''...''') in Unicode entities for <bold>. Mediawiki style links ([[...]]) are not interpreted at all. It's similar for the <meta name="dc.description"... meta tag, if used.

I don't know if this is intended behaviour, but as it is, the output is useless as a meta tag. I'd think that the content from the node body would have to run through the input filter first to become proper HTML, and then all HTML tags would have to be stripped before the metatag token is inserted into the header of the resulting HTML page.

Thanks & greetings, -asb

CommentFileSizeAuthor
#8 nodewords-n971428-8.patch587 bytesDamienMcKenna
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

DamienMcKenna’s picture

Issue tags: +v6.x-1.12 blocker

Tag.

Pomliane’s picture

Hi,
Same issue here without any input filter from contrib activated.
Following meta tags are not displayed correctly: keywords, copyright, description, abstract, dc.contributor, dc.creator, dc.description, dc.publisher, dc.title.

antiorario’s picture

Seems like Nodewords should take input filters into account. I use Markdown, which obviously ends up appearing in the meta tags.

quicksketch’s picture

Version: 6.x-1.11 » 6.x-1.x-dev
Issue tags: -v6.x-1.12 blocker

As this issue exists in both 1.11 and the latest 1.12-RC, it doesn't look likely to get fixed in the 1.12 release. At the same time though, this is an obvious problem. I'm kicking this issue to the next version, which we can address once we get the long-overdue 1.12 release out.

DamienMcKenna’s picture

Status: Active » Postponed (maintainer needs more info)

This should go in nodewords_metatag_from_node_content() in nodewords.module. The question, however, is should we use node_build_content(), node_view() or something custom based on a manual application of the input filter? I'm personally thinking of node_build_content(), any other suggestions?

DamienMcKenna’s picture

Status: Postponed (maintainer needs more info) » Active
DamienMcKenna’s picture

Status: Active » Needs review
FileSize
587 bytes

This inserts a simple check_markup() on the output before any other parsing is done.

DamienMcKenna’s picture

Status: Needs review » Fixed

Committed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.