Download & Extend

Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)

Project:Drupal core
Version:7.x-dev
Component:aggregator.module
Category:task
Priority:normal
Assigned:Unassigned
Status:needs review
Issue tags:needs backport to D7

Issue Summary

The aggregator modules uses check_plain() on the title of a feed item but this results in html entity ugliness with quotes and other characters.

Patch attached used php function strip_tags() instead. Not sure if this is the way to go.

AttachmentSizeStatusTest resultOperations
aggregator_2.patch828 bytesIdleUnable to apply patch aggregator_2.patchView details | Re-test

Comments

#1

Important clarification. This occurs when the feed already has html entities in it. For example, if the feed has a title of:

It's an Honor

(It's an Honor)

When it gets run through check_plain, the ampersand in the html entity get converted to an html entity, and it becomes:

It&amp#39;s an Honor

So the final output is:

It's an Honor

in the title.

#2

I can't properly display the html entities in the comments. It's all screwed up. Let me try with spaces before the semicolon. You should mentally remove them.

It&#39 ;s an Honor

gets converted to

It&amp ;#39; an Honor

which results in the following output:

It&#39 ;s an Honor

#3

Status:needs review» needs work

I'm pretty sure this is correct behaviour. Most feed fields contain regular, unmarked up text. Double-escaping like that in the source feed is only valid if the field contains escaped, marked up HTML text (such as the body).

In any case, this needs to be solved at the parsing stage, not the output stage. Drupal takes feeds in various formats.

Finally, using strip_tags() on output as a validation measure is disallowed. It should only be used as a conscious filtering op (such as done for taxonomy terms in title attributes).

#4

> Most feed fields contain regular, unmarked up text.

Well, Google uses htmlentities. They are a pretty major supplier of feeds.

#5

yes google news is pretty important and i like it, and it would be nice to have at least a checkbox available in aggregator setup to specify whether to unescape the title. i know nothing about this, and can't do this myself, i'm sorry to say. i had other google news related problems (duplicate news items) so i'm open to suggestions from those who may have good reasons for thinking google news is not important, or not best. thanks for anything anyone does to help us. please share if there's an adopted fix. thanks.

#6

Version:x.y.z» 4.7.2

Does anybody have brainstorming ideas of how to work around this? I'd settle for just dropping the entities from the RSS feed, which would be better than the current mash of ascii characters on the front page of the site.....

#7

Rather than strip_tags(), you'll want to use filter_xss($item->title, array()) around the item title. It's been shown on the security mailing list that strip_tags() can be bypassed in terms of XSS exploits.

#8

Version:4.7.2» 4.7.3

I downloaded newest version of 4.7.3 aggregator.module, and it still has the problem. I has to modify the file in multiple locations (using advise for filter_xss) to get it to work both in the block AND the "more" summary page... just a heads up. http://wellweight.org

#9

Version:4.7.3» x.y.z
Assigned to:Steve Dondley» edmund.kwok
Status:needs work» needs review

Changed strip_tags to filter_xss as suggested and in two other places, block_item and summary_item, including the original page_item.

Btw, issue also applies in HEAD, http://drupal.org/node/63459, marking that duplicate because this was reported first. Also, changing version to cvs to get more attention.

Patch is for cvs version.

AttachmentSizeStatusTest resultOperations
aggregator_3_0.patch1.88 KBIdleUnable to apply patch aggregator_3_0.patchView details | Re-test

#10

Patch for 4.7 cvs.

AttachmentSizeStatusTest resultOperations
aggregator_47.patch1.88 KBIdleUnable to apply patch aggregator_47.patchView details | Re-test

#11

In this issue (which got sidetracked), I suggested cleaning up the title before it is inserted in the database. This has the obvious benefit of cleaning it up only once, making my patch a lot smaller than the one proposed here.

The argument against cleaning up the tile before insertion in the database is that it is not the Drupal way to do so. For example, when adding a title to a node, the title is inserted into the database as is (after making it database safe); no removal of tags etc. The rationale behind this is that when the user goes to edit the node, he would expect his/her original input and not a parsed/filtered version of it. There is an obvious need to keep the original input, for consistency to the user.

In the case of the aggregator module however, there is no need to store the original input, which is machine generated and which will never be edited. So I'ld say: filter before insertion in the database, and be done with it.

#12

Version:x.y.z» 5.x-dev

Will this patch be part of drupal 5.0 core?
I am running drupal 5 CVS on a test site, and I still see & # 3 9 ; (spaces inserted there to prevent it from being parsed)

#13

Status:needs review» needs work

Tags need to be stripped from titles.

#14

Assigned to:edmund.kwok» Anonymous

Releasing this issue back into the wild. I'm not sure what's the best way to fix this..

#15

I see that this problem still exists in 5.1. Any sign of a permanent, main-stream resolution? I'll just go hack the code for now as this is UGLY.

#16

I agree 100%

#17

Version:5.x-dev» 6.x-dev
Assigned to:Anonymous» csevb10

I repatched this for Drupal 6.
There's an aggregator_filter_xss so you can actually filter the feed content independent of the site content which seems like a slightly better solution than invoking the filter_xss directly. I patched the same 3 areas as before.
What does everyone think of this as a viable solution moving forward?

AttachmentSizeStatusTest resultOperations
aggregator_checkplain.patch1.86 KBIdleInvalid patch format in aggregator_checkplain.patch.View details | Re-test

#18

Status:needs work» needs review

#19

I'm going to make a run at closing the Aggregator queue for D6.

Personally, I like http://drupal.org/files/issues/aggregator-striptags.patch and agree with ahoeben from #11.

This is machine-text, so keeping the "integrity" of the original title string is not relevant. Stripping the entities seems perfectly fine during the parsing stage.

#20

Status:needs review» needs work

No longer applies.

#21

Since it was only one line, in D5.3 I just made the quick edit, removed the items from the feed, updated the feed and had the same #39 business.

#22

What is the status of this problem? I use Google news and get this ugly output as well.

#23

Drupal's architecture needs to be changed to allow admins complete control over filtering of both input (what is stored in the database) and output (the final, displayed HTML). Further, this filtering should be customizable as per every content type (newsfeed, user comment, and so on). E.g., machine-generated newsfeed titles (database stored) can be filtered for XSS, appropriate HTML tags, etc., during the input stage to avoid the overhead of similar filtering everytime the title is output as web page content.

Also, appropriate code/principles should be used from better (than Drupal's Filter module) filtering scripts like htmLawed and HTMLPurifier. Though the mentioned scripts can be used in plugged-in modules, they then either cannot have certain functionalities (like, to address the newsfeed-entity issue), or re-do some actions that Drupal's core does anyway (increasing the processing time).

#24

What is the valid standard in XML - " or " or & #39;? I don't know how much more attention people pay to feed validity than website validity - would it be enough to require validity and reject quirky feeds, or do we lose a lot of functionality?

Edit: I note that whatever filter is here on d.o. is one whose code I would probably yell upon seeing. You do not unescape & amp; # 39; twice to make '. You just don't.

#25

Lines 289, 1179, 1267, 1352, 1367 and 1394 of the drupal/modules/aggregator/aggregator.module file (version timed 1-10-08 22:14) have code like:

... check_plain($item->title) ...

Looks like replacing all those six check_plain occurrences with aggregator_filter_xss should take care of the issue.

#26

I implemented the changes suggested by ahoeben above in his patch in post #11. Then I tried changing all the check_plain references to aggregator_filter_xss, but still got funky characters in Google News feeds on my site .

ahoeben's patch seems to clean titles that show up in feed blocks, but not in the Drupal aggregator category view which was still giving me funky single quotes in feed titles. It looks like the funky symbols are still being stored in the database, and ahoeben's patch fixes them when they are displayed in blocks.

Since the funky stuff was still getting displayed I wanted to remove them before they go into the database & thus make ahoeben's display patch unnecessary. So I added htmlspecialchars_decode() in the aggregator.module aggregator_parse_feed function.

It works great, whenever a "& #39;" shows up in the title of my news feeds, it's replaced with a single quote and then inserted into the database for later use.

Here's the line before:

$title = $item['TITLE'];

Here it is after:

$title = htmlspecialchars_decode($item['TITLE'], ENT_QUOTES);

#27

Can somebody let me know what patch/changes are recommended for a Drupal 6.3 environment ? I am happy to modify code, but wanted to know which code changes to make so that these special characters are changed before the google news feed items are written to the database.

#28

For me on 6.3 it's as the above post. I have a google news feed displayed in a block on a site, and the title links were showing &blah; style text - much in the same way that Firefox RSS feed reader does for sites like Slashdot et all interestingly.

Anyway - all I had to do was as jdefay says:

add htmlspecialchars_decode() in the aggregator.module aggregator_parse_feed function.

Here's the line before:

$title = $item['TITLE'];

Here it is after:

$title = htmlspecialchars_decode($item['TITLE'], ENT_QUOTES);

jdefay rox.

D

#29

I tried this, but when I update feed I get:

Fatal error: Call to undefined function: htmlspecialchars_decode() in /data01/mysite/public_html/drupal/modules/aggregator/aggregator.module on line 738

Is there something else I need to do so that this function is available ?

Thanks,
Tim

#30

I just noticed that the htmlspecialchars_decode() function is only available in PHP >= 5.1.0. I am using PHP 4.4.4 which is supported by Drupal 6.3. Unfortunately I am unable to upgrade PHP because my site is hostsed on a server managed by my ISP and uses cpanel. So, the version of PHP is out of my control.

Is there a PHP 4.4.4 function which will do same/similar, or do I need to suffer this until PHP 5.1.0 or later is available ?

Thanks,
Tim

#31

I found some suggestions at http://uk.php.net/htmlspecialchars_decode

So, I have now changed code as shown below:

if ( !function_exists('htmlspecialchars_decode') )
{
    function htmlspecialchars_decode($text)
    {
        return strtr($text, array_flip(get_html_translation_table(HTML_SPECIALCHARS)));
    }
}

    // Resolve the item's title. If no title is found, we use up to 40
    // characters of the description ending at a word boundary but not
    // splitting potential entities.
    if (!empty($item['TITLE'])) {
      $title = htmlspecialchars_decode($item['TITLE'], ENT_QUOTES);
    }

I don't get any error now, so I will see if I get any news items with special characters included. I will report my findings in next few days.

#32

This bug still exists as of D6.6 in core aggregator.module. Any plans for a patch to be brought into core?

--Brad

#33

Version:6.x-dev» 6.8

It also/still happens on the "domain.com/user" page where the title is generated dynamically below the login form.

#34

I can confirm that @jdefay's patch (#26) works well for >PHP5; I've been using it for months. I have not tested the other version, nor can I vouch for whether or not doing this is 'the Drupal way.' (I just know that my users need the feeds to work properly!)

#35

thank you jdefay in #26, after much useless information your advice has finally solved my problem :)

#36

jdefay,

I just looked at your site, which has a Google News Feed here: http://jason.defay.org/aggregator/categories (titled 'UCSD In the News').

How did you get that to work? When I setup something similar, all the ampersands (&) are stripped out of the links, which breaks them. Example:

http://news.google.com/news/url?sa=T&ct=us/0-0&fd=R&url=http://www3.sign...

http://news.google.com/news/url?sa=Tct=us/0-0fd=Rurl=http://www3.signons...

Any suggestions? Thx.

-- Cronin

#37

Hi Cronin,

I just looked at the Google News Feed 'UCSD In the News' you mentioned on my site and noticed it does in fact have the ampersands in the links. Are you seeing something different on my site?

I just upgraded to 6.10 and haven't reimplemented any of these patches described in this thread. So I can't say for certain if the patches will actually strip the Google tags or not.

What I can say is that I got frustrated with the way Google feeds were causing frequent problems with the Drupal aggregator. I switched over to Yahoo news feeds and am getting consistently clean results. It's too bad because I like the Google news service better and would prefer to use it over Yahoo. I just didn't want to deal with headaches anymore.

Jason

#38

All,

Since Cronin asked about it, I decided to go in and re-implement the simple patch I described in my earlier post to this thread.

I added the htmlspecialchars_decode function back into the aggregator.module, removed all the old items from the news feed, then refreshed it. Viola, no more funky characters in the news feed titles.

Since several new updates/releases of Drupal have happened since my original patch, and all of them still contain the old problematic PHP code, perhaps someone can volunteer to help me get this fix added to the next version of Drupal so we don't all have to keep re-patching every time a new release comes out.

Please feel free to respond to this thread and/or email me if you can help make this happen.

Cheers,

Jason

#39

Status:needs work» reviewed & tested by the community

Tim,

Have you had any errors on this yet? If not, your aggregator patch seems to be a better, more backward compatible, improvement over mine.

I implemented your version and tested it on my site. It works great so I'd appreciate it if any of you who are reading this & willing to help would do the same and report back here.

Cheers,

Jason

#40

Status:reviewed & tested by the community» needs work

Oops, I shouldn't have changed the patch status since it looks like the original patch isn't the one we trying to implement. I'm setting back to 'code needs work' until someone can create and submit a patch that meets the Drupal patch standards. Here is what I'm proposing someone help me do:

Write a valid patch that replaces line 737 through 741 with the following:

if ( !function_exists('htmlspecialchars_decode') )
{
    function htmlspecialchars_decode($text)
    {
        return strtr($text, array_flip(get_html_translation_table(HTML_SPECIALCHARS)));
    }
}

    // Resolve the item's title. If no title is found, we use up to 40
    // characters of the description ending at a word boundary but not
    // splitting potential entities.
    if (!empty($item['TITLE'])) {
      $title = htmlspecialchars_decode($item['TITLE'], ENT_QUOTES);
    }

As soon as that patch is up and tested I'll work on getting it into the queue for the next Drupal release.

Cheers,

Jason

#41

Been using this patch for a while with no issues; would be glad to see it get in the next version.

#42

Jason,

"I just looked at the Google News Feed 'UCSD In the News' you mentioned on my site and noticed it does in fact have the ampersands in the links. Are you seeing something different on my site?"

No, your site looks fine. But my site (http://www.zts.com, see the section "Google Web Search Filtered for Recent Thermoelectric Content"). Check the links: "&"s are stripped out, and broken. I checked and they are stored in the DB that way too.

I just upgraded to drupal 6.9 (a day before 6.10 came out!).

Any thoughts on what's going on? Thanks for the replies.

-- Cronin

#43

Sounds like the patch isn't working for some reason. Have you done all of the following?

1. Checked to make sure you have PHP version 5.1.0 or greater installed on your server
2. Changed line 741 in the aggregator.module from $title = $item['TITLE']; to $title = htmlspecialchars_decode($item['TITLE'], ENT_QUOTES);
3. Removed all previous posts from your DB and re-imported them.

You might also try out the patch I just created below. Please give me feedback on the patch, it's in need of QA testing!

Thanks,

Jason

#44

Version:6.8» 6.10
Assigned to:csevb10» jdefay

Hi everyone,

I've created and tested the attached patch on my Drupal 6.10 site and am looking for volunteers to do additional patch testing.

The patch chages the aggregator.module as generally descibed in my posts above. It first checks to see if the server has the htmlspecialcharacters_decode() function available (PHP 5.1.0 and greater). If so it uses the function to clean up HTML in the article titles and saves the results to the database. If the function isn't there for some reason, it defaults to the prior method for saving article titles.

I'll give it a few weeks for testing, then if I don't hear anything I'll assume it's good to go and try to get it into the next patch release.

Thanks,

Jason

P.S. Thanks to domesticat for turning me on to WinMerge!

AttachmentSizeStatusTest resultOperations
aggregator_final.patch1.05 KBIdleFailed: Failed to apply patch.View details | Re-test

#45

Jason,

Thanks much for the help, but after further testing & searching I'm pretty sure my problem is distinct from this thread.

A drupal thread (Aggregator Module Shows HTML Gibberish) which described my problem quite closely (dropping of key symbols like "&") is at this link: http://drupal.org/node/346990

It seems there is a known bug in one of the php libraries, libxml2 2.7.1,which strips out key html symbols (like <, >, & etc.). The bug is described here: http://bugs.php.net/bug.php?id=45996

My host's php installation has libxml2 2.7.2, which (if I'm reading the libxml release notes right) doesn't have the fix in it. But the latest release, libxml2 2.7.3, should. So only a few systems will be affected by this particular bug, and it will go away as the php libraries get updated.

In mean time, guess I'll use some work-around.

Thanks again .

#46

Status:needs work» reviewed & tested by the community

I fear there may not be sufficient traffic on this thread to get additional patch testing done. After waiting a few weeks for testers to evaluate the final version of the patch, and not getting any responses, I've gone ahead and updated the status to "reviewed & tested by the community".

While nobody seems to have tested the final version, it was tested and improved by a number of users since it was first created in May of 2006 by Steve Dondley. Since those earlier iterations led us to this final version of the patch, I think it has been sufficiently tested to be committed to the next version of Drupal.

I apologize in advance for not being able to get other developers to provide a supporting opinion on the patch. However, I'm confident this patch will work just fine, particularly since the final patch checks to make sure the htmlspecialchars_decode() function is available on the server before running it.

Cheers,

Jason

AttachmentSizeStatusTest resultOperations
aggregator_final.patch1.05 KBIdleFailed: Failed to apply patch.View details | Re-test

#47

Version:6.10» 6.x-dev

After re-reading the Drupal PATCH instructions, it appears that I should be upgrading this bug to Version "6.x-dev", so that's what I just did :-)

#48

I've been having problems with my aggregator in that my feeds are displaying HTML code. I've got Drupal 6.10 installed. This patch sounds like exactly what I need. I clicked on the attachment, but it appears to display some text in my browser. How do I download and install this patch?

#49

Version:6.x-dev» 7.x-dev
Status:reviewed & tested by the community» needs work

The patch seems to be reversed - htmlspeciachars_decode() is removed, not added by #46.

Additionally the current development version of Drupal is 7.x rather than 6.x - where possible we fix issues in 7 first, then backport. http://api.drupal.org/api/function/aggregator_parse_feed hasn't changed that much (although it's in a different file now), so it'd be roughly the same change to make, although Drupal 7 requires PHP 5.2 or greater, so no need for the function_exists() there. Only did a very cursory review of the patch, but if you could re-roll against 7 that'd help to get some more reviewers.

#50

Status:needs work» needs review

This might work.

AttachmentSizeStatusTest resultOperations
61456_html_decode_titles.patch1.04 KBIdleFailed: Failed to apply patch.View details | Re-test

#51

Status:needs review» needs work

The patch looks good, and this is a great idea, but we definitely need a test case for it.

#52

Status:needs work» needs review

Thanks for re-rolling this Alex_b, sounds like we're ready to test & patch

#53

Status:needs review» needs work

The last submitted patch failed testing.

#54

Assigned to:jdefay» Anonymous
Status:needs work» needs review

Re-rolled for current D7 HEAD.

#55

Re-rolled for current D7 HEAD.

AttachmentSizeStatusTest resultOperations
61456_html_decode_titles_2.patch1021 bytesIdlePASSED: [[SimpleTest]]: [MySQL] 20,138 pass(es).View details | Re-test

#56

Status:needs review» needs work

The last submitted patch failed testing.

#57

Status:needs work» needs review

Testbot says that the Error handlers test failed. It passes on my machine.
Patch still applies to HEAD.
Re-setting to Needs Review for another shot.

#58

Re-test of 61456_html_decode_titles_2.patch from comment #55 was requested by Arancaytar.

#59

#55: 61456_html_decode_titles_2.patch queued for re-testing.

#60

subscribing

#61

I have my aggregator feed titles ended with &#160;.
Subscribing. Thanks

#62

Version:7.x-dev» 8.x-dev

#63

So after 5 years, we should just declare that we don't care and close this.

Moving to D8 is pointless.

#64

LOL, I don't think this one is really that bad, and it's finally in 'needs review'. (I was trying to update the versions on the more promising issues).

#65

Version:8.x-dev» 7.0
Status:needs review» reviewed & tested by the community

Well, the patch is green, and should be RTBC, so let's give it a try for 7.1.0.

#66

Status:reviewed & tested by the community» active

I started writing a test for this for verification/assurance. The included patch adds this test but does not include the patch in #55, (nor any other proposed changes in this issue,) in order to allow testing with-and-without such changes.

The test in its current form adds a feed with an item whose title includes a quote (") and an ampersand (&).

However... these look to be displaying properly -- using SimpleTest and adding the feed manually -- without patching.
Am I testing this properly?
Is anyone actually seeing this error in D7?

I had some issues with testing the apostrophe ('). I see the correct output (manually and in SimpleTest's verbose results) when I use it in the test feed item's title, but assertText() doesn't seem to acknowledge it.

Changing this issue back from RTBC seems appropriate.

#67

sorry, here's the test...

AttachmentSizeStatusTest resultOperations
61456_html_decode_titles_test.diff2.19 KBIdlePASSED: [[SimpleTest]]: [MySQL] 31,484 pass(es).View details | Re-test

#68

@jeffschuler

Now if you can just combine the patch and the test into one, set the issue to Needs Review and let TestBot try it out.

#69

@agentrickard:

Sorry if I wasn't clear: the test passes without applying the patch.

We should demonstrate that something is broken, [change the test to do so,] before fixing it.

#70

So you are suggesting that this actually got fixed somewhere else?

#71

Status:active» needs review

Marking 'needs review' to test the test patch.

#72

Either it's been fixed elsewhere or I'm not testing it properly.

I just tried the same test feed (generated in #67) in Drupal 6 and the entities displayed properly. So, perhaps I'm not testing the correct conditions?

The test uses a manually-generated RSS 0.91 feed (copied from the existing RSS091 test file) whose only item's title includes & and " entities.

I've added items to this feed using the title phrase "It&39;s an Honor" (from comment #1,) as well as a few other examples folks cited as problems in this issue, and they all displayed properly in D6 and D7.

Maybe someone experiencing this issue could point us to a feed they're having a problem with?

#73

The test passes and accounts for the original condition, so I suppose this was fixed elsewhere and we can close this.

If it recurs, we can re-open?

#74

Status:needs review» closed (fixed)

Sounds reasonable to me.

#75

Version:7.0» 7.7
Status:closed (fixed)» active

Patch in #55 worked for me, mostly. This suggests to me that the result of #66 is a faulty test condition, not that the problem magically fixed itself. Testing against an external site known to cause the problem may be a better metric than creating a local feed to test against. (Of course, maybe something else has changed between 7.0 and 7.7)

I said the patch "mostly" worked for me. &rsquo; still appears in my feed titles, but that's a htmlspecialchars_decode() issue and may be outside the scope of this issue.

#76

smscotten: can you point us to the feed you're having issues with?

#77

Sorry, that probably should have been one of the first things I put in the post.

Feed: http://www.politifact.com/feeds/statements/truth-o-meter/

And here on my site: http://splicer.com/aggregator/sources/19

Ugh. Looked at the source. Looks like they have double-converted their characters to character entities, resulting in &amp;quote; all through, as described in reply #1.

So the patch in #55 is a workaround, not a fix, because the behavior in the aggregator isn't wrong.

Um... right?

EDIT TO ADD: I sent mail to politifact.com and got a response back saying they were looking into it so it may not even show the issue much longer.

#78

Title:Aggregator titles display quotes and other characters with HTML entity equivalents badly» Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)
Version:7.7» 8.x-dev
Category:bug report» task
Status:active» needs work

I saw this issue recently also, but same thing as above; it turned out that the feed itself (on the source site) was incorrectly escaping its titles. So there's no bug here for Drupal to fix, at least not in D7/D8.

If an actual bug is still reproducible in Drupal 6 (i.e. with a feed that is formatted correctly on the source site but broken when imported to Drupal), someone could feel free to set this issue back to 6.x.

In the meantime, the tests in #67 look pretty solid so it's worth getting the patch rerolled and committed; tests are good regardless of whether the bug itself exists...

#79

Status:needs work» needs review

Cool. Rerolled #67 for 8.x.

AttachmentSizeStatusTest resultOperations
61456-79-core-aggregator-title_entities_test.patch2.1 KBIdlePASSED: [[SimpleTest]]: [MySQL] 34,383 pass(es).View details | Re-test

#80

Status:needs review» reviewed & tested by the community

Patch looks good to me. The tests pass, and they're completely consistent with the way the other tests in the same area of the code are written.

I did a quick reroll to remove the "No newline at end of file" issue from the newly-added file. This should be good to go.

AttachmentSizeStatusTest resultOperations
61456-80-core-aggregator-title_entities_test.patch2.08 KBIdlePASSED: [[SimpleTest]]: [MySQL] 34,124 pass(es).View details | Re-test

#81

Ah, but someone is going to come point out that Drupal documentation standards now require the PHPDoc to say "Tests a feed.." rather than "Test a feed.." :)

Fixing that in the attached.

AttachmentSizeStatusTest resultOperations
61456-80-core-aggregator-title_entities_test.patch2.08 KBIdlePASSED: [[SimpleTest]]: [MySQL] 34,120 pass(es).View details | Re-test

#82

Haha, you heard me coming! But also....

+++ b/core/modules/aggregator/aggregator.testundefined
@@ -914,4 +918,15 @@ class FeedParserTestCase extends AggregatorTestCase {
+  /**
+  * Tests a feed that uses HTML entities in item titles.
+  */

Indentation is goofed on the second and third lines. Needs one more space before the asterisks.

#83

Fixing that. No commit credit please. :P

AttachmentSizeStatusTest resultOperations
61456-83.patch2.08 KBIdlePASSED: [[SimpleTest]]: [MySQL] 34,133 pass(es).View details | Re-test

#84

Oh, and yeah.

#85

Version:8.x-dev» 7.x-dev

Committed the tests to 8.x. Moving to 7.x.

#86

Title:Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)» [ROLLBACK] Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)
Version:7.x-dev» 8.x-dev
Priority:normal» major

This appears to have broken tests. Failure on "Quote&quot; Amp&amp;" found http://qa.drupal.org/8.x-status

#87

Title:[ROLLBACK] Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)» Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)

--- /dev/null
+++ b/core/modules/aggregator/tests/aggregator_test_title_entities.xml

this file was not commited

#88

Title:Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)» [BROKEN HEAD] Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)
Priority:major» critical

#89

+++ b/core/modules/aggregator/aggregator.test
@@ -914,4 +918,15 @@ class FeedParserTestCase extends AggregatorTestCase {
+    $this->assertText("Quote&quot; Amp&amp;");

Why are you testing the page output that has been converted to plain-text?

When testing HTML escaping/sanitization/entities, use assertRaw() to check the actual, unprocessed content in the page output.

assertText() runs the entire page output through filter_xss(), which performs a plethora of HTML escaping, entity conversions, and validations.

#90

Status:reviewed & tested by the community» needs review

Patch suggested by #87 and #89.

#91

Patch for real this time.

AttachmentSizeStatusTest resultOperations
core-aggregator-title_entities-test-61456-91.patch2.06 KBTest request sentNoneView details

#92

Category:task» bug report
Status:needs review» reviewed & tested by the community

Note that since HEAD is broken, this is a critical bug.

#93

Category:bug report» task
Status:reviewed & tested by the community» needs review

+1 to commit asap

# drush test-run FeedParserTestCase
Feed parser functionality 62 passes, 0 fails, 0 exceptions, and 15 debug messages                              [ok]
No leftover tables to remove.                                                                                  [status]
No temporary directories to remove.                                                                            [status]
Removed 1 test result.                                                                                         [status]

#94

Status:needs review» reviewed & tested by the community

#95

Category:task» bug report

#96

Sorry about that. I just committed the patch in #91. I think that should fix HEAD. Let's see ...

#97

Priority:critical» normal
Status:reviewed & tested by the community» fixed

#98

Title:[BROKEN HEAD] Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)» Aggregator titles display quotes and other characters with HTML entity equivalents badly (write tests)
Version:8.x-dev» 7.x-dev
Category:bug report» task
Status:fixed» patch (to be ported)

Looks good - seems like we can still use a patch for D7.

Regarding assertText() vs assertRaw()... either should be fine here since it's just a text string. Leaving it as assertText() actually has the nice property that it verifies that filter_xss() doesn't do anything funny to &quot; or ;&amp (which it shouldn't), but I suppose that's an extremely backwards way to test that :) Either one is OK.

#99

Status:patch (to be ported)» needs review

D7 backport. I *think* I got it right.

AttachmentSizeStatusTest resultOperations
core_aggregator_title_entities_test-61456-d7-99.patch2.08 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch core_aggregator_title_entities_test-61456-d7-99.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.View details | Re-test

#100

Status:needs review» needs work

The last submitted patch, core_aggregator_title_entities_test-61456-d7-99.patch, failed testing.

#101

Status:needs work» needs review

Trying again.

AttachmentSizeStatusTest resultOperations
core_aggregator_title_entities_test-61456-d7-101.patch2.05 KBIdlePASSED: [[SimpleTest]]: [MySQL] 37,023 pass(es).View details | Re-test

#102

Should probably use assertRaw() rather than assertText() to be consistent with what went into Drupal 8.

#103

Good call.

AttachmentSizeStatusTest resultOperations
core_aggregator_title_entities_test-61456-d7-103.patch2.04 KBIdlePASSED: [[SimpleTest]]: [MySQL] 37,012 pass(es).View details | Re-test
nobody click here