drupal_html_to_text() can only run once

salvis - January 20, 2008 - 02:40
Project:Drupal
Version:6.x-dev
Component:other
Category:bug report
Priority:critical
Assigned:Gábor Hojtsy
Status:closed
Description

The helper function _drupal_html_to_mail_urls() accumulates the URLs that drupal_html_to_text() encounters, so that it can return a list of them to put at the bottom of the email. If you want to run drupal_html_to_text() more than once (for example the Subscriptions module sending out a bunch of emails in one cron run), then the list of URLs just keeps accumulating.

drupal_html_to_text() needs to reset _drupal_html_to_mail_urls() at the end, and the first attached patch does that. It does not change the behavior of either function. IMO this needs to make it into D6, because drupal_html_to_text() is broken.

If the same URL occurs more than once in a text, then the list has duplicates. It would be nice, if these could be omitted. The second patch does that in addition to adding the reset mechanism and would be nice to have.

#1

salvis - January 20, 2008 - 02:42

Hmm, lost my attachments...

AttachmentSizeStatusTest resultOperations
drupal_html_to_text.more_than_once.patch891 bytesIgnoredNoneNone
drupal_html_to_text.more_than_once.remove_duplicates.patch1.11 KBIgnoredNoneNone

#2

chx - January 23, 2008 - 15:49
Priority:normal» critical

If D6 can't send more than HTML mail per page that's imo a critical problem

#3

Gábor Hojtsy - January 24, 2008 - 10:56
Status:needs review» needs work

Why not add a $reset parameter? It would be much cleaner, and compared to that it is a minor API modification to solve a critical omission. (I would not expand logic to collect the fold the same URL, so let's keep focusing on adding a clean reset option).

In fact, it looks like _drupal_html_to_mail_urls() needs a $reset parameter and it needs to be reset before the call to preg_replace_callback($pattern, '_drupal_html_to_mail_urls', $string);

#4

Arancaytar - January 24, 2008 - 14:26

Quick note: The diff "-p" option (as far as I know) adds the relevant function name to each chunk. That makes it a lot easier to see what function is being changed and what their parameters are; I can only guess that the first one is html_to_text and the second one is the urls function.

#5

Gábor Hojtsy - January 25, 2008 - 11:04
Status:needs work» needs review

Here is the approach I suggested, please review/test.

AttachmentSizeStatusTest resultOperations
reset_mail_urls.patch1.83 KBIgnoredNoneNone

#6

Gábor Hojtsy - January 25, 2008 - 16:40

A simple script to test before and after the patch:

<?php
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

$text = '<a href="http://drupal.org">Drupal</a> test <a href="http://example.com">Example</a>';
echo
drupal_html_to_text($text);
echo
drupal_html_to_text($text);
?>

#7

webchick - January 25, 2008 - 16:58
Status:needs review» reviewed & tested by the community

Ok, confirmed the problem without the patch, applied the patch and did a few variation of the script. Things worked. :) RTBC.

#8

Gábor Hojtsy - January 25, 2008 - 17:04
Assigned to:salvis» Gábor Hojtsy
Status:reviewed & tested by the community» fixed

Thanks, committed.

#9

salvis - January 26, 2008 - 17:03

Thanks. I had a (pre-announced) power outage here that kept me from pursuing this in a timely manner.

#10

Anonymous (not verified) - February 9, 2008 - 17:11
Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.