testcase html:
<a href="http://drupal.org/" title="the official website">Drupal</a> is an open source content management platform.
result:
Drupal
is an open source content management platform.
expected result:
Drupal [1] is an open source content management platform.
[1] http://drupal.org/
problem:
$pattern = '@(<a[^>]+?href="([^"]*)">(.+?)</a>)@i';
fix:
$pattern = '@(<a[^>]+?href="([^"]*)"[^>]*?>(.+?)</a>)@i';
and attached.
if i see this right, drupal_html_to_text() also fails for utf8 with non ascii characters. but thats another issue.
Comments
Comment #1
gábor hojtsyThat is pretty straightforward and trivial. Committed. Thanks.