Posted by rmjiv on October 6, 2009 at 6:49pm
| Project: | Mime Mail |
| Version: | 7.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed (fixed) |
Issue Summary
The mimemail_extract_files function regex $pattern matches more items than it should. In our case, it matched the "src" suffix of a query parameter on a link.
Example: <a href="http://www.example.com?extsrc=foo"> was transformed into <a href="http://www.example.com?extsrc=Array">
because the "src=[\'"]?" clause of the regex matched.
Comments
#1
#319229: src="Array" if path to image is broken appears to be a duplicate of this issue, but it contains more followup, so I am closing this one and referring to followup there.
#2
I just noticed that this issue was marked as a duplicate of #319229. Unfortunately, having read that thread, I don't think that it is. That thread is dealing with a problem related to private download method for files. This issue is problem where the regex match is overly aggressive and matches the characters "src" in places that it shouldn't.
I'm relatively terrible at regular expressions, but it may be as easy as changing line 87 from
$pattern = '/(<link[^>]+href=[\'"]?|<object[^>]+codebase=[\'"]?|@import |src=[\'"]?)([^\'>"]+)([\'"]?)/emis';to$pattern = '/(<link[^>]+href=[\'"]?|<object[^>]+codebase=[\'"]?|@import | src=[\'"]?)([^\'>"]+)([\'"]?)/emis';The goal being to match "src" only if it is preceded with a space.
But I'm not sure that is even 100% right. Really what should happen is that
tag.
srcis only matched as an attribute of an#3
This code has changed in the latest release, alpha2 for a security advisory. Please review your change against this code. You should update your site to use a derivative of alpha2 as soon as possible.
#4
Thanks for the heads up on the security advisory. But that only removes the
/eswitch from the pattern which doesn't affect this either way. The pattern is still wrong in alpha 2.#5
I can confirm this issue. I am attaching a patch against current HEAD with the fix in #2.
#6
The previous patch is not good enough, it misses the case in which there is a line break or tab before the attribute. I'm attaching a new one which checks src attributes with any (\s) preceding white-space character.
#7
Committed to HEAD.
#8
Automatically closed -- issue fixed for 2 weeks with no activity.