Download & Extend

Regex for internal file broken

Project:Link
Version:7.x-1.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:needs review

Issue Summary

The regex for the internal file test is broken - it does not allow for directory separators or spaces in the filename. The only regex that does is the internal menu path, but it does not allow "." characters. Hence something like "sites/default/files/my great presentation.pdf" fails all URL validation tests even though the file exists.

I also discovered that the internal menu path regex also doesn't allow spaces (which are legit, and both the URL aliases module and core menu router allows them) so added that character there as well.

Patch below.

Comments

#1

Status:active» needs review

Here's the small patch to those regexes...

AttachmentSize
link-internal_path_regex-1599314-01.patch 766 bytes

#2

Updated patch with open/close parentheses added to regex string for files

AttachmentSize
link-internal_path_regex-1599314-02.patch 770 bytes

#3

Status:needs review» fixed

Well, I thought %20 was the thing to do, but it turns out I was wrong in testing.

So - merged, and will be live in 7.x-1.1.

#4

Yeah, if the file is saved with a space in it, the %20 won't match it. We specifically ran into it with a data migration, but if I'm not mistaken file uploads will also preserve the space.

Thanks for merging in so quickly!

#5

Status:fixed» needs review

OK, guess who's back with yet another regex change... Did some research after discovering filenames with & and * in them, and discovered Un*x filenames can contain anything except NUL (\0). That being said, Windows and OS X have their own restrictions, but creating files on those OS's is going to limit those filenames to a subset of what Un*x can handle.

In that case, we should probably check for any filename that is possible on a Un*x system as well as for the existence of said file. The attached patch does just that, and actually accepts a file with the following internal path (which I was actually able to create from the command line on a Linux system):

sites/default/files/test1234567890-=+_)(*&^%$#@!`~[]\|}{;'":,.?><

... and the Link module was able to generate a valid URL to download the file. Sorry for the follow-on, but I was not aware of just how much you could get away with in a Un*x filename!

AttachmentSize
link-internal_path_regex-1599314-05.patch 1 KB
nobody click here