The regexes which obtain the <script>, <object> and <embed> are case sensitive. This makes it trivial to bypass the filter and insert arbitrary javascript simply by using <SCRIPT> instead of <script>!
The fix is simple:
change line 228 to:
$input = preg_replace_callback('@<(embed|object|script)([^>]*)/?>?@si', 'embedfilter_process', $input, 5);
change line 231 to:
$input = preg_replace_callback('@<(embed|object|script)([^>]*)>(.*?)@si', 'embedfilter_process', $input, 5);
(note: all we're doing is adding the 'i' flag to the regex statement).
The XSS cheat sheet at http://ha.ckers.org/xss.html is a fantastic resource for familiarizing yourself with current XSS attack techniques. Embed_filter needs to fully sanitize code associated with the object, embed, and script tags, as it opens holes for them.
Comments
Comment #1
ragaskar commentedit appears that I am having an issue other than I described above (as embed_filter DOES string-to-lower *after* it pulls the lines, supposedly). I need to do more research, clearly.
Comment #2
ragaskar commentednevermind: this is still an issue; the tags will never get to the str-to-lower code if we do not search case-insensitively.
IE, embed filter sees 'SCRIPT' and doesn't recurse it through the filtering process.
Comment #3
ragaskar commentedafter making these changes and some of the others I recommended, embed_filter passed all of the XSS cheat sheet attacks I tried (about 70% of them -- didn't bother with some of the URL obfuscation stuff and some other tests that were outliers or not really related to embed_filter's functionality.).
good enough for now. thanks for this script.
Comment #4
steven jones commentedPatched
Comment #5
(not verified) commented