Jump to:
| Project: | Google Highly Open Participation Contest (GHOP) |
| Component: | Task idea |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
Issue Summary
Check if input don't get sanitised in modules.
Check if it is not sanitised and rely on core functions if those function sanitise the input.
Report which are the weakest point in core for input sanitising.
Report which are the most common errors in modules.
Judging on how modules were written where do people expect input to be sanitised?
1) ASAP
2) in module somewhere
3) delegate to core
Do you think responsability for sanitising input is clear?
If not when it is less clear? Passing array/object/values to core, passing SQL, XSS/content?
What are the functions in core on which people rely and pass unsanitised input?
Any suggestion to improve the db api to improve security and make it easier rely on?
Are there some common techniques used by modules to sanitise input? (cast, regexp...?)
Could they be pushed down to core functions? Could they be turned into a security API?
eg. sanitising the stuff getting inside an IN clause in SQL generally require building 2 array to pass to db_query or cast on place.
Give a look on how the last security issue in taxonomy_select_nodes was solved and to:
http://heine.familiedeelstra.com/a-security-vulnerability-waiting-to-happen
and
http://heine.familiedeelstra.com/drupal-confirmation-forms-csrf
Highlight some good and bad example so we can add them to the Writing secure code section of the support page.
thx
Comments
#1
There are some good ideas in here, but we need to limit the scope and make this into an actual task. Please take a look at the How to write a task page at http://code.google.com/p/google-highly-open-participation-drupal/wiki/Ho...
Also, just FYI, we already have one official security related task (see http://code.google.com/p/google-highly-open-participation-drupal/issues/...)
so please try to make sure that your task doesn't overlap with that task.
Thanks
#2
Where is the list of "33 security issues for Drupal in 2007"?
I do think there is some overlap... but I'm more interested in metrics.
I recently did a very quick security audit of all drupal modules. It was a few hours audit and I didn't have any time to really check my impressions but what I got was:
1) there is no clear idea about who should take the responsability to sanitise input. That somehow is even written in the docs
The Drupal philosophy - Escape or filter when appropriate http://drupal.org/node/101495
2) people don't know what actually core function do and if they do any sanitising. Most people play it safe and sanitise, some don't.
3) people use several tecniques to solve common problems, they could be put into an API, so people will be more aware they have to sanitise input
4) there are several things passed to core functions: sql, content and "objects". People tend to play safer when they are dealing directly with SQL. When thay pass other stuff to core attention level get lower.
Getting some metrics on the above points could help to prioritise efforts on document/code.
a) If we get an idea about dev expectations we could make that part stronger or write documents that make them understand it is not safe to assume so.
b) we could make more explicit in docs or inside the code if a function is "safe" or expect clean input
c) we could get an idea about offering a sanitising API
d) if dev consider writing clean SQL their responsability, we should offer helpers to make them more aware that even objects and content have to be cleaned or write more clearly that core functions expects clean input when dealing with objects/content.
Policy 1) oblige people to go down all the road to see if and how input is sanitised 2) play it safe and sanitise. People should get a bit of help to deal with 1) or we are missing part of the advantages of having an API (encapsulation or code reuse).
So to make a mockup of the things I'd like to know:
1)
* There are N1 circumstances in modules that sanitise input ASAP
* There are N2 circumstances in modules that sanitise input down the road
* There are N3 circumstances in modules that rely on core to sanitise input
2)
In places where there was a risk of passing dangerous stuff down the road (no sanitising took place but somehow there was no actual security problem) or where there was a security problem where people passing:
* content (N of cases)
* objects/values (N of cases)
* SQL (N of cases)
3) In places where there was a risk of passing dangerous stuff down the road (no sanitising took place but somehow there was no actual security problem) or where there was a security problem what was the core function receiving the input
4) when pople sanitise input in their modules what are the most common techniques (cast, regexp, php filter functions...)
5) Highlight some good and bad example so we can add them to the Writing secure code section of the support page.
6) some free comment about the "feeling" they got
Does it look more defined and with an enough narrow scope now?
#3
Security announcements are here: http://drupal.org/security
As I understand it this task seems very large. You want a student to go through every module and look for security problems or places where input is sanitized to prevent security problems? That's way too much for one task, unless you think there is a good way to automate this.
I'd still like to see this in a task form. At this point I think it is way too broad still, but it may be that I am not understanding the task very well. Please take a look at the links I provided earlier in this thread and try to hash this out into the various sections. Once you do that, it will be easier to see whether the task is something appropriate for this contest or whether it needs to be split into multiple tasks, for example.
BTW, I do think this is a good idea. I've wondered myself many times about when exactly I needed to sanitize input and when it will be taken care of for me.
#4
Ok, my feedback:
1) Sounds fairly interesting, but way too vague to seem approachable: perhaps more definite guidelines?
2) What about the coder module? Doesn't that try to scan files for security vulnerabilities?
#5
#6
OK... this is very large cos when I had to do it it was limited to some tens of modules, while there are 500+ modules in contrib.
Otherwise the questions are reasonably precise.
Any idea about how to chose a limited number of modules that could give us clues about the behaviour of the "average" dev and its expectations?
Are there any statistics about most downloaded module? module with most bugs? Modules with longer history?
If we just analyse the one on the spot light this won't give us a picture of the average dev expectations but giving a look to what kind of stats we have for modules may help to narrow down the audit to some tens of modules.
If we could find the modules of "casual" contrib that didn't have enough time to dig into drupal core, maybe we could get a clue on the pristine expecations.
Is there a way to have the list of modules developed by the newest members? This could be a good criterion to chose the modules to analyse.
BTW since I did it on a limited base I'm ready to offer help to anyone that will take the task.
#7
Which part do you think is "too broad"?
If it is the number of modules I proposed a way to limit the analysis to the one that will give us a more interesting picture.
Otherwise I build up a mockup of what I'd like to see and they look to me reasonably precise questions.
As for the "various sections" do you mean just splitting into: title, description, resources, primary contact, drupal issue(?) ???
#8
As for picking modules, I don't really know if there's a good solution to your question. Obviously you could go with some of the big modules (Views, CCK, OG) which are also frequently used. As for getting modules that should be pretty secure, maybe you pick modules maintained by people on the d.o security team (eg. dww, Heine, etc.). I'm not sure how you would find the newest modules or ones by less experienced developers. I don't think we have a recent list of most popular modules, but during the summer I seem to recall that Views and Google Analytics were both in probably the top 5 most used modules. I don't remember the others.
Well, going through all of contrib is clearly too broad, so you've got to limit it somehow. I think your questions themselves are pretty precise, but I'm not sure that it's obvious how to actually gather the data to answer your questions. I have the feeling that getting accurate information of the form you want will be difficult, because I don't see how you can clearly measure things like the risk of passing unsanitized information along, etc. Basically, this seems like it would require a line by line review of the code, and really understanding how the code works. Doing this for more than 1 or 2 modules that are reasonably complex seems beyond the capabilities of a student in a week, but just doing 1 or 2 modules won't give us much useful information.
Yes, that's what I mean. Once you split up a task into these sections, it's easier to see what the problems with it are (IMHO).
#9
As for time required I audited around 40 modules in few hours but I was looking to a specific vulnerability and for some modules I didn't go down to the path to core.
Some modules may have several input from the outside, but in the one I had to review the source of input were limited.
I understand that without any more formal definition of risk you can't get the number.
Reporting any non filtered input that go to the DB or is stored as content direcctly in the module or is passed to core functions.
Since they should report to which core function it is passed... it should still be valuable data.
mockup:
If this seems a reasonable proposal, where can I get stats about the newest contrib people?
Now it looks the only missing thing to get a good list of modules before giving to this GHOP task a formal shape.
I think most modules may require from 10 to 30 min each to be checked + time to write a report. And after the first 4 people should be able to say: this smells at a glance.
Requirements don't ask to fix bugs or really check if unfiltered input is actually dangerous.
Places to look at should be: forms, arg, $_GET, I could think about others, but I don't think there are many left.
#10
So GHOP has been over for nearly a year now. Is there any other way that we can mobilize the community to do a security audit? I haven't seen anything from Acquia about a security audit process for core and supported modules. However, it would be great to have some process whereby we can list modules that have gone through a defined audit process (with each release) and whereby we can have more confidence.
#11
coder module?