Problem/Motivation
When the #limit_validation_errors
form property is used, values that had not yet been validated could still end up in a list of submitted form values. This could mean that unsafe values could end up being used during a form submission (a very possible issue if contrib modules use #limit_validation_errors
for buttons that trigger #submit
handlers that do potentially risky things).
Proposed resolution
Replace the partial functionality provided by _form_set_value()
and provide 3 new helper functions:
drupal_array_set_nested_value(array &$array, array $parents, $value, $force = FALSE)
: Set the value of an item in the array(traversed through viaparents
). Using force would create the path if it did not previously existdrupal_array_get_nested_value(array &$array, array $parents, &$key_exists = NULL)
: Get the value of an item in the array (traversed through viaparents
.key_exists
would inform if the path existsdrupal_array_nested_key_exists(array $array, array $parents)
which is a wrapper fordrupal_array_get_nested_value
Remaining tasks
A series of patches have been committed to both Drupal 7 and Drupal 8. [...]
API changes
From Drupal 6 -> Drupal 7. _form_set_value()
has been removed for the functions outlined in Proposed resolution.
Original report by fago
When #limit_validation_errors is used, still form values that doesn't throw validation errors are put into $form_state['values']. However this is the place for validated and checked values, thus modules might rely on that. Form values of "not validated" elements should be in $form_state['values'].
Comment | File | Size | Author |
---|---|---|---|
#137 | 763376-variable-d7.patch | 657 bytes | andypost |
#133 | drupal.array-get-value.133.patch | 9.21 KB | sun |
#131 | drupal.array-get-value.131.patch | 9.54 KB | sun |
#128 | drupal.array-get-value.128.patch | 9.54 KB | sun |
#127 | drupal.array-get-value.127.patch | 9.14 KB | sun |
Comments
Comment #1
fagoNote: Limited validation errors were introduced in #370537: Allow suppress of form errors (hack to make "more" buttons work properly)
Comment #2
fagoComment #3
effulgentsia CreditAttribution: effulgentsia commented#limit_validation_errors was a stopgap measure to allow things like "add more" buttons where more relaxed security is less of a problem, not to give people a false sense of security that the property is safe to use for submit handlers that really need validated values. The problem is that there's not necessarily a perfect correspondence between element #parents properties and validated $form_state['values']. There is when elements are well behaved, but our validation pipeline doesn't enforce good behavior. I'm concerned if people start using #limit_validation_errors for buttons that trigger #submit handlers that do potentially risky things. I'm hoping that #735800: node form triggers form level submit functions on button level submits, without validation fixes the problems of core doing this. Given that we can't control contrib, stripping $form_state['values'] to only the "sections" in #limit_validation_errors before calling the #submit handlers seems like an idea worth considering, but given the issues we're facing with entity editing forms, I wonder what unwanted side-effects that will create: especially in multi-step forms where people are persisting prior steps' validated values in $form_state['values'].
[EDIT: 'values' is within form_state_keys_no_cache(), so my concern about multistep forms might not be a problem]
Comment #4
fago>Given that we can't control contrib, stripping $form_state['values'] to only the "sections" in #limit_validation_errors before calling the #submit handlers seems like an idea worth considering...
I think that's necessary. The pure cause to use the form API 'values' instead of $_POST is improved security and validation. If it doesn't offer that, there is no cause to use it at all :P Everything which is in the form API values has to be "save" such that it passed validation.
Of course good usage of #limit_validation would involve not using insecure form API values. But core has proven that sometimes this isn't obvious (e.g. #builder_function) - so people might be even not aware they do.
@multistep-forms:
Forms using a custom multistep persistence like the node form now would loose the interim form values if form API persistence has been disabled. For sure that's better than have it silently working but being insecure as we have it in core now. If something breaks, people fix it.
Comment #5
Gábor HojtsyI agree putting non-validated values in $form_state['values'] can end up being quite confusing.
Comment #6
effulgentsia CreditAttribution: effulgentsia commentedGreat. Let's see what fails.
Comment #7
rfaysubscribing
Comment #9
sunWould it be possible to backup this value earlier in drupal_process_form(), so as to prevent validation handlers to break into a unbreakable system?
The rest could use a comment or two, but it's probably also good enough if it's just us who understand it. :P
Powered by Dreditor.
Comment #10
sunwow. You managed to trigger a fatal error, but only in tests that seem to be related to #limit_validation_errors. Nice work :)
Comment #11
fagohm, doing it at this step would mean the values are there during validation, but not during submit. That's imho confusing and bad DX.
So why not do it before doing validating?
In that case we should probably skip the element validation handler for not validated elements too.
What about elements falling out the tree? #759222: #tree does not default to TRUE, fails to meet developer expectations
Your code would miss those values. I think in general your approach is fine, but that makes the above issue critical now.
@code:
+ // Leverage the ability of the internal _form_set_value() function
+ // to set data deep into an array.
Perhaps it's time to finally introduce a generic array_set_value() function? _form_set_value() is private and it's not the first usage outside of form_set_value().
You assume the #parents are an intersection of the #array parents - unfortunately that's not true in general as someone can set different #parents manually. I think we need to get the element first and then respect its #parents. -> array_get_value()? ;)
Comment #12
MichaelCole CreditAttribution: MichaelCole commented#6: form-remove_unvalidated_values-763376-6.patch queued for re-testing.
Comment #13
MichaelCole CreditAttribution: MichaelCole commentedThis issue is about unvalidated data being in $form_state['values'], which is confusing.
This only applies to fields with #limit_validation_errors. Why can't this be solved with documentation on #limit_validation_errors?
Not sure why it was marked "needs review" in #9, when tests were failing in #6. Marking "needs work"...
In #11 Fago implies that the patch makes the API *more* confusing by having the unvalidated data in $form_state['values'] during the validation step, but not during the submit step.
This seems either not a critical bug, or a documentation fix to me...
Comment #14
fago>This only applies to fields with #limit_validation_errors. Why can't this be solved with documentation on #limit_validation_errors?
* Because it doesn't cope with $form_state['values'] being validated values. Why not document that exception?
Well it might led to insecure code being written, which drupal core itself has proven.
I've rolled a patch based on the suggestions I made in #11. Thus the values aren't available during validation too and I introduced the utility functions drupal_set_array_deep() and drupal_get_array_deep() I made use of.
As #735800: node form triggers form level submit functions on button level submits, without validation hasn't been fixed yet, I expect the patch do generate errors. Non-JS button upload-submits though worked for me.
unrelated:
Imho the special casing of #type == 'form' in form_builder() should be moved to a separate after build callback functions for the #type == form element. That would clean up form_builder(), but that's out of scope.
Comment #16
moshe weitzman CreditAttribution: moshe weitzman commentedAnyone available to push this along? Is it really critical?
Comment #17
catchI think this more or less depends on #735800: node form triggers form level submit functions on button level submits, without validation.
Comment #18
effulgentsia CreditAttribution: effulgentsia commentedCorrect. It will probably be relatively easy once the other issue lands, and that one is very close.
Comment #19
effulgentsia CreditAttribution: effulgentsia commented#735800: node form triggers form level submit functions on button level submits, without validation landed, so this issue can proceed again: yay!
BTW: From #14:
I agree that this is a desirable thing to do in a separate issue. See http://randyfay.com/node/66#comment-102.
Comment #20
fago#14: fapi_values.patch queued for re-testing.
Comment #22
fagook, I had a first look at this. The approach I took in #14 doesn't fly as still form level validation handlers are invoked, thus the usual form values need to be available there. But furthermore cleaning up the values after validation doesn't work either, as still modules rely on button values ('$form_state['values']['op']) or similar to exist, thus this would break also existing code. Thus using #limit_validation_errors as whitelist for form values causes headache.
Therefore the only way I see to proceed is removing any values, for which validation fails. Attached patch implements that. For that to work we need $form_state available in form_set_error(). To achieve that I invented drupal_get_processed_form() and drupal_set_processed_form(), which work with a drupal_static() that is set by drupal_process_form() accordingly. This results in the small API change that any brave module that builds a form during processing of another form needs to set the processed form accordingly and set it back to the original value afterwards.
Note that having the $form_state in $form is big step towards finally cleaning up error setting. I tried to so without changing the API by inventing $form_state['errors'], but failed as form_set_error() only receives the #parents, thus the location of the element in question in the form is unknown to us, but we would need it in order to fix rendering of the error. (As of the time of the form rendering the form is not processed any more, thus $form is unavailable to form_get_error(), therefor we would need to set $element['#error'] or something similar beforehand.).
Finally having the $form_state available in form_set_error() enables us to clean up the internal #limit_validation_error logic, which is currently rather confusing (it sets the static for each validated value!). I cleaned up so that form_set_error() just obeys #limit_validation_errors of the triggered element of the current form. Note that this can't lead to a security issue even in case someone mixes up with drupal_get_processed_form() due to nested form processing or similar, as if
* #limit_validation_error is set wrongly due to a wrong $form_state it might suppress validation errors, but then it also removes the form values.
* #limit_validation_error is wrongly not set due to a wrong $form_state, then an unwanted validated error is shown.
Therefore form_set_error() as changed in the patch is more secure, as any time a validation failure is suppressed the according form values are also removed.
However I still can think of a case for which unvalidated form values could survive. If the module in questions uses the API wrongly and issues form_set_error() with wrong #parents. Then the validation failure wouldn't be properly marked due to wrong API usage, but if this failures are suppressed then the according form values won't be deleted! However we could ensure this only, if we make form_set_error() aware of the right $element. We need to this anyway for d8 in order to be able to clean up error setting (see above), but doing it now would be a rather big API change as we won't be able to support form_set_error('parent][child', 'message'); with parent and child being #parents anymore, as we would have to convert to #array_parents or totally drop this syntax. Additionally form_get_errors() outside of the form processing scope won't work any more. I'd love to have this finally cleaned up though.
Note: You can see the history of this patch at http://github.com/fago/drupal/commits/fapi_values. At tag is the latest version of my close-to-being-ready attempt to cleanup form errors.
Patch attached. Please carefully review.
Comment #23
fagoComment #25
fagoComment #26
effulgentsia CreditAttribution: effulgentsia commentedI haven't reviewed the latest patch yet, but:
From #22:
I disagree with this. Submit handlers don't have to use $form_state['values']['op']. They can use $form_state['triggering_element']. And in many cases, they won't even need that, since #limit_validation_errors is only in-effect if the button also has a button-level #submit, though you do have some cases like file.module, where the "upload" and "remove" buttons use the same #submit, and therefore, that handler does refer to $form_state['triggering_element'] to know whether to process the "upload" or "remove" action. If the submit handler wants to switch on $form_state['values']['op'], then I think it makes sense to require that the #limit_validation_errors of that button include array('op') as one of its items.
From #11:
I don't think it's confusing, because that's how FAPI works. During normal processing when there is no #limit_validation_errors, during validation, $form_state['values'] may contain invalid data. It is only known to be valid during the submit handlers. So, #6 doesn't introduce anything new in that regard. But you may be right about bad DX: do you have a specific example that illustrates what makes it bad DX?
#25 seems way more complicated than #6. But if #6 is truly bad DX, then ok, let's refine #25.
Comment #27
fagoAgreed. Well to a validation handler the form values might be different than to a submit handler, what might confuse some users. However as noted above, as we have to execute form level execution handler we cannot remove form values before the validation phase anyway.
>Submit handlers don't have to use $form_state['values']['op']. They can use $form_state['triggering_element'].
>If the submit handler wants to switch on $form_state['values']['op'], then I think it makes sense to require that the #limit_validation_errors of that button include array('op') as one of its items.
Agreed, but I got quite some errors due to modules not doing so and not stating the button values in #limit_validation_errors. I guess this would just change the way #limit_validation_errors works too late in the cycle.
I do think #25 is good way to proceed, it's secure, helps cleaning up things and it doesn't produce unnecessary side-effects due to missing button values, token values or similar.
Comment #28
effulgentsia CreditAttribution: effulgentsia commentedHm. I think we need some more FAPI maintainer feedback here. I'd really like to know what sun and chx think.
I think we have two approaches we can take:
1) fago's recommendation (#25), which has, IMO, these two potential down-sides:
a) It requires the drupal_set_processed_form() function in order to have a statically cached $form_state available within form_set_error(). I actually like the idea of a statically cached $form_state available to form_set_error(), because it's annoying that that's our one place where we don't have $form_state available, and I tried to add it in the original #limit_validation_errors issue (#370537-162: Allow suppress of form errors (hack to make "more" buttons work properly)), but chx removed that in the following comment and it stayed removed in the patch that landed.
b) It only pulls things out of $form_state['values'] that fail validation. So, if a developer does the thing we're specifically trying to protect them from, which is having their submit handler try to access
$form_state['values'][SOMETHING_OUTSIDE_A_LIMIT_VALIDATION_ERROR_SECTION]
, it will work as long as they submit valid input, and will only stop working when they submit invalid input. Seems like this isn't ideal DX, because it delays when the developer becomes aware of the bug in their code (they only become aware of it when trying to submit the form with invalid input).2) My earlier recommendation (#6), though after reading the last paragraph of #27, I'm not sure if this is still my recommendation, which avoids the two problems of #25, but has the two down-sides mentioned in the last paragraph of #27:
a) The clicked button's name/value (e.g., 'op') gets removed from $form_state['values'] unless it's whitelisted in #limit_validation_errors. This can be easily fixed by adding a little more code at the end of #6:
b) Other form elements that specify a #value (e.g., 'form_id') in the build phase, and that therefore, are always valid, get removed from $form_state['values'] unless whitelisted in #limit_validation_errors. As fago points out, it may be undesirable at this stage in D7 to require these kinds of tokens to be whitelisted everywhere #limit_validation_errors is used. If we wanted to auto-whitelist all elements that have a builder-specified #value, we could add code in _form_builder_handle_input_element() to track that in some new $form_state variable (e.g., $form_state['forced_values']), but that's starting to feel icky.
So, I'm torn. I see the appeal and annoyance with each approach, and haven't thought of anything better yet.
Comment #29
effulgentsia CreditAttribution: effulgentsia commentedActually, I think this is my recommendation at the moment:
#6 + the fix in #28.2a. And require use-cases that involve a #limit_validation_errors submit handler accessing $form_state['values']['form_id'] and similar tokens to add 'form_id' and the other ones that are needed by the submit handler to the #limit_validation_errors array. This requires the smallest code change to HEAD and I think only adds minimal annoyance to developers who use #limit_validation_errors. Note that none of our 3 submit handlers in core that use #limit_validation_errors require access to a form token value:
field_add_more_submit()
file_managed_file_submit()
poll_more_choices_submit()
Comment #30
fagoI agree with #29. 1)b is really bad DX. The patch in #25 solves the security problem, but introduces DX issues. With the added fix #28.2a the approach of #6 shouldn't affect much code, so I don't think we need a $form_state['forced_values']. So I've implemented that based upon #14 - thus scratched the *private* _form_set_value() function and introduced drupal_set/get_array_deep(). Buttons currently have form values based on #name and the #parents. Thus the attached patch keeps both values in case a button is the triggering element.
Comment #31
effulgentsia CreditAttribution: effulgentsia commentedThe code looks great to me. sun should review this anyway, so I'll let him make documentation suggestions. Also, I think a test should be added to the patch. Not sure if this issue is a beta blocker or not, but if it is, I believe it can be committed as-is.
Comment #32
fagook, I've improved the limit_validation_errors test case to ensure not validated values are not available to submit handlers.
Comment #33
effulgentsia CreditAttribution: effulgentsia commentedI asked chx to review and comment. @fago: while he's doing that, in case you want to do an interim improvement:
1. In IRC, chx requested that drupal_(get|set)_array_deep() not be recursive, since PHP recursion sucks performance and memory. They can be done with a simple foreach loop (in the case of set, a reference that moves during the foreach).
2.
This doesn't feel quite right. Since $form_state is by reference, changing #limit_validation_errors here seems unclean. Also, what if a textfield and a button have the same #parents (bad module developer, bad!)? Is there another way to achieve the same goal? Perhaps simply drupal_set_array_deep($values, $form_state['triggering_element']['#parents'], $form_state['triggering_element']['#value'])?
3.
Can we clarify this test by submitting an explicit 'title' => ''? And a comment that the !isset() doesn't suffer the risk of a false positive, even when empty string is submitted, since empty string is different from NULL. This stumped me until I understood that subtlety.
Powered by Dreditor.
Comment #34
fagoad 1): done.
ad 2): Good catch. Indeed developers fiddling around with #parents could achieve that the value of a not validated form element overwrites the value of the button. I added in your suggestion, as that way we still pass through the value of the button in that case.
ad 3): Added 'title' => '' and a comment.
Comment #36
fagoThere were some problems with string offsets that cannot be referenced when _form_builder_handle_input_element() tries to set the value. Interestingly the recursive function - as _form_set_value() was previously - had not problem with that. Anyway we do not want to set the value in that case, so I just improved drupal_set_array_deep() to make sure it operates with arrays and return whether it was successful.
Comment #37
fagotalked with chx in IRC. He was worried about what made #34 fail, so I had a closer look at it. It turned out that there is an error in the tableselect tests, which manually set $form_state['input'] with wrong values.
Also incorporated some comment improvements suggested by chx.
Comment #38
fagoforgot to remove "return TRUE".
Comment #39
sun2 new very helpful helper functions, 1 fine and fancy form API change, 0 remaining review issues.
Comment #40
chx CreditAttribution: chx commentedLooks very good. I am a little unsure of the names of the new functions but they will do.
Comment #41
chx CreditAttribution: chx commentedSorry crossposted.
Comment #42
jhodgdonchx asked me in IRC to post my opinion on the get/set functions.
I think the names are OK and the code is clear. Though maybe set/get_array_element_deep would be a better name?
Also, suggested minor updates to the doc headers (sorry, in a rush, don't have time to patch and probably the wrapping is incorrect):
Comment #44
ksenzeeReroll with jhodgdon's doc changes (plus an extra @see) and drupal_get/set_array_element_deep function names.
Comment #45
ksenzeeCross-post. Guess we can run it by the bot again though.
Comment #46
fagooh, there is another test case which sets $form_state['input'] wrong. Fixed that too.
I also incorporated the doc improvements from #44, however kept the name drupal_(get|set)_array_deep - as it is not about form elements only. The functions can be used with any array. If still drupal_get/set_array_element_deep is preferred, I'm fine with that too.
Comment #47
sun-1 on the function name change -- "element" adds no valuable meaning to the function names. And, there is no risk of function namespace pollution in this case. Even the "deep" is senseless in reality already. Arrays are deep, by nature. If ya really wanna change tha name, considering a proper [subject]_[verb] namespacing would've more value.
Comment #48
ksenzeeTo me element doesn't imply a form - it just means an array element. I think it's clearer to specify "element" in the name - otherwise it sounds like you're setting the whole array. But if element sounds too FAPI-ish maybe we could do drupal_get/set_array_value_deep.
Comment #49
ksenzeeOkay, what about drupal_get/set_deep_array_value? (I promise I will not bikeshed this too much. It's just a difficult function to understand in the first place, and it could use a good name.)
Comment #50
sunThese function names really don't need to be overlengthy. Keywords "array" and "set/get" are sufficient to let each and everyone understand that they are about handling array values. What else can you, actually, remotely, expect? Is there anything else, possibly?
Why on earth would you want to call a function for that?
Righto, because übercoredevs simplified the only possible array complexity for you. :)
Comment #51
ksenzeeI disagree that these names are clear enough to let everyone understand. I certainly didn't at first glance. And I am thinking of the poor person stepping through FAPI for the first time trying to understand wth is going on in there. It seems like a long function name would help in that case. These functions certainly won't be used often enough to need short names for convenience. But like I said, I won't bikeshed this.
Comment #52
sunNot limited to Form API, thus contained in common.inc.
We've worked with form_set_value() for years. Forms are straight arrays, and this patch replaces that entire function. Thus, array_set_value() would be the direct replacement, but since Array is not a Drupal sub-system or module, and we don't want to risk function name clashes with PHP, we can simply do what we always do, prepend drupal_ and be done with it.
--
However, for both function PHPDocs, we absolutely need more elaborate descriptions, so as to clarify, why someone would want to use them. The examples that have been added do not explain this very first WTF moment. People will ask, why not simply:
"How many needless functions can this bloated Drupal provide and even load on all requests?" :P
Let's try to prevent this reaction.
Comment #53
effulgentsia CreditAttribution: effulgentsia commentedI'm fine with the names of drupal_(get|set)_array_deep(), but throwing some other options into the mix:
Of these 3, I prefer the last, because even though "tree" is the technical term for this, it's not in widespread usage within Drupal, and people are already confused enough with #tree.
#52.2 is better than #52.1 and should be used wherever you can target each sub-index literally. The function is for when $parents is a variable, because for these cases, #52.2 is not an option. (hoping someone else can turn that into a good PHPDoc)
If we really want a short name, I experimented with a drupal_subtree() function in #370537-92: Allow suppress of form errors (hack to make "more" buttons work properly), but it got dropped in subsequent patches: I'm not suggesting we want that function, but it does provide a short name.
Comment #54
fagoad #52:
The patch doesn't remove form_set_value() - just the private _form_set_value() which was sometimes used as drupal_set_array_deep().. But drupal_array_get/set_value() makes sense to me, so I'd vote for that.
Exactly. I give it a try:
"In case the array parents are known beforehand the value should be directly set/retrieved from the array. This helper function should be used only when the passed parents may vary, i.e., the parents are determined by a variable."
Comment #55
jhodgdonHmmm.
If the parents are variables they can still be put into a direct call like $array[$var1][$var2]['etc']
The only case where you couldn't do that is if you don't even know the number/depth of the parents array.
Comment #56
fagoMy sentence says, it's needed if the parents are determined by a variable, not multiple. Anyway it looks like it's not clear :( Perhaps you can think of a better one?
Comment #57
tstoecklerMaybe:
"In case the array parents are known beforehand the value should be directly set/retrieved from the array. This helper function should be used only when the depth of the value may vary, i.e., the number of parents is determined by a variable."
?
Comment #58
tstoecklerI just re-read that and it's not very good IMO, as the $variable doesn't actually contain the NUMBER of parents, but the parents themselves, which in turn may vary in number. So the text is a bit misleading.
Maybe the following (just cut out 3 words):
"In case the array parents are known beforehand the value should be directly set/retrieved from the array. This helper function should be used only when the depth of the value may vary, i.e., the number of parents is variable."
Comment #59
sunComment #60
tstoecklerLooking good.
Comment #61
sunMe merely renamed two strings and copy'n'pasted #58 into PHPDoc. Back to RTBC.
Comment #62
jhodgdonThe current version is not punctuated correctly... you need a ; before an i.e.
How about
"If the array parent keys are known, the value should be directly set/retrieved from the array. This helper function should be used only when the depth of the value may vary (that is, the number of parents is variable)."
Also, "set/retrieved" should not be in there literally. It should be set in the set function, and retrieved in the get function.
Comment #63
bcn CreditAttribution: bcn commentedRe-rolled with the suggested text from #62.
Comment #64
sun"set from the array" => "set in the array".
Powered by Dreditor.
Comment #65
bcn CreditAttribution: bcn commentedAnd the change suggested by sun in 64...
Comment #66
fagofixed missing closing parenthesis:
>+ * the value may vary (that is, the number of parents is variable.
Comment #67
sunThanks! It's time to close down this critical. Further adjustments, if required, can happen in a non-critical follow-up patch.
Comment #68
marcingy CreditAttribution: marcingy commentedJust a reroll to head
Comment #69
marcingy CreditAttribution: marcingy commented#68: non_validated.patch queued for re-testing.
Comment #70
sunConfirmed that #66 and #68 are identical.
Comment #71
Dries CreditAttribution: Dries commentedPersonally not a big fan of these new helper functions. They are anything but easy and not particularly elegant from an API/DX point of view -- it is a bit of a WTF to be honest. Still need to think more about how to make it easier.
Comment #72
sunAt this point in time, I think we should go with the current solution. If it's still a big WTF, then it's only the docs that need to be updated. Otherwise, those new helper functions merely resemble what existing code in HEAD (even D6) already does, just that existing code is more complicated.
Comment #73
yched CreditAttribution: yched commentedField / Field UI has to do that same sort of array navigating in a couple places. I'd love to use those helper functions instead - precisely because the code to do that is tedious to write and to read :-).
Comment #74
fagoad #77: Agreed. Furthermore existing code, as previously in the file module (see #59), even made use of the *private* _form_set_value() function instead, a good sign that such a helper is long overdue.
Comment #75
webchickTotally agreed with Dries in #71. The examples in the docblock might make sense to the people involved in this issue, but they make absolutely no sense to a passerby. If we're hoping for these to be general utility functions for contrib authors (sounds like folks are saying that would be useful), we should spend some time on the DX here.
Probably at the very least need a fleshed-out "real world" example (and in actual @code tags) of how to use both of these. I tried looking for a good example in the patch itself and came up short. :\ Documenting this will probably help with figuring out how to name the functions/arguments a bit more sensibly.
Comment #76
sunI've heavily expanded the PHPDoc, included example code, and even included example snippets that explain that the only other way would be using evil eval(), which is why these helper functions exist.
Comment #77
jhodgdonPlease don't RTBC your own very changed patches! This one needs some work...\
a) Shouldn't these be literal 'b' and 'c' in the first example in this hunk? Either that or set $b and $c so we know what you are talking about.
b) And in the same chunk, it took me a while ot figure out the second comment about "resembled using evil eval()". First of all, probably you mean "reassembled", and second, I think it was confusing to say "this function" here, when the previous line is talking about something else... probably you should put your explanation outside the @code block and make it clearer.
c) Comment (a) also applies to the get function.
Comment #78
sunAnything else?
Comment #79
webchickSorry... By 'a fleshed-out "real world" example' I meant that instead of this 'a' 'b' 'c' stuff, let's show them an actual form with a use case and why one would want to use this function. The use case must exist, because yched mentions these helper functions are valuable, but I can't understand from the current docs why these functions exist nor why I might want to use them.
Comment #80
sunThose real world use-cases exist since Form API was introduced, in various contributed modules. The code of these two helper functions is duplicated in various modules, or even worse, I've actually copied that evil eval() example from a contributed module.
I think that the example values are simple and sufficient enough and at the same time not too complex to explain the functionality. To see other examples, users can click through the list of functions that call these functions. Among them will be that function in File module, which is a nice showcase already. We'll potentially see more.
Comment #81
jhodgdonI'll make some better examples
Comment #83
jhodgdonHow about this?
Comment #84
fagoYep, a FAPI example is better. I improved the documented get call to make use of list($value, $value_exists) and fixed the text to talk about retrieving, not setting.
Comment #85
sunThanks.
Comment #86
Dries CreditAttribution: Dries commentedReviewing this from the Core Developer Summit in Copenhagen.
Could we change this to say something along the lines of:
"Sets a value in a nested array with variable depth."
The function name and the documentation at the top of the phpDoc make it sound as this is the preferred way to set a value in an array. It should only be recommended when using an array with variable depth. Would be great if that could be represented in the documentation and/or function name.
This is very ugly, and won't work for me. It is much cleaner to introduce a third function called
drupal_array_exist()
.If we make those changes, this is RTBC for me and webchick.
Comment #87
Dries CreditAttribution: Dries commentedComment #88
fagothanks.
I know the list() stuff isn't that nice, but usually you need both values - so getting both in one call ala
is just useful.
Additionally if we would have 2 separate functions, this would mean that we would have to iterate all over the array two times, once for checking whether its there, and once for actually getting the value - what would unnecessarily affect performance.
Attached patch incorporates the suggested doc improvement. I've not changed the function name, as I don't think making it even more verbose helps us here. People probably need to read the docs anyway in order to understand why and when to use those functions.
Comment #89
Dries CreditAttribution: Dries commentedwebchick and myself discussed this, and we definitely want to add
drupal_array_exist()
so the calls tolist()
can be avoided. APIs need to be clean.Comment #90
webchickOk, Dries, Fago and I spent like 30 minutes going through this patch. At least 20 of that was me still being extremely confused. :P A big part of that was:
a) The API doc code example is not good for explaining why this function is needed. What helped me was when fago explained that the real issue is the following:
Your code wants to find the 'signature' field. The problem is, that signature field might be in an array like this:
or it might be in an array like this:
So to deal with this situation, your code needs to do some codey thing to figure out the path to the field ($form['signature_settings']['user']['signature'] or $form['signature_settings']['signature'] or...). You end up with a dynamic $parents array.
Without this pair of functions, the only way to get at it is this line of code:
...which is digusting and kills kittens.
The second part of that (eval() being the only other options) is spelled out pretty well, but the initial example totally isn't. Which is why we need to add that other bit about "Don't actually use this function at all, just reference the array values directly."
So let's fix the example code to explain the actual problem, and in the part that says "don't use eval(), let's put the commented out eval line just above the actual line you would call with the new helper function. Capice?
2. Speaking of killing kittens, this makes Dries and webchick cry:
Dear sweet lord mercy, no. From a DX POV this is awful, and a function should simply return success or failure, and not conflate it with values.
Fago explained that this is required, because we can't do a simple "return FALSE/NULL" because then there would be no difference between trying to reference a value that didn't exist (the 'banana' test) and a value that was legitimately set to FALSE/NULL.
So what Dries and I decided is we want a third function, an "array key exists" style function, to handle the TRUE/FALSE part, and the values separately. Fago has concerns about performance here (since we have to do that loop twice), but from a DX POV it is *much* easier to follow code like this:
...than code like this:
3. Finally, the name of this function. drupal_array_get_value() and drupal_array_set_value() sound suspiciously like I should use. I should never use this function. :P This function is for people like yched or fago or people developing low-level form elements (by and large). It also doesn't remotely describe what this function is actually doing, which was at least 10 minutes of the confusion.
So. We did a mini bikeshed session and decided on the following:
...aaaaaaaand done. Fago's working on a re-roll.
Comment #91
fagook, before I start drinking beer, here is the re-roll ;)
Comment #92
sunTBH, and sorry for being direct, but I believe the review issues + rejections given are totally over-complicating this patch.
1) Just because there is a function somewhere, anywhere in Drupal doesn't mean I'll use it. Why should I? Do you do that? I think + hope you are not.
2) We happily lived with form_set_value() in the past. In Drupal 7, we transposed the concept of structured $form arrays to the entire page. Therefore, it's just logical and sufficient to replace that with a more generic drupal_array_set_value(). It seems like we didn't face reality yet: Our renderable arrays are not regular arrays. Thus, this is merely about s/form/drupal_array/, nothing else.
3) The additional drupal_array_get_value() is the counter-part to drupal_array_set_value(). And it's only introduced now, because it was totally missing in the past - I could have used it, but had to re-implement it all over again.
4) The new drupal_array_key_exists() or whatever it is named now is plain performance overhead. It makes zero sense to recurse more than once into an array. Recursing deeply into an array is slow. drupal_array_get_value() is not the only function that returns an array. Are all of those functions considered bad? No, they are not, if there's a legit reason for doing so. That's the case here: We have to return an array to allow calling code to differ between NULL and NULL. Again, recursing more than once is 100% needless overhead for every use-case of this function. To use drupal_array_get_value(), you have to invoke drupal_array_key_exists() -- it simply makes no sense to split it.
To conclude yet again:
- Longer function names do not buy us anything. If a developer happens to use these functions where they shouldn't be used, no amount of docs and no amount of additional words in the function names will prevent that.
- drupal_array_key_exists() is plain performance overhead.
- The phpDoc of #78 and #84 was more than sufficient for the purposes of this function already. People who don't understand that, also don't understand form structures and renderable arrays, so the phpDoc of these functions is the entirely wrong place to document how Drupal's arrays work and need to be treated at run-time.
Comment #93
webchickI don't disagree about the performance overhead. But I'm concerned with people being able to understand what's happening in the code.
This:
...is very clear and makes complete sense.
This...
...makes absolutely no sense, and I must look up the API documentation of drupal_array_get_value() to understand what's going on. A "mixed" return value that includes both "value you want" and "status" is totally inconsistent from any other function in Drupal, apart from the ungodly awful update_sql() which we've thankfully ripped out from D7. Not eager to add more functions like this, ever, unless the performance shock is shown to be so unbelievably dastardly that there's no other option.
And sorry, I was in a hurry before so wasn't quite articulate enough to summarize the 30 minute discussion. The longer function name doesn't just buy us "scary, stay away", but it also buys us a name that actually describes what the function does. drupal_array_get_value() doesn't, remotely. I would expect that it's a wrapper function around some PHP function "array_get_value()" since that's normally why we use the drupal_ prefix. I showed it to a couple other random Drupalistas at the sprint and neither one of them could parse what the function was supposed to do, but having the keyword "nested" helped tremendously.
And the PHPDoc is totally not sufficient because the code examples totally do not actually describe the actual problem. It shows a normal run-of-the-mill form definition, on which you never would have to use this function because you know the structure in advance and can just reference it with $form['foo']['bar']['baz']. The PHPDoc goes on to qualify this itself further down which is a wonderful indication that it's confusing/bad PHPDoc and should be improved.
So, basically, I stand by these requests. Dries stands by these requests. If we can get them banged out we can close another critical tomorrow.
Powered by Dreditor.
Comment #94
sunSorry, but
solely means that we're still afraid that people could use these functions, just because they exist. That can only happen if people don't know PHP, and we totally cannot teach or explain PHP in our phpDoc. Otherwise, why would you remotely ever consider to use a function to get or set an array value? Please explain that logic, because I absolutely don't get it.
um? Can there actually be more precise variable names than $value_exists and $value? I don't get this either. That's most probably the most intuitive way to name the return values of the function.
Also, having you look up API documentation is much more performant than adding this run-time overhead to each and every Drupal site in the wild. But that said, people in the need for this functionality (no one else would remotely consider that such helper functions may exist) will perfectly understand the API documentation added earlier, and will also perfectly understand how to work with those return values.
I stand by #92, we are totally over-complicating things here. phpDoc, if something is still unclear, can be improved in a follow-up issue/patch. A separate drupal_array_key_exists() can happily discussed in a separate, non-critical issue, but it's useless and a plain performance overhead here.
Comment #95
fagoI agree that generic function names like drupal_array_get/set_value() might indeed lead to people starting using it just instead of $array[$value]. People might see the function and think - oh there is another drupal wrapper about some php stuff (as we have it for strings and stuff) and use it - that definitely needs to be avoided.
So I do think the whole discussion helped us to come up with a better docs example + function names. Then drupal_array_nested_value_exists() makes sense and the code using it is indeed more readable. However, as webchick mentioned above I'd also prefer the
list($value, $value_exists) = drupal_array_get_nested_value($array, $parents);
notation to avoid double-iterating over the parents.Anyway - all of that is much better than the previous practice of mis-using _form_set_value() :P
Comment #96
Frando CreditAttribution: Frando commentedI have to agree to sun - I find
list($values, $value_exists) = drupal_array_get_value($form_state['values'], $element['#parents']);
very concise, clear and readable. No need IMO for a second function that also adds a performance overhead. Traversing arrays in PHP is not cheap.Everybody who is in need to use this function (which is not as rare as one might think, and will get less rare in D7) will have to read the phpdoc, which includes an example on how to use the array that's returned. And it's not the only place in drupal we use an anonymous array and list() to have functions return more than one thing quickly and easily, e.g. http://api.drupal.org/api/function/entity_extract_ids/7.
Changing the names to inlcude _nested_, that's a good change. In all this discussion, don't forget that previously, we abused a private function with a totally misleading name and no helpful phpdoc at all, so having public functions in the proper namespace is a huge huge plus in any way.
Comment #97
jhodgdonWhat about this possible alternative pattern:
$ok = drupal_array_get_value($array, $parents, $value) with $value passed by reference? i.e. have the function return true/false for success/failure, and have the value be set in a variable that the calling function provides? Would that make it clearer?
Comment #98
Dries CreditAttribution: Dries commentedEach approach has pros and cons. We thought and talked about them a lot, and at the end of the day, I'm in the camp of using #91.
In fact, I think #91 is RTBC.
Comment #99
yched CreditAttribution: yched commenteddrupal_array_nested_value_exists() / drupal_array_get_nested_value() : isn't that what exceptions are for ? bubble up abnormal / failure conditions without clobbering the natural format for return values ?
Comment #100
jhodgdon#91 is not quite RTBC:
a)
I think this paragraph is very awkward. How about:
To deal with this situation, the code needs to figure out the route to the element, given an array of parents that is either ...
b)
Either needs to be "without this helper function" or "without these helper functions".
c)
Remove the blank line before @code, since this belongs with the text in one logical paragraph.
d)
This is a comma splice (it is illegal to connect two complete sentences by a comma in English). Use a semicolon ; or make it into two sentences.
e)
This is good PHP syntax sort of, but not so good in English. How about else -> otherwise?
Comment #101
sunSorry, but this does not work for me. Can we please be a bit more explicit than "we discussed this today and let's do X"? No amount of discussion can eliminate the performance issue of recursing multiple times into an array.
Comment #102
drunken monkeyI'm also in favor of doing this the more performant way with the returned array and
list()
, or the exception suggested by yched (if this wouldn't be a regular case – but probably that wouldn't be as good). Even passing a reference would be better, in my opinion.However, if Dries is already set on the three-functions approach: Here is a re-roll taking Jennifer's concerns into account. Only comments were changed, the patch still applied cleanly.
Comment #103
jhodgdonAgreed with #101 and #102. I think this API is a bad idea -- where else in core (or in any other project's decent API) do we require you to call one function to see if another function will work? The pattern of "return success/failure and pass in a reference" is pretty common in PHP, and a much better idea than this one IMO. Or the return-a-list idea, which is a fairly common pattern in PHP (and EXTREMELY common in Perl APIs) as well.
But if the Powers That Be are set on this idea, then I agree that the patch in #102 has fixed the concerns I expressed in #100.
Comment #104
fagothanks. Imo #102 is ready now -> setting RTBC as it implements the desired approach.
>"return success/failure and pass in a reference"
That would mean one would have to create the variable for returning the reference beforehand in order to be php 5.3 compatible:
Rather weird too. list() is the PHP solution for returning multiple variables, so we should use that or stick with the three functions.
@performance:
For that specific use-case I don't think it is that critical, as it runs only once per section to which validation is limited. However it might be different in other use-cases and lonely the need of unnecessarily diving into the array two times is awkward. In fact, when I want to to use the functions for a check + get the value I'd probably think - WTF I can't do it at once?
Comment #105
sunThe arguments against an array as return value are moot and have not been justified or clarified further. While the getter may not be used much throughout core currently, it's possible that contributed modules may use it to perform heavy alterations on the renderable $page array on every request, so performance has to be a concern here, and it makes no sense to make something slower just because people fail to read docs.
I've kept the _exists() (<- Function names are too long to remember now), as there's a single use-case.
Comment #106
jhodgdonGiven all of the documentation in the get function, I don't think it needs to say "see the set function for details"? I think all the details are right there.
Also:
List items should start with a capital letter, and I think this should say "if it exists, or NULL if it doesn't".
More concisely: "TRUE if the nested key exists, and FALSE if not". Also should start with capital letter.
Also as a note - see what the exists() function does:
These two functions should have exactly the same wording for this line.
Comment #107
sunIncorporated #106.
I've yet to see another critical issue that gets hold off by so many minor documentation issues. Why is it possible to do such improvements in follow-up patches for other issues, but not for this one?
Comment #108
jhodgdonSorry sun. If you get jhodgdon involved in reviewing patches on an issue, you will have doc wording reviews.
My feeling is that if every issue got "held up" by the documentation issues, i.e. if every patch got reviewed for documentation, then we would (a) have better documentation and (b) have a MUCH smaller documentation issue queue. It's huge right now, and there are only a few people interested in working on it. The follow-up doc issues mostly just sit there. So I think it's much better to get the doc right in the original patch.
But that's what you'd expect the maintainer of the API documentation to say...
Comment #109
jhodgdonAnyway, that patch in #107 is fine with me.
Comment #110
sunComment #111
webchickBecause you're adding new API functions with confusing arguments and confusing return values. Just because a bug is critical doesn't mean we rush in a sub-optimal API docs that might (and given our history, probably) never be fixed.
API docs look good to me now. Dries felt pretty strongly on Sunday that we should have that interim "_exists" function, but I think the performance arguments against that are pretty strong. I still hate the values coming back from the functions mixing "values" and "random status return parameter", but the only real alternative is passing &$ret around like we do in update_sql() and that seems even more sub-optimal.
Since Dries isn't here at the code sprint to discuss, I've decided to commit this to HEAD. But of course if Dries feels differently, feel free to revert.
Comment #112
effulgentsia CreditAttribution: effulgentsia commentedYay that this got in. One more critical down!
Sorry I didn't have a chance to review prior to it landing. But there's one key bug in it:
This returns FALSE for $value_exists even if the final parent key exists, but its value is NULL. This is precisely the condition we need to return TRUE for.
But otherwise, I actually agree with webchick and Dries that it's not ideal DX to return a list from drupal_array_get_nested_value(). Whether we like it or not, PHP uses isset() vs. array_key_exists(), and even if we disagree with PHP's choice in that matter, we should try to be consistent with it, because consistency improves DX (I think). Here's a patch that changes to that, including a consideration where performance needs require the logic to be inlined.
It's an API change, but to an API that's only existed for 5 hours, so I think that's okay.
Powered by Dreditor.
Comment #113
rfayI have a new #fail in the form_example multistep (#8). This worked until today's commit of this issue. I'm not figuring out what happened here.
It's in Form Example Tutorial, #8. (Examples project, #8, which is multistep).
You can see the #fail (with the actual page) in http://qa.drupal.org/pifr/test/26934 - Basically what happens is when the $form_state is reset when the next button is hit, it doesn't seem to show up properly when it gets to the next page.
Hoping this isn't a regression, but I can't figure out how to make the code work correctly.
To make things worse, I can't replicate it in my local, and that's why I put the debug() statements in and committed them (to see on the bot).
Comment #114
sunFor the same reasons stated earlier, I disagree with that change. Why didn't you simply change isset() to array_key_exists() in the getter? That's the bug that needs to be fixed, nothing else.
Comment #115
sunComment #116
sunAlthough the function argument consistency with array_key_exists() is nice, the resulting function argument inconsistency with the other drupal_array_nested_*() functions is ugly. I'm rather opposed to that change, but I don't find it important either.
Comment #118
effulgentsia CreditAttribution: effulgentsia commentedI'm going to roll another patch on this later today, unless sun or someone else wants to beat me to it.
I like what jhodgdon said in #103 a lot [referring to #98, not what landed]:
And this is a great rebuttal against my #112 patch. But I still don't like drupal_array_get_nested_value() returning a list. The other functions that we have in Drupal core that return lists are pretty clear about it in their name. entity_extract_ids() obviously returns multiple different ids. The only drupal core function I could find that returns a list despite not making that clear in its name is ajax_get_form() which does not return $form (and is therefore inconsistent with drupal_get_form()), but rather a list of $form, $form_state, and other stuff. But I think that function name/signature combination is unfortunate, and I don't like the idea of introducing a second name/signature WTF into Drupal core.
Plus, $key_exists is not an indication of whether the function worked. Whether the key exists and the corresponding value is NULL or whether the key doesn't exist, both are "working" conditions, and it's legitimate to return NULL either way. Potentially, most callers of this function won't care why the result is NULL. FAPI cares, but it's because FAPI is complex and has to deal with a lot of subtlety.
So, following jhodgdon's alternate suggestion, what do you all think of this:
Thought I'd get feedback on this, before rolling another patch.
[EDIT: Also, returning lists is common in Perl APIs, but Perl supports better handling of scalar and list contexts, where if the result of the function is being assigned to a scalar, then the first item in the list is assigned to that scalar. Sadly, no such luck in PHP.]
Comment #119
effulgentsia CreditAttribution: effulgentsia commentedAnd just to clarify #118, we could still have a wrapper function of drupal_array_nested_key_exists() for code readability where we're not as worried about every microsecond, but in performance-critical code, drupal_array_get_nested_value() could be used to return both pieces of information.
Comment #120
effulgentsia CreditAttribution: effulgentsia commentedHere's the patch I promised in #118. Ok, tear it apart :)
Comment #121
effulgentsia CreditAttribution: effulgentsia commented@rfay: I haven't had a chance yet to look into the details of #113, but it is definitely possible that #107 landing caused a regression with your form example, if that example was making assumptions about $form_state['values'] containing data that is outside of what is allowed by #limit_validation_errors. This issue has an "API Change" tag for a reason, and implementing this security was critical. OTOH, if there's some *unexpected* regression with multistep forms, we certainly want to get to the bottom of that.
Comment #122
yched CreditAttribution: yched commentedre rfay / effulgentsia #113 / #121 : yes, similarly this broke Field UI's formatter settings form on "Manage display" screens - #896162: Write tests for the Field UI formatter settings form. Simple fix.
Bottomline is : if a submit button uses #limit_validation_errors, and its submit handler or some associated multistep form building logic rely on the presence of specific values in $form_state['values'], those sections need to be explicitly included in #limit_validation_errors, You cannot go with a simple
'#limit_validation_errors' => array()
anymore, or $form_state['values'] comes up empty.Comment #123
yched CreditAttribution: yched commentedBTW : it seems the explanation for #limit_validation_errors in the PHPdoc for form_set_error() needs to be amended after the changes that got in here :
"... and the submit handlers will run despite $form_state['values']['step2'] and $form_state['values']['step2']['groupX']['choiceY'] containing invalid values"
$form_state['values']['step2'] / $form_state['values']['step2']['groupX']['choiceY'] are now not populated to begin with.
Comment #124
sunThe compromise in #120 would work for me.
Comment #125
fagook, once again... :/
This functions are only used in complex scenarios, so I think this difference is usually important. Still I agree that the possibility to use the function cleanly if the check is not necessary, is a plus.
Ouch. Thanks for spotting this.
Do we really need to bother users of the function with that implementation consideration?
Uhm, so we say to developers "this function is slow, so use something else if necessary.". That forces developers to decide which way to go each time the function is used, what makes using it really not simpler. Let's better remove this extra wrapper function - if one needs the distinction between NULL values and $value_exists, the way to go is clear: the unpleasant but fast way.
Comment #126
rfayHere's what I think we're saying now:
#limit_validation_errors can be used to prevent validation, but the unvalidated values are now discarded. As a result there is no way of preserving the values while skipping validation (a very common use case with a "back" button, of course).
Do I get it now?
If this is correct, then a significant update to both the form_set_error() phpdoc and the Form API reference are in order. Neither explains this.
This also makes #limit_validation_errors quite a lot less useful, as my general usage was to preserve the values and allow later validation. I guess I can see what that might mean. And the title of this issue seems to say as much.
Comment #127
sunIncorporated #125 and cleaned up a bit further. Also moved the note about PHP's array_key_exists() arguments similarity into the function body, although I originally wanted to remove that entirely, since I was quite confused after reading it and asked myself why I have to read that. ;)
Didn't work on rfay's documentation update request.
Comment #128
sunMay I additionally request to return the nested array value by reference?
Doing so allows to skip drupal_array_set_nested_value() in some cases, and I don't see how returning a reference could break anything.
Furthermore, we should add type hinting for $array and $parents to all of the helper functions added here.
Lastly, after having used these functions in a few places already, I'm very annoyed by their needlessly overly lengthy and verbose names. It's totally ugly to type them and have a function name that almost takes up 80 chars on its own. Could we revisit that decision, please? drupal_array_set|get_value() would be sufficient (and also long enough).
Powered by Dreditor.
Comment #129
sunCan we get this follow-up in first (so that the current function signatures are not too long in the wild) and care for any necessary documentation adjustments afterwards?
Comment #130
effulgentsia CreditAttribution: effulgentsia commenteds/$value_exists/$key_exists/
+1. If you think it won't be controversial, put it in now; otherwise, let's leave it to a follow-up.
+1. I'm fine if my above docs nit waits for a follow-up as well.
fago, jhodgdon, drunken monkey, Frando: any chance one or all of you can give this a quick review, since you had opinions on function signatures in earlier comments?
Comment #131
sunFixed that $key_exists. #128 as well as this patch already contains the tiny by-ref tweak.
Comment #132
fagoI still think list() PHPs method of returning multiple vars and we should use it if that is required, however as written in #125 this approach has the plus of being neatly usable if the NULL distinction doesn't matter. So I'm fine with changing it to the reference method.
Double-if (If if).
This duplicates the @return docs and is explained rather confusing. I'd prefer a simple docu of what it does regardless of the function return value + a short hint that checking the return value might suffice.
Comment #133
sunThanks, fago! Incorporated the requested changes of #132.
Comment #134
effulgentsia CreditAttribution: effulgentsia commentedsun, fago, and I like this, so RTBC. Besides, based on earlier comments from Dries and webchick, I think they'll like it too.
Comment #135
webchickGreat, now we're talking! Much better, thanks. That third parameter is a bit weird/non-standard, but at least it makes the return values far more sane for the 99% case.
Committed to HEAD.
Comment #136
effulgentsia CreditAttribution: effulgentsia commentedAll that's left on this issue now is #limit_validation_errors docs follow-up as per #123 and #126.
Comment #137
andypostVarible name in example is wrong
Comment #138
webchickCommitted that one, too.
Ok, now hopefully this issue will finally stay fixed. :)
Comment #139
sunAll that's left on this issue now is #limit_validation_errors docs follow-up as per #123 and #126.
Comment #140
the greenman CreditAttribution: the greenman commentedI have not seen anyone bring this up yet, but isset does not detect the difference between strings and arrays. This means that $ref = &$ref[$parent]; will WSOD if $ref is a string.
Here is a little example from the php contrib manual (http://www.php.net/manual/en/function.isset.php#98109):
This issue causes both drupal_array_set_nested_value and drupal_array_get_nested_value to WSOD if an attempt is made to create a reference within a string.
There is some typehinting in drupal_array_get_nested_value, but, that only works when the initial variable is passed in. If $parents is set incorrectly, and looks too deep into the array, then everything dies.
If we are avoiding checking for arrays with in_array, there is a faster option (http://uk.php.net/manual/en/function.is-array.php#98156), but it is a bit ugly.
Comment #141
rfay#111 broke #913846: Image/file field breaks after uploading two files. I don't know whether it's because of some missing changes in the file ui, or because of a problem in this issue.
Comment #142
tim.plunkettThis may also be the cause of #913528: Create new boolean field "Cannot create references to/from string offsets nor overloaded objects".
Comment #143
sunComment #144
Alan D. CreditAttribution: Alan D. commentedAnd maybe #935772: The function of the remove image button doesn't work correct too.
Comment #145
effulgentsia CreditAttribution: effulgentsia commentedI haven't studied in-depth, but I found KarenS also raising some concerns about the implications to Date fields in #929494-15: Everything is broken again.
Comment #146
yched CreditAttribution: yched commentedShameless plug : #941620: #skip_validation_errors : complement of #limit_validation_errors (D8 feature request)
Comment #147
kylebrowning CreditAttribution: kylebrowning commentedsubscribe. This bug is currently stopping services from having a 7.x release.
Comment #148
rfay@kylebrowning, there are no open tasks on this except documentation. This has already been committed.
I'm not sure that the ground has fully settled on this, and there may have been bugs introduced, but if so they have not been identified. There are almost certainly a number of bugs remaining in core that are due to less-than-adequate usage of #limit_validation_errors.
Comment #149
kylebrowning CreditAttribution: kylebrowning commented@rfay, after downloading the latest dev snapshot of drupal7, i now receive this error upon attempting $ret = drupal_form_submit($node->type . '_node_form', $form_state, $node);
Fatal error: Cannot create references to/from string offsets nor overloaded objects in /Users/kylebrowning/Sites/drupal7/includes/common.inc on line 6006
Comment #150
rfay@kylebrowning, then I encourage you to search the issue queue to see if somebody has reported that, and if they have not, file an issue.
Comment #151
Alan D. CreditAttribution: Alan D. commentedThis change has appeared to upset a number of contributed modules and the core Image removal functionality..., I got errors when trying to remove an image that also lead to the exact position in the form include. Was this change tested against array based values? I.E. Date, Image, ... Is there something that needs to be added to the element info declaration?
Comment #152
sunIt was wrong to add that "_nested" to the function names. The function names can be barely remembered now. Even though they are being used in many more locations now. Everyone has to look up their names in common.inc or api.d.o... Lots of code is going to be much harder to read and understand. It's as if drupal_render() was named drupal_array_render_element(), or element_children() was named drupal_array_get_nested_element_children(). Sucks. Heavily. Sad sun. :(
Comment #153
effulgentsia CreditAttribution: effulgentsia commented@Alan D.: Please see #926016: Several bugs when trying to remove files from a multivalued file/image field, and if you still get bugs with removing images even after that patch, please add a comment on that issue explaining the bug. Please also see #929494-21: Everything is broken again for a discussion about the Date field.
@fago: Care to comment any further on Alan D.'s question. Do you think there's any fix needed here to help make elements with array values easier?
Comment #154
fagoI don't think this patch caused troubles for element with array values. It hasn't changed the value assigning logic, it just made it use the new functions. However - by design - this patch introduced some troubles with forms that accessed not validated form values previously.
>Do you think there's any fix needed here to help make elements with array values easier?
Well, I think it is not intuitive to use for such elements now, so we definitely need to improve that for d8 -> #768312: Cleanup form element processing and value assignment However we might want to have a look at streamlining the approach I outlined in #929494-21: Everything is broken again.
Comment #155
effulgentsia CreditAttribution: effulgentsia commentedPlease see #745590-24: #managed_file element does not work when #extended not TRUE, or when ancestor element doesn't have #tree=TRUE for how this broke #managed_file, and please review the fix in the corresponding patch. Thanks!
Comment #156
jhodgdonAdding tag. It looks like this needs some updates in various API documentation places, as well as an entry in the 6/7 module update page?
Comment #157
jhodgdonI think this issue is only open for documentation, but what needs to be documented is not clear to me.
Could someone please post an issue summary here, with regards to what needs to be documented:
a) Things that need to be documented on drupal.org for developers of new modules.
b) API changes that need to be documented in the 6/7 module update guide for people updating modules from Drupal 6 to 7.
c) Documentation changes that need to be made on api.drupal.org
Thanks...
Comment #158
jhodgdonSince no one has provided any information on what needs to be documented in the update docs, I'm going to try another tactic and ask that the people involved in this issue make a change record node instead. I'm adding a note to the module/theme update pages for d7 to also look at the change list page, because there are already a few d6/7 changes there anyway.
Comment #159
fagoI think the form_set_error() docs related to this need updating, they are partly outdated:
Else, we should probably mention that behaviour in the form api reference.
As #limit_validation_errors is a new feature for d7, I'm not sure whether this should be included in the update docs - there is no d6 <-> d7 API change due to this issue. Thus, we probably don't need a change notification either?
Comment #160
jhodgdonRE #159 - those are probably best filed as one or two separate follow-up issues (drupal core / component documentation for form_set_error(), and documentation project / api docs files, tagged "FAPI reference" for the FAPI reference).
Regarding the change notification, this issue was tagged "api change", so unless that is wrong, it probably needs a change notification?
Comment #160.0
xjmInitial, partial summary written by btmash.
Comment #161
fagoIt was an API change that changed a new Drupal 7 feature which doesn't exist in D6 at all. So for people upgrading there is no API change, but just a new feature.
The change went in during Drupal 7's alpha phase, so it's an API change between d7 alpha releases. Are we documenting those?
Comment #162
jhodgdonAs long as it is not an API change from one released actual version (not alpha/beta) to another, no we don't need to document it. Thanks for the clarification!
If someone who understands this issue could file the follow-up issues (see last few comments), that would probably allow us to mark this fixed finally.
Comment #163
rfayI agree that this old stale issue doesn't need an API change notice.
However (and it's the wrong issue for this of course) but I don't agree that we only have to document changes between releases. The D7 changes were quite rough... We'd do one thing and then API-change it later, breaking all existing contrib. Once something is established in D8 it needs an API change notice when a change happens.
Comment #164
jhodgdonThere is no reason we can't use API change notices for that. Just mark it up correctly and it will be clear when the change happened.
Comment #165
Stepan Kabele CreditAttribution: Stepan Kabele commentedSolving this issue made hard to have Save Draft functionality implemented in D7. I understand that for security reasons approach like http://drupal.org/project/draft is safer, but it fundamentally can not work with more complex field types. Also this change caught many by surprise, recommended solutions for save draft functionality stopped working.
I found a way how to allow some selected fields to be saved unvalidated and I would like to ask if it is acceptable solution at least for D7 or if it is only hack which may easily stop working with any update:
In hook_form_alter add custom validation function like:
array_unshift($form['#validate'], 'mymodule_validate');
and in this custom validation function change #limit_validation_errors:
$form_state['triggering_element']['#limit_validation_errors'] = XXX;
where XXX contains fields which will be removed in drupal_validate_form, and is subset of original #limit_validation_errors. In extreme XXX can be set FALSE to allow all unvalidated values to be saved.
Comment #166
sun@Stepan Kabele: I'm fairly sure that usage is wrong and that the security team will contact you in case you publish a release that uses the approach. It's basically the kind of possible code that @effulgentsia hinted at in #3.
Comment #167
Stepan Kabele CreditAttribution: Stepan Kabele commentedWhy is ability to save node without some selected validations automatically security risk?
Common real world use case: Save Draft feature. Imagine content type with 100+ fields (some large questionnaire). It would be nice allow user to save unpublished node with some selected validations disabled (like required values) as a draft to be finished later, then saved as published with all validations.
Comment #168
rooby CreditAttribution: rooby commented@Stepan Kabele:
You cannot use #validate functions to affect required validation because required validation is done before your #validate functions are called.
For a solution that allows skipping required validation without touching other validation check out #1786442-4: Allow saving drafts with missing required fields
Comment #168.0
rooby CreditAttribution: rooby commentedUpdated issue summary.
Comment #173
dgtlmoon CreditAttribution: dgtlmoon commentedHmm, sorry for bumping, is this still an issue? I'm researching #2313517: Cannot create references to/from string offsets nor overloaded objects, should this issue be moved to D8?
Comment #174
amateescu CreditAttribution: amateescu for Chapter Three commentedThe patches here have been committed a long time ago, and keeping the issue open is just confusing at this point, so I'm going to mark it fixed. If anyone understands what are the documentation improvements requested in #123 / #126, please open a followup :)