Closed (fixed)
Project:
Grammar Parser
Version:
7.x-1.x-dev
Component:
Code
Priority:
Normal
Category:
Support request
Assigned:
Reporter:
Created:
16 Mar 2012 at 13:09 UTC
Updated:
13 Aug 2012 at 14:11 UTC
Jump to comment: Most recent file
Comments
Comment #1
jhodgdonReproducible. If the file has been recently touched, or if the job queue has been reset, running drush cron also triggers the segmentation fault. If the job queue hasn't been reset, or the file hasn't been touched, there's an expired parsing job in the {queue} table that keeps it from being parsed again. Which is how I knew which file it was, by the way.
Comment #2
jhodgdonThis is crashing in grammar_parser, so moving over there.
This bit of code is enough to crash it:
As I mentioned above, this crash causes a segmentation fault when I run cron on the command line. When I put the above code into a PHP-format block (eek!) on my test site, I got a WSOD. Normally I don't get WSOD, since I have xdebug running, and it will give me a stack trace for PHP problems, but this is a bit more drastic I guess.
Anyway... since I don't know much about the inner workings of grammar_parser, I'm moving it over to that issue queue in hopes that you know more than I do. :)
Meanwhile I will also try deleting parts of the offending file to see if I can narrow down where it is crashing. I had originally suspected some of the lines with backslashes in them, but there are plenty of other files that are parsing with similar lines in my repository without problems, so I don't think that's it. Anyway I'll report back shortly with a smaller test file if possible.
Comment #3
jhodgdonHere's a much smaller test file that also crashes the parser.
Comment #4
solotandem commentedThe Symphony code uses namespace syntax which the parser has not been updated to handle. This is a feature request.
Comment #5
jhodgdonHm. But why would it crash, and why on just that one file? There are a lot of other Symfony files (and also core 8.x Drupal database class files now) that have namespaces in them, and they don't cause crashes. Also, this crash is a segmentation fault, not just a mere PHP error... very odd!
Comment #6
jhodgdonYes, it is just that namespace stuff. If I take out the
it parses OK.
Interestingly, if I leave that part in, but take a few of the lines from the method out, it also doesn't crash. Also, the other namespace stuff at the top of the file doesn't cause problems, and the rest of the Symfony and other Drupal code with namespaces doesn't cause crashes either. Just this one file.
Anyway... This is probably going to be impacting api.drupal.org soon if it isn't already (seg faults during the cron job are not a happy situation, although perhaps it is a quirk of my OS/etc. that it is segfaulting? I don't really know).
So do you think this can be fixed sometime soon, and/or do we need to recruit help in order to get it fixed?
Comment #7
solotandem commentedPlease try this patch on several files with namespace syntax; report your findings.
Comment #8
jhodgdonAwesome! I'll test it out later today. Thanks!
Comment #9
jhodgdonWith this patch applied, I cannot get the regular (no namespaces) tests for the API module to run with our sample code. For instance, the test "Branch and object creation" in api/tests/api_objects.test (which basically just parses the files in api/tests/sample and verifies they were all created correctly) crashes with a PHP error, indicating not everything was parsed. Also before it got to that PHP error, it failed a bunch of the tests that see if the class inheritance was found correctly.
So, I didn't get to the point of testing the namespaced code, since non-namespaced code isn't working.
By the way... We have 15 tests right now in the API project. But from a Grammar Parser perspective, they are pretty much the same as far as what they are parsing. That Branch and Object Creation test is fairly quick, and is probably a good one to adopt if you want to just have a reality check of "grammar parser still works with the API module".
Comment #10
solotandem commentedTwo typos in the patch.
I tested this patch with namespace and use statements and some namespaced expressions such as $foo = \my\space\bar(). The patch is not intended to parse namespace code surrounded by brackets nor a statement that begins with a back slash.
Comment #11
jhodgdonI'm still getting the same test failures as with the previous patch applied. See #9.
Comment #12
solotandem commentedI am able to parse the entire API project (excluding tpl.php files). However, there is an issue in one of the Symfony files, interestingly not related to namespaces, but to an array of "new foo()" items. I am looking into the latter.
Are you using the patch with the latest release parser release?
Comment #13
jhodgdonI am running Grammar Parser and API module off of git, and then I applied the patch in #10. For API I am on branch 6.x, and for Grammar Parser, I seem to be on branch "master" (is that correct?).
With that setup, when I run the API module's tests, specifically the Branch and Object Creation test, quite a few of the test assertions fail. The ones that fail seem to be related to testing that the "class A extends/implements class/interface B" things are working.
Comment #14
solotandem commentedIf the files are being parsed properly (which occurs for me) but the tests are failing, then it may be due to another cause. What do you think?
Comment #15
jhodgdonDo I need to make any changes to the API module to use the new code? In other words, are there any changes to the output?
So I just tried applying the patch again (I had removed the patch), and reparsed the "sample" project that I have on my test API site (which parses the code in api/tests/sample).
It is not working correctly. For instance, we have in classes.php:
But when I go to the page for this class, it doesn't show me the class hierarchy. That indicates to me that this information is not being provided any more by Grammar Parser (which is the same problem that happened in the test). The hierarchy is also missing from other classes in the sample.
The relevant code from api/parser.inc is:
I haven't run this through a debugger, but based on the output I'm seeing, I would say that $statement->type is correct, but $statement->extends and $statement->implements have either changed or are not present.
Comment #16
solotandem commentedYou are correct. In the patch the extends and implements properties are no longer simple arrays but PGPList objects to accommodate the namespace syntax. I am not settled on how it is parsing now; I would rather each namespaced item be a single item in an array as before. But the tokenizer does not lend itself to this directly. I need to think about this.
You could try the following type of change to be consistent with what was there before (I suggest this without testing it :)):
$docblock['extends'] = array($statement->extends->toString());
Comment #17
jhodgdonAh. Well that explains it. I will try that type of thing out. I'll need to make a change that is consistent with the old or new code I guess.
Comment #18
jhodgdonI am now running the 7.x branch of the API module, using Grammar Parser installed as a library... and I'd like to get back to this issue.
Basically, we need to be able to support namespaces in the API module eventually. The support I think I would ideally want from Grammar parser would be:
a) Support of use/namespace declarations - it should return those as individual statements in $reader->getStatements(), so that I can recognize that a file contains them, and act appropriately.
b) Obviously, not crash when encountering use/namespace declarations, or ad-hoc uses of namespaces in code (see comment #3 above for an example of a file that crashes the current, unpatched Grammar Parser, using the code snippet in comment #2 to parse it).
c) Do something appropriate in the PGPEditor::statementOperandToText() function if there is a namespace involved (or have another method that deals with the namespace?). I'm not sure what "appropriate" would be, but here are some examples of namespace-using code, and I would need to be able to tease apart the namespace and the class in these cases:
http://us.php.net/manual/en/language.namespaces.basics.php
I think that's all the API module would need... I've outlined what I want the API module to do with namespaces in the issue summary of this issue
#1507476: Support namespaces in API module
and I can't think of anything else I would want/need Grammar parser to do to support this.
So... Does the patch above do all of this? If so, I'll go back to testing it and see what I can figure out.
And by the way, I believe that the namespace and use declarations need to be at the top of files:
http://us.php.net/manual/en/language.namespaces.definition.php
http://us.php.net/manual/en/language.namespaces.importing.php
There is a *discouraged* allowance for multiple namespaces in a file, but in any case they have to be outside of any other PHP code (not embedded in any other code):
http://us.php.net/manual/en/language.namespaces.definitionmultiple.php
Comment #19
fgmI just parsed the current Symfony and all of D8, and found 4 files which caused that same error, including the one originally mentioned.
Comment #20
jhodgdonRE #19 - Which specific files trigger the problem change over time, as Symfony files checked into Drupal 8.x change. That is why I attached a few specific files to comments above, so we could have something definite to test against.
Comment #21
fgm@jhodgdon: indeed, when I puller later yesterday, a whole new set of files has been added and failed too.
This is for a class inheritance graph I just wrote, BTW: maybe in some form it could be used within or along api.module to provide such graphs on a.d.o, just like doxygen has been doing for quite long ( http://drupal.org/sandbox/fgm/1553284 ).
Comment #22
jhodgdonThis issue is on Grammar Parser, so let's keep the discussion pertinent. I'll file a feature request on inheritance graphs on the API module though, good idea.
Comment #23
solotandem commentedJennifer, sorry about the delay in posting this patch. It is intended to handle:
- namespace declarations and use statements
- namespace expressions and statements
- namespace code wrapped in braces or not
Comment #24
jhodgdonThanks for the patch! I'll do some testing on this later in the week (possibly tomorrow).
Comment #25
jhodgdonI tested the patch in #23 a bit today.
For the first pass, I just patched Grammar Parser and didn't change the API module code (so it is not making use of any of the new functionality directly).
The good news is that at least in API version 7.x, cron/parsing is still fine (no syntax changes, and the output still looks the same, although I haven't yet run all the tests to make sure).
The bad news is that Grammar Parser is still crashing. I am consistently getting a segmentation fault when it tries to parse file:
core/vendor/symfony/http-foundation/Symfony/Component/HttpFoundation/HeaderBag.php
from the latest 8.x (I'm attaching the particular file it is trying to parse, since 8.x changes rapidly).
I will try excluding that directory and see what other files it still crashes on.
Comment #26
jhodgdonAnother file that crashes with this patch applied:
core/vendor/symfony/dependency-injection/Symfony/Component/DependencyInjection/SimpleXMLElement.php
(attaching this file too)
Comment #27
jhodgdonAdditional files that crash during parsing with the patch in #23 and the latest 8.x code base:
core/vendor/symfony/routing/Symfony/Component/Routing/RouteCollection.php
core/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/Tests/KernelTest.php
core/vendor/symfony/yaml/Symfony/Component/Yaml/Tests/DumperTest.php
Note that after each crashing file I found, I disabled parsing for that whole directory. So there could be other files in the same directory that also crash and I wouldn't know about them.
And another note: in IRC solotandem told me that he has some code that can successfully parse the file noted in comment #25, so he's planning to try posting a new patch soon.
Comment #28
solotandem commentedWith the attached patch, I parsed:
- the file in #25
- KernelTest.php in #27
The output files will differ from the input due to whitespace changes:
- line indents are 2 spaces, not 4
- opening brace on class and function definitions is on same line, not next line
Comment #29
jhodgdonWith the new patch, parsing still crashed for me with file:
drupal-8.x/core/vendor/symfony/http-foundation/Symfony/Component/HttpFoundation/HeaderBag.php
(comment #25).
Comment #30
solotandem commentedMy suspicion is it is not the parsing (as this works fine), but other things that API is invoking from the parser toolbox. The changes to the parsed code structure may require updates to other functions, but that has not been my focus. Is there a possibility you could narrow down where the crash occurs? Or do you have a line number now?
Comment #31
jhodgdonThe crash (which is a segmentation fault, not just a PHP error) is definitely happening in the call to $reader->buildGrammar(). Here is the code that the API module is using:
$docblock['source'] here has been set to the entire contents of core/vendor/symfony/http-foundation/Symfony/Component/HttpFoundation/HeaderBag.php (the same file that was attached to comment #25 above).
Some settings that could be of interest:
Grammar Parser memory limit: 512M
PHP version: 5.3.10-1ubuntu3.2
Comment #32
solotandem commentedI ran the file in #25 through the latest dev release of API project, manually invoking
api_parse_file()with this file. As indicated in #28 and #30, the file parses and the grammar structure is created. I allowed the API code to invokeapi_format_php()andapi_documentation_loop(). Both were successful based on a dump of $docblocks.I ran this test using the Grammar Parser 1.2 release patched with #28 as well as my latest code. Same result. Also, I ran this on a box with 512MB of RAM and a PHP memory limit of 128MB.
Based on these results, this is a support issue for the API project, not the parser. Please create a separate issue for this.
Comment #33
jhodgdonHm. It must be something to do with the version of PHP I am running or some setting, because you are able to run this on your machine, and I cannot.
In my previous tests in this original report, I had completely eliminated the API module -- did you ever get a segmentation fault in PHP at that time (see comment #2 above) running purely Grammar Parser code?
Comment #34
jhodgdonSo... Following what I did in comment #2, I just did the following:
a) Created a block with the PHP input format, and the following code (with an opening <?php tag):
b) Enabled this block on a particular page.
c) If I move the "return" line before the buildGrammar call, the block is fine and shows "It is OK". If I have the block as shown here, I get a white screen of death. (In my cron runs, PHP was having a segmentation fault.)
Apparently you do not have the same problem, but it is not an API module issue (this test does not use the API module)... I guess it is some kind of an internal PHP bug with my version of PHP and not with your version of PHP? Very odd.
Comment #35
solotandem commentedWhat version of PHP are you running? If it is not 5.3+, then some of the PHP token constants referenced in the parser would not be known. Could this be an issue?
Comment #36
jhodgdonI'm on 5.3.10. It could be the PHP modules that I have installed -- I did a Google search and it looks like some of the PHP modules have memory issues that can cause the PHP process to crash. Anyway, it isn't really a Grammar Parser problem at this point. I might try to narrow it down, and I might try to disable some PHP modules temporarily to see if the crashing goes away.
Meanwhile, I can parse almost all of the D8 source code, and I have the 4 Symfony directories that crash on my machine excluded, so I should be able to try out the new grammar parser code and see how to integrate it into the API module in the best way. Thanks!
Comment #37
solotandem commentedReverting this to a support request as most of the comment content is related to PHP issues not the grammar parser code. As mentioned above, this module successfully parses the Symfony files referenced.
Also creating a separate feature request issue.
Comment #38
jhodgdonGood idea. I guess that's
#1704294: Add support for use and namespace statements and expressions
and I think we should just close this one as "fixed", since it's obviously a PHP bug and not a Grammar Parser bug that is causing crashes on my system.