Firstly, I don't expect anyone to work on this patch, or incorporate it into this module... it's purely here for reference and use in Drush make files, or to help other people who have a situation like mine.

Aim of this patch: This patch aims to add a redirect into the core .htaccess file of D7. The reasoning being, I frequently deploy sites using Drush make which includes Drupal core. Drupal core has a robots.txt file, which needs to be deleted before this module (robotstxt) will work correctly.

This patch adds a redirect rule (based on this discussion) that always redirects requests to the physical robots.txt file to index.php, so it can be managed by the robotstxt module.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hass’s picture

Status: Patch (to be ported) » Needs work

Please create a patch for the readme.

johan.gant’s picture

Hi hass,

Sorry, I'm not sure what you mean by this. Do you want a note in the readme.txt file of this module?

Thanks

hass’s picture

Yes

jp.stacey’s picture

@hass would it be best for you to write the patch? It's not clear on this thread exactly what you want the changes to the readme to even *be*, let alone how to write such a patch!

@johan.gant: thanks for this! It at least mitigates hacking core in an untrackable manner, which has always been a stumbling block for using this (or any) module that wants to hack core and hence make it harder to do core upgrades.

jp.stacey’s picture

When this .htaccess fix is in place, the robotstxt_requirements() check on the site status report fails. This is because it's still checking for the robots.txt file on disk. But the real "acid test" of whether or not robots.txt is "hidden" is now "can the robotstxt.module deliver a dynamic response over HTTP?" - not "is there file X on disk".

Please find a patch attached that fixes this. Also, this patch adds more verbose installation instructions to the README.txt, referencing the .htaccess core patch.

(Note that this second patch is worthwhile, regardless of whether you use the first patch. It makes the requirements check behave in the same way as a site crawler would do, so it's a more robust test.)

jp.stacey’s picture

Status: Needs work » Needs review
hass’s picture

Version: 7.x-1.0 » 7.x-1.x-dev
Status: Needs review » Needs work

I'm sorry to say, but this is no reliable test. See #965078: HTTP request checking is unreliable and should be removed in favor of watchdog() calls.

Additional there are several code style issues and other bugs in translatable strings in this patch.

jp.stacey’s picture

Hi @hass,

the current test is proven to be unreliable as discussed above, in the sense that it cannot detect when Apache is configured to route around robots.txt on disk rather than delete it. So the worst case scenario is that we're replacing one unreliable test with another, except the new test works much more like a web crawler rather than like a trusted PHP process and so I think is an improvement.

I don't really know what you mean by the HTTP test being unreliable in this context, except in the sense that server configuration generally is "unreliable". That's true; but that unreliability is being handled by that separate issue; right now any failure will be reported on the Status page; if #965078: HTTP request checking is unreliable and should be removed in favor of watchdog() calls gets through, it will be reported into the watchdog logs. So either way is fine. Also, that issue has been marked down from critical to major.

Surely contributed code should be able to make calls to drupal_http_request(), as it's part of Drupal's core API: if you're saying "we shouldn't use some Drupal core API functions", then I'd appreciate any thoughts on an alternative solution to this problem.

You mention code style issues and translatable string issues. I attach the result of drush coder-review: none of the existing warnings are from code in my patch as far as I can tell. So if you can highlight these issues then that'd be great, as I think I've used $t() throughout. And I still don't know what you want us to add to the README file either (see comment 4), so if you've any suggestions for that then they would help too; otherwise it would be nice to know if you're happy with the additions I've made.

Right now these two patches are working great for us together, so if there's anything that's stopping them from working for you then it'd be great to know what it is.

hass’s picture

Coder is bullsh**. We are not using l() in t(). We use double quotes if we only have single quotes in a string. \' is only used if double and single quotes are in the string. Coder should have detected the tabs. We don't use tabs, only spaces.

jp.stacey’s picture

Status: Needs work » Needs review
FileSize
3.97 KB

Well, the only tabs I could find with grep -P '\t' were in the README.txt file, which I don't think the Drupal Coding Standards really apply to: so probably neither does Coder. Still, I've changed them to two spaces.

I've also made all the other minor changes you requested, so please find a new patch attached.

jp.stacey’s picture

Testing code left in previous patch; rerolled and attached.

hass’s picture

Component: Miscellaneous » Code
Status: Needs review » Needs work

This is not minor. We also use @url placeholder for url() and never !url. I'm also not sure if we are allowed to access $_GET directly. May need to go with an drupal API.

Can we write tests for this, please?

theRuslan’s picture

Does anyone have final solution?

johan.gant’s picture

Re-rolled the patch in #11 against the latest dev branch. Also replaced !url with @url as this was flagged in #12 as being important.

Regarding tests: It's not clear to me what we're seeking to test here. @hass, if this is important to you (as a maintainer) could you please list what you're aiming for, please?

hass’s picture

Component: Code » Documentation
Category: Task » Support request
Status: Needs work » Fixed

Both this issue here and #1260912: Provide patch to remove core's robots.txt for use in make files have the same target, but the .htaccess patch here still causes warnings on the status page and cannot recommended. Please use the solution in #1260912: Provide patch to remove core's robots.txt for use in make files for make files.

hass’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.