Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I see Aegir supports robots.txt module, but really, why should one use a module when per-site robots.txt can be achived with simple rewrites ? As an added benefit, per-site robots.txt could be version-controlled. Also having settings like this in the database generally goes against recent configuration-in-code movement (CTools, exportables, Features, etc).
Comment | File | Size | Author |
---|---|---|---|
#5 | provision_per_host_robots_1173954.patch | 1.13 KB | crea |
#3 | provision_per_host_robots_1173954.patch | 1.13 KB | crea |
Comments
Comment #1
crea CreditAttribution: crea commentedI propose to store the file in the sites/example.com folder. We could even have rewrites that do following checks:
1) try per-site file
2) try global file
3) fall-back to drupal
Comment #2
crea CreditAttribution: crea commentedComment #3
crea CreditAttribution: crea commentedThis is for nginx only, so needs work.
Comment #4
crea CreditAttribution: crea commenteddupe
Comment #5
crea CreditAttribution: crea commentedhmm that attach is broken...second try
Comment #6
omega8cc CreditAttribution: omega8cc commentedThe idea is nice, but the file shouldn't be stored directly in the site directory, because only Aegir system user should have write access there, while any static file should be uploaded only to the files directory.
Example:
try_files /sites/$host/files/robots.txt $uri @cache;
Comment #7
crea CreditAttribution: crea commentedIn my opinion this is sort of system setting (same as global and local settings.php), and should be outside files directory. Also I want to put the file in VCS while files directory should be out of it, and I don't want to introduce custom policy just for this file.
Comment #8
omega8cc CreditAttribution: omega8cc commentedOK, then maybe:
try_files /sites/$host/robots.txt /sites/$host/files/robots.txt $uri @cache;
Comment #9
crea CreditAttribution: crea commentedIs it true that the file will be overwritten during migration ? I mean, maybe it would be better idea to store the file in the profile directory so that it becomes a part of the platform ?
Comment #10
crea CreditAttribution: crea commentedWe don't know the path to the profile inside nginx unfortunately.
Comment #11
omega8cc CreditAttribution: omega8cc commentedNo, the file will be moved with the site as-is and never touched by Aegir.
It shouldn't be a part of platform on the install profile level - it should be a part of the *site*, or there is no point in avoiding standard robots.txt file in the platform root.
Comment #12
crea CreditAttribution: crea commentedI meant exactly that - in deployment scenario with platform in VCS the file won't be a part of the platform so it's not possible to have it in the same VCS repo. If I roll out new platform with updated robots.txt it will be overwritten with the old one.
Comment #13
crea CreditAttribution: crea commentedWe know profile name in the vhost template. Thus we can generate dynamic per-host rewrite using php.
I agree, that (generally) this should be a part of site and not a part of profile. However, in order to support platforms in VCS as a site deployment model, Aegir already suggests to store site parts in the profile - site-specific modules and themes (i.e. separate profile per site model). I already do that, and it works great. So, I think it would be ok to store such configuration files as robots.txt at the profile level (not as a general rule but as an option).
Comment #14
omega8cc CreditAttribution: omega8cc commentedThen maybe custom config overrides is the way to go, because we shouldn't add such non-standard locations in the generic setup.
I would never expect robots.txt to be a part of the *code* and part of the platform, by the way.
The problem with standard multisite setup is that the default file in the platform root doesn't allow you to manage robots.txt per site, so if we want to introduce it, we need current solution (support for robotstxt module), maybe extended with support for static file per site, but stored in the sites space.
Comment #15
anarcat CreditAttribution: anarcat commentedSounds like a great idea!! Why don't we do something like this in apache:
Is there anything missing in the Nginx side?
Comment #16
anarcat CreditAttribution: anarcat commentedActually, we need a patch here :)
Comment #17
omega8cc CreditAttribution: omega8cc commentedWell, I'm against allowing robots.txt in the
sites/domain
directory, because then you have to open write access to this directory for the group (as we do for modules/themes/libraries) which is rather crazy idea, IMO.There could be a rewrite added to support
sites/domain/files/robots.txt
- and we don't have it in the Nginx config yet, as we support standard, platform root level location for robots.txt and also the robotstxt module, so far.Comment #18
Steven Jones CreditAttribution: Steven Jones commentedIt should be noted that Drupal 7 includes a robots.txt at its root, so it seems that having a platform level robots.txt or using the robotstxt module is the 'Drupal way'.
I don't see why this couldn't be handled in a contrib, 'Aegir robotstxt' module?
Comment #19
anarcat CreditAttribution: anarcat commentedI have a problem with the robots.txt module - it bootstraps drupal, so it's a huge performance hit.
I would rather have a static file generated.
And I agree it goes in files/robots.txt, that's alright for me.
And I don't mind having this in contrib, but it seems like a so simple task (one rewrite rule!) that we should get it in. Besides, if we want to prioritize the platform robots.txt, we can do that too...
Comment #20
omega8cc CreditAttribution: omega8cc commentedSounds good. So we could avoid robotstxt module and create/copy the robots.txt files from platform root to sites/domain/files/ by default, on site deploy maybe (not on verify to not overwrite it).
Comment #21
crea CreditAttribution: crea commentedPlatform-level robots.txt doesn't make sense when you add multisiting to the picture, and Aegir heavily uses it. It exists in Drupal simply for legacy reasons and also because patch with a better approach wasn't submitted.
The question is simple: should we support something better, or stick with inferior solution just because Drupal does it.
Comment #22
Steven Jones CreditAttribution: Steven Jones commented@omega8cc aren't we proposing a simple rewrite rule, not actually copying the robots.txt over.
I Guess if someone wants it then they can still install robotstxt module and delete any robots.txt in their platform and it'll just work, but otherwise the can use the Aegir magic.
Comment #23
omega8cc CreditAttribution: omega8cc commentedSure, we can simply add a rewrite to support both legacy location in the platform root and in the
sites/domain/files
.Then we can leave this for the server admin and site admin to use either legacy location - platform-wide (which still makes sense also in the multisite env, unless some site really requires custom robots.txt) or upload the file to the
sites/domain/files
directory (after it was deleted from the platform root).So we need only simple rewrite and some how-to entry in the handbook/docs to explain that it as a built-in feature.
Comment #24
omega8cc CreditAttribution: omega8cc commentedNginx patch for review: http://drupalcode.org/sandbox/omega8cc/1111100.git/commit/7fcc788
Comment #25
anarcat CreditAttribution: anarcat commentedI have committed the patch for Nginx and I have rolled my own patch for Apache. I am not sure they work the same way though. In apache, i first check if the site-specific robots.txt file exists, and redirect there only if so, otherwise the normal process follows course (ie. the platform-level robots.txt gets served).
Is that the way nginx works?
Here's the apache patch:
http://drupalcode.org/project/provision.git/commitdiff/e7127de6027c54727...
Let's let this sit for a while in 2.x and merge when we have some more tests.
Comment #26
j0nathan CreditAttribution: j0nathan commentedSubscribing.
Comment #27
omega8cc CreditAttribution: omega8cc commentedIn the Nginx configuration it checks for platform-wide file first, then site specific if no platform-wide exists, and then, if none exist, it sends the request to Drupal via php-fpm backend, to support also robotstxt module.
We don't use any rewrite or redirect here, only the file check in the above order, which is also cached in the Nginx memory for better performance.
Comment #28
crea CreditAttribution: crea commentedThe feature should check site file first, then fallback to the platform-wide. Otherwise we are adding an additional step of removing platform file as a requirement, and also killing fallback mechanics in the process since there's nothing to fallback to
Comment #29
anarcat CreditAttribution: anarcat commentedOk, we have a problem with the patch then because they work in opposite ways.
I believe we should allow users to override the platform robots.txt. Otherwise it means we need to add a robots.txt to *every site* if we want to customize *one*. Seems backwards.
Can you change your patch so that the site-specific robots.txt has precedenceÉ
Comment #30
omega8cc CreditAttribution: omega8cc commentedAh! Right, sorry. The fix: http://drupalcode.org/sandbox/omega8cc/1111100.git/commit/cfbc3e1
Comment #31
anarcat CreditAttribution: anarcat commentedAlright, patch applied, we are now making sense again. Thanks! :)
Comment #32
anarcat CreditAttribution: anarcat commentedpushed to 1.x.
Comment #34
netpez CreditAttribution: netpez commentedI currently have AEGIR 1.6...
I noticed the patch lines:
+
+ RewriteCond
print $this->root;
/sites/%{HTTP_HOST}/files/robots.txt -f+ RewriteRule ^robots.txt /sites/%{HTTP_HOST}/files/robots.txt [L]
are in the template file within the .drush dir on my version....
So I created a robots.txt file that disallows all (/) on the manager (aegir) server in a specific /platform/site/files directory and I verified it so it would push out to all servers.
It is still not working :(
Anyone else still have issues?
Comment #35
anarcat CreditAttribution: anarcat commentedPlease open an new issue instead of posting in older ones. For such questions, you should probably use the community site. See also http://community.aegirproject.org/help