Posted by plan9 on June 18, 2009 at 2:18pm
| Project: | Boost |
| Version: | 6.x-1.x-dev |
| Component: | Miscellaneous |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | closed (fixed) |
Issue Summary
I have a number of other sites appearing in my cache folder. They are all wap sites: such as wap.artyphoto.net and wap.artwheel.com
They follow the normal boost cache directory structure, eg: cache/wap.artyphoto.net/0/index.html
This is quite worrying, I'm running on a Serverpoint VPS with unique IP address. I haven't seen any other security issues on my site until this.
Anyone know what could be going on here? I have the boost cache permissions set to read/write for webserver user and group.
Thanks
G
Comments
#1
Are you running a multi-site, with these other sites?
#2
No - Just the one site.
#3
are you an admin of the other site in any way shape or form?
#4
#5
I experience this as well. I have a dedicated server with many domains - Boost somehow detects the host domain of my server and creates a directory cache based on it as well as a few sub domains. I suspect this may have something to do with search engine crawlers reaching my site via the hostdomain.com/~user rather than the actual domain of my site. This isn't an issue for me as I recognize the domain name, but for someone on shared hosting I can see how this could be very alarming to see an unrecognized domain name in the file caches. I can't be sure this is the same issue as mentioned above - but hosting administrators often utilize sub and parent domain naming in shared hosting environments, unnoticeable to the end user but apparent to crawlers, which appears to create file caches.
#6
@davidsnyder is this for 5.x as well? Drupal has to boot up & boost code has to run in order for this to happen; thus if this is happening, it's very bad SEO due to duplicate content.
Here's a thread about the exact opposite #665772: How to make boost work with external pages
#7
Closing all 5.x issues; will only reevaluate if someone steps up #454652: Looking for a co-maintainer - 5.x
Reason is 6.x has 10x as many users as 5.x; also last 5.x dev was over a year ago. The 5.x issue queue needs to go.
#8
Sorry to post to a closed thread but I solved this same problem or other mysterious cache dirs appearing in the cache folder by adding $base_url to settings.php
e.g $base_url = 'http://www.example.com';
hth,
DT
#9
I have the same problem but in my case are spam website like sina.com.cn!!! I found it because I'm investigating the reason why after weeks my website on VPS ran smoothly around 470MB. suddently rocked at almost 800Mb.. I'm afraid my website is under attack of spammer, I found a dozen of users in my site listed on Stopforumspam.com as spammers.
Can you explain to me what is the function of $base_url.? Is it suppose to allow only your website to be cached?
Thanks
#10
#11
@giovassi
Its in settings.php
<?php/**
* Base URL (optional).
*
* If you are experiencing issues with different site domains,
* uncomment the Base URL statement below (remove the leading hash sign)
* and fill in the URL to your Drupal installation.
*
* You might also want to force users to use a given domain.
* See the .htaccess file for more information.
*
* Examples:
* $base_url = 'http://www.example.com';
* $base_url = 'http://www.example.com:8888';
* $base_url = 'http://www.example.com/drupal';
* $base_url = 'https://www.example.com:8888/drupal';
*
* It is not allowed to have a trailing slash; Drupal will add it
* for you.
*/
# $base_url = 'http://www.example.com'; // NO trailing slash!
?>
If you have some sort of way of replicating this bug, I would appreciate it. All that is created is empty directories correct?
#12
In CACHE/norn folder there was just an empty folder
In CACHE/perm folder it was like my website folder with sites, modules, misc sub folders and boost file pointing to those websites and IP's
I'm very upset!!
mikeytown2
I'm asking what it is the function "$base_url = 'http://www.example.com'" exactly has.
Thanks
#13
@giovassi
If you want this issue fixed, I need a lot more info then what your providing. Technically boost doesn't need $base_url to be set, it should work with it not set, thus this appears to be a bug. I've encountered a similar bug before; thought I took care of it, but I could be wrong. This bug should be harmless, all you will see is a lot of empty directories being created.
What version of PHP/Apache are you using?
Can I get a full directory tree of what shouldn't be there?
Are there any access logs so I can see what the URL looks like?
#14
#15
I found out 1 website appeared in my cache folder that has an article that link to my website, it means they share my article on their website. But I'll still need to check the other sites why they appeared on that cache folder.
#16
this might be the explanation you are all looking for- http://drupal.org/node/842756
#17
I have a similar problem with my site. Domain names other than my own have appeared in the cache/normal folder.
I also have folders that are similar to my domain name, but not exact, appearing. For example, a folder called www.mydomain.com. (with a period after it) is in the cache/normal folder and has cached files. In addition, a folder with the IP address of my server is there.
I have global redirect installed, but it has not helped.
Any help would be appreciated.
#18
The cache folder is set to 775 when I installed it, which means group write access. This could possibly be the cause of the problem on (some) shared server setups. Shouldn't it be 755 at least?
#19
Sorry to have neglected this issue after having started it.
My situation is exactly as described by kriss683 in #17. I'm going to try settings $base_url = 'http://www.example.com' and will report back.
#20
I've also got this same issue...
PHP 5.2.13
Apache/2.2.3 (Red Hat)
/var/www/html/cache/normal
drwxrwxr-x 2 apache apache 4096 Aug 4 21:02 www.qq.com
drwxrwxr-x 2 apache apache 4096 Aug 5 00:46 www.sina.com.cn
both dirs have a _.html in them
I can supply some access logs if you tell me more specifically what you need.
As a data point, I've always had the base_url set in settings.php, so that doesn't help here.
#21
what's inside _.html? your homepage?
#22
Yes it is my homepage with all the urls which would be my ServerName filled in with the odd domain name.
#23
Your site is run from the sites/default directory correct? It's a little odd that you would be getting this even with the base_url set. Long story short drupal doesn't care what the hostname is; if you hit the site with the "wrong" hostname it will still generate the correct output, thus boost will cache it. Core will cache it as well, just its put into a database table so you don't notice it.
Map your servers IP to any domain name and that domain name will show up in drupal's cache_page table; if all your using is the core cache. If using boost then you get a directory called that domain name. The best solution to this is to not use the default directory in your sites folder.
#24
You mean if I run my settings.php is in sites/default? If so, then yes... the site itself is run from the root httpd directory. Seems really strange that I would get have served up pages for qq.com and sina.com.cn because wouldn't that mean that the DNS for those sites were pointing to my webserver? Odd that someone else in this thread also mentioned sina.com.cn too.
#25
in your hosts file if you put
127.0.0.1 sina.com.cnassuming 127.0.0.1 is the IP of your server, drupal will process the request fully. I have no reliable way to tell if this is a fake or real request; I could make a whitelist, but core doesn't deal with this issue.
#26
How about checking if the file requested comes from the domain the Drupal installation is on? One could use the URL specified in settings.php for the check, or set on in the module. If the complete url does not match the domain and the path then it does not get added to the cache folder.
#27
This patch requires the latest dev. This only works if $base_url has been set in settings.php.
#28
With ref. to a similar issue posted by me here - http://drupal.org/node/842756 , I would also like to point-out that in my case boost in not obeying it's cache directory setting and writing in drupal root.
#29
@deepesh
Is there any way for you to test the patch in a dev environment?
#30
For the record, I haven't added the chinese website to my hosts file... something else is going on.
#31
Someone else can though. If I know your IP I can add it to my hosts file and get the same effect.
#32
Sadly no, I guess the bug can be reproduced by searching a drupal site as posted in my thread and see if it produces the same effect.
#33
committed #27
leaving open because there is probably more that needs to be done.
#34
Hello!
I had the same issue before 3 weeks. Visitors from Social network site told me, that I have virus in my pages. Their antivirus started to alert them. And I found that somehow intruder injected in my cached pages the malicious script. I stopped Boost for while and almost forget.
My permissions are 777 of the cache folders. I run php-fpm and nginx.
#35
teri@uhaaa.com
Can I see the scripts contents? Use the contact form so send me more details; like one of the bad pages from the boost cache and the URL that was called if you know it.
http://drupal.org/user/282446/contact
#36
I deleted them, but I will search to remember which virus was. I can remember slightly that it was some worm called Wordpress maybe.... I will write again soon.
#37
The virus was HTML:Iframe-inf - that said one of my users for this page of my site (only for reference):
http://uhaaa.com/rakata-na-buda
I saw the java script injected in the cached copy of the page. When I cleared the cache, the same user told me, that the worm is gone. He alerted me, because his antivirus program alerted him. It's strange, but in the social network site where people discussed the link, this user reported this worm, but other users said that their antivirus is silent.
#38
#764494: boost-gzip-cookie-test redirection
Sounds like a false positive from this issue that I've now fixed in the latest dev. Iframes are not reliable so I switched over to ajax.
#39
Subscribing
#40
@highrockmedia
do you have anything to add? From what I can tell, this is operating how Drupal works. What files are in the other dir's? also read #23
#41
mikeytown2, no I don't have anything to add, I was interested in the issue so I subscribed. cheers.
#42
Just reporting back - as the original poster - that enabling base URL has definitely fixed the problem for me. I'm running Drupal 5.x from sites/default directory on a VPS (this was opened as a 5.x issue).
I don't know if it;s related but I also had an issue with expired cache files not being deleted on cron and this seems to be fixed now as well.
Does having Base URL enabled require anything to be added to the .htaccess file?
I'm guessing not...
#43
I also noticed this.. one thing I am not sure if it was clear on the readme and also led to more domains is to put the boost .htaccess rulels below the www redirection, that's right, isn't it?
Anyway, will monitor this issue but what about the patch, makes sense to have it? I even saw one www.yahoo.com there, who it got there, I have no idea!
#44
Per #42, seems like it is resolved, so I am closing the issue.
From reading the other comments, it seems like these are Apache servers with a default vhost pointing to their Drupal site ("apachectl -S" should confirm this). So unless your base url is explicitely set, this is how Drupal works (c.f. #23).
#45
Automatically closed -- issue fixed for 2 weeks with no activity.