Hi!

I found something about this problem. If I logged into the drupal with any user I gave permission denied when I try to access to the sitemap.xml. When I logged out from the drupal, I can reach the sitemap.xml. I think the problem is caused by function xmlsitemap_anonymous_access.

Comments

xmarket’s picture

Category: support » bug

One more thing:
I made a clean install with the 6.x-1.x-dev (2009-02-24) version, and the sitemap.xml doesn't contain the nodes automatically, only the start page. I need to edit and submit the nodes to see it in sitemap.xml. What I am making wrong?

I tried the following thing:
http://www.example.com/admin/content/node I choosed the "Add the selected posts to the XML site map", and I checked all contents. I run the cron 2-3 times after the update, and when I looked again the sitemap.xml it was still empty.. :S

PROBLEM SOLVED! The way I tried works perfectly.

avpaderno’s picture

Category: bug » support

For the access denied, see what the project page reports.
If you make a clean install of XML Sitemap, it is probably difficult the site map can contain the nodes that have been edited before the project modules were installed. If you want to see the node created before to enable xmlsitemap_node.module, visit the admin/content/node page that shows a list of operations that can be made on the nodes; in the list there are some operations that are specific to XML Sitemap.
The same is true for the /admin/user/user page, which is relative to the authenticated user of the site.

avpaderno’s picture

The site map will appear empty until the first time the cron maintenance tasks are executed.

xmarket’s picture

Thanks for your reply. After some cron run, nodes are shown in the sitemap.xml. The permission denied is embarrassing, but thats all. I can live together with this thing. :D

avpaderno’s picture

Category: bug » support

The permission given only to the anonymous user is because otherwise the links present in the site map would probably give a 404 access error to the search engine, especially Google Webmaster Tools.

avpaderno’s picture

Status: Active » Fixed

I am setting the report to fixed.

avpaderno’s picture

Title: Permission denied with authenticated users » Access denied to sitemap.xml

I am changing the title to something clearer.

mikeytown2’s picture

I got an idea on how to serve this to the auth user. call file_get_contents() on example.com/sitemap.xml and pass that back to the user. The request will come from it's self (the web server) which doesn't keep any cookies.

avpaderno’s picture

I could use it on a tools page where the admin user could check the content of the site map as seen from the search engines.

If you have any ideas about that, feel free to open a report.

mikeytown2’s picture

under xmlsitemap_menu() instead of checking permissions there ('access callback' => 'xmlsitemap_anonymous_access'), check permissions inside the xmlsitemap_output() function. Since that function uses print, all thats needed is a if (user anonymous) then {same old} else {print file_get_contents()}. That should work in all cases.

avpaderno’s picture

The site map has been made accessible only to anonymous user also for two other reasons (apart the one I said in some comments I posted here):

  • To avoid that somebody would put a link of the site map to some Drupal menus; this is needed for two reasons:
    • To avoid that the site map link is in a Drupal menu that is used to add links into the site map; this would take to the situation where the site map contains a link to itself.
      I know that it could be silly, but if the links are too much, somebody could not notice he put the link to the site map in the menu that is then used to add additional links into the site map.
    • To avoid somebody uses the site map like something to help the user to find what he is looking for; an XML site map is not the same thing of what is normally called site map, and that is used to help the user to find a particular page he is looking for. The XML site map is thought to be used from a search engine, not a human being.
  • To avoid that somebody keeps to obsessively check the site map, and report that it misses 5 links to nodes he just created. The site map is not, and cannot be, instantly generated; therefore it could miss some links that will be added one of the next times the cron maintenance tasks are executed.

All these cases cause a use of resources that can be avoided.

If I allow the page to be seen only from the anonymous user, it's clear than then I don't adopt a shortcut to allow anybody to see the site map; if I would do this, then I could have also avoided to change the code.
The only thing I can do is to create a page that gives some statistic on the site map content, and that could check if the site map content is correct. This could allow to the administrator user to check if something is wrong in the site map, and could also give some informations useful when a user submits a bug report here.

mikeytown2’s picture

Create a new item in admin/settings/xmlsitemap called 'check sitemap output'. It puts the sitemap output into a scrollable div with pre tags. Not sure about chunks, and how to handle that, but at least having the main one there would be a nice feature. Then above or below the output you can place the usual warnings about new items not showing up until ___.

dman’s picture

Is there no way to give access to anonymous AND admin? I was puzzled for a while when trying to test. I had to open up another browser just to see what, if anything was happening. It was a last resort and I found it weird when it worked.

avpaderno’s picture

See my comment at #11.

dman’s picture

Well, as an admin #1 that's installing the module, I don't think that 'obsessionally' looking at the result after I've manually run a cron is something I should be forbidden from doing. It's called testing!
And as an admin installing a site, it's my job to be sensible enough not link directly to it in a menu or use it as a replacement for sitemap. It seems that I could anyway if I wanted to be stupid...

It's just that at the moment, I can install something, but I'm being deliberately prevented from seeing if installing it even worked!

the1who’s picture

@kiam

I guess I don't get what you are saying. I have three sites, almost identical. One is more new as I have been working on it since February. But I downloaded the latest release version and installed on this more recent server installation. I have updated all of the three sites to D6.1 and two are running xml sitemap 6.x-1.x-dev and one is running 6.x-0.x-dev. The two sites that I have had running for awhile now, one having the 1.x-dev and the other 0.x-dev, I can access sitemap.xml just fine. I can't access on the most up to date server, takes me to access denied page, whether I am logged in or not. I had trouble updated xml sitemap on this site compared to the other site, I know I had returned errors which I don't have at the moment, so I did a uninstall, checked the database and found that the table was empty for xml sitemap. Now if there is another table I should have checked, then maybe I should check that but I need to know which one.

Other than that, did clean install and now I can't access that page, nor can google webmaster tools for the sitemap function. What am I missing here? I have other modules, like node privacy by role, and in the user permissions page, there isn't an option for xml sitemap to have authenticated or anonymous for selections. I do have the same setup like I said on the other two servers and they work fine. I guess I could reference them in the process of seeking help, but I was wondering what is amiss here that the other sites work, but this one isn't working as hoped. Thanks in advance.

Matthew

avpaderno’s picture

@dman: you missed the first reason to avoid an authenticated user can watch the site map, and it's the most important one. As the links added to the site map are checked for the current user, using the first user account will introduce links in the site map that the anonymous user (aka, the search engine) will not be able to access. I don't think that to install XML Sitemap to help the search engine to find the content of a site map, and then giving it links that returns it a 403 error message is so far something people want to do.
If you read the other comments I posted, I also said that I could introduce a page that analyzes the content of the site map as seen from the anonymous user, and reports some data about it. In that way, the administrator user will not have to read all that XML content to understand something about it.

Nobody is then preventing you from seeing the content of the site map; I am able to see it when I want, and I am using the same code you all are using.

jonjon’s picture

I really don't know the inner-workings of the sitemap module but it sure is awkward I can't even see the sitemap.xml file because I'm the first user, whatever the reason that is! Till I found this page, I thought this was a bug and the module wasn't working properly.

Normally, access rules work the other way and the first user should have access the everything. And I totally agree with dman in comment #15.

The way I see it is that the module might need some refactoring to work differently if the simplest thing that is testing sitemap.xml is forbidden to any authenticated use. And I'm not saying this is a trivial task nor do I underestimate your very hard work.

Regards,

John.

dman’s picture

OK, I understand what you are saying there then. I can see why that would be an issue.
Would it be helpful to try a patch to 'switch user' to anonymous for the purposes of sitemap building? I believe it's not that much harder to check node access and specify a user. I may be wrong.
Would this issue be making a difference to things if I run cron manually when logged in vs running cron anonymously?
And does this have any effect on attempting to install xmlsitemap on a site while it is offline/under-maintainance when nothing is accessible publically?
I understand if you say we just shouldn't do that then ... but like I say, I'd like to test things I install.

Anonymous’s picture

So why can't the code updating [xmlsitemap} check the access with a user->id of 0? I'll take a look myself.

avpaderno’s picture

There isn't a way to check for anonymous user access for every links added to the site map, if not rewriting a set of Drupal functions.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

strikehawkecomm’s picture

What I am seeing is the footer syndicate content has a "MORE" link to sitemap and then access denied. Why not take this out of the syndicate block?

Anonymous’s picture

Category: support » bug
Status: Closed (fixed) » Active

Well, yes for the authenticated user.

avpaderno’s picture

Category: bug » support
Status: Active » Fixed

The module doesn't put any link in any syndicate block.
Then description is about a issue that is completely different from the original one.

the1who’s picture

Status: Fixed » Active

I hope you don't mind me changing back to active, but no one really addressed my question and I just updated to the latest release you have and it still hadn't addressed my problem. The only way it addresses my problem in my point of view is when I took the site that was running the latest released 6.x-1.x-dev on 2009-Apr-05, and removed that and put 6.x-0.x-dev (2008-Jun-09) on. That is the way I would expect it to work is how I can go to www.site.com/sitemap.xml and view it regardless of who I am. I prefer it that way so that users can reference that if they so choose and search engines can access it easier, for example google webmaster tools.

Unless I am misunderstanding the whole purpose of this, I don't really get it then. Because if you are saying you are preventing anonymous users from viewing it or having those without the appropriate access from viewing it, than the first user account, and in the case with some two sites is the only account, should be able to view it. But that isn't the case, I still can't view it with the 1.x-dev versions regardless if I am anonymous or not. Am I missing something from this whole module?

So now I have two sites that run the 0.x-dev version where I was anticipating to run the latest, but now I will be making the third site run the 0.x-dev version as well so that google webmaster tools sitemap can access the sitemap that I have told their site to look at. For example, you can look at what I am referring to, www.bsarc.us/sitemap.xml but if I have 1.x I can't in regards to www.kingsavionics.net/sitemap.xml for example for now until I change it or get the right help.

Anonymous’s picture

Status: Active » Closed (works as designed)

It is the purpose of allowing only anonymous users to populate the sitemap.xml database with the links that only anonymous users can see. Otherwise the anonymous users will see a bunch of broken links and Google will raise an error for the sitemap.xml file.

I.E.: If I access the sitemap.xml as UID 1 then the sitemap.xml will contain links that the anonymous user cannot see and that is not desirable. The visit to the page is when the data is populated to the files.

the1who’s picture

So if you can help me, should I update and after updating, log out and see if I can see the sitemap then? I was getting sitemap errors from google and that is why I reverted my versions.

Anonymous’s picture

@the1who: see #379854-128: The site map is not being populated.

The D6 module is a work in progress, errors should be expected. You are warned about using the module in a production site because it isn't quite ready. However, I am using a 6.x-dev version in one of my production sites; yes I see warnings from Google about unavailable links.

musashi39’s picture

Thanks! I was going crazy trying to figure out what was wrong. It all makes sense now.

the1who’s picture

@earnie

Thanks for the information. Looking forward to progress.

webalchemist’s picture

Has there been any progress in this? I am using xml sitemap and trying to get my sitemap to submit through google webmaster tools, but it keeps getting rejected.

What can I do to resolve this? Any ideas?

dave reid’s picture

@webalchemist: It works fine if you're using the latest official releases. File a new issue if you're still having problems instead of posting to an old issue.

avpaderno’s picture

The report, which is old, is then marked as by design; I would not expect any progress, in this case.
As Dave already said, if you are still having problems, fill a new report.