Download & Extend

Wikipedia: HTTP request failed! HTTP/1.0 403 Forbidden

Project:Create from Web
Version:6.x-1.3
Component:User interface
Category:support request
Priority:normal
Assigned:Unassigned
Status:active

Issue Summary

Warning: file_get_contents(http://en.wikipedia.org/w/index.php?title=Alexandre_Dumas&amp;action=raw...) [<a href='function.file-get-contents'>function.file-get-contents</a>]: failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in createfromweb_operator_wikipedia_execute() (line 72 of /var/aegir/platforms/pressflow-6/sites/all/modules/createfromweb/operator_wikipedia.inc).

If you get something similar, check out that the url in question can be accessed from your browser.

In my case I got an error message from wikipedia saying.

Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. Please try again in a few minutes.

Just noting it here in case someone searches for this. I'm marking this as works as designed.

Comments

#1

Thanks for noting!

Could also be though that the servers don't like a PHP 'bot' to GET their content and a more sophisticated http client would be needed (maybe simply setting a fake browser id ;) -- or using the wikipedia api instead ...

#2

Status:closed (works as designed)» active

I've been thinking the same, since it doesn't work again yet.

The thing is that when I go to the same URL with my browser I get the same access denied:

Server: squid/2.7.STABLE7
Date: Tue, 22 Mar 2011 09:44:32 GMT
Content-Type: text/html
Content-Length: 3096
X-Squid-Error: ERR_ACCESS_DENIED 0
X-Cache: MISS from sq60.wikimedia.org, MISS from system.makta.no
X-Cache-Lookup: NONE from sq60.wikimedia.org:80, MISS from system.makta.no:3128
Via: 1.0 system.makta.no:3128 (squid/2.6.STABLE21)
Connection: close
403 Forbidden

which is what the Web developer plugin for Firefox says are the headers from http://en.wikipedia.org/w/index.php?title=Albert_Einstein&action=raw&sec...

You'll notice that I have cache on my router, so the question is: have anyone else noticed the same problem?

If so, perhaps Wikipedia API would be the way to go, but I haven't got the foggiest about it. I'm going to give http://drupal.org/project/linked_data a try, see if it might help me out.

nobody click here