Problem/Motivation

Drupal's current outgoing-HTTP capability is, to be polite, minimal. We have one small function with a lousy API that can do basic requests, but that's it. If we want to be serious about web services we need strong bidirectional HTTP support.

Issues depending on a better API for HTTP requests (check these and move down to follow-up if correct, close if redundant now).
#1189464: Add an 'instant' queue runner
#1014086: Stampedes and cold cache performance issues with css/js aggregation - Put AdvAgg in core
#229905: Batch API assumes client's support of meta refresh - Can be fixed with something like Background Process
#106506: drupal_http_request() does not handle 'chunked' responses - Make it support HTTP 1.1
#786074: Allow hooks for drupal_http_request() request and responses
#205969: drupal_http_request() assumes presence of Reason-Phrase in response Status-Line
#386384: drupal_http_request does not support gz compressed data
#289820: drupal_http_request() could support digest authentication
#436240: Add a cURL replacement for drupal_http_request()
#164365: drupal_http_request() does handle (invalid) non-absolute redirects (RFC 7231)
All of these need something better than drupal_http_request() in order to be done well.

Proposed resolution

Writing a solid HTTP client is hard. Fortunately, there's plenty of existing ones out there that we can leverage, just like we pulled in Symfony.

Mikeytown2 did some extensive research into existing libraries, and the end conclusion was that Guzzle was the most full-featured and capable option available. It's only missing feature was asynchronous HTTP requests, which the author, Michael Dowling, added based on our feedback months ago.

The primary concern expressed about Guzzle is its size, which was decidedly not-small. (Multiple megabytes.) The size is justified for a fully capable HTTP client, because HTTP is complex, but not everyone was comfortable with a vendor library that was 2-3 MB in size.

Based on those concerns, Guzzle has been refactored in 3.x to be a suite of components, much the same way Symfony is. That allows us to just pull in 3-4 base components for core, and contrib modules that have need of the more advanced Guzzle components can pull those in when they need to. Problem solved.

The other concern was that Guzzle requires cURL, whereas drupal_http_request() does not. However, based on feedback so far it appears that cURL is far more common than it used to be, so making it a requirement for outgoing requests is not an onerous requirement. Arguably it's less onerous than Drupal's performance requirements, which effectively mandate APC already. The recommended resolution here is "meh, OK, you need cURL. Such is the modern web."

Therefore, Guzzle. Let's do.

Remaining tasks

- Get a green light from Dries on this issue. Dries?

- Add the necessary Guzzle components to core via Composer. Convention has been to do this in a separate patch from this issue to keep patch size down.

- Expose Guzzle to core via the Container.

- Replace calls to drupal_http_request() with Guzzle. (Don't bother wrapping an extra layer around it. There's no benefit to doing so.)

User interface changes

None.

API changes

drupal_http_request() goes away, instead you use Guzzle directly. Other systems like Simpletest could be refactored to use Guzzle's browser as well, but that's for follow-ups.

Follow-ups

- #1862536: Improve documentation of SimpletestHttpRequestSubscriber
- #1599622: Run poormanscron via a non-blocking HTTP request
- #1862398: [meta] Replace drupal_http_request() with Guzzle
- #1875818: Update composer.json to use the latest stable 3.0.x version of Guzzle
- #1875792: Standardize Guzzle exception handling

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

boombatower’s picture

The two recommended implementations of the Client are:

For a simple implementation of a browser based on an HTTP layer, have a look at Goutte.

For an implementation based on HttpKernelInterface, have a look at the Client provided by the HttpKernel component.

https://github.com/fabpot/Goutte
https://github.com/symfony/symfony/blob/master/src/Symfony/Component/Htt...

Based on http://www.garfieldtech.com/blog/refocusing-wscci we will already have the HttpKernel component available so I will start with that.

yched’s picture

"goutte" means "drop" in french. Just sayin' :-)

Crell’s picture

Issue tags: +WSCCI

I've also been told that Buzz (https://github.com/kriswallsmith/Buzz) is popular in Symfony circles. We should look at that, too.

For the record, I'm 100% in favor of using a separate library for outgoing requests. One that plays nice with Symfony's HttpFoundation (which does incoming requests) would be better for DX.

Since Web Services involve more than just receiving requests but contacting other systems, too, this is relevant to WSCCI. So, tagging.

Crell’s picture

From the readme, it looks like Goutte is more of a scraper, not a generic HTTP client object. That's more what we need, I think. The scraper would only be useful for Simpletest, for things that we shouldn't be testing that way anyway. :-) I don't know that it would work for general purpose REST communication.

boombatower’s picture

msonnabaum suggested http://mink.behat.org/, also looking into it.

catch’s picture

Mink would be really handy if we ever want proper browser acceptance tests in core, as well as real browser performance testing. I've not looked at it yet but based on the description it sounds encouraging. We discussed those quite a bit at #1194824: Pick a load test runner and scripting language but didn't know about Mink at the time.

boombatower’s picture

If we end up implementing this we should close #64866: Pluggable architecture for drupal_http_request().

catch’s picture

Title: Convert drupal_http_request() and DrupalWebTestCase to Symfony BrowserKit component » Convert drupal_http_request() and DrupalWebTestCase to Symfony BrowserKit component or Mink
catch’s picture

Title: Convert drupal_http_request() and DrupalWebTestCase to Symfony BrowserKit component or Mink » Evaluate third party libraries to replace drupal_http_request() and the browser in DrupalWebTestCase

Re-titling. Also see recent discussion on #64866: Pluggable architecture for drupal_http_request().

catch’s picture

mikeytown2’s picture

Subscribe from #64866: Pluggable architecture for drupal_http_request()

My original plan was to create a 2.x branch of HTTPRL that could be the base for a pluggable architecture #1593862: 2.x branch and target for core inclusion. Move extra stuff to sub-modules.. Replacing drupal_http_request() with a 3rd party library seems to be the winner so I've come up with a list of things that a HTTP client should have.

List of requirements based off of what I've seen/need.

Whats in core:
- Set Headers
- Method (GET, POST, etc)
- Max Redirects
- Data
- Timeout
- Error handling

Nice things to have (whats in httprl):
- Parallel HTTP streams.
- Non Blocking Requests.
- Set callback function with arguments.
- Domain Connections: Maximum number of simultaneous connections to a given domain name.
- Global Connections: Maximum number of simultaneous connections that can be open in this process.
- Global Timeout: A float representing the maximum number of seconds the call may take.
- Option to alter all streams mid execution (example: request 20 urls & break after at least 5 return).
- Chunk Size (needed for writing to REST server on IIS)
- HTTP Version.
- Add new requests to stack mid execution.
- Cookie Parsing.
- Proxy Support.
- Async Connect.
- Handle chunked encoding.
- Handle gzip and deflate.
- Handle more error conditions.
- Automatically handle given data (if not a string use http_build_query and set application/x-www-form-urlencoded content type).

Things that are not in HTTPRL (usually found in curl)
- Get things from FTP.
- Sending Files.
- Full HTTP 1.1 compliance.
- Support protocols other than HTTP and FTP.

Modules that use context (stream_context_create()) in drupal_http_request():
Acquia Network Connector in acquia_agent_stream_context_create()
Shorten URLs in _shorten_googl()
Both deal with SSL. All other uses for stream_context_create in contrib do not use drupal_http_request (mainly used to send headers when using file_get_contents()).

Other projects like HTTPRL in Drupal land:
Webclient
cURL HTTP Request
Async Jobs
Background Process

Other PHP HTTP Clients
https://github.com/guzzle/guzzle
https://github.com/kriswallsmith/Buzz
https://github.com/jmathai/php-multi-curl
https://github.com/LionsAd/rolling-curl <- closest to HTTPRL from what I can tell

RobLoach’s picture

Buzz, Buzy or Guzzle are on the top of my list. If they don't satisfy our requirements, we could easily work with those libraries to add the functionality we need.

Crell’s picture

Another nice-to-have would be request/response objects that are compatible with the Symfony ones. That's not a hard requirement, but it would make DX a bit nicer.

mikeytown, are you volunteering to drive this forward, I hope? :-)

mikeytown2’s picture

Status: Needs work » Active
Issue tags: -API change, -Testing system

A comparison of the various php http clients can be found here in this wiki: http://groups.drupal.org/node/233173

mtdowling’s picture

Hi! I'm the author of Guzzle. I've updated the matrix wiki posted by mikeytown2 with more information on Guzzle. I see that Mink and Goutte have both been mentioned. Guzzle is now the HTTP client used by Goutte, and Goutte is actually one of the browser implementations used in Mink. Let me know if you have any questions or need clarification on anything related to Guzzle.

-Michael

tassoman’s picture

On Symfony2 documentation the chapter abt "Testing" explains you can use browserkit with the DomCrawler component and phpUnit for testing easily your code. Running with that framework I think no more 3rd parties software is needed to get things done.

catch’s picture

I've opened #1599622: Run poormanscron via a non-blocking HTTP request for once this is in (assuming we get non-blocking http request support).

rbayliss’s picture

In reference to Goutte (mentioned above). Yes, Goutte is a scraper, but it's literally a single class wrapper (OK a class and a compiler class we wouldn't need) around Guzzle and DomCrawler, BrowserKit, and a couple other Symfony components. So you could still use Guzzle for HTTP requests, and Goutte for testing, all in the same package.

mikeytown2’s picture

Guzzle+Goutte/Guzzle look like the right way forward. What's the next step?

Crell’s picture

1) Adding the necessary libraries to the composer.json file, and pulling them in.
2) Replace one or two uses of drupal_http_request() with whatever the new setup would be, so we can see what it looks like and if we need to extend anything or if we can use it directly as-is. (I'd prefer the latter if it works.)
3) Post a patch here and let's have a look.

Assuming people don't hate it entirely we'll probably open a new issue to just add the libraries to core to keep the patch size here manageable.

If you want to work out of a branch, I'm happy to give you access to the wscci sandbox to spin off a new branch for this.

boombatower’s picture

So we are still committing libraries even with composer files?

RobLoach’s picture

Especially with composer.json files... Add "kriswallsmith/buzz" or "guzzle/guzzle" to the "require" section of core/composer.json, and run:

drush dl composer
drush composer update

It will install the new library accordingly. This patch would help as well: #1663404: Use Composer's defined namespaces to ease maintenance.

boombatower’s picture

Not really an explanation, kinda like using drush make files and committing all the modules and their dependencies into your own repo...doesn't make a whole lot of sense.

Crell’s picture

rbayliss’s picture

Here's a first pass at using Guzzle for a couple things. I switched the new default aggregator fetcher plugin and system_retrieve_file() to use Guzzle instead of drupal_http_request(). It seems like we can easily replicate all the behavior of drupal_http_request() using subscribers and configuration. The trick is in applying the settings to every client Drupal produces. In this patch, I've only created a default client, but to use the full extent of Guzzle, we would probably want to have custom clients (service clients in Guzzle terms) for more advanced applications. A couple of ideas that came to mind on how we can normalize settings across various clients:

1. Create an event dispatcher specifically for HTTP requests, load it up with subscribers in the container, and make sure a cloned version is attached to every client that's created.
2. Create a client factory that handles attaching a bunch of subscribers and common configuration.
3. drupal_alter().

RobLoach’s picture

Status: Active » Needs work

Very nicely done! This is looking absolutely amazing. Don't quite have that much time to evaluate thoroughly currently, but just wanted to say it looks really good.

+++ b/core/includes/common.incundefined
@@ -1020,6 +1021,26 @@ function drupal_http_request($url, array $options = array()) {
+ */
+function drupal_http_request_guzzle($uri, $method = RequestInterface::GET, array $headers = array(), $body = NULL) {
+  return drupal_container()->get('http_default_client')->createRequest($method, $uri, $headers, $body);

Do we want to turn drupal_http_request() into a wrapper around Guzzle?

RobLoach’s picture

Status: Needs work » Needs review

Also curious what the bot says.

Crell’s picture

Can you post a version that doesn't include the Guzzle source as well, so that it's Dreditor-able? :-) I am scared to click Review on a 1.22 MB patch...

Rob: I think we want to eliminate drupal_http_request() entirely and just let objects pull http_client from the DIC directly. The wrapper function is completely unnecessary, and just makes testing harder.

rbayliss’s picture

Yeah, here's the patch less Guzzle.

Crell’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php
@@ -73,6 +73,16 @@ class ContainerBuilder extends BaseContainerBuilder {
+    $this->register('http_client_simpletest_subscriber', 'Drupal\Core\Http\Plugin\SimpletestHttpRequestSubscriber');
+    $this->register('http_default_client', 'Guzzle\Http\Client')
+      ->addArgument(NULL)
+      ->addArgument(array(
+        'curl.CURLOPT_TIMEOUT' => 30.0,
+        'curl.CURLOPT_MAXREDIRS' => 3,
+      ))
+      ->addMethodCall('addSubscriber', array(new Reference('http_client_simpletest_subscriber')))
+      ->addMethodCall('setUserAgent', array('Drupal (+http://drupal.org/)'));

Just a note that this will change a bit once #1599108: Allow modules to register services and subscriber services (events) gets in.

+++ b/core/lib/Drupal/Core/Http/Plugin/SimpletestHttpRequestSubscriber.php
@@ -0,0 +1,42 @@
+class SimpletestHttpRequestSubscriber implements EventSubscriberInterface {
+
+  /**
+   * {@inheritdoc}
+   */
+  public static function getSubscribedEvents()
+  {
+    return array('request.before_send' => 'onBeforeSendRequest');
+  }
+
+
+  /**
+   * Event callback for request.before_send
+   */
+  public function onBeforeSendRequest(Event $event) {
+    // If the database prefix is being used by SimpleTest to run the tests in a copied
+    // database then set the user-agent header to the database prefix so that any
+    // calls to other Drupal pages will run the SimpleTest prefixed database. The
+    // user-agent is used to ensure that multiple testing sessions running at the
+    // same time won't interfere with each other as they would if the database
+    // prefix were stored statically in a file or database variable.
+    $test_info = &$GLOBALS['drupal_test_info'];
+    if (!empty($test_info['test_run_id'])) {
+      $event['request']->setHeader('User-Agent', drupal_generate_test_ua($test_info['test_run_id']));
+    }
+  }
+}

For now this works, but it should be moved to the Simpletest module as soon as modules have access to register subscribers, per the issue linked above.

+++ b/core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php
@@ -20,44 +20,34 @@ class DefaultFetcher implements FetcherInterface {
+    $request = drupal_http_request_guzzle($feed->url);
+

I'd rather skip this wrapper function entirely. Let's just expose Guzzle and be done with it.

+++ b/core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php
@@ -20,44 +20,34 @@ class DefaultFetcher implements FetcherInterface {
-    // Process HTTP response code.
-    switch ($result->code) {
-      case 304:
-        break;
-      case 301:
-        $feed->url = $result->redirect_url;
-        // Do not break here.
-      case 200:
-      case 302:
-      case 307:
-        if (!isset($result->data)) {
-          $result->data = '';
-        }
-        if (!isset($result->headers)) {
-          $result->headers = array();
-        }
-        $feed->source_string = $result->data;
-        $feed->http_headers = $result->headers;
-        break;
-      default:
-        watchdog('aggregator', 'The feed from %site seems to be broken due to "%error".', array('%site' => $feed->title, '%error' => $result->code . ' ' . $result->error), WATCHDOG_WARNING);
-        drupal_set_message(t('The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->title, '%error' => $result->code . ' ' . $result->error)));
-    }

Is this all subsumed into Guzzle directly? I like. :-)

+++ b/core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php
@@ -20,44 +20,34 @@ class DefaultFetcher implements FetcherInterface {
+      //Guzzle HTTP Headers are a slightly different format from what aggregator expects:
+      $feed->http_headers = array(
+        'last-modified' => $result->getLastModified(),
+        'etag' => $result->getEtag()
+      );
+      return TRUE;

Then shouldn't we be modifying aggregator to expect Guzzle's headers and then not need this comment?

+++ b/core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php
@@ -20,44 +20,34 @@ class DefaultFetcher implements FetcherInterface {
+    catch (\Guzzle\Http\Exception\BadResponseException $e) {

Should be 'use'd rather than a full namespace here.

+++ b/core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php
@@ -20,44 +20,34 @@ class DefaultFetcher implements FetcherInterface {
+      $result = $e->getResponse();

Minor nitpick. $result is usually used for a DB result object. $response for an HTTP response object. I'd rather risk confusion with the Symfony response object than with a PDOStatement object. :-)

+++ b/core/modules/system/system.module
@@ -3755,17 +3755,20 @@ function system_retrieve_file($url, $destination = NULL, $managed = FALSE, $repl
+  catch (\Guzzle\Http\Exception\BadResponseException $e) {

Ibid.

-15 days to next Drupal core point release.

sun’s picture

Issue tags: +API change, +Testing system

Guzzle looks absolutely fantastic to me. Especially love its extensive documentation.

Would love to see further progress here. (It is possible/likely that we're going to use the new library for the new testing framework.)

That said, if anyone is confused like me, the comparison matrix in #14 is outdated - http://groups.drupal.org/node/233173 is way more complete and some facts have been corrected. @mikeytown2, can you remove the misleading comparison table from your comment here (or replace it with the updated/current one)?

mikeytown2’s picture

Status: Active » Needs work
Issue tags: +API change, +Testing system

Old table removed; all that's left is a link to the wiki.

Crell’s picture

Title: Evaluate third party libraries to replace drupal_http_request() and the browser in DrupalWebTestCase » Adopt Guzzle library to replace drupal_http_request() and the Simpletest browser.

I think we can retitle this now.

sun’s picture

Title: Adopt Guzzle library to replace drupal_http_request() and the Simpletest browser. » Adopt Guzzle library to replace drupal_http_request()

Hm... given that we're likely going to NOT "update" WebTestBase for the revised testing framework, but rather "replace" it (with a new/separate class to extend from, and which will then tie into Guzzle), I'm really not sure whether this patch here should "waste" any time on the Simpletest browser...

No, actually, please ignore WebTestBase entirely -- it does not use drupal_http_request() at all (just grepped). WebTestBase uses cURL directly.

The other parts to support the WebTestBase architecture need to be retained though. But it looks like @rbayliss mastered those already :)

That said, I might have to disagree with @Crell ;) but only on a detail... unless I'm mistaken on how and when exactly it is involved, the Simpletest subscriber needs to work even if Simpletest module is not installed.

rbayliss’s picture

Yes, unless we're changing something drastic, the Simpletest subscriber has to be available outside of the Simpletest module. It's needed to modify the HTTP requests outbound from the site under testing, which typically doesn't have Simpletest enabled. I'll be rerolling this patch very soon.

Crell’s picture

In either case, I'm fine with replacing drupal_http_request() first, getting this committed, and then we can see what if any impact we want it to have on Simpletest. That can easily enough be a follow-up.

sun: Why does core need the ability to set a header on an outgoing request to be "from simpletest" when simpletest is not enabled? That makes no sense to me at all.

sun’s picture

This is done to allow a web test to "recursively" perform a HTTP request against the site under test.

I.e., to show a possible call chain (actual class/function names differ):

1. Test runner (has simpletest) calls AggregatorTest::testWhatever()
   which calls drupalGet()/drupalPost() (custom cURL) to trigger feed fetching in child site.

   2. Site under test (no simpletest) calls aggregator_fetch() to fetch a feed provided by itself,
      so it calls drupal_http_request() to itself.

      3. Same site under test (still no simpletest) responds to the drupal_http_request().

All of the issued HTTP requests need to use the User-Agent HTTP header, so that the site under test uses the correct database and configuration.

Without this TestSubscriber, the "nested" HTTP request between 2. and 3. (from the site under test to the site under test) would not use the required HTTP header, so the requested endpoint in 3. would effectively hit the database and configuration of the test runner (or "parent site"), which does not have the test configuration and might not have Aggregator module installed at all.

EDIT: This part will stay, even with the new testing framework.

Crell’s picture

Oh yuck. That's gross, but I can see the use case, ugly as it is. I'd still rather refactor our tests to not have to do that sort of inception-esque call chain, but that's not happening any time soon.

boombatower’s picture

How is the something we can avoid short of removing the web tests that call that execute code that calls the drupal site? This is also necessary for xmlrpc tests and possibly other service oriented tests (in contrib if not core).

rbayliss’s picture

Here's a reroll of the patch in #29, after taking @Crell's comments into account. The do-not-test version is the one that only includes changes outside of core/vendor. Assuming this still looks good and we want to proceed, here's a list of places that currently use drupal_http_request() and still need to be switched to Guzzle.

  • HttpRequestTest
  • aggregator_form_opml_submit()
  • _openid_xrds_discovery()
  • openid_association()
  • openid_verify_assertion()
  • SearchRankingTest
  • StatisticsAdminTest
  • _update_process_fetch_task()
  • _xmlrpc()
rbayliss’s picture

Status: Needs work » Needs review
sun’s picture

Would it make sense to commit the Guzzle library in a separate issue first, so we can focus on all the conversions in this issue?

Dries’s picture

At first glance, it seems like an enormous library to replace a small function in Drupal.

Also, Guzzle seems to require the PHP CURL extension.

I'm not sure we want this in core and definitely want to brainstorm about this more.

I don't think this is in the critical path for getting WSCCI done so let's not focus on it for now?

mikeytown2’s picture

If curl is a major issue we can have a contrib module that replaces guzzle with one that doesn't use curl, right?

@Dries
Issue is a lot of modules already use/require curl #64866-43: Pluggable architecture for drupal_http_request()

Sheldon Rampton’s picture

@Dries: I think Guzzle or something like it would be a significant thing to get added into Drupal. Right now the Services module for Drupal is exploding in popularity, and Drupal 8 will have a lot of Services-like functionality built into core. This will make it much easier to serve up Drupal content in non-HTML forms such as JSON, but there is no corresponding functionality in Drupal that makes it easy to RETRIEVE that content. Guzzle provides a fully-featured HTTP Client package which does more and does it more easily than the current drupal_http_request() function. I think this is important functionality strategically to make Drupal play well with other applications in the future as we move away from a focus on HTML-based web pages and become more focused on Drupal as an importer/exporter of data services.

effulgentsia’s picture

Perhaps an updated issue summary would be helpful. drupal_http_request() is currently 300 lines of ugliness. Whereas Guzzle is 20,000 lines (40,000 if you include tests) of what is probably much more modern, powerful, and pretty code. But what does it actually provide for core? What do we envision adding to core to warrant including such a large library? If part of the Web Services initiative roadmap includes specific service consumption features needed in core, please state what those are in the issue summary. Or, if our testing system, aggregator fetching, OpenId logic, or something else currently in core is running into a limitation that Guzzle solves, please add that to the summary too. If there's not enough benefit for core to justify the weight, why not let it be a contrib project?

Sheldon Rampton’s picture

The big thing in my view is that there are quite a few Drupal contrib modules which consume some external API via REST queries, and right now there is no standardization about how they make those API calls. For example, the D6 version of the Redmine API module relies upon a separate contrib module called rest_api_query. Dave Reid and I have had some conversations related to updating the Redmine module to D7, and the main area of difficulty/disagreement between our approaches had to do with what approach we should take to do the REST queries, because we both felt that the rest_api_query module was wrong for the task. Dave therefore started writing his own library for that purpose, while I found a PHP library for REST queries that was recommended on Redmine.org. Judging from what people are saying here about Guzzle, it sounds to me like we would have been better off (and would have found it easier to agree quickly upon an approach) if Guzzle or something like it already existed as a standard for Drupal.

If we take the approach of making Guzzle a contrib module for D8, that will be a step forward, but it will also mean that there is still no standard recommended and defined in core which the entire community of Drupal module developers can rely on for guidance on how they should make REST queries. We'll therefore have a proliferation of methods and will probably be trying to sort this out in Drupal 9.

The main value of this issue ticket is that multiple people have come together to discuss and evaluate different PHP library options for HTTP request handling in Drupal. It seems that a consensus has formed in favor of Guzzle as the best option. If that consensus is not yet solid, then maybe we should postpone putting Guzzle into core for the time being. However, if the consensus is solid, I think it should go into core.

As an aside: I think creating a contrib module for Guzzle in Drupal 7 would be a good idea in any case, regardless of what happens with Drupal 8.

Crell’s picture

One big use case is deployment:
http://groups.drupal.org/node/237443
http://groups.drupal.org/node/244158
http://groups.drupal.org/node/246748

(Incidentally, the latter two still need feedback.)

We want to do a lot more Drupal-to-Drupal communication. That means Drupal sending outgoing requests to other Drupal sites, and sanely handling redirects, recursive connections, POST/PUT/DELETE, Accept headers, etc. Those are all places that we're weak right now. Web services means being strong on outgoing requests, not just incoming.

webchick’s picture

Status: Needs review » Needs work
Issue tags: +API change, +WSCCI, +Testing system, +symfony

The last submitted patch, 1447736-drupal_http_request_guzzle-40.patch, failed testing.

mikeytown2’s picture

Created a session to present on this subject at the Pacific Northwest Drupal Summit.
http://2012.pnwdrupalsummit.org/sessions/issuing-http-requests-drupal-8

I believe mtdowling is Seattle based, so I'll see if we can team up on the presentation.

mikeytown2’s picture

Another bug that can be fixed by using guzzle
#229905: Batch API assumes client's support of meta refresh

Found a sandbox just for guzzle: http://drupal.org/sandbox/kimpepper/1723292

pwolanin’s picture

I don't think we should include guzzle or any other library for 8 - the security implications alone of adding extra libraries is something that hasn't been addressed.

Instead, let's define a good interface and get this issue done for 8: #64866: Pluggable architecture for drupal_http_request()

In general I think we should be working to define good interfaces with minimal implementations in core and allow them to be swapped for those sites that need something more.

boombatower’s picture

Drupal runs on the web...modules want to use the web...to do so we need a browser...Having one makes loads of sense for all the reasons above...and having core come with one lowers the barrier and allows for a default choice.

http://drupal.org/node/64866#comment-6382120 sums it up well.

pwolanin’s picture

Status: Needs work » Postponed

So, I think the top priority D8 for now should be #64866: Pluggable architecture for drupal_http_request() which I'm re-opening. I think this should be closed or postponed given the lack of concrete answers to Alex's questions in #46.

mtdowling’s picture

Guzzle has a lot of features and is extremely extendable. This is not code you had to write nor is it code you will need to maintain (though more contributors are always welcome). I'd love it if Drupal decided to add Guzzle to core, but I understand that some of you have concerns. The issues I'm seeing being raised are the following:

  1. It's not lightweight enough
  2. Undefined security implications
  3. Requires cURL
  4. Let's just define an interface instead

Antithesis to "it's not lightweight"

Guzzle is 20,000 lines

According to phploc, Guzzle currently is 11,156 non comment lines of code, but I'm curious as to why the lines of code of a project is significant. Are you working on an embedded system with PHP? Is the 1MB of source code (which includes the liberal use of docblocks for each method, class, and property) going to use up all of your disk space on a web server? (even most shared hosts claim to offer unlimited disk space these days).

157,775 - that's the number of non comment lines of code in Symfony2 according to phploc. Does that make Symfony2 a bad candidate for Drupal? Nope; it means Drupal is choosing a sophisticated framework to power its server side framework.

HTTP is not "lightweight" protocol. Heck, the book on HTTP is 656 pages and weighs 2.3 pounds. Does that mean I don't like the book? Nope; it means it can tell me almost everything I need to know about HTTP. Do I have to read the entire book for it to be useful? Nope; I can skip to the parts that I need right now, and if I ever needed to refer to other parts of the book, I would know where to look. </allegory>

While Symfony2 receives requests and powers the server side of HTTP, Guzzle works with the other side of HTTP -- the client. With the rise in web services, the client is just as important as the server in a HTTP transaction because the client is likely a critical part of your application. These days people are using HTTP clients to power their databases, file storage, queuing, emailing, and even their sessions.

Guzzle's aim is to provide access to every feature of HTTP/1.1, be as fast as possible, and provide some abstractions and patterns to make it easier to consume a web service. If you want a client that can handle almost every aspect of HTTP/1.1, then Guzzle is a great option. If you want a client that's 1-2k lines of code, then I don't know how you could come close to the feature set of Guzzle. mikeytown2 wrote a great comparison of PHP HTTP client libraries that can be found here: http://groups.drupal.org/node/233173

40,000 if you include tests

If anything, that should be a good sign that the library is extremely well tested. As a matter of fact, Guzzle has 99.98% code coverage -- that's 4,366 tested lines of code out of 4,367 total testable lines of code.

Undefined security implications

the security implications alone of adding extra libraries is something that hasn't been addressed

Can you elaborate on the security implications? I'd be happy to try to address them.

Requiring cURL

Guzzle seems to require the PHP CURL extension.

From what several people are posting, there are currently several places in core that already use cURL, so this would not be a new requirement.

It's extremely common for cURL to be installed on shared hosts. A quick Google search shows that GoDaddy, hostgator, dreamhost, and ServerGrove come with cURL. If a shared host does not support cURL, I would guess that they would enable if asked. Additionally, installing the PHP cURL bindings on Windows is trivial: http://curl.haxx.se/libcurl/php/install.html

Using Guzzle vs defining interfaces

[...] I think we should be working to define good interfaces with minimal implementations in core and allow them to be swapped for those sites that need something more

This is a great idea in theory, but I'm a bit of a pessimist when it comes to this sort of thing.

The problem is that you would need to abstract every feature that you like in Guzzle into various interfaces (e.g. Request, Response, Client, CookieJar, Cookie, Stream, EntityBody, Url, QueryString, RequestFactory, etc). Making these interfaces flexible and generic enough to be implemented by multiple providers while still providing all of the features Guzzle provides will be a monumental effort. Assuming you get past the task of agreeing on a set of potentially 20+ interfaces (which based on the related PSR discussion is not easy), you would then need to provide implementations of these interfaces for one or more clients. At this point, I would estimate that it's months later and you're using a stripped down version of Guzzle (and likely much slower due to the fact that every call would need to be proxied through an adapter).

-Michael

Dave Reid’s picture

As for undefined security implications, can we get answers for what the Guzzle project team's stance or position is on all of the following questions from http://groups.drupal.org/node/250578:

  • The project should have a simple, well documented way to report bugs
  • The project should follow Responsible Disclosure
  • The project should make security releases on a consistent schedule (e.g. every tuesday, every X day of the month)
  • The version of the libraries being incorporated into Drupal will be supported by their security process for as long as we reasonably expect Drupal Core to be supported (around 5 years)
  • The project should have active maintainers so that security vulnerabilities can be fixed in a reasonable amount of time (X weeks).
  • The project will ideally allow interested members of our security team to join their private mailing list/issue tracker so we can have some advance knowledge and coordination around releases.

I'm not sure we've also tested as to how secure/dummy-proof the code is from a security standpoint. Are there any active security issues (private or known)? What are the common gotchas (like XSS in Drupal)?

mtdowling’s picture

Hi Dave,

> The project should have a simple, well documented way to report bugs

Like most open source projects, issues are reported on GitHub.

> The project should follow Responsible Disclosure
> The project should make security releases on a consistent schedule (e.g. every tuesday, every X day of the month)

Sure, but there aren't many security concerns for an HTTP client. A client serializes an HTTP request, sends it over the wire, parses the response, then returns the response. As long as you use SSL and don't disable peer and host certificate verification, there isn't much more a client should be responsible for. Did you have any specific things in mind that might raise concern?

I typically tag a release once a week, but I don't think there's ever been a need for a security release.

> The project will ideally allow interested members of our security team to join their private mailing list/issue tracker so we can have some advance knowledge and coordination around releases.

I don't think this is necessary for Guzzle, but if it's a requirement, we can set something up. Is this something that can be handled on drupal.org somehow if needed?

> The version of the libraries being incorporated into Drupal will be supported by their security process for as long as we reasonably expect Drupal Core to be supported (around 5 years)

I'm sure we could work out a similar approach to what will be taken with Symfony2.

> The project should have active maintainers so that security vulnerabilities can be fixed in a reasonable amount of time (X weeks).

I'm the lead, and at this point in time, there have been 25 contributors: https://github.com/guzzle/guzzle/graphs/contributors

> I'm not sure we've also tested as to how secure/dummy-proof the code is from a security standpoint.

Guzzle has been thoroughly tested both in unit tests and in production systems, but I'd love for someone to do this.

> Are there any active security issues (private or known)? What are the common gotchas (like XSS in Drupal)?

None that I know of.

-Michael

catch’s picture

I've added two issues depending on this one to the summary. Those aren't enough in themselves to justify adding the library, so hopefully some other people will update the summary too.

catch’s picture

Issue summary: View changes

Updated issue summary.

mikeytown2’s picture

I've added a couple more issues to the top. The biggest benefit I see is in contrib. Having something as powerful as guzzle in core allows for a lot of cool things to happen. Looks like a decent sized project is now requiring httprl.

webchick’s picture

Just to play devil's advocate though, couldn't those contrib modules just add a contributed Guzzle module to their list of dependencies..?

RobLoach’s picture

Just chiming in here. Blown away by how much work and dedication mtdowling has put into helping us out. This should not be ignored.

It's not lightweight enough

mtdowling has split Guzzle into different components. If we need just the HTTP interface, we can do that. Composer will handle its dependencies for us.

Just to play devil's advocate though, couldn't those contrib modules just add a contributed Guzzle module to their list of dependencies..?

Yes, but that's contrib and we wouldn't gain benefit out of it out-of-the-box in Drupal Core. I'm all for keeping Drupal Core light, but the benefits shown in Guzzle extremely out-weight the cons.

pwolanin’s picture

@Rob Loach - the issue summary seems out of date above, but I'm agree with webchick's suggestion that modules that need it add it as a dependency. I still think this is unreasonable core bloat and overhead of coordinating core releases with this project.

The http component alone is still ~300k of code (~10k LOC).

mtdowling’s picture

In order to address some of the concerns that have been raised on this thread about the size of Guzzle, I've been working on a 3.0 release of Guzzle that will break the project further into components. The end result would be somewhere around 5k LOC that would need to be required by core. Here's the pull request if anyone is interested: https://github.com/guzzle/guzzle/pull/133

The plan is to break Guzzle into the following components via composer and possibly PEAR:

  • guzzle/batch
  • guzzle/cache
  • guzzle/common
  • guzzle/http
  • guzzle/inflection
  • guzzle/iterator
  • guzzle/log
  • guzzle/parser
  • guzzle/plugin
  • - guzzle/plugin-async
  • - guzzle/plugin-backoff
  • - guzzle/plugin-cache
  • - guzzle/plugin-cookie
  • - guzzle/plugin-curlauth
  • - guzzle/plugin-history
  • - guzzle/plugin-log
  • - guzzle/plugin-md5validator
  • - guzzle/plugin-mock
  • - guzzle/plugin-oauth
  • guzzle/service
  • guzzle/stream
  • guzzle/validation

A barebones installation would require guzzle/http, which requires guzzle/common and guzzle/stream. You will be able to install all of the plugin using the guzzle/plugin repository, or install individual plugins using guzzle/plugin-[name]. You will be able to install the entire project by using guzzle/guzzle. (Note that using these components will not pull in the tests)

These changes will allow a cleaner way to separate plugins and the ability to build more robust plugins. For example, the BackoffPlugin and the CachePlugin will be much easier to extend and modify in 3.0. Breaking the project into components like this will mean that contributors can build awesome plugins and new components without worrying about the lines of code. If you don't need a feature, you won't have to pull it into your project. If Guzzle's core were provided in Drupal's core, this setup would make it easier for contrib modules to pull in additional Guzzle modules as needed.

These are breaking changes, but the majority of the changes are just find and replace. I'll provide a small migration script when I release to help ease the migration. These changes are a long time overdue. I can promise that after this release, Guzzle will have much fewer BC changes between versions, and if Guzzle were added to Drupal core, I would try to coordinate with Drupal on BC changes.

Do these changes help to address some of the concerns? Is there something else that could be done?

-Michael

cosmicdreams’s picture

The naming conventions here make the inclusion of this sounds like a happy marriage of Web service client in core and Feeds in core.

This would be my wildest dream come true.

effulgentsia’s picture

@mtdowling: that sounds awesome. My only suggestion is to really try to be aggressive in cutting things out of guzzle/http and moving anything not essential to the architecture or very simple http requests into a separate component. For example, looking at https://github.com/guzzle/guzzle/tree/modular/src/Guzzle/Http, I'm wondering if:

- Cookie and CookieJar can move to a Cookie component
- Message/PostFile can move to a File component
- Curl can move to a Curl component

I'm just thinking that really minimizing what we need to pull into Drupal core will help a lot in getting this in, but these are just rough suggestions: of course, you also need to keep your project architecture sane and not over-componentized.

pwolanin’s picture

It would seem guzzle still requires cURL, which would then become a requirement for Drupal. That seems like the most likely reason not to pull an external library into core, in addition to possible maintenance issues mentioned above. A smaller library lessens those concerns, but doesn't remove them.

effulgentsia’s picture

ClientInterface has setCurlMulti() and getCurlMulti() methods. Would be good for those to go away.

Client has send() and prepareRequest() methods that use the Curl classes. Would be good if those could be abstracted, though if not, we could potentially extend the class and override those methods.

Note that as a last resort, we could swap out CurlHandle to not actually use cURL, though I hope we don't have to resort to such hackery.

Crell’s picture

Why is cURL such an evil thing to require, exactly? It's already required if you want to use Simpletest...

mikeytown2’s picture

In #56 mtdowling explains that curl is widely supported now. I would agree that it's rare to find a host that does not support cURL. I think that all we would need to do is add a page like the one we did for PDO.

effulgentsia’s picture

It's already required if you want to use Simpletest

That's not a compelling argument, since many production sites don't need to run Simpletest. However, one can make the argument that a site also doesn't need to run aggregator, openid, or xmlrpc. But, "Update Status" is the one core feature that I would find unfortunate to not be working on sites without cURL, if such sites actually exist.

webchick’s picture

It might indeed be more true now than it was when I started with Drupal that a cURL requirement is more standard. I know drupal_http_request() saved my bacon on DreamHost back in 2005 by the fact that it used sockets rather than cURL, but I noticed phpinfo there now shows libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6.

The only other host I have access to atm is Acquia and they have 7.19.7.

This might be worth a webform.com survey piped through g.d.o/core and see if we can get some data on this.

mtdowling’s picture

@effulgentsia Thanks for the feeback!

Cookie and CookieJar can move to a Cookie component

I'll think about moving the Guzzle\Http\Cookie object and Guzzle\Http\CookieJar namespace to Guzzle\Plugin\Cookie.

Message/PostFile can move to a File component

I think these should stay because they're required for sending POST requests.

Curl can move to a Curl component

I'm not sure about this one, but I'll consider it.

I could potentially add a Guzzle\Transport namespace, and update Guzzle\Http\Client to use a TransportInterface. This would open up the ability to support multiple transports in the future (socket, file_get_contents, etc).

The problem here is that it would be incredibly difficult to achieve or even emulate feature parity with a cURL transport, and the amount of effort required to implement additional transports would only benefit a very small percentage of users that do not have cURL installed. Using a socket based approach that can even come close to cURL, including parallel requests, would be a massive undertaking and tens of thousands of lines of code. Each adapter would hopefully support events for transferring data out, in, progress events, and true streaming of HTTP request and response bodies.

It might be a good idea to add the necessary interfaces regardless of if/when alternatives are ever implemented though.

This might be worth a webform.com survey piped through g.d.o/core and see if we can get some data on this.

@webchick: It would be awesome to get some data around this!

-Michael

Crell’s picture

+1 to collecting data rather than guessing.

pwolanin’s picture

imho, it's not worth adding more indirection and interfaces if Guzzle is only ever going to use cURL. The code is already hard to follow.

mtdowling’s picture

@pwolanin I evaluated the possibility yesterday, but will not be abstracting away cURL. The reason Guzzle is so powerful is because it is so tightly integrated with cURL.

effulgentsia’s picture

Some more thoughts about cURL:
- WordPress does not require it. They developed their own transport abstraction API in order to be compatible on as many servers as possible.
- ZenCart and Magento do require it. My guess is most big shared hosting companies want the business of small shop owners and make their hosting plans compatible with these.
- Per #71, a Drupal site doesn't need to consume web services, so cURL would only be needed for sites that choose to do so (e.g., Aggregator or OpenId). Even Update Status is optional (though a very nice security helper for site maintainers who don't subscribe to the security announcements mailing list).
- Per #44, if sites without cURL that want to consume web services is a small enough edge case, it might not make sense for core to address it. Contrib could provide a replacement implementation that conforms to Guzzle's interface. This would likely only come about if there's a large enough website that wants it. Until then, small website owners would need to switch to one of the many hosting companies that does offer cURL.

sun’s picture

mtdowling’s picture

That's some great insight, effulgentsia. Including what effulgentsia listed, these popular PHP projects all require cURL: Magento, ZenCart, Facebook PHP SDK, AWS SDK for PHP, Twitter OAuth, ThinkUp, Fuel (appears to only offer a cURL client), and the majority of web service clients I've worked with. I'm sure we could come up with a much larger list if we kept looking.

tunic’s picture

#77, about Drupal consumption of webservices: the point may be what things could be done with a good HTTP client implementation like Guzzle in core instead of actual needs. If Guzzle is included other core and contrib components will be using it (check mentioned cases in the issue body).

cosmicdreams’s picture

#80: In my opinion, if Drupal had Guzzle AND we accomplish everything we're targetting in the WSCCI initiative then we'd have web-service driven, platform agnostic Outputs and inputs.

Once we connect Guzzle with data import logic (Migrate, Feeds, Import/Export API, etc.) we can allow Drupal to simply act as a datastore in projects that

  • only want a built-in admin ui for managing the data
  • need to pull data from outside sources
  • need to create "stage" content on completely different site so that the finished content can be pushed to the production site
  • General data migration issues
  • or support functionality that we haven't thought of yet.

In my opinion, whenever I read about the progress of the Services module in D6 and D7, I always thought it would allow Drupal to act as a web client that connects to others, instead of acting like a Server that provided it's data.

Being able to pull in data from outside sources so that content can be curated is exciting to me because I love data warehousing. That's a whole category of stuff we can approximate also long as we had good clients and servers.

catch’s picture

It seems OK to add a cURL dependency to modules like aggregator and even update status. We have a nag when update_status is disabled, so while it'll be really annoying to have Drupal nag you about a module that it won't also let you enable, that's probably the right kind of annoying.

I also can't really think of a feature we'd need in core using outgoing http requests that would have to be required - we removed system_check_http_request() as well as the strange clean urls check which are the only places I can think of in core that were doing this and neither will be missed. Given that I can't see it creeping up to a full dependency.

Our PHP version and PDO requirements have been a lot more stringent than this and that's worked out OK. And the release is almost a year out so there's plenty of time for both hosting companies and small sites to prepare.

cweagans’s picture

Status: Postponed » Active

In the last two months, I think that there's been a pretty compelling argument for adding Guzzle to core, so I'm going to reopen this.

Answers were given to #46, and the project maintainer seems very amicable to making changes that we need made.

For the record, I'm okay with adding a curl dependency to all of core - I mean, what host doesn't have curl available? If it does turn out to be a problem, then is there any reason that Guzzle can't be extended with another class that uses sockets instead of curl? I don't see anything particularly difficult about that.

pwolanin’s picture

@cweagans - I don't think you can do that to guzzle itself. See #73.

I think the main question is whether core ships w/ some interface that overlays guzzle, or whether we depend on and ship the minimal parts of it and depend on some factory methods to swap out a piece if someone really has a use case that cannot be satisfied.

kim.pepper’s picture

I think the main question is whether core ships w/ some interface that overlays guzzle

I agree with comments by @mtdowling in #56. We would have to do a load of work to abstract every Guzzle interface

The problem is that you would need to abstract every feature that you like in Guzzle into various interfaces (e.g. Request, Response, Client, CookieJar, Cookie, Stream, EntityBody, Url, QueryString, RequestFactory, etc). Making these interfaces flexible and generic enough to be implemented by multiple providers while still providing all of the features Guzzle provides will be a monumental effort.

mtdowling’s picture

I just released Guzzle 3.0. Guzzle is now broken into the many components listed earlier in the thread. This would allow Drupal to pull in only the bare-bones requirements of Guzzle. You can read more about the release and the component installation options here: http://mtdowling.com/blog/2012/10/16/guzzle-3-dot-0-better-service-descr...

podarok’s picture

#86 nice update

Crell’s picture

Assigned: Unassigned » catch

Nice work, mtdowling!

pwolanin: Does the revised Guzzle aussage your concerns? Would you be comfortable with putting the base guzzle/guzzle package (and maybe one or two others as needed) in core, and saying "if contrib wants more, they can grab the other components with composer"? (That's pretty much what we're doing with Symfony.)

catch, how about you?

effulgentsia’s picture

I'm guessing that catch will care about this statement from #64:

I can promise that after this release, Guzzle will have much fewer BC changes between versions, and if Guzzle were added to Drupal core, I would try to coordinate with Drupal on BC changes.

With Symfony, fabpot formalized this a bit more into components that will keep strict BC across 2.x point releases and ones that won't, and so far in Drupal core we've only added the stricter set of components, though #1810472: Add Symfony's Serializer component to core despite Symfony potentially releasing BC-breaking updates after 2.3. may become our one exception. @mtdowling: are you thinking of doing something similar for Guzzle 3.x, where BC is treated differently per component, or in general what are your thoughts of what things would look like for Guzzle in Drupal 8.x in 2017/2018?

catch’s picture

Assigned: catch » Unassigned

@Crell, I've been generally in favour of this, and the updated structure of Guzzle looks like a really good change. It'd be good to see what the answer to effulgentsia's question about BC over the next 5-6 years is, but restructuring the library so it's easier for us to commit it suggests it should be easy to resolve issues in the future as well.

If we use Guzzle, we'd hopefully be able to unpostpone #1599622: Run poormanscron via a non-blocking HTTP request which would cheer me up for one. In this case most of the pushback has been from Dries, so unassigning from me.

effulgentsia’s picture

To move this along, let's get a patch up here that just adds Guzzle 3.0 core and the other 1 or 2 components we expect to need (I'm hoping Guzzle is setup to allow this to be done via Composer). Then, pwolanin, Moshe, Dries, and others who have been concerned about code weight / complexity can look it over (yes, they can also look it over on Guzzle's github, but having it here would put it in their face a little bit more). Perhaps it also makes sense to change drupal_http_request() to use Guzzle as part of this patch to demonstrate what a simple Guzzle consumer might look like, but I don't think it's worth changing any consumers of drupal_http_request() yet, since Dries hasn't agreed to Guzzle yet.

rbayliss’s picture

I definitely agree we should put a patch together that includes the library. A couple simple examples of how Guzzle (pre-3.0, anyway) can be used is found in #40 above. That patch was applying and passed tests at the time it was provided.

mtdowling’s picture

#89:

> are you thinking of doing something similar for Guzzle 3.x, where BC is treated differently per component

I don't have anything written up like what fabpot has, but I'd be happy to put something together next week.

In short, Drupal would only require guzzle/http, which in turn requires guzzle/common, guzzle/parser, guzzle/stream. These core packages in Guzzle will have the same promise Fabien made for his core packages. However, other packages, i.e. guzzle/service, could potentially have minor BC changes introduced over time. Further, Guzzle 3 will strictly follow semantic versioning.

> what things would look like for Guzzle in Drupal 8.x in 2017/2018

If between now and 2018 Guzze 4 is released, then Guzzle 3 will be maintained in a separate branch. This is the approach I'm taking with Guzzle 2. While I'm not actively maintaining Guzzle 2 anymore, I'll accept pull requests for it and will address any security concerns.

mikeytown2’s picture

Trying to manually merge in changes from #40 into latest dev. Got some questions. Hopefully the answers are easy :)

Looks like core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php has changed a lot. Where can I register http_default_client? Would it belong in core/lib/Drupal/Core/DependencyInjection/Compiler/RegisterKernelListenersPass.php?

diff --git a/core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php b/core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php
index a82f5b6..3b1ffe2 100644
--- a/core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php
+++ b/core/lib/Drupal/Core/DependencyInjection/ContainerBuilder.php
@@ -73,6 +73,16 @@ class ContainerBuilder extends BaseContainerBuilder {
     $this->register('httpkernel', 'Symfony\Component\HttpKernel\HttpKernel')
       ->addArgument(new Reference('dispatcher'))
       ->addArgument(new Reference('resolver'));
+
+    $this->register('http_client_simpletest_subscriber', 'Drupal\Core\Http\Plugin\SimpletestHttpRequestSubscriber');
+    $this->register('http_default_client', 'Guzzle\Http\Client')
+      ->addArgument(NULL)
+      ->addArgument(array(
+        'curl.CURLOPT_TIMEOUT' => 30.0,
+        'curl.CURLOPT_MAXREDIRS' => 3,
+      ))
+      ->addMethodCall('addSubscriber', array(new Reference('http_client_simpletest_subscriber')))
+      ->addMethodCall('setUserAgent', array('Drupal (+http://drupal.org/)'));
   }

Note: This has NOT been added to this patch. I don't know where to put it.

The missing bit of code is sorta important thus this patch won't work without it.

With this patch applied, I believe all one needs to do is run

cd core
php composer.phar update

and composer will do the rest. Or run

drush dl composer
drush composer update

if composer.phar isn't in the core directory but drush is installed.

effulgentsia’s picture

Where can I register http_default_client?

core/lib/Drupal/Core/CoreBundle

rbayliss’s picture

We might want to wait on a reroll though, since #1706064: Compile the Injection Container to disk is RTBC, and that will change where the service is registered again.

mikeytown2’s picture

Patch with http_default_client registered. Still not ready for test, will be doing that next (php composer.phar update).

mikeytown2’s picture

Status: Active » Needs review
FileSize
2.6 MB

full patch attached.

Status: Needs review » Needs work

The last submitted patch, drupal-1447736-98-full-guzzle.patch, failed testing.

Anonymous’s picture

Issue summary: View changes

Added in more issues that could be solved by using guzzle

Crell’s picture

Assigned: Unassigned » Dries

I've updated the issue summary. If I missed anything, please add it. Assigning to Dries for a green light.

Crell’s picture

Issue summary: View changes

Add proper issue summary.

mikeytown2’s picture

I was able to meet up with mtdowling at the Pacific North West Drupal Summit this weekend. Long story short guzzle has been heavily profiled for speed (overhead of guzzle), so if that is a concern, I wouldn't worry about it; the code is fast.

With things like service descriptions in guzzle, putting guzzle in core is starting to make more and more sense when WSCCI is taken into account.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
583.67 KB
+++ b/core/composer.json
@@ -13,6 +13,7 @@
+    "guzzle/guzzle": "3.*",

We only need guzzle/http and its dependencies.

+++ b/core/includes/common.inc
@@ -7,6 +7,7 @@
+use Guzzle\Http\Message\RequestInterface;

Useless line.

+++ b/core/modules/system/system.module
@@ -3639,17 +3640,20 @@ function system_retrieve_file($url, $destination = NULL, $managed = FALSE, $repl
-  $result = drupal_http_request($url);
-  if ($result->code != 200) {
-    drupal_set_message(t('HTTP error @errorcode occurred when trying to fetch @remote.', array('@errorcode' => $result->code, '@remote' => $url)), 'error');
-    return FALSE;
+
+  try {
+    $result = drupal_container()->get('http_default_client')->get($url)->send();
+    $local = $managed ? file_save_data($result->getBody(TRUE), $path, $replace) : file_unmanaged_save_data($result->getBody(TRUE), $path, $replace);
+    if (!$local) {
+      drupal_set_message(t('@remote could not be saved to @path.', array('@remote' => $url, '@path' => $path)), 'error');
+    }
+    return $local;
   }
-  $local = $managed ? file_save_data($result->data, $path, $replace) : file_unmanaged_save_data($result->data, $path, $replace);
-  if (!$local) {
-    drupal_set_message(t('@remote could not be saved to @path.', array('@remote' => $url, '@path' => $path)), 'error');
+  catch (BadResponseException $e) {
+    $response = $e->getResponse();
+    drupal_set_message(t('HTTP error @errorcode occurred when trying to fetch @remote.', array('@errorcode' => $response->getStatusCode(), '@remote' => $url)), 'error');
+    return FALSE;
   }
-
-  return $local;

I'm not clear why this is an improvement, so I took it out of this patch. We can do it in a follow up. The aggregator example demonstrates some clearer improvements.

6 days to next Drupal core point release.

effulgentsia’s picture

By the way, in #102, the composer.lock file grew a lot compared to HEAD. Is there a switch to php composer update for generating a less verbose lock file? I don't know how we have such a terse one in HEAD.

Crell’s picture

Composer changed its .lock file format a few weeks ago in preparation for an RC, I believe. If it works, I wouldn't worry about it.

cosmicdreams’s picture

Status: Needs review » Needs work
+++ b/core/vendor/composer/autoload_real.phpundefined
@@ -6,9 +6,15 @@
+        if (null !== static::$loader) {
+            return static::$loader;
+        }
+
+        static::$loader = $loader = new \Composer\Autoload\ClassLoader();

This doesn't seem related to using Guzzle but that looks nice.

effulgentsia’s picture

Status: Needs work » Needs review

That was automatically updated by Composer.

RobLoach’s picture

Issue tags: -API change, -WSCCI, -Testing system, -symfony

#102: drupal-1447736-101-guzzle.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, drupal-1447736-101-guzzle.patch, failed testing.

Anonymous’s picture

Issue summary: View changes

Fixed typo.

Fabianx’s picture

Status: Needs work » Needs review
Issue tags: +API change, +WSCCI, +Testing system, +symfony

#102: drupal-1447736-101-guzzle.patch queued for re-testing.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

We can flesh out usage further once this is in. Let's get the big patch in, then we can spin off other issues.

hass’s picture

As Link checker heavily relies on Drupal core and how it works today and as I know that cURL cannot work properly - has someone tested if Guzzle has implemented some workarounds or solved the inabilities of cURL. Just from memory.

Stop redirecting after the first request (e.g. the first HEAD answers with 301). cURL in this case will follow all redirects and only gives me the last 200 what break will link checker completely...

I'm sorry, but I've forgotten the details of these major limitations I had with cURL, but it was a show stopper for sure I've found with http://www.onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blo.... Maybe this have been issues with multi_curl, I tried to use for boosting the link check speed... but the redirect limit we have in Core is something that cURL cannot handle correctly. Have you really tested this status codes?

RobLoach’s picture

Guzzle can send requests via curl_multi. If their implementation is not enough for us, then it's still possible to inherit from it and swap out the definition we use using drupal_container().

mtdowling’s picture

I'm trying to understand what you're talking about, but you haven't provided enough details.

> As Link checker heavily relies on Drupal core and how it works today and as I know that cURL cannot work properly

Please elaborate with evidence.

> Stop redirecting after the first request (e.g. the first HEAD answers with 301). cURL in this case will follow all redirects and only gives me the last 200 what break will link checker completely...

You can turn off redirects if needed, or set the maximum number of allowed redirects. If you do allow redirects, you can get the chain of request and response objects that created the final response by calling getPreviousResponse() on a Response object.

I should also mention that I moved away from cURL powered redirects to redirect plugin that can work around some issues with redirects and non-repeatable entity bodies / cookies being added on a redirect. These changes are present in master and will be tagged and released likely this weekend.

> the redirect limit we have in Core is something that cURL cannot handle correctly

Can you elaborate?

-Michael

sun’s picture

Yes, we should definitely do this.

Also, to give an update from the testing system side: Meanwhile, discussions resulted in a concrete battle plan:
#1567500: [meta] Pave the way to replace the testing framework with PHPUnit and possibly rewrite Simpletest

We will start to work on this for reals after feature freeze in:
#1801176: Deploy a PHPUnit system with a bottom-up approach

This effort will also require Guzzle.

mikeytown2’s picture

Heads up: Amazon's AWS SDK for PHP now uses Guzzle
https://github.com/aws/aws-sdk-php/

RobLoach’s picture

Tagging.

RobLoach’s picture

#102: drupal-1447736-101-guzzle.patch queued for re-testing.

Crell’s picture

So what are we still waiting for here? Dries, catch, please either commit or mark back to CNW with an explanation.

catch’s picture

I already posted in #90, but new libraries need agreement from Dries as well, which is why it's assigned to him.

klausi’s picture

Priority: Normal » Major

While writing Simpletests for rest.module I really started to hate cURL. Cannot even access the Location response header without parsing the response myself. So let's please use Guzzle to have readable and maintainable tests for rest.module.

Not sure this qualifies as a critical task, but at least major.

Dries’s picture

Assigned: Dries » Unassigned

I looked into this and am +1 for this issue. I don't think we should be in the business of maintaining an HTTP library. My biggest concerns is around security. We need the Guzzle team to do security releases well. This can be committed once we're below the commit threshold.

mtdowling’s picture

That's great!

If Guzzle eventually goes beyond a 3.x release in the distant future, then security fixes will always be backported to a 3.x branch.

Is there more feedback or a specific process in mind for security fixes?

klausi’s picture

The Drupal community encourages responsible disclosure: http://en.wikipedia.org/wiki/Responsible_disclosure

When I search the web for "guzzle report security problem" the first hit is a link to a Drupal module ;-). I could not find a hint on reporting security problems on the Guzzle homepage.

Furthermore the Drupal security team has a security advisory process with RSS feeds and a mailing list for notifications: http://drupal.org/security

And the security release process follows a predictable timing scheme, i.e. security releases are done in the middle of the week on work days so that people are available to react accordingly.

So from Drupal's point of view it would be good to know when Guzzle schedules security releases, so that we can release on the same day or as close as possible.

effulgentsia’s picture

#102: drupal-1447736-101-guzzle.patch queued for re-testing.

effulgentsia’s picture

This can be committed once we're below the commit threshold.

We are currently 2 major tasks over thresholds and this is one of them. I don't know whether that means we should commit this or reclassify as a feature request. I queued a bot check just in case the answer is the former.

RobLoach’s picture

So from Drupal's point of view it would be good to know when Guzzle schedules security releases, so that we can release on the same day or as close as possible.

mtdowling stated above:
Sure, but there aren't many security concerns for an HTTP client. A client serializes an HTTP request, sends it over the wire, parses the response, then returns the response. As long as you use SSL and don't disable peer and host certificate verification, there isn't much more a client should be responsible for. Did you have any specific things in mind that might raise concern?

I typically tag a release once a week, but I don't think there's ever been a need for a security release.

So, as long as we roll the latest stable from Guzzle, then we'll be solid. No reason to hold a Guzzle release for a Drupal release. Their release cycle is a lot more aggressive than ours.

scor’s picture

@mtdowling said in #58:

> The project should have a simple, well documented way to report bugs

Like most open source projects, issues are reported on GitHub.

What about security issues? These are usually not reported / discussed publicly. Is there a email address to report these? or do we contact you personally? I couldn't find any indication on http://guzzlephp.org/ or github.

Sure, but there aren't many security concerns for an HTTP client.

The Drupal Security team would prefer to follow the same policy for all third party libraries, no matter the size and/or risk associated with a library, that's why we're asking you these questions...

mtdowling’s picture

Thanks for the feedback. I've create an email address for security issues, security@guzzlephp.org, and updated the documentation accordingly: http://guzzlephp.org/guide/contributing.html#reporting-a-security-vulner...

David_Rothstein’s picture

Priority: Major » Normal

So are we actually in a situation where this issue is blocking itself from being committed? :)

I think to resolve that catch-22 we can just demote this issue's priority. #120 bumped it to major based on a shortcoming in cURL, but the patch is more about replacing drupal_http_request() which I don't think has that same shortcoming. (Also, having to parse a response in a simpletest to work around a cURL limitation really doesn't seem like a major issue to me, just an annoyance.)

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs work

Actually, I just tried this patch out on a server with cURL disabled and got all sorts of "call to undefined function" fatal errors when using the Aggregator module.

This patch needs a strategy for how it's going to handle servers that don't have cURL, whether to allow the modules to be enabled but handle the failed requests gracefully, or make each individual module that depends on cURL declare it as a dependency (as Simpletest already does), or just make all of core depend on cURL, etc.

David_Rothstein’s picture

Mysteriously disappearing tag.

mikeytown2’s picture

@David_Rothstein
The server with cURL disabled, is cURL available on it? Is this a simple matter of turning it on or do you have a rare host that doesn't have any cURL support?

Comments about cURL #56, #66 - #83. Long story short we are okay with making cURL a core dependency (from my understanding). Vast majority of hosts can enable/install cURL if cURL isn't available.

Crell’s picture

Let's just add it as system_requirements() check and call it a day. Maybe we can push that out to specific modules later, but for now it sounds like it's not anymore of an onerous requirement than we have already.

sun’s picture

Status: Needs work » Needs review

I agree with that — haven't seen a host on which cURL wasn't available in the past years.

Essentially, as @Crell says, the check has to be moved into system_requirements(), but off-hand, I'm not 100% sure whether that is sufficient for the Drupal installer as well as update.php. 100% confidence can most likely only be achieved by temporarily disabling cURL via php.ini and manually testing those requirements checks.

That said, I wouldn't mind this patch from landing without that manual testing. Like it or not, we'll run into PHP extensions issues anyway with D8, since the Symfony components we're actively using have a range of undocumented dependencies. (E.g., it is more than possible that our entire includes/unicode.inc is completely obsolete, because we're indirectly relying on the multibyte extension in various of subsystems already and unicode.inc #1 kicks in too late and #2 isn't even known to standalone vendor components in the first place. Just to name the first battle-field.)

effulgentsia’s picture

Status: Needs review » Needs work

Back to needs work for some kind of requirements check added to this patch.

My preference would be that we release D8 with the ability to run without cURL, and only make Aggregator, OpenId, Update Status, Testing, etc. depend on it. But I'm happy to take that up in a follow up if it's easier to add it to system_requirements() for the time being. However, #102 only uses Guzzle for Aggregator, so I would consider it sufficient to add a check to aggregator_requirements() instead if that's just as easy.

catch’s picture

I'd also prefer to just add the requirement where it's needed, and just for aggregator for now. The worst that happens is you install a module that doesn't declare it and get some notices if someone forgets to add the dependency, as opposed to not being able to install in the first place.

David_Rothstein’s picture

The server with cURL disabled, is cURL available on it? Is this a simple matter of turning it on or do you have a rare host that doesn't have any cURL support?

Nope, it's very easy to turn on in my case - I was just pointing out that we need a better way to tell people that than a white screen of death :)

kim.pepper’s picture

I applied the patch at #102 but there are lots of composer.lock differences.

If I run composer update, it updates all components including symfony.

How do I re-roll this patch without updating everything?

Thanks!

Kim

effulgentsia’s picture

You should be able to run composer update guzzle/http to just update that.

kim.pepper’s picture

Status: Needs work » Needs review
FileSize
644.76 KB

I have:

  • re-rolled the patch in #102
  • ran composer update guzzle/http
  • added 'curl' to system_requirements()
mikeytown2’s picture

Status: Needs review » Reviewed & tested by the community

Test Passes. Moving this back to RTBC as curl has been added to the system requirements thus preventing unknown WSOD (main objection in #130).

cosmicdreams’s picture

Ah I see. The tests pass and this patch is green but the comment in this issue doesn't annotate it as so.

Fabianx’s picture

#140: PASSED: [[SimpleTest]]: [MySQL] 48,751 pass(es).

catch’s picture

Status: Reviewed & tested by the community » Needs work

We should be adding the requirement for individual modules that require cURL, not the entire system.

cweagans’s picture

Does it really matter? cURL is almost universally available so almost nobody will hit that check anyways.

catch’s picture

David hit it but I don't know if he turned it off especially for testing this patch.

I don't really mind adding cURL as a dependency in general, but it feels like an artificial limitation if we don't actually require it.

cweagans’s picture

Can we revisit that after feature freeze and commit the patch as-is for now?

Dave Reid’s picture

Since Guzzle would be available as a system library, it feels like it makes more sense to add cURL as a system dependency. As a contrib author, adding requirements for cURL for each module that would need to use the API would be frustrating, especially since I can't easily 'extend' the list of PHP required extensions and I have to write a new hook_requirements() for each module each time.

effulgentsia’s picture

Status: Needs work » Reviewed & tested by the community
Issue tags: +revisit before beta

One benefit of adding the hard dependency now is we can get feedback from the wild (to the extent that people testing D8 is sufficiently wild) for who doesn't have cURL. I agree we should revisit the artificially hard dependency before release though. Back to catch for comment.

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: -revisit before beta

In my case I turned it off to test the patch. I'm pretty sure people do sometimes run Drupal in restricted environments though (where outgoing HTTP requests are forbidden); whether that translates to not even wanting to or being able to install cURL itself (regardless of whether its requests are going to be blocked), that I'm not sure.

As far as I can tell, the only places in core that call drupal_http_request() are optional modules, so making it a strict requirement seems a little overboard. The one exception is system_retrieve_file(), which is an API function in System module but isn't actually called from anywhere outside the Update module.

As a contrib author, adding requirements for cURL for each module that would need to use the API would be frustrating, especially since I can't easily 'extend' the list of PHP required extensions and I have to write a new hook_requirements() for each module each time.

I would think the right way to handle that is sort of like http://drupal.org/project/curl - core could move all the Guzzle and HTTP request code into its own independent module, then any module which relies on HTTP requests can just declare that module as a dependency (and the module itself can take care of hook_requirements). It's a better experience for end users also.

However, it's too big of a change to the patch at this point, so just adding the code to aggregator_requirements() (copied from the Simpletest module) like @catch suggested seems pretty reasonable, and anything else could be a followup... I would have done that myself but wasn't sure what to do about system_retrieve_file().

Either way, the above patch isn't quite right since adding the code to system_requirements() means the simpletest_requirements() code is obsolete, but it's still there.

David_Rothstein’s picture

Status: Needs work » Reviewed & tested by the community

Crosspost. Setting RTBC for @catch's feedback is good.

Although like I said, there's still a problem with the patch no matter what:

Either way, the above patch isn't quite right since adding the code to system_requirements() means the simpletest_requirements() code is obsolete, but it's still there.

But it's a minor one.

David_Rothstein’s picture

Issue tags: +revisit before beta
catch’s picture

Issue tags: -revisit before beta

For contrib modules there's #1322890: Add a property to .info files to allow modules to specify required php extensions which would avoid having to write system_requirements() every time. I don't think saving contrib modules from having to write even a system_requirements() though is worth completely blocking installs for anyone on a restricted environment unless they hack core.

Also iirc several linux distros (ubuntu is one I think) don't have php5-curl out of the box - you need to install that separately. Most people will have no restrictions on doing so if it's their desktop but it's still an instant failure when you try to install.

sun’s picture

IMHO, #1322890: Add a property to .info files to allow modules to specify required php extensions is the best and ultimate answer, but for both core and contrib modules.

I think we should commit this as-is with the system requirement for now, and migrate that into individual module .info files as soon as the capability exists.

Crell’s picture

+1 to #154. Let's get the library in so we can use it; we can fiddle with the requirements flags for months without changing the API after that.

Sylvain Lecoy’s picture

Come on even IBM mainframe system support curl!

effulgentsia’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
643.48 KB

#140 no longer applied, so I needed to reroll.

I also noticed that despite saying in #102 that I removed the change to system_retrieve_file() to cover that in a follow up, that I hadn't actually done so, but in this patch, I did.

And finally, I removed the change to system.install, and added an aggregator_requirements(). I copied the relevant bits from simpletest_requirements(). Since we have #1322890: Add a property to .info files to allow modules to specify required php extensions in the queue, I'm not concerned about the 10 lines of code duplication that each Guzzle consuming module will need until that issue lands.

mikeytown2’s picture

Status: Needs review » Reviewed & tested by the community

Last patch passed. #144 has been addressed with this latest patch thus moving this back to RTBC.

On a side note I'm getting married December 1st (yes code freeze day). This getting in would be a really nice gift. Also means I'm pretty busy and can't help much with pushing this through.

Crell’s picture

Congratulations, Mike! Let's commit this as an early wedding gift. :-)

catch’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +API change, +WSCCI, +Proudly Found Elsewhere, +Testing system, +symfony

The last submitted patch, 1447736-157-adopt-guzzle.patch, failed testing.

effulgentsia’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
643.49 KB

Just a reroll for HEAD drift. No actual changes.

Status: Reviewed & tested by the community » Needs work
Issue tags: -API change, -WSCCI, -Proudly Found Elsewhere, -Testing system, -symfony

The last submitted patch, 1447736-162-adopt-guzzle.patch, failed testing.

catch’s picture

Status: Needs work » Needs review

#162: 1447736-162-adopt-guzzle.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, 1447736-162-adopt-guzzle.patch, failed testing.

Fabianx’s picture

Status: Needs work » Needs review

#162: 1447736-162-adopt-guzzle.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +API change, +WSCCI, +Proudly Found Elsewhere, +Testing system, +symfony

The last submitted patch, 1447736-162-adopt-guzzle.patch, failed testing.

Fabianx’s picture

Status: Needs work » Needs review
FileSize
654.38 KB

And another re-roll to land on a different testbot ...

Status: Needs review » Needs work
Issue tags: -API change, -WSCCI, -Proudly Found Elsewhere, -Testing system, -symfony

The last submitted patch, core--adopt-guzzle--1447736--168.diff, failed testing.

mikeytown2’s picture

Status: Needs work » Needs review

#168: core--adopt-guzzle--1447736--168.diff queued for re-testing.

Testbot went fatal on line #30. Thinking this should still pass

Status: Needs review » Needs work

The last submitted patch, core--adopt-guzzle--1447736--168.diff, failed testing.

Fabianx’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
Issue tags: +API change, +WSCCI, +Proudly Found Elsewhere, +Testing system, +symfony

The last submitted patch, core--adopt-guzzle--1447736--168.diff, failed testing.

jthorson’s picture

The testbots have a hard cap at 90 minutes per test ... with everything we've dumped into core, I've seen testing of D8 + this patch exceeding that at least twice today.

Requeued ... if it comes back as failed during invocation of run-tests.sh in 90 minutes again, we're going to have to i) track down what's slowing down the tests (which may be outside of this patch ... *everything* is slow right now), or ii) increase the timeout on the bots.

jthorson’s picture

Status: Needs work » Needs review

Passed.

jthorson’s picture

Status: Needs review » Reviewed & tested by the community

Ooops ... RTBC as per #162.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

K, committing the crap out of this while it's still green. Taking a raincheck from earlier when we were at thresholds.

Committed and pushed to 8.x. Thanks!

ParisLiakos’s picture

Title: Adopt Guzzle library to replace drupal_http_request() » Change notice: Adopt Guzzle library to replace drupal_http_request()
Priority: Normal » Critical
Status: Fixed » Active
Issue tags: +Needs change record

Awesome!
I guess a change notice will be needed

twistor’s picture

Fantastic!

catch’s picture

Is there an issue to actually convert drupal_http_request() to use this yet?

Sutharsan’s picture

I like to see an example too. Need to convert the drupal_http_request in #1804688: Download and import interface translations. And like to find a way to route the calls to http://ftp.drupal.org made by its test script back to the calling test domain. The test domain should serve files as if they come from an external domain. A http request and a redirect in one.

Berdir’s picture

git show 5058fa3 core/modules/aggregator/lib/Drupal/aggregator/Plugin/aggregator/fetcher/DefaultFetcher.php

Or look for that file in the patch.

For testing, look for Mock Plugin on http://guzzlephp.org/guide/plugins.html.

According to a quick grep, we're not yet including the Mock plugin, so you will also have to update composer.json to fetch it. https://packagist.org/packages/guzzle/plugin-mock, so the line in composer.json should look like "guzzle/plugin-mock": "3.*" I guess.

EDIT: @catch I think the goal is to directly use guzzle everywhere and then just drop drupal_http_request() once everything is converted.

catch’s picture

@Berdir, leaving drupal_http_request() as is then just dropping it when it's not used in core works for me, but it'd still be good to know where the conversion issues are.

Berdir’s picture

Berdir’s picture

Berdir’s picture

Assigned: Unassigned » Berdir

Will try to get started with a change notice here.

Berdir’s picture

Status: Active » Needs review

Here is a start: http://drupal.org/node/1862446.

Please review and complete.

While writing the example, I noticed that there are some possibly relevant things that we lost during the conversion:

-    // Request feed.
-    $result = drupal_http_request($feed->url, array('headers' => $headers));
+    try {
+      $response = $request->send();
+      $feed->source_string = $response->getBody(TRUE);
+      $feed->etag = $response->getEtag();
+      $feed->modified = strtotime($response->getLastModified());
+      $feed->http_headers = $response->getHeaders();
 
-    // Process HTTP response code.
-    switch ($result->code) {
-      case 304:
-        break;
-      case 301:
-        $feed->url = $result->redirect_url;
-        // Do not break here.
-      case 200:
-      case 302:
-      case 307:
-        if (!isset($result->data)) {
-          $result->data = '';
-        }
-        if (!isset($result->headers)) {
-          $result->headers = array();
-        }
-        $feed->source_string = $result->data;
-        $feed->http_headers = $result->headers;
-        break;
-      default:
-        watchdog('aggregator', 'The feed from %site seems to be broken due to "%error".', array('%site' => $feed->title, '%error' => $result->code . ' ' . $result->error), WATCHD
-        drupal_set_message(t('The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->title, '%error' => $result->code . ' ' . $result->error)
+      return TRUE;
+    }
+    catch (BadResponseException $e) {
+      $response = $e->getResponse();
+      watchdog('aggregator', 'The feed from %site seems to be broken due to "%error".', array('%site' => $feed->title, '%error' => $response->getStatusCode() . ' ' . $response->g
+      drupal_set_message(t('The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->title, '%error' => $response->getStatusCode() . ' ' . $res
+      return FALSE;
     }

- The old code updated the feed url in case of a 301 response.
- The old code explicitly only accepted the response for a given set of response codes. I'm not really sure when the BadResponseException is actually thrown, what happens if the site returns a 5xx response code? Are we going to happily return this to the parser? Does Guzzle provide anything to differentiate a "good" response from a unexpected one? Or a feature to set expected status codes and throw an exception on an unexpected one?

Berdir’s picture

Another thing, SimpletestHttpRequestSubscriber is currently added by CoreBundle, but it should be in simpletest.module, as @Crell already pointed out after the first patches here.

But, that means that the event will only be executed in tests that have simpletest.module enabled, which is kinda pointless ;) So we either need to keep it there or add a TestBundle or something like that? I'm not exactly sure what the point of that subscriber is, anyway, is it just for requests done by drupalPost()/drupalGet()? Another follow-up, I guess...

Berdir’s picture

Ok, I did some testing with the BadResponseException and a) it throws an exception on e.g. not found (which is going to be interesting in drupalPost()/Get() because there sure are a lot of cases where this is the expected behavior, just like access denied and more. b) The current error handling in DefaultFetcher.php can result in a fatal error because there isn't always a response, for details see #1862524: Convert drupal_http_request usage in aggregator.module to Guzzle, I suggest we deal with that there. @mtdowling, it would be great if you could comment over there in regards to the recommended error/exception handling with Guzzle.

Also opened #1862536: Improve documentation of SimpletestHttpRequestSubscriber.

Will add these follow-ups to the summary and try to stop spamming this issue ;)

Berdir’s picture

Issue summary: View changes

Added a related issue to drupal_http_request()

sun’s picture

That was explained in #37 already.

effulgentsia’s picture

Title: Change notice: Adopt Guzzle library to replace drupal_http_request() » Adopt Guzzle library to replace drupal_http_request()
Priority: Critical » Normal
Status: Needs review » Fixed
Issue tags: -Needs change record

The change notice looks fine to me. Perhaps it can be improved as we follow through on remaining conversions, but it already identifies the key change, provides some example code, and links to documentation.

Berdir’s picture

Assigned: Berdir » Unassigned
Sheldon Rampton’s picture

While running SimpletestHttpRequestSubscriber, my website met an earlier version of itself and accidentally shared information that altered its own timeline. It then became self-aware and began recursively sending User-Agents back to eliminate Sarah Connor.

tim.plunkett’s picture

Issue summary: View changes

Updated issue summary.

David_Rothstein’s picture

Issue tags: +major and critical issue threshold sweep

Sorry for the noise; I'm told that slashes in tags can break the autocomplete on drupal.org.

David_Rothstein’s picture

And again...

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

arafathhashmi’s picture

Seems not fixed yet, however my host has solved few of the bugs.

bappa.sarkar’s picture

bappa.sarkar’s picture

Status: Closed (fixed) » Active
bappa.sarkar’s picture

Berdir’s picture

Status: Active » Closed (fixed)

There is no need to re-open this issue for that?

Berdir’s picture

Issue summary: View changes

Updated issue summary.

dgtlmoon’s picture

Is this worth backporting? (For those searching - yes adding Guzzle means it now supports HTTP/1.1 and with chunked encoding correctly)

mikeytown2’s picture