This is just so we can track the status of this library, which will be a PHP project outside of Drupal, for all to use, likely a part of Symfony. This is where we will figure out where to contribute. We can't contribute directly to Drupal.org because of licensing issues.

Per today's WSCCI IRC meeting:

11:38 <@Crell> What we'll want to do is be able to pull the accept header out of the request object, pass it to some library, and get back "so this is the mime type, version, and language you want to use. Trust me on this."

Thanks to the alphabet, you can see the spec for these headers at the top of this page: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The benevolent lsmith77 is in need of a similar thing here:

https://github.com/FriendsOfSymfony/FOSRest/issues/1

which is all about re-implementing what Apache can do so that we have more say in it, and what Apache does is here:

http://httpd.apache.org/docs/2.2/mod/mod_negotiation.html
http://httpd.apache.org/docs/2.2/content-negotiation.html

We are interested in solving this in a general PHP library scope, which to me means something like taking a bastard string like this:

image/gif, image/jpeg, image/pjpeg, application/x-ms-application,
        application/vnd.ms-xpsdocument, application/xaml+xml,
        application/x-ms-xbap, application/x-shockwave-flash,
        application/x-silverlight-2-b2, application/x-silverlight,
        application/vnd.ms-excel, application/vnd.ms-powerpoint,
        application/msword, */*

and parsing that into an ordered hash of some sort so we know what it wants, by way of MIME types in this example. So, we need a way of supplying on the server side an index of types, languages, encoding and (I guess) charset that we support for the given resource (system path), we initialize our Negotiator with that, and it comes back with what we're going to serve up. Then this library is done, and we get to decorate the request object, and pass that on.

There are quite a few libraries for doing content negotiation in PHP:
- https://github.com/winmillwill/BadFaith initiated by @wamilton but seeking someone to continue the work
- http://ptlis.net/source/php/content-negotiation/ it's an old library, see its pros/cons in #6)
- https://github.com/ramsey/mimeparse (supports MIME types only)
- FLOW3 content negotiation implementation which was committed a few months ago
- https://github.com/FriendsOfSymfony/FOSRest/tree/master/Util (partial implementation by @lsmith)
- https://github.com/codeguy/Slim/pull/376 (includes tests)
- https://github.com/symfony/symfony/pull/5711 is a proof of concept content negotiation library which may become part of Symfony (see expression of interested by Fabien Potencier)

Related issues:
- #1833440: Implement partial matcher based on content negotiation MIME type

Files: 
CommentFileSizeAuthor
#30 1505080_30_mimeparse.patch25.54 KBscor

Comments

thx for opening this ticket .. while the ticket is on FOSRest (there is another one related to this on FOSRestBundle) .. i am not entirely sure where the final code for this should live. I want to get this all integrated into Symfony2 Routing, but I also want to ensure that people can use this code without having to feel "pressured" to also use the Routing system. Anyway, we can develop the code where ever and then later on decide where it should best live.

It seems to me that this logic is generic enough that it could/should be its own stand-alone PHP library. Then you simply implement a Symfony Matcher that takes this library as a dependency and works that into its routing logic however it wants. (Drupal and Symfony CMF would probably have different Matchers, for instance, but use the same negotiation library between them.)

I agree that this logic should exist in a stand-along library. It should also not depend on objects or data structures that are specific to Drupal or Symfony2.

One argument for this being stand-alone is that it could be hacked into the D6 services contrib module to allow negotiation of resources in D6 (which has no concept of Symfony2 objects such as Request).

Here's the first body of work I'm comfortable sharing:

https://github.com/winmillwill/BadFaith

I started off trying to build a Negotiator class straight-up, but then realized that in all the implementations I've seen so far, I didn't see any effort to treat the textblobs like 'media-range with parameters' and 'language-tag' as real objects, and they all use a common subset of parsing rules and rules for implicit q-value etc. Going forward, it should be way easier to write an expressive negotiation algorithm.

I'm happy to chat about anything about this if anyone thinks I'm doing something stupid or ugly. I'm happy to take pull requests and also happy to take patches via email. Because of the license issues, I don't want any patches actually posted here, because that could potentially make them intellectual property of the Drupal Assn or something crazy like that and then *pow* no one can use it.

Obviously, if there's real interest in this, I'm happy to move the project to an organization GitHub or similar.

After giving commitment to Crell on helping out on this (together with rmuilwijk) I have been looking at the following things so far:

  • Wikipedia: Read the article on it, kinda basic.
  • Apache: I've read the content negotiation documentation.
  • Apache: the actual mod_negotiation.c source code, hard to grok at once though.

In addition to that I spoke to Lin Clark about this as the negotiation activities are required for hers, and many other parts. I continued looking at wamilton's Github repository and although I'm still a bout to continue running it, it sounds like the right approach to me as well.

Happy to get any feedback on where I can help out. For what I can safely promise now, I can put in say 2x 2 hours per week (evening) and try to help commenting in between other activities. As relatively close, I can sit together with rmuilwijk as well.

Okay, after reading through the source code wamilton started I realized that it still needs quite a lot of work which led me to wonder if there isn't something like this already. Not that it's not worth finishing this effort - to the contrary - I just do see a lot of risk in getting the exact algorithm right and maintaining a extra library in the long run.

I stumbled upon something called "conNeg", a four file library that seems to be stable and had it's last release this year.
http://ptlis.net/source/php/content-negotiation/

After reviewing - without running it yet - the code I have the following pros/cons:

  • pros:
    • GPL 2.1.
    • Stable release history from 2006 and still active.
    • Object oriented, single class and seems very light-weight.
    • Everything is clearly documented.
    • Seems to support language, charset, and mime-type negotiating.
    • Algorithm is already implemented (looks very complex)...
  • cons:
    • Is not composer nor PSR-0 capable, can be though.
    • Although code quality seems good, style does not: mixed spacing everywhere.
    • There are no tests AFAIK.
    • It uses globals like $_SERVER directly, very bad thing for subrequests IMHO.

Please have a look at this as might influence our path forward...

i would certainly contact the author. the issues seem solveable if the author is on board. i do think it would be beneficial to follow the algorithm implemented by mod_negotation since i simply assume it has gotten more scrutiny. however potentially the author is also willing to pool resources on BadFaith.

FYI, there is also another content negotiation library in PHP: http://ptlis.net/source/php/content-negotiation/ - might be useful to see another approach for doing conneg.

It's used in the Drupal based Neologism project/distro, though that doesn't necessarily mean it fits exactly our needs for Drupal core, but worth investigating.

<wamilton> nielsvm: yeah, the library is not far along at all right now, but the issue is that this library isn't just for how Drupal wants to react to Accept-* Header Fields
<wamilton> nielsvm: Content Negotiation doesn't dictate that your algorigthm look anything like the Apache algo
<wamilton> nielsvm: in fact, it's fairly foreseeable that something in the contrib space would want to say something about the algo
<wamilton> nielsvm: I looked at the ptils thing, but it's pretty old and it didn't look that usable to me because there is no separation of concerns that would allow for a pluggable algo
<nielsvm> wamilton: well, agreeing on most you said, and yes, the library doesn't look nicely and properly decoupled but seems lighter and at least has some algo's in it already. I'm just very much wondering what would be the best approach for getting proper content negotiation in core.
<wamilton> nielsvm: my thought has been that the best approach is to have classes that represent the different Accept-* criteria as proper collections of objects of different types and that supply some natural, obvious functionality
<wamilton> nielsvm: the thing here is that we need to get variants from the calling code
<nielsvm> I totally agree on splitting on the different Accept- header types, mixing that into one codemonster seems not sustainable to me.
<nielsvm> wamilton: yeah exactly, so the library dictates Drupal (and any other system using it) what content variation to pick, understood.
<wamilton> nielsvm: for the record, I have no intention of sitting in the way of any approach that people want to work on -- I don't have time to do this now, and if there's an approach people want to work on, that one has my full support
<wamilton> nielsvm: so, there were quite a few missteps I made as I went along -- I don't even know if I have a class for variant and a variant queue yet
<nielsvm> wamilton: that's great news, although I'm not intending to change any direction of efforts as your approach seems the best to me as well but on the other hand we have a 3 months time window.
<nielsvm> wamilton: although my time is limited I'm trying to run with your lib the coming while to ehh... try using/understanding your approach and finishing/adding stubs to get it in proper shape.
<lsmith> btw Flow3 also recently added some code for content negotiation
<lsmith> it sits in their request class though
<lsmith> btw there is also https://github.com/ramsey/mimeparse
<lsmith> http://forge.typo3.org/issues/37561
<lsmith> ben ramsey is an active contributor in the php scene
<nielsvm> ^^, that's mimetypes alone, which def. covers parts of what we need I think.
<wamilton> nielsvm: yeah, only mime and language implement ranges though
<lsmith> btw .. maybe engineyard can be enticed to help
<lsmith> they are quite proud of FRAPI .. which afaik also provides something
<lsmith> and they have a bunch of competent php guys now
<lsmith> i will write a blog post tonight and see if i can get some movement

Core needs to define what interface it wants. The library mentioned in #6 and #8 looks like the way to go for me. Having working algorithms and documentation is a big plus compared to a good looking architecture that might not even work out. That makes it easy to start writing some unit tests and start refactoring to make it PSR-0 compatible and pluggable. What might be interesting is the ability to alter the quality factors or change the entire algorithm e.g. via strategy pattern.

Tests can be derived from the examples given in http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1

I have already written tests for the conNeg library, covering media range matching. I have also contacted the author about whether he would be interested in a collaborative effort. The test results indicate more work is still needed on the implementation. I'd rather like us to define an interface for the public facing part of the library before going ahead. Is there an established way of doing that?

I feel like what is most likely to work is to pronounce some piece of work finished and throw it at the people who will be involved in using this library for determining which Language and which MIME to use. There is actually no precedent to date for a Drupal Entity having variants for anything other than Language, which makes this difficult to understand, but fortunately simple in that it would be Someone Else's Problem. In the end, all this library can do is take in some variants and check them against some Accept-* header fields and eliminate those variants that are unacceptable and provide the calling application some insight into what would be preferred. It is entirely possible that when Drupal 8 ships all it does is call this library and then say "well, maybe the user agent will only accept Finnish Flash, but the closest thing I've got here is English HTML, so that's what I will send."

A major obstacle will be separating out abstract negotiation from what the major browsers will actually send you. It makes no sense whatsoever to handle the concern of properly parsing and prioritizing the stated preference of the user agent at the same level as caveats for handling the way that the major browsers completely negate HTTP spec in this regard. At the same time, it is a tremendous goal to implement the abstract types on the one hand and then support implementations specific to user agent as a plugin, and get it all scoped out and solid in 3 months.

What is more likely to work is acknowledging that the true needs are not actually content negotiation, and redefining the goal to something more realistic. For example, something that only understands how to react to a Symfony object (if that is deemed expedient) and can construct variants from our resources (roughly equivalent to drupal entities) and returns a list that contains only those variants which are acceptable with information about the meaning of the Accept-* fields, and the calling application can do whatever it wants to decide which of the acceptable ones to actually use.

After some discussion at DrupalCon, I think the best approach forward is to define a basic interface that we can use, back it with something simple based on the Symfony request object for now (as we are already doing), and punt. We will still need to figure out how to add additional types to the mapping, but for core's purposes for right now the important thing is a configurable exact-match library in the DIC with a defined interface. Once we have that, anything else could happen in contrib if needs be. So let's focus on that.

lsmith pointed me to this partially-written implementation that could be borrowable: https://github.com/FriendsOfSymfony/FOSRest/tree/master/Util

Another conneg PHP library was announced today which is part of the Slim framework: https://github.com/codeguy/Slim/pull/376 (it comes with tests too)

guys today crell presented and asked about interest from sf2 guys into this, i am reading this responses but seems like i need more solid direction. Anybody wants to take me by the hand on this, i want to take a deeper look.

i've spent a bit of time looking through the various emerging PHP libraries (the Slim pull req & FOSRest, as well as BadFaith), and i have some concerns that they'll be compatible with the RouteCollection-reduction approach necessitated by the PartialMatcher; most of them seem to be interested (unsurprisingly) in producing a best match, rather than removing "incompatible" matches.

i think it'd work if we could build up a list of acceptable mime types by looking at what's in the RouteCollection, ask the library to give us the best one, then reduce the RouteCollection to only those routes that match that. is this what we're thinking?

That seems like a reasonable approach, yes. In practice, between MIME and METHOD I can't imagine we'd have more than one route left (pending the RouteEnhancer discussion in #1784040: NestedMatcher may be too slow and non-deterministic), but either one of those could individually have multiple valid routes. To wit:

GET application/json-ld+json
PUT application/json-ld+json
GET application/atom
PUT application/atom

A MIME filterer would find, correctly, two "best" matches for a given request, as would MIME, but together they'd find only one. so yes, I think "here's a request, here's a list of possible mimes I can handle, which one should I use", which could still match multiple routes, sounds like a good approach.

I'll spend some time next week looking at this issue, so if anyone has made some progress on this in the meantime, please report here so I don't end up duplicating someone else's work. if you're really keen on working with this also that's fine, I can use any help. ping me on #drupal-wscci.

Assigned:wamilton» Unassigned

Sorry, I shouldn't have left this assigned to me. I have no time whatsoever to work on this and can only help by getting out of the way. Please let me know via pm if I'm passively blocking anything out of neglect. As I understand it, fabpot wants to pull this into Symfony straight up, and I said in that thread what I'm saying here: good luck and good hunting folks.

Fabien Potencier said he is planning to take over the work on the badFaith library and make it part of Symfony. According to lsmith on IRC, ETA is 2-3 weeks.

2-3 weeks was an educated guess .. best confirm with fabien when he plans to work on this.

Issue summary:View changes

Updating with a reminder to avoid licensing issues by contributing elsewhere and organizing here.

Assigned:Unassigned» scor

Fabien Potencier has asked @jfsimon (a Sensio employee) to work on the content negotiation for Symfony. He is experimenting in a pull request on Github which I'll keep an eye on. No ETA yet.

I've also updated the Issue Summary to list all the libraries which have been mentioned in this issue (including this recent PR).

Component:wscci» base system

The pull request mentioned in #24 still needs Drupal eyes. :-)

Refiling...

Issue summary:View changes

add list of conneg libs

The content negotiation PR for symfony hasn't seen much progress in the last couple of weeks, and the example showing how to use it is currently not working. A bug in the current way Symfony parses HTTP Accept header was reported and is being fixed in a separate PR.

In today's Drupal+JSON-LD call, @Crell had some concerns about relying on Symfony's POC conneg PR as we're not yet 100% sure we can use it in our NestedMatcher for routing. Our priority now is to write a partial matcher using the new Symfony conneg library to see whether it can work for us or not. I'll be sprinting on that this weekend at BADCamp.

Will Accept Headers be configurable for each resource plugin's route[1]?

1. #1816354: Add a REST module, starting with DELETE

Title:Content Negotiation for Accept Headers[META] Content Negotiation for Accept Headers

I've started a new issue specifically for the partial matcher that @Crell wanted to have as part of Drupal's nested matcher: #1833440: Implement partial matcher based on content negotiation MIME type.

I'd like to keep this issue open here to track Content Negotiation in general in both Drupal and Symfony. The partial matcher is independent from the issue of content negotiation and can be worked on / committed separately, even without a good content negotiation lib. (of course it will benefit from it when we have it!). Also, the partial matcher is not the only place where content negotiation could potentially take place, a route which does not define a set of formats could well rely on the controller to do conneg.

Issue summary:View changes

update status of Symfony conneg lib

During BADCamp, pdrake looked at the symfony PR for content negotiation and made some fixes which are waiting to be merged with the main PR.

Status:Active» Needs work
StatusFileSize
new25.54 KB

There hasn't been any activity in the symfony PR in the last few weeks, so I've started to work on integrating the mimeparse library in Drupal 8. We could still switch to Symfony's new content negotiation later on if it is merged on time before Drupal code freeze, but in the meantime we'll have a more robust Accept header parser. mimeparse conforms to PSR-0, has composer support and includes tests.

I'm attaching an initial patch which so far only includes the mimeparse library. I was hoping switching to mimeparse would be trivial but mimeparse can only provide the best match given an Accept header string and a list of supported mime types. We're handling this differently in Drupal/Symfony so I'll have to extend mimeparse to provide a list of acceptable types like Symfony does.

Hi, i'm the author of http://ptlis.net/source/php/content-negotiation/ and i'm eager to make the required changes required to allow clean integration of it into Drupal. Stefan contacted me several months ago but it was caught by an over-zealous spam filter and I only caught it a few days ago.

In it's current state it is far from perfect and has been neglected somewhat of late, but it has already been adopted a reasonable number of projects including Neologism (as scor mentioned) and several semantic web projects (including the semantic web dogfooding site, http://data.semanticweb.org/).

The four problems I see from reading through this thread are:
* Requirement for PSR-0 support.
* Direct use of $_SERVER superglobals.
* Indentation/consistancy issues.
* Lack of tests (and fixing of any bugs the tests reveal).

None of these are difficult issues to resolve, i've created a github repository for the library (https://github.com/ptlis/conNeg) and once i've received Sefan's pull request (support for PSR-0 & tests) i'll take a look at what further changes are required and integrate some tests I was working on a while ago.

I would still rather we use the upstream code from Symfony.

I spoke with fabpot. The developer is busy right now, so doesn't have much time for that PR. If someone from Drupal wants to fully take over, they can fork the PR and make a new PR to continue running with it. Please do so only if you're going to carry it to completion, though. :-) pdrakeweb, you up for it?

@ptlis since you acknowledge that there is some work to do .. would you be interested in collaborating with the Symfony2 people to finish https://github.com/symfony/symfony/pull/5711 ?

@Crell, possibly. Is there a clear functional spec somewhere that I can work toward (ie a way I can tell exactly what I'm getting into if I agree to take this)?

pdrake: I don't think there is a formal spec. The abstract is "support Apache mod_negotiation-style algorithm in Symfony matchers". Precise details, I'm not sure. You've looked at the Symfony PR more closely than I have. :-)

@pdrake, if you're interested in a collaborative effort, the conNeg library already implements most of the tricky parts of the Apache negotiation algorithm. The parts of that algorithm that aren't supported was a design decision because conNeg was originally intended to be general-purpose and non-prescriptive in what mechanism is used to handle negotiation & simply does the field parsing and selection of the best match on a single field.

@ptlis I'm always interested in collaborative efforts - I just wish there were a clear direction here. I know BadFaith was started and then dev was halted, a Symfony PR was started, then halted, there's conNeg... so many different directions, each with their advantages and disadvantages.

@Crell, are you (or could you find someone integral in putting this in core who is) interested in defining a simple PHP interface and/or abstract class as you see it being used, along with a couple really simple tests? @ptlis and I could then investigate which of the available options could most readily implement that interface... mostly, I want to be sure that if I put some effort into this, I'm taking it a direction that definitively meets our needs.

FYI, in the interest of keeping this issue focus on Symfony conneg, I've started a new issue for our stop gap measure: #1857138: Improve Content Negotiation with dedicated library (mimeparse).

@ptlis: thanks for chiming in. Bear in mind that we're hoping to use Symfony's improved conneg in the long term, so us using mimeparse or any other library is only a stop gap solution. It's great if you can improve ptlis wrt PSR-0 etc. Contributing to Symfony's content negotiation would be awesome, since that's what we're more likely to use in the end. Maybe there are piece of your code which could be contributed or adapted to Symfony. thanks.

Priority:Normal» Major

So as of #1969870-35: REST export view should default to JSON, we now have a concrete example of where we're violating HTTP specification due to the lack of content negotiation. The issue there is that for a REST export view, we have a route that can only respond in 'application/json' (or possibly other formats, but not 'text/html'). Visiting the URL in a browser, the browser sends an Accept header along the lines of text/html, ..., */*. The desired behavior in this case is to return the JSON. However, because we don't yet have this issue solved, we don't distinguish between text/html, */* and text/html, so the fix/hack in that issue was to return JSON even for the latter case, in violation of HTTP.

Therefore, escalating this issue to major.

The conneg library by ptlis (https://github.com/ptlis/conNeg) now complies with PSR-0 and passes tests derived from the examples of RFC 2616 Section 14.1. among others written by the author. It returns the best match as well as a list of matches ordered by qualifier. It also deals with Accept-Charset and Accept-Encoding headers. If we do not want to use it as a third party library and rather integrate it with Symfony; no problem, it's licensed under LGPL.

Stefan: I'm not sure we have time to switch to a new library at this point. However, if you're interested in giving it a shot I'm open to seeing a proof of concept. Is it in Packagist?

It's not in Packagist, though that's a formality. The required composer.json is there. From reading this issue it sounds like we need better support for content negotiation and the ptlis/connecg library woudl provide that level of functionality. There has been new development recently, it could definitely help us and others move forward. I could provide a patch. Shall I reopen http://drupal.org/node/1833440 ?

Assigned:scor» Unassigned

please open a new issue on ptlis. note there is also an issue to use mimeparse, but I don't really care for one or the other, especially if you think ptlis is better.

Stefan: No, don't reopen. Make a new issue and link it from a comment here. Thanks.

I'm not sure we have time to switch to a new library at this point.

@Crell: what do you mean by "switch"? We have no suitable library for this currently in Drupal and we need one. What would we be switching from?

From the half-arsed quick-n-dirty-but-works-for-now class we have been running for the past year. :-) I've no love for it, but it does what we need right now. At this point I don't consider switching to something more robust a release-blocker, hence my caution. I'm not against it, just cautious.

but it does what we need right now

Not really. Per #40, without the ability to get an array of matches, it's making us violate HTTP spec in Views REST export displays. And will likely cause similar problems with #1855260: Page caching broken by accept header-based routing. Whether those problems end up being release blocking or not remains to be seen, but I think a strong case could be made for getting a suitable library in that fixes them.

As I said, I'm in no way opposed to us getting a better library in place. Volunteers welcome. :-)

I opened #2026431: Make ContentNegotiation a "internal" service, used only by the router, so that core or contrib can implement real negotiation, which I think everyone here should be interested in. It's based on a conversation I had with @Crell 2 weeks ago. (@Crell: so sorry for the delay in doing this.)

likewise people following this issue might also want to follow the mimeparse issue which is has a patch waiting for review: #1857138: Improve Content Negotiation with dedicated library (mimeparse).

Issue summary:View changes

add related issue #1833440

Both Symfony's FOSRestBundle and Slim (via Middleware) use the Negotiation lib (https://github.com/willdurand/Negotiation) for a while now.
It works, it is extensively unit tested, and it is maintained.

Disclaimer: I am the developer behind Negotiation..

Hi folks,

if the implementation is still under discussion i'd like to chip in again and say that the library that I maintain has recently undergone a total re-write; ptlis/conneg is now PSR-1, PSR-2 and PSR-4 compliant, available via composer/packagist as well as being highly modular and extensible. I also maintain a Symfony2 bundle that I am continuing to improve as I work towards a version 1 release.

Currently the documentation for the bundle is more complete than for the underlying conneg library but this is something that will be remedied this week.

If there is interest in using the library I maintain I would be glad to do the contribute to the integration work or complete it myself if a high-level picture of how it should behave can be provided.