#311 | 2831944-3-311.patch | 166.44 KB | alexpott |
|
#311 | 310-311-interdiff.txt | 12.28 KB | alexpott |
#310 | 2831944-3-310.patch | 166.67 KB | alexpott |
|
#310 | 309-310-interdiff.txt | 3.21 KB | alexpott |
#309 | 2831944-3-309.patch | 165.59 KB | alexpott |
|
#309 | 306-309-interdiff.txt | 3.06 KB | alexpott |
#306 | interdiff-2831944-305-306.txt | 32.21 KB | phenaproxima |
#306 | 2831944-306.patch | 164.51 KB | phenaproxima |
|
#305 | 2831944-3-305.patch | 160.76 KB | alexpott |
|
#305 | 299-305-interdiff.txt | 19.22 KB | alexpott |
#299 | 2831944-2-299.patch | 157.4 KB | alexpott |
|
#299 | 294-299-interdiff.txt | 1016 bytes | alexpott |
#294 | 2831944-2-294.patch | 157.31 KB | alexpott |
|
#294 | 293-294-interdiff.txt | 11.18 KB | alexpott |
#293 | interdiff-2831944-291-293.txt | 2.64 KB | phenaproxima |
#293 | 2831944-293.patch | 157.57 KB | phenaproxima |
|
#291 | interdiff-2831944-289-291.txt | 6.11 KB | phenaproxima |
#291 | 2831944-291.patch | 157.48 KB | phenaproxima |
|
#289 | interdiff-2831944-287-289.txt | 2.49 KB | phenaproxima |
#289 | 2831944-289.patch | 153.94 KB | phenaproxima |
|
#287 | interdiff-2831944-281-287.txt | 8.46 KB | phenaproxima |
#287 | 2831944-287.patch | 154.01 KB | phenaproxima |
|
#281 | interdiff-2831944-278-281.txt | 2.06 KB | phenaproxima |
#281 | 2831944-281.patch | 153.57 KB | phenaproxima |
|
#278 | interdiff-2831944-275-278.txt | 3.55 KB | phenaproxima |
#278 | 2831944-278.patch | 153.57 KB | phenaproxima |
|
#275 | interdiff-2831944-273-275.txt | 11.41 KB | phenaproxima |
#275 | 2831944-275.patch | 149.91 KB | phenaproxima |
|
#273 | interdiff-2831944-273-fix.txt | 1.01 KB | phenaproxima |
#273 | interdiff-2831944-263-273.txt | 3.31 KB | phenaproxima |
#273 | 2831944-273.patch | 150.69 KB | phenaproxima |
|
#273 | 2831944-273-FAIL.patch | 150.66 KB | phenaproxima |
|
#263 | interdiff-2831944-258-263.txt | 5.47 KB | phenaproxima |
#263 | 2831944-263.patch | 149.82 KB | phenaproxima |
|
#258 | interdiff-2831944-257-258.txt | 1.54 KB | phenaproxima |
#258 | 2831944-258.patch | 149.64 KB | phenaproxima |
|
#257 | interdiff-2831944-254-257.txt | 7.78 KB | phenaproxima |
#257 | 2831944-257.patch | 149.31 KB | phenaproxima |
|
#254 | interdiff-2831944-251-254.txt | 588 bytes | phenaproxima |
#254 | 2831944-254.patch | 148.16 KB | phenaproxima |
|
#251 | interdiff-2831944-250-251.txt | 2.1 KB | phenaproxima |
#251 | 2831944-251.patch | 148.17 KB | phenaproxima |
|
#250 | interdiff-2831944-244-250.txt | 25.28 KB | phenaproxima |
#250 | 2831944-250.patch | 148.01 KB | phenaproxima |
|
#244 | interdiff-2831944-240-244.txt | 29.24 KB | phenaproxima |
#244 | 2831944-244.patch | 145.14 KB | phenaproxima |
|
#240 | 2831944-240.patch | 139.24 KB | phenaproxima |
|
#239 | interdiff-2831944-235-239.txt | 24.66 KB | phenaproxima |
#239 | 2831944-239.patch | 138.43 KB | phenaproxima |
|
#235 | interdiff-2831944-234-235.txt | 1.96 KB | phenaproxima |
#235 | 2831944-235.patch | 137.73 KB | phenaproxima |
|
#234 | interdiff-2831944-232-234.txt | 3.97 KB | chr.fritsch |
#234 | 2831944-234.patch | 137.73 KB | chr.fritsch |
|
#232 | interdiff-2831944-231-232.txt | 4.71 KB | chr.fritsch |
#232 | 2831944-232.patch | 137.5 KB | chr.fritsch |
|
#231 | interdiff-2831944-228-231.txt | 22.97 KB | chr.fritsch |
#231 | 2831944-231.patch | 136.05 KB | chr.fritsch |
|
#228 | interdiff-2831944-221-228.txt | 6.22 KB | phenaproxima |
#228 | 2831944-228.patch | 136.79 KB | phenaproxima |
|
#221 | interdiff-2831944-217-221.txt | 1.83 KB | phenaproxima |
#221 | 2831944-221.patch | 136.62 KB | phenaproxima |
|
#217 | interdiff-2831944-216-217.txt | 3.13 KB | phenaproxima |
#217 | 2831944-217.patch | 136.28 KB | phenaproxima |
|
#216 | interdiff-2831944-214-216.txt | 1.65 KB | phenaproxima |
#216 | 2831944-216.patch | 133.67 KB | phenaproxima |
|
#214 | interdiff-2831944-212-214.txt | 2.47 KB | phenaproxima |
#214 | 2831944-214.patch | 132.29 KB | phenaproxima |
|
#212 | interdiff-2831944-211-212.txt | 9.18 KB | phenaproxima |
#212 | 2831944-212.patch | 130.77 KB | phenaproxima |
|
#211 | interdiff-2831944-207-210.txt | 4.37 KB | chr.fritsch |
#211 | 2831944-210.patch | 130.44 KB | chr.fritsch |
|
#207 | interdiff-2831944-205-207.txt | 6.85 KB | chr.fritsch |
#207 | 2831944-207.patch | 130.62 KB | chr.fritsch |
|
#205 | interdiff-2831944-202-204.txt | 10.88 KB | phenaproxima |
#205 | 2831944-204.patch | 130.62 KB | phenaproxima |
|
#202 | interdiff-2831944-200-202.txt | 30.91 KB | phenaproxima |
#202 | 2831944-202.patch | 128.01 KB | phenaproxima |
|
#200 | interdiff-2831944-198-200.txt | 1.86 KB | chr.fritsch |
#200 | 2831944-200.patch | 122.32 KB | chr.fritsch |
|
#198 | 2831944-198.patch | 121.71 KB | chr.fritsch |
|
#198 | interdiff-2831944-195-198.txt | 5.12 KB | chr.fritsch |
#196 | Screen Shot 2018-04-23 at 8.44.48 PM.png | 231.41 KB | robpowell |
#196 | 2831944.mp4 | 1.09 MB | robpowell |
#195 | interdiff-2831944-192-195.txt | 739 bytes | chr.fritsch |
#195 | 2831944-195.patch | 121.52 KB | chr.fritsch |
|
#192 | interdiff-2831944-190-192.txt | 12.41 KB | chr.fritsch |
#192 | 2831944-192.patch | 121.48 KB | chr.fritsch |
|
#190 | interdiff-2831944-185-190.txt | 2.62 KB | phenaproxima |
#190 | 2831944-190.patch | 121.22 KB | phenaproxima |
|
#185 | interdiff-2831944-184-185.txt | 4.78 KB | phenaproxima |
#185 | 2831944-185.patch | 120.71 KB | phenaproxima |
|
#184 | interdiff-2831944-183-184.txt | 18.67 KB | phenaproxima |
#184 | 2831944-184.patch | 119.64 KB | phenaproxima |
|
#183 | interdiff-2831944-179-183.txt | 1.29 KB | chr.fritsch |
#183 | 2831944-183.patch | 118.08 KB | chr.fritsch |
|
#179 | interdiff-2831944-177-179.txt | 4.11 KB | chr.fritsch |
#179 | 2831944-179.patch | 118.03 KB | chr.fritsch |
|
#177 | interdiff-2831944-172-177.txt | 25.43 KB | chr.fritsch |
#177 | 2831944-177.patch | 118.01 KB | chr.fritsch |
|
#172 | interdiff-2831944-169-172.txt | 6.86 KB | chr.fritsch |
#172 | 2831944-172.patch | 113.66 KB | chr.fritsch |
|
#169 | interdiff-2831944-167-169.txt | 8.64 KB | chr.fritsch |
#169 | 2831944-169.patch | 113.62 KB | chr.fritsch |
|
#167 | interdiff-2831944-163-167.txt | 5.85 KB | chr.fritsch |
#167 | 2831944-167.patch | 110.76 KB | chr.fritsch |
|
#163 | interdiff-2831944-162-163.txt | 14.86 KB | phenaproxima |
#163 | 2831944-163.patch | 110 KB | phenaproxima |
|
#162 | interdiff-2831944-159-162.txt | 26.98 KB | chr.fritsch |
#162 | 2831944-162.patch | 109.83 KB | chr.fritsch |
|
#159 | interdiff-2831944-154-159.txt | 51.21 KB | phenaproxima |
#159 | 2831944-159.patch | 105.75 KB | phenaproxima |
|
#154 | interdiff-2831944-153-154.txt | 5.24 KB | phenaproxima |
#154 | 2831944-154.patch | 86.46 KB | phenaproxima |
|
#153 | interdiff-2831944-150-153.txt | 63.71 KB | phenaproxima |
#153 | 2831944-153.patch | 86.27 KB | phenaproxima |
|
#150 | interdiff-2831944-147-150.txt | 12.4 KB | chr.fritsch |
#150 | 2831944-150.patch | 86.47 KB | chr.fritsch |
|
#148 | interdiff-2831944-142-147.txt | 1.06 KB | chr.fritsch |
#147 | 2831944-147.patch | 81.55 KB | chr.fritsch |
|
#147 | 2831944-147.patch | 81.55 KB | chr.fritsch |
|
#142 | interdiff-2831944-136-142.txt | 6.59 KB | chr.fritsch |
#142 | 2831944-142.patch | 81.07 KB | chr.fritsch |
|
#136 | interdiff-2831944-134-136.txt | 9.11 KB | chr.fritsch |
#136 | 2831944-136.patch | 80.36 KB | chr.fritsch |
|
#134 | interdiff-2831944-128-134.txt | 13.41 KB | chr.fritsch |
#134 | 2831944-134.patch | 79.04 KB | chr.fritsch |
|
#131 | iframe-approach.txt | 9.36 KB | chr.fritsch |
#123 | interdiff-2831944-120-123.txt | 3.08 KB | chr.fritsch |
#123 | 2831944-123.patch | 74.39 KB | chr.fritsch |
|
#121 | oEmbed_video.gif | 1.62 MB | chr.fritsch |
#120 | interdiff-2831944-119-120.txt | 9.38 KB | chr.fritsch |
#120 | 2831944-120.patch | 74.49 KB | chr.fritsch |
|
#119 | interdiff-2831944-116-119.txt | 12.18 KB | chr.fritsch |
#119 | 2831944-119.patch | 73.79 KB | chr.fritsch |
|
#116 | interdiff-2831944-115-116.txt | 18.8 KB | chr.fritsch |
#116 | 2831944-116.patch | 73.74 KB | chr.fritsch |
|
#115 | interdiff-2831944-113-115.txt | 12.25 KB | chr.fritsch |
#115 | 2831944-115.patch | 78.33 KB | chr.fritsch |
|
#113 | interdiff-2831944-103-113.txt | 45.27 KB | chr.fritsch |
#113 | 2831944-113.patch | 76.39 KB | chr.fritsch |
|
#103 | interdiff-2831944-98-103.txt | 29.33 KB | chr.fritsch |
#103 | 2831944-103.patch | 73.56 KB | chr.fritsch |
|
#98 | interdiff-2831944-96-98.txt | 5.36 KB | chr.fritsch |
#98 | 2831944-98.patch | 71.49 KB | chr.fritsch |
|
#96 | interdiff-2831944-92-95.txt | 7.31 KB | chr.fritsch |
#96 | 2831944-95.patch | 71.67 KB | chr.fritsch |
|
#92 | interdiff-2831944-91-92.txt | 1.8 KB | chr.fritsch |
#92 | 2831944-92.patch | 72.53 KB | chr.fritsch |
|
#91 | interdiff-2831944-87-91.txt | 22.6 KB | chr.fritsch |
#91 | 2831944-91.patch | 72.53 KB | chr.fritsch |
|
#87 | interdiff-2831944-73-87.txt | 14.68 KB | chr.fritsch |
#87 | 2831944-87.patch | 73.07 KB | chr.fritsch |
|
#73 | 2831944-73.patch | 68.41 KB | chr.fritsch |
|
#71 | 2831944-71.patch | 70.11 KB | chr.fritsch |
|
#68 | interdiff-2831944-67-68.txt | 5.07 KB | chr.fritsch |
#68 | 2831944-68.patch | 69.97 KB | chr.fritsch |
|
#67 | interdiff-2831944-65-67.txt | 6.25 KB | chr.fritsch |
#67 | 2831944-67.patch | 68.18 KB | chr.fritsch |
|
#65 | interdiff-2831944-64-65.txt | 3.13 KB | chr.fritsch |
#65 | 2831944-65.patch | 66.95 KB | chr.fritsch |
|
#64 | interdiff-2831944-63-64.txt | 3.27 KB | chr.fritsch |
#64 | 2831944-64.patch | 67.52 KB | chr.fritsch |
|
#63 | interdiff-60-63.txt | 13.82 KB | seanB |
#63 | 2831944-63.patch | 67 KB | seanB |
|
#60 | interdiff-2831944-58-60.txt | 15.69 KB | chr.fritsch |
#60 | 2831944-60.patch | 67.34 KB | chr.fritsch |
|
#58 | interdiff-2831944-51-58.txt | 13.38 KB | chr.fritsch |
#58 | 2831944-58.patch | 64.87 KB | chr.fritsch |
|
#52 | interdiff-2831944-42-51.txt | 16.4 KB | chr.fritsch |
#52 | 2831944-51.patch | 64.98 KB | chr.fritsch |
|
#42 | interdiff-2831944-40-42.txt | 6.49 KB | chr.fritsch |
#42 | 2831944-42.patch | 61.02 KB | chr.fritsch |
|
#40 | interdiff-2831944-37-40.txt | 53.59 KB | phenaproxima |
#40 | 2831944-40.patch | 59.9 KB | phenaproxima |
|
#37 | interdiff-2831944-36-37.txt | 11.77 KB | chr.fritsch |
#37 | 2831944-37.patch | 59.69 KB | chr.fritsch |
|
#36 | interdiff-2831944-34-36.txt | 20.25 KB | chr.fritsch |
#36 | 2831944-36.patch | 55.71 KB | chr.fritsch |
|
#34 | interdiff-2831944-32-34.txt | 2.64 KB | chr.fritsch |
#34 | 2831944-34.patch | 55.66 KB | chr.fritsch |
|
#32 | interdiff-2831944-31-32.txt | 11.14 KB | chr.fritsch |
#32 | 2831944-32.patch | 57.83 KB | chr.fritsch |
|
#31 | interdiff-2831944-30-31.txt | 12.01 KB | chr.fritsch |
#31 | 2831944-31.patch | 53.33 KB | chr.fritsch |
|
#30 | interdiff-2831944-26-30.txt | 38.52 KB | chr.fritsch |
#30 | 2831944-30.patch | 50.04 KB | chr.fritsch |
|
#26 | interdiff-2831944-10-26.txt | 6.04 KB | chr.fritsch |
#26 | 2831944-26.patch | 36.04 KB | chr.fritsch |
|
#17 | youtube_media_type.gif | 1.59 MB | slashrsm |
#10 | youtube_media_type1.mp4 | 2.64 MB | marcoscano |
#10 | interdiff-4-10.txt | 40.36 KB | marcoscano |
#10 | 2831944-10.patch | 35.92 KB | marcoscano |
|
#4 | 2831944_4.patch | 30.16 KB | slashrsm |
|
Comments
Comment #2
seanBThe oEmbed option seems like the best approach to me, and delivers a whole bunch of video providers. It seems like a nice feature to be able to configure the list of supported providers, so maybe by default allow youtube and vimeo and add a settings page to allow extra providers?
Comment #3
tobiasbHint: Perhaps useful lib https://github.com/essence/essence
Comment #4
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedWIP proof of concept. Attached patch adds a media handler for oEmbed and some configuration to be easy to test. There is also a formatter that will display the resource using HTML code returned by the oEmbed provider.
We need to decide if we want to implement oEmbed integration by ourselves or use an external library.
I'd also like to add ability to limit to certain providers, but this is a bit tricky since the official database of providers isn't really reliable. We could achieve that by offering site-builders to enter URL wildcards/regular expressions, but that isn't the best UX. Any thoughts about that would be welcome.
Obviously this depends on #2831274: Bring Media entity module to core as Media module.
Comment #5
tedbowOne thing about just making oEmbed formatters for the existing link field type is that now all existing link fields will have this option. I think this will be very confusing to site builders.
Should we make another field type oEmbed Link that just extends the link field type? Then we could limit the oEmbed link formatters to this oEmbed Link field type and maybe update existing link formatters to work with it too. That would also allow use to remove options that don't make sense. Internal links for example. Would oEmbed be used for internal links. I know it would possible but seems very edge case.
Comment #6
Gábor HojtsyI agree it sounds like a UX improvement if we can remove unnecessary extra options.
Comment #7
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedI'd rather use
isApplicable()
to solve that. Adding a bunch of new field types will also create confusion and UX problems (speaking generally, not just related to this issue).The reason I didn't do that when initially working on this patch is that I've been trying to get PoC through the door before having to leave the sprint on the last day.
Comment #9
Dave ReidI don't think there's an actual reason to have a separate Link field type. I think site builders would be confused at that separation, and in my experience are already good at just creating 'LInks' fields. FYI the URL Embed module already provides this formatter for link field types.
Comment #10
marcoscanoOK So let's see if we can move this forward.
In this patch:
- Re-rolled #4
- Made all modifications necessary to adapt it to the API changes that ended up happening in #2831274: Bring Media entity module to core as Media module
- Used
::isApplicable
to only expose the oEmbed formatter to fields from media entities- Moved default config entities from standard profile to the media module, along with some fine-tuning in the default display settings.
- Finished implementing a first version of the
allowed_providers
configuration setting- Fixed some minor bugs, documentation and CS improvements
Remaining tasks:
- Validate the main approach (for example, decide if we want an external library to handle the oEmbed integration, etc)
- Decide what to do with the "Allowed Providers" validation. Due to the inconsistency of the oEmbed database, this validation will probably give false failures quite often. We can try to add workarounds for each specific case (an example is implemented in this patch for the YouTube provider), bit it's ugly and hard to maintain. Should we just skip this functionality entirely and always allow all providers?
- Define a default thumbnail icon (for now I'm using the "no-thumbnail.png" icon)
- Write tests
A quick demo of current 8.4.x HEAD with this patch applied:
youtube_media_type1.mp4
Comment #11
seanBJust watched the video, this looks AMAZING! Only wondering if Youtube should be called Video (with only the youtube provider enabled). This way you can easily add other video provider without having to add a new media type for each provider.
Will look at the code later, but marcoscano thanks for picking this up!
Comment #13
marcoscanotestbot: I know I know.. let's validate the approach first :)
Comment #16
martin107 CreditAttribution: martin107 as a volunteer commentedI like the idea behind this issue.
everyone_who_has_worked_on_this++
I see from the remaining tasks in #10 ... there are still things to decide
So while I am itching to fix this ... i will just post a code review comment as things might change radically.
This last patch introduces new code with calls to drupal_set_message().... this has become outdated.
New code should use the messenger service see-
#2278383: Create an injectible service for drupal_set_message()
Comment #17
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedConverted screencast into animated gif.
Comment #18
slashrsm CreditAttribution: slashrsm commentedI started gathering info about 3rd party oEmbed libraries: https://docs.google.com/document/d/111zSRYcnOLz9qlJ1RiP2ZzYlCeMPm1DhieX-...
We can use this info to eventually decide whether to use one of them or to implement our own.
Comment #19
vijaycs85+1 on that!
Comment #20
chr.fritschComment #21
phenaproximaRe-tagging.
Comment #22
phenaproximaThis patch is pretty nice, and a big selling point for Media in core. Great work so far.
I found a relatively minor suite of issues -- nothing major -- but my overarching concern is that I find the OEmbed service, and its interface, kind of confusing and a little haphazard. Granted, I don't know how oEmbed works, and I'm unfamiliar with all the terminology so maybe my feeling is totally invalid. However, I wonder if we can't maybe tighten up that API somewhat and do a better job of sealing the abstraction.
Nit: "youtube" should be "YouTube". Also, language like "...such as YouTube videos" implies that maybe the name and label of this media type should not be YouTube. How about "Remote video"?
These seem like they should be constants, not properties.
$this->providers could (theoretically) be an empty array, so this should be
if (!isset($this->providers))
Won't Guzzle normally throw an exception on a non-200 status code? If so, can we catch that exception and log it instead? It might give more detailed information that would be helpful for troubleshooting.
Namespaced functions are, if I remember correctly, specific to PHP 5.6, so we would be violating Drupal's minimum PHP version requirement by doing this. Can we just use the Drupal\Component\Serialization\Json class to decode the response?
Whoa.
Can we possibly store this kind of info in config, say, and make use of that? Hard-coding it is icky-sauce.
Again, would it be preferable here to take advantage of HTTP exceptions thrown by Guzzle?
This seems potentially quite expensive. Can we cache it?
Should we open a follow-up?
Should we leverage Guzzle HTTP exceptions here?
Once again, not sure we should be using namespaced functions instead of \Drupal\Component\Serialization\Json.
This smells like an ArrayPI. I kinda wish we just had a domain object to encapsulate all of this stuff. (One possible advantage of which is that such an object could implement CacheableDependencyInterface, for advanced caching possibilities...)
Wait -- the values are irrelevant? Why not simply pass a simple array of provider names, then?
Um...why?
Nit: This could be condensed to:
return $this->getMetadata($media, 'thumbnail_local') ?: $this->getMetadata($media, 'thumbnail_uri')
Comment #23
phenaproximaI'm picking this issue back up. I'd like to land it in 8.5.0; it'll be a highlight of the Media Initiative's progress.
What I think we are blocked on, at the moment, is two things:
So let's open the bike shed and hear from media folk. Would you prefer a third-party library, or rolling our own?
My preference is to use a third-party library. PIE FTW. Lower maintenance burden for us. The library (whichever one we choose) will be doing one thing, and doing it well. That is as it should be.
Let's hear from others.
Comment #24
chr.fritschI also prefer a third-party lib. For the same reason like @phenaproxima. Less code to write and maintain for us.
I think embed is most widely used and best maintained.
Comment #25
seanBThe libraries posted in the Google doc in #18 seem to be well supported and I don't think it would be a good idea to implement and support our own implementation if we don't have to. So big +1 to using a existing library.
We could debate on which one to choose, I don't have a strong preference at the moment, but I lean towards https://github.com/oscarotero/Embed since that seems to have the biggest community support (more contributors, installs, commits etc).
Comment #26
chr.fritschI moved the config files into the standard profile and renamed the youtube type into video.
Comment #27
chr.fritschI looked a bit deeper into the proposed libraries and now i am not 100% convinced anymore that it's a good idea to use one of them, because they are not fitting well into our needs.
Some of them are just using static functions and all of them are not providing a method to get the available providers.
Beside that, the stuff @slashrsm already implemented is super cool and works quite good.
Comment #28
phenaproximaI have some architectural concerns about this patch. I also find the documentation sketchy; there's no sense of what a "resource" or "provider" is, so I think that will need to be improved. Also, tests!
Should this be https? Also, should we maybe make this a constructor parameter so it can be overridden by a service provider (and be more amenable to unit testing)?
What's the difference between a "fetched" and "discovered" oEmbed resource? Can the doc comments here explain that distinction?
This service should use UseCacheBackendTrait, which means the cache backend can be optional.
$this->provider could theoretically be an empty array, so this should be changed to a
if (isset($this->providers))
check instead.Why does the cache key need to be a static property?
Won't get() throw an exception if the response code is not 2xx? If so, we should catch that instead so we can potentially log more detailed error information.
Function namespacing is a PHP 5.6+ thing. We should use \Drupal\Component\Serialization\Json::decode() here instead.
Can there be an example here, for clarity?
The cache key and cache lifetime probably do not need to be static properties.
This definitely needs a comment to explain what's going on here.
I believe \Drupal\Component\Utility\Html has some static methods that may prove helpful here.
Needs a follow-up issue or something.
Let's use \Drupal\Component\Serialization\Json::decode() instead.
Er, wat? If the format is XML, we encode it as JSON, then decode it, and expect an array? I don't get it. Definitely needs a comment.
This seems like an ArrayPI to me. Maybe we should implement a simple value object to encapsulate the provider info instead.
If the values are irrelevant, this should be a simple array of codified provider names, rather than an associative array.
Considering the doc comment, I find this method name a little odd. How about isProvider($url) instead?
I'm not sure, but it seems like these two methods can be combined. Not sure what one does that the other does not.
This message could be better. How about something like "The %provider provider is not allowed", with %provider being replaced at validation time with the actual name of the provider?
Follow-up?
Not sure this should be here :)
There's no need for the closure here. Just this will work:
$allowed_providers = array_map('hex2bin', $source_config['allowed_providers']);
Wrong class name.
Comment #29
phenaproximaMore thoughts that occur to me...
Comment #30
chr.fritschThanks for the extensive reviews @phenaproxima
I tried to implement the most of your recommendations.
I changed so much, that it would be nice to get some feedback if we are on the right track.
Since we are shipping a specific formatter with this patch, i think we should also think about #2865184: Allow MediaSource plugins provide default field form/view display settings a bit more.
Comment #31
chr.fritschNext wave of improvements, now with more tests.
Comment #32
chr.fritschNext wave, minor improvements and more tests
Comment #34
chr.fritschFixed one test, and removed the one that already exists in the Functional space.
Comment #35
mtodor CreditAttribution: mtodor at Thunder commentedNice work! I have tested this a bit and reviewed code too.
Here are some things that ware not nice when I tested:
twitch.tv
oEmbed because it's listed on oembed.com - and I wasn't able to create media entity. Then I have seen that some providers are not supported. I'm not sure is that big concern, but at least it's relevant information.GuzzleHttp\Exception\RequestException: cURL error 60: SSL certificate problem: unable to get local issuer certificate (see http://curl.haxx.se/libcurl/c/libcurl-errors.html) in GuzzleHttp\Handler\CurlFactory::createRejection() (line 187 of .../test-sites/drupal_2831944/vendor/guzzlehttp/guzzle/src/Handler/CurlFactory.php).
https
but in oembed check onlyhttp
is support, so I can't save it. And when I change tohttp
then it's possible to save it, but nothing is rendered, only title is correct.And here are some code related concerns:
I'm not sure
http*
is a good way to check schema, because in this case, alsohttps://www.drupal-youtube.com/watch?v=92a7Hj0ijLs
is valid link.I don't see a reason for $providers in this exception. Since it's never set with some data.
This should be checked, it's kinda not logical.
Should be injected.
I think this one can go inside foreach, instead of $filter and then also
break
since there is no reason looping when discovery and schema are found.These two Exceptions are not used in class, so uses can be removed.
It would be better to put
$media->get($this->configuration['source_field'])->first()
into variable and then there would not be need for$main_property
I'm maybe missing something, but - if getMetadata is called with
thumbnail_local_uri
, then it will executed getMetadata withthumbnail_url
, and that case will call getMetadata withthumbnail_local_uri
and so on. I see it ass infinite recursive calls, or there is something I'm missing?And in general this
getMetadata()
function is quite difficult to read. It's like spaghetti.Small typo
Use thie media
Good progress!
Comment #36
chr.fritschThanks @mtodor for the review.
After the review i discussed the implementation with @mtodor in person and we basically found a way to support all the providers, doesn't matter if they support discovery and schemes or not.
If a provider doesn't support discovery, we can use the url key from the provider plus ?url=http://youtube.com/something
So i implemented that and and tested the current patch with Twitter, YouTube, Flickr, Twitch and Vimeo. Everything works OOTB :-)
Also fixed the nits from #35
Not sure if interdiff is usefull, because a lot of stuff was changed.
Comment #37
chr.fritschI added support for the maxheight and maxwidth oEmbed parameter in the formatter.
Also added some more comments and a few more tests.
Comment #38
aheimlichUnless I'm misunderstanding something, the only oEmbed resource types that support the
html
response parameter are thevideo
andrich
types.I realize that this issue is about remote videos, but nothing about the code in this patch actually prevents people from trying to embed, say, an image using the oEmbed field formatter. If my understanding is correct, then if someone were to try and embed an image using the oEmbed field formatter, nothing would show up, and they'd get an
Undefined index "html"
PHP notice.I'd suggest either accommodating the other oEmbed resource types, or explicitly forbidding their use.
See https://oembed.com/, section 2.3.4 for details.
Comment #39
seanBThanks for the input! I think the oEmbed service should accommodate other types than video. We might want to have different formatters for each oEmbed type?
The media types with a oEmbed source should be able to configure which specific type of oEmbed resource it allows. We can use that information to validate the user input by type. Based on the supported resource type in the media source we can also determine which formatter is applicable.
Another approach could be to have different oEmbed media sources per resource type with a oEmbed source base class. This would mean adding more media sources but would probably make it a little easier to add source field constraints and configure supported formatters per source.
Comment #40
phenaproximaI did some substantial refactoring on this patch.
We still need to implement support for all oEmbed types defined by the specification, and add a ton of test coverage.
Comment #42
chr.fritschThanks, @phenaproxima for your work. That looks really good.
I just fixed some namespaces and docs. I also added a check if there is an HTML key in the formatter.
Comment #44
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedI finally managed to check the latest version of the patch. First of all I'd like to express my satisfaction with the direction we're taking. A lot of valuable points were made. Refactoring into two services makes a lot of sense and after a long thought I also independently came to the same conclusion about the 3rd party libraries; amount of code that they would replace is not huge and removing that wouldn't justify bringing in another huge dependency.
Would we need to escape in a strange case where the pipe character would appear in the URL? Is that even possible?
"following values" are missing.
Do we have any idea how big could this cache item become? If I understand it correctly count($media_items) ~= count($cached->data) AKA it could be huge.
Huge cache items can cause problems from my experience.
Do we want to validate this value to make sure nothing strange is put into it?
This is a long list! Can we guarantee that all listed providers actually work? And more importantly, can we guarantee that any new providers that will be added to the list will?
Would it make sense to maintain our own list? It could be based on the official one, but would only contain providers that we tested and can be sure are working correctly with in our integration. We could store it as a config object, which would allow us to easily add or update entries. Thoughts?
We are displaying raw HTML that arrived from a 3rd party. Isn't that a potential security problem?
We want to limit allowed providers in the default configuration.
Comment #45
martin107 CreditAttribution: martin107 as a volunteer commentedyes i am including a link with some interesting background reading
-- under the heading
"Common characters that need encoding"
The pipe char | - %7C.
https://developers.google.com/maps/documentation/urls/url-encoding
I hope that helps
Comment #46
chr.fritsch#2865184: Allow MediaSource plugins provide default field form/view display settings landed, so let's use it to set the oEmbed formatter for the source field.
Comment #47
chr.fritschAlso we should hide the name field and bring the source field to the top in prepareFormDisplay. Then we are aligned with the shipped configuration.
Comment #48
phenaproximaWe also need to fix (and have a test confirming) #45. Y'know, in addition to everything @slashrsm said in #44. :)
Comment #49
phenaproximaLet's talk about #44.6:
I would think so, yes. The oEmbed specification has this to say about video resources from a third party (source, section 2.3.4.2):
That, to me, implies that the oEmbed provider assumes no responsibility for XSS filtering. So it's our job to filter anything we receive from an oEmbed provider.
I discussed this with @chr.fritsch on Slack, and he told me that Facebook, for one, does include <script> tags in oEmbed resources. In my opinion, we definitely cannot trust that kind of thing, at least not out-of-the-box in core. I feel that our templates and code should filter everything we get from an oEmbed provider, but we should also provide some sort of mechanism, like an alter hook, to allow coders (NOT site builders, IMHO, due to how potentially dangerous this is) to disable XSS filtering for trusted resources on a case-by-case basis.
Comment #50
phenaproximaTagging for security review.
Comment #51
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedComment #52
chr.fritschI fixed most of the mentioned things.
Still open are #44.1 and #44.6
We should also add some description to the source field, which providers are currently available.
Comment #53
phenaproximaWe could hard-code a list of providers which we know to be safe, then provide an API-level mechanism (an alter hook?) allowing developers, who are equipped to understand and mitigate the XSS risks, to add additional providers that they trust. As long as we don't expose that mechanism to site builders, I think we'd be OK.
Comment #55
seanBIf we encourage people to create a different source for each oEmbed provider (or a subset of them) as I suggested in #39, new providers can only be added by coders. This also solves the "maintain our own list" part. We could potentially build the Facebook, Twitter, Instagram types in contrib on top of an oEmbed base class.
It might take a lot of work since there could be an unlimited amount of providers. This also makes it harder to support regions outside of Noth America and Europe. For example China has its own set of social platforms for video etc. This is really something we have to look into a bit more since a remote video media type only allowing Youtube/Vimeo will exclude a large part of the world not using that as the main platform.
Comment #56
seanBWe just had a short conversation in Slack in which the following was proposed:
The oEmbed approach saves a lot of pain with the specific implementations of oEmbed providers. Basically we only have a couple of oEmbed types and there could be generic formatters for each type. This is a lot better than needing specific implementations for each provider. The real benefit would be when you start grouping the allowed providers for plugins.
This will allow people to create all the sources they want and change the list of providers each source allows. This issue will solve point 1, 2 and 3.
We can create follow ups if we want 3 and 4 (this could also be contrib though, but having them in Core would be great since this removes the need for a lot of contrib modules).
Any thoughts?
Comment #57
phenaproximaMy suggestion: define the allowed providers in the plugin definition. That way, developers can add additional ones without us needing to define a completely new hook.
Comment #58
chr.fritschI implemented the proposed approach from #56. But only the points 1.-3. because I want to see if I am going in the right direction.
Comment #60
chr.fritschAnother push:
And now:
Merry Christmas everyone 🎄
Comment #61
kmajzlik CreditAttribution: kmajzlik at Ciklum Western Europe commentedI think that Link should not be dependency of Media. We can use Document/Image media types only.
Comment #62
phenaproximaDefinitely coming along; I like where this is heading!
Can these be max_width and max_height, with the labels "Maximum width" and "Maximum height"?
&$element and $context should have the array type hint.
Also, we can remove the instanceof LinkWidget check if we implement hook_field_widget_WIDGET_TYPE_form_alter(), as well as the hard dependency on Link. Let's do that :)
Nit: There's an extra blank line here.
We should be using $media->getSource()->getSourceFieldDefinition(), rather than $source_config['source_field'].
Let's change the !empty to isset(), so that we add the message even if the description items are an empty array.
I'm not 100% sure we should be doing this. Annotations are not really supposed to be this free-form; besides, the annotations can already contain arbitrary keys (it's all transformed to an array anyway by the plugin discovery system).
We're no longer doing the hex conversion stuff, so we should probably remove this comment.
Nope, this is the 'oembed' formatter. :)
This label is exposed in the UI, so should we make it something more human-friendly?
Let's rename this to $resourceFetcher.
Wrong class name.
"plugin_id" should be "plugin ID".
Let's rename the parameter to $resource_fetcher and update the description.
Can these be max_width and max_height, and also have a sane integer default?
Rather than wrap nearly the whole function body in this if statement, can we do this:
Again, rather than wrap so much code in a single try block, let's flatten this a bit:
Let's add a default case in which we throw an UnexpectedValueException if $resource['type'] is not one of the four known types.
Is this safe, from an XSS standpoint?
Should be max_width and max_height, and the labels should say "Maximum" instead of just "max"
This method will also need maxwidth and maxheight converted to max_* (and the labels should be "Maximum").
It might be helpful to log the full exception message somewhere, for debugging purposes.
We do this SO OFTEN...let's just add a public helper function to MediaSourceBase for it. MediaSourceBase::getSourceFieldValue($media)
This doesn't need to be a class. If we make the OEmbed class non-abstract, we can add this plugin definition by implementing hook_media_source_info_alter(). Or better yet, we can have a deriver which simply generates a hard-coded set of plugin definitions based on the OEmbed plugin class implementation.
Comment #63
seanBHad an hour so started on processing #62. Here is what I have so far.
1. Fixed
2. Fixed
3. Fixed
4. Not sure what this code is doing and what it's for? What is the reason for
$source_config['source_field'] != $field_definition->getName()
?5. Fixed
6. This needs some more discussion.
7. Fixed
8. Fixed
9. Changed it to 'oEmbed content', not sure though.
10. Fixed
11. Fixed
12. Fixed
13. Fixed
14. Fixed
15. Fixed
16. Fixed
17. Fixed
18. Fixed
19. Fixed
20. Fixed
21. Still todo
22. We had this, but it was deleted since this was deemed unnessecary. I think it would be very helpful to add it back though. It would be good to scroll through the patchzilla issue to find out what the exact arguments were.
23. Still todo
Comment #64
chr.fritschWorked on #62.23
Comment #65
chr.fritschWorked on #62.6. It seems to work without the new annotation key as well. So I removed that.
Comment #66
seanBDiscussed the link dependency with chr.fritsch in slack. We agreed link fields are best suited to store the links to oEmbed media. String fields are currently also allowed as a source field and we should probably remove that. Any thoughts?
Comment #67
chr.fritschBased on the discussion with @seanB, I re-add the link dependency and removed the string types from the allowed_field_types.
Also:
- Fixed #62.4
- Changed the field name of the optional config to field_media_oembed_remote_video
Still open:
#62 21 & 22
Comment #68
chr.fritschOk, fixed the last two open points
Comment #70
benjifisherThe new
getSourceFieldDefinition()
method is now added in #2934966: Make it easier to get source field values from media items.I think this issue already needs a re-roll: there seems to be a conflict in
media.schema.yml
.Comment #71
chr.fritschRebased on all the fancy things that landed in the last couple of hours 😃
Renamed the media type to "remote_video", because "video" is already used now by the local video type.
Comment #73
chr.fritschRebased on top of #2934966: Make it easier to get source field values from media items
I also reverted #62.18. We know it's not super safe to directly output what we get from the oEmbed service, but that's the reason why we only support a few oEmbed providers. Because we trust them.
Comment #74
phenaproximaI tend to agree, but I think we need explicit review and sign-off from the security team for this approach.
Comment #75
alexpottOne quick thought that's going cause quite a bit of work... more review on the way.
This needs a corresponding hook_update_N / post update to install the link module. Not sure about what type of update it should be. I think hook_update_N because without link being installed the system is not correct.
Comment #76
alexpottThis should use core's time service. Also 60*60*24*7 is a constant. Why make the program do the work? I'd add a comment about caching the result for a week.
Can we create a value object to encapsulate the array and not pass around arrays of info. Same with endpoints.
Comment #77
chr.fritschDiscussed "adding a link dependency" again with @phenaproxima and we agreed that adding a new dependency might not be the best solution. Instead of adding a new dependency we want to solve #2935973: Media source doesn't check if a field type exists before so that the source can fallback to a string field if link module is not present.
Comment #78
alexpottI'm not sure that falling back to a string field if the link module is not present is a good idea.
What happens in these two situations:
1. Media installed
2. Remote field created
3. Link module installed
1. Media installed
2. Link module installed
3. Remote field created with exactly the same settings as the first time
Are the two fields different?
Comment #79
phenaproximaIn this case, the remote media source will use the first available field type that it is compatible with. That will probably be a string field. If Link is subsequently installed, nothing changes for existing media types; new remote media types, though, will create a link field.
In this case, the remote media source will create a link field, since that's what it most prefers.
This does not present a problem at all, because the media source will be compatible with either type of field, pretty much interchangeably.
Comment #80
alexpott@phenaproxima I think that that behaviour is pretty unexpected though and it could lead to hard to debug bugs because you have to realise the code is behaving differently depending on the order things have occurred.
Comment #81
dawehnerI am wondering whether a solution to the dilemma out there would be to use string by default always, but allow people who wants more security (URL validation) or simply a title field to use the link field.
I would have expected an additional widget here to be honest.
I don't think we need this optionalness. Why do we not pass in a null cache backend for this case?
Nitpick: Doesn't the implementation throws an error in the case of "bool"?
❓ I'm curious, doesn't this make it impossible to translate it?
That is a really nice test!
Comment #83
samuel.mortensonI think this is something we should consider - what does the Link module provide that would improve the UX of the OEmbed source? To me the nicest part of the Link module is its handling of internal URLs, but for OEmbed almost all links will be external. In terms of security, if we go with normal string fields, the only part of the Link module we'll be missing is protocol validation (see
\Drupal\link\Plugin\Validation\Constraint\LinkExternalProtocolsConstraintValidator
). I think the oembed_resource/oembed_provider constraints will handle this for us, but it wouldn't hurt to enforce http/https URLs as well.Comment #84
phenaproximaI can't assign it to him, but @robpowell has agreed to implement the changes requested since #73.
Comment #85
phenaproximaChanging the title for increased accuracy points. :)
Comment #86
robpowellAssigned it to myself and will take a look tonight.
Comment #87
chr.fritschI started implementing value objects for Provider and Endpoints.
Comment #89
robpowell@phenaproxima will not have time to work on this till saturday 1/27/18. Will change to unassigned.
Comment #90
robpowellComment #91
chr.fritschSo I picked it up again.
I adjusted several things:
Comment #92
chr.fritschDamn, forgot to remove the interfaces from the patch
Comment #95
alexpottA quick drive-by un complete review - just concentrating on the new value objects.
The argument should be $supports_discovery
I think
supportsDiscovery()
is better.Tempting to put this logic into the Endpoint value object.
I think this could be part of the Provider and not here.
Comment #96
chr.fritschSo fixed the last failing tests and improved some minor things.
Comment #97
phenaproximaI just want to give a wholehearted +1 to @alexpott's comments in #95.3 and #95.4. The Provider and Endpoint value objects are definitely the place for that logic.
Comment #98
chr.fritschSo I moved some logic into the value objects and fixed all the things from #95
Comment #99
chr.fritschCreating a second media type from the remote_video source would currently fail because of #2797047: Problems creating a d6 user profile field storage with an name longer than 32 characters
Comment #100
phenaproximaGreat work! More hacking ahead :)
This config schema defines all oEmbed source derivatives, so the label should be something like "oEmbed media source configuration".
We don't need to load the media type. $media->getSource() will work just fine.
We should add a new metadata attribute to return the transliterated machine-readable provider name, I think (provider_id or something like that).
Rather than wrap all of the function's logic in an if statement, let's just return early if $element['value'] is empty.
We should be using $media->getSource()->getSourceFieldDefinition() here, not peeking directly into config. Keep those abstractions sealed! :)
I think this is also a bit of a leaky abstraction. Maybe we can add a method to the OEmbed plugin which returns the allowed providers, and call that here.
I'm not sure we should use the technical term "oEmbed providers" here. How about something like "You can link to media from the following services: @providers"?
The endpoint should also get its provider passed in, and let's add a method to retrieve it as well.
This should be an InvalidArgumentException, not a ProviderException. And let's adjust the message to match; something like "oEmbed endpoint must have a URL."
We should probably verify that the URL is valid as well, and throw an exception if not. I'm not exactly certain how we should go about that...maybe freak out if parse_url() returns FALSE?
"build" should be "built".
Can we call $this->getFormats() once and reuse its return value? Also, under what circumstances would getFormats() *not* return an array? And shouldn't the constructor throw an exception if $formats is empty?
Is there anywhere that I can get a list of the supported formats, and what they look like? (Even if it's not in the Drupal API?) An @see or @link would be very helpful here.
I'm not sure this should be an array of objects. Why don't we simply make getEndpoints() instantiate the Endpoint objects? That way the calling code doesn't need to do all that work.
Er...because the foreach loop returns, this will only iterate once, if at all. We should probably just retrieve the first endpoint's URL and go from there; no need to loop.
Needs @var int.
I think we should move this logic into Endpoint as well. Something like $endpoint->matchUrl($url), which returns TRUE or FALSE if the given URL matches the endpoint.
This doc comment needs adjustment! :)
We can remove this. The switch-case makes it redundant.
I think we should log an error here, rather than throw an exception.
We can use $media->getSource()->getSourceFieldValue() here. :)
supported_providers? Never heard of it. This is another case for having a method on the OEmbed source which can retrieve the allowed providers.
Let's use $media->getSource()->getSourceFieldValue() here.
We should probably call, file_prepare_directory() here, and try to set up the location if it doesn't exist. However, I'm OK deferring that to a follow-up.
Why are we removing the name from the form display?
Nit: There's an extra blank line here.
I don't think we need to call this "remote_video"; the fact that it's an oEmbed derivative implies that the video is remotely hosted. Let's just call it 'video', so that the full plugin ID will be 'oembed:video'.
Comment #101
samuel.mortensonSome manual testing notes:
Comment #102
chr.fritschWorking on this
Comment #103
chr.fritschFixed most of the things #100. Regarding the removed name field: The oEmbed providers are supporting the name of the resource. So we should use that by default. But we could also show the name field and make it non-required.
Comment #104
tstoecklerComment #110
dawehnerI'm curious, shouldn't the bundle one be more specific then the generic media__oembed suggestion?
What about making this a bit more dense, by moving the description inline.
I like this level of defensiveness!
It'd be nice to explain why we use the first one ... I guess there is no better idea? Also I think it would be nice if getFormat ensures to always returns an array. No reason to check for is_array
Let's drop that :)
Why do we not check $url in this case?
Let's add documentation that and why we return the first available endpoint.
What about making this a bit easier by using
$endpoint += ['formats' => [], 'schemes' => [], ...]
?I'm personally a huge fan of having the common case being not inside a condition aka.
if (!error_case) { ... throw an exception } ... continue normally
It would be nice to have at least one place which explains the overall concept, aka. how providers, fetchers and provider discovery work together.
Shouldn't this be the same interface?
Interesting, why is this one returning a FALSE instead of throwing an exception?
Why can $provider be empty?
That is good to know!
It looks like this documentation is out of date.
I'd actually log the error instead. In an API first world this isn't helpful. Also, the enduser doesn't care about that
It doesn't feel like we would actually need a browser test for that. Can't you trigger rendering the formatter directly and save a bit of setup code?
Nice!
Nitpick, let's drop this empty line
I had a look, for some reason, $schemes is empty for youtube:
I also naively tried things out on vimeo: I went up using
https://vimeo.com/channels/staffpicks/251853214
, which didn't work either :( Here is an idea: Could we do something more clever here: Show some message like this: "It looks like you are trying to provide a youtube video, these are the allowed schemes ...". In order to do that we could use a String distance function and find the most likely matching one.Interesting point. To be honest this feels like something which could be put into the configuration level, but it doesn't have to be the in the UI itself, given it reads, at least for me, like an edge case. Most people are fine with using the name of the provider they are given by the provider itself.
Comment #111
chr.fritschThanks again for reviewing. Will work on that
Comment #112
tstoecklerRead through the patch except for the test coverage and the exported config.
So, instead of altering this would it not make sense to provide a dedicated widget with this description and then do all those various checks in
::isApplicable()
This would also be nice if we want to improve the input of oembed fields in the future, i.e. provide a preview, or do some magic input processing via JS or whatever. Then it would be nicer to have a proper widget instead of a form alter. What do you think?
The
|OEmbedInterface
is wrong, I think.This is going to fatal for non-media entities using the string widget.
So the question to me is whether we use the base plugin ID or the derivate ID or both. I.e. the resulting ID could be "field_media_oembed", "field_media_youtube" or "field_media_oembed_youtube". You chose the second one here, but I'm whether the first one actually wouldn't be more sensible? That would promote more re-use of the field data.
Generally I think we would call this "MAX_AGE" in core.
Should we validate, that
$provider
has the required keys?Let's not invoke the markup component, can we not just use string concatenation or
strstr()
or something?Should we give some context to modules doing this, then? I.e. the provider, for example?
Why can we not use the
AllowedValuesConstraintValidator
with an allowed values callback?Can we inline the call instead of having the extra
$ready
variable? That would be more standard with howfile_prepare_directory()
is used elsewhere.Comment #113
chr.fritschI fixed most of the things from #110
Some notes to that:
1. Yes, but changing that would be an BC break, right?
6. The provider URL is not relevant for us. Endpoint URLs are much more important.
11. I don't think so. getAll() throws ProviderException if it's not possible to retrieve anything from the oEmbed API. get() throws InvalidArgumentException if someone tries to get a non-existent provider.
13. Because we also throw this exception during the fetching of oEmbed data from the API. At that point there are no instanciated providers.
17. I did that corresponding to FileAudioFormatterTest
I also did a big refactoring. The YouTube thing didn't worked, because in OEmbedProviderConstraintValidator we only checked the URLs against the schemes. YouTube has no schemes, but provides discovery. So getProviderByUrl() has to use the discovery as well. Which leads to use the resource fetcher inside the provider discovery.
So I reorganized the things a bit. The ProviderDiscovery is now a ProviderCollecter, becuase thats everything it does. The ResourceFetcher does now only fetching resources. And the new OEmbedManager (not satisfied with this name), has getProviderByUrl and getResourceUrl.
Didn't looked at #112. Thats the next thing on my list
Comment #114
phenaproximaCan we call $media->getSource() once and reuse the return value?
We should use !== here.
Can we rename this service media.oembed_manager?
If the plugin is not a derivative, won't getDerivativeId() return NULL?
If we're going to use $this->formats in the empty() check, we might as well use it in the reset() call as well.
For consistency, let's cast this to bool.
Nit: "an" should be "a".
"TRUE if the URL matches..."
Doc comment needs to be fleshed out.
This is redundant and can be removed.
This is also redundant and can be removed.
We need to explain under which circumstances the ResourceException can be thrown.
This doc comment seems out of date. Shouldn't getResourceUrl() return a URL? If it should return a bool, we should rename the method.
$endpoints is not an array of strings. Also, we need to document the circumstances under which the ProviderException is thrown.
I think that this method should instantiate the Endpoint objects, not the constructor.
"Builds and returns the endpoint URL."
This seems like it should be part of the Endpoint object, not the Provider object.
The doc comment needs to be more detailed.
Should be rephrased a bit: "Returns an array of Provider objects, keyed by provider name." Also, let's make this the @return description.
We need to document the circumstances of the ProviderException.
Two things: 1) Document the circumstances of the ResourceException; and 2) Let's add a @see or @link to the expected structure of the oEmbed resource.
We should log the actual URL as well, plus additional information (HTTP status code? Response content?) if we can get it.
Nit: We can shorten this to
return $this->configuration['allowed_providers'] ?: $this->getSupportedProviders();
Can we improve the doc comment here?
I think we should move all of this default configuration into a follow-up issue.
Comment #115
chr.fritschOk, in this patch I addressed #112
I am still on it to address #114
Comment #116
chr.fritschAddressed #114
Comment #118
tstoecklerAwesome work! I really like the new setup with OEmbedManager/ProviderCollector/ResourceFetcher. I really found almost only nitpicks, the whole patch is very nice to read now. Yay!
Nice!
This is unused.
I guess the
->toString()
should come before using it as an array key.Super-nitpick: Could add a trailing comma.
Let's add all of these as different messages, i.e.
$constraint->unallowedMessage
,$constraint->emptyMessage
,$constraint->invalidMessage
(or something like that).I think this is leftover debug.
Nice! ;-)
Could use
::class
for extra credit.Superfluous newline.
Let's use snake_case here. Or use lowerCamelCase consistently throughout the class.
Let's use double quotes here to avoid the escaping.
Missing
@inheritdoc
@chr.fritsch told me I should tell him that we should also test that provider_id is returned (I would obviously have never seen this, because I am too focused on finding pointless nitpicks)
s/youtube/Twitch/
This description doesn't really match the name
getResourceUrl()
@daniel.bosen points out this is not actually TRUE.
Very nice!
Comment #119
chr.fritschAfter 14h of heavy hacking around, and fixing more than 100 suggestions, I am done for today. I think we made a good step forward today. Hopefully, tomorrow will be as productive as today 😀 Thank you everyone for continuously reviewing.
Comment #120
chr.fritschMore cleanup and documentation changes.
Comment #121
chr.fritschUpdated the video
Comment #122
alexpott@chr.fritsch asked me to look at the security considerations. From oembed.com:
I think we have to make it easy and recommended to set-up your site in the secure way oembed.com recommends.
Comment #123
chr.fritschJust more cleanups
Comment #124
bhanuprakashnani CreditAttribution: bhanuprakashnani at Google Summer of Code commentedPatch applies cleanly.
Comment #125
phenaproximaThanks for checking it, but this is not quite ready to go. Still needs pretty thorough review and an explicit answer to the security concerns pointed out by @alexpott.
Comment #126
chr.fritschOk, what about that:
I extended the oEmbedFormatter to define which additional HTML tags should pass the XSS filtering. Then I adjusted the deriver to pass in allowed HTML tags into the oEmbed source. This configuration will then used to configure the formatter in prepareViewDisplay.
Comment #128
chr.fritschOk, this obviously not a patch file...
Comment #129
alexpottI was asked in slack what I think about filtering using
Xss::filter($youtube_html, ['iframe']);
and allowing custom tags to be allowed. I still think this is some way away from the oembed's best practice of serving the content from an different domain. I wonder if https://www.owasp.org/index.php/HTML5_Security_Cheat_Sheet#Sandboxed_frames helps us.Comment #130
alexpottBasically we need this:
Comment #131
chr.fritschSo I tried a bit the iFrame with sandbox parameter approach. It would work for us but we would have to add
sandbox="allow-same-origin allow-scripts"
to get YouTube and Vimeo running.From https://html.spec.whatwg.org/multipage/iframe-embed-object.html#attr-ifr...
In my understanding, this would mean our effort is basically pointless 😢
Comment #132
phenaproxima@alexpott, @chr.fritsch and I discussed this in Slack today. We came away with some pretty solid conclusions.
To display oEmbed content securely, we need to sandbox it in an iframe which has no access to the Drupal domain's cookies, in order to prevent session hijacking. Core, however, makes this rather difficult because session cookies are, by default, accessible to both the main domain and any of its subdomains. (See \Drupal\Core\Session\SessionConfiguration::getCookieDomain()).
So the first thing that needs to happen is that #2522002: Do not strip www. from cookie domain by default because that leaks session cookies to subdomains needs to be revisited (#2347877: Move DrupalKernel::initializeCookieGlobals() into a SessionConfiguration service was also cited as related literature). Session cookies should probably not apply to all subdomains, but that is potentially quite complicated. In any event, @alexpott determined that this work does not need to block Media, because we came up with a strategy for how to proceed...
With the current oEmbed architecture proposed in this patch, the ideal way to display oEmbed content securely is to show it in an iframe that is served from the same Drupal site, on an alternate domain (not a subdomain). So Media will:
@alexpott gave us this handy guide to determine if we're showing oEmbed content securely:
So, in essence: although we are going to allow people to do something that's not very secure, we are going to make it very obvious and easy to configure it in a secure manner.
Another snippet which came out of this conversation is the possibility of not supporting the http scheme for any oEmbed endpoint, at least not by default. There is no longer any good reason to be using HTTP instead of HTTPS in general, and no serious oEmbed provider should be preferring HTTP over HTTPS at this point. So it seems reasonable to force all endpoints to use the HTTPS scheme, or maybe not support providers which don't support secure endpoints. This could and probably should be done in a follow-up issue, though.
Comment #133
alexpottSo I've read through #2522002: Do not strip www. from cookie domain by default because that leaks session cookies to subdomains again and a patch was committed to 8.x for that so if you force to www on your domain your cookie domain will be
.www.example.com</com>. I'm not sure why we are prefixing the dot though. But the good news is that in most sensible site configurations we'll be subdomain secure - apart from where you force <code>www.example.com
toexample.com
Comment #134
chr.fritschOk, I started to implement the proposal from #132.
What I did so far:
Still missing is the extensive warning and I suppose that some tests will fail now.
I will continue tomorrow morning on that.
Comment #135
tstoecklerWow, it's really cool to be surrounded by such smart people in this community. Very impressive research and solution-finding by all of you!
Some thoughts, all but 4. are just nitpicks, though.
Our descriptions often aren't really meaningful, but I think we can do just a little bit better ;-)
I think this is not needed.
I think the renderer should be injected.
I was going to ask why you are not just returning the render array directly and let the request processing system render that. The answer, I think, is that if you were to do that the oEmbed HTML would be "decorated" by all the usual blocks, page template, ...
I then thought that this could also be achieved by using a
_wrapper_format
in the URL that is requested in the formatter. Not sure if we already have one that just renders the controller result without any wrapper markup but I guess modals/dialogs have a similar problem, so maybe it would actually work.On the other hand, the resulting HTML that will be output by this template actually includes an
<html>
and<body>
tag already, so if we were to utilize such a wrapper format, the resulting HTML on a page without that wrapper format would actually be invalid due to the nested HTML tags. So perhaps this is the best way to go.Maybe someone more into the request processing system (@Wim Leers ?!) can comment. Also, if we do stick with the manual rendering here could probably use a little code comment.
Instead of using the URL generator, can't we just do
$url->setOption('base_url', ...)
? That would also avoid having to inject it.Maybe adding a link to the URL @alexpott posted above would be nice, as reasoning for why we are doing this.
Was about to complain about missing validation of the URL, but the URL element already does that for us. Nice!
Super nitpick:
a -> an
Our regular
html.html.twig
indents the<body>
element so we should do that here, as well.The
{{ post|raw }}
should also be indented (twice) and the empty line can be removed, I think.You can use
$this->getSession()->switchToIFrame()
to actually test stuff within the iframe.Comment #136
chr.fritschThanks for reviewing again.
Fixed all the mentioned suggestions.
I am not sure if we should store the iFrame URL in the field formatter, or better create a settings page for that, because you probably want to configure the domain just once and not everytime you use the formatter.
Comment #137
samuel.mortensonReviewing the patch from #136, there appear to be multiple security vulnerabilities:
/media/oembed?url=https://cdn.rawgit.com/mortenson/deabebe5ff31e8afb6ced6b3a2a13b62/raw/f1fd7d413f0b0438c69ccf74a525864c40c0d180/oembed.json
on their site. The Github gist is available here: https://gist.github.com/mortenson/deabebe5ff31e8afb6ced6b3a2a13b62simplexml_load_string
is used in \Drupal\media\OEmbed\ResourceFetcher::fetchResource and is vulnerable to XXE. The core \Drupal\serialization\Encoder\XmlEncoder should be used instead, which extends the Symfony class which appropriately protects against XXE.Comment #138
phenaproximaThanks for reviewing, @samuel.mortenson!
If we use this, won't Media have a dependency on Serialization? That's not the worst thing in the world, but it would be preferable not to introduce additional dependencies. Is there any chance of moving XmlEncoder into the \Drupal\Component namespace, or maybe just using the Symfony class directly?
Comment #139
samuel.mortenson1.
This method also seems spoofable. I didn't create a full POC but managed to create an HTML file that had an oembed link to a JSON file with malicious content - which was fully validated until it was eventually passed to YouTube which didn't know how to handle the oembed. You can debug this locally by putting https://rawgit.com/mortenson/af20b4f7faecb1aa75362e04354ec64f/raw/ccea86... into a remote video source field.
Sources:
https://gist.github.com/mortenson/af20b4f7faecb1aa75362e04354ec64f
https://gist.github.com/mortenson/0d0d746685053b54df07df01e8e78649
We should validate the passed in $url to this method somehow, so that we're not requesting HTML from arbitrary sources. In addition to that, we should validate that the URL returned by this method is supported by a provider - right now we just request it and determine the provider based on the provider_name it contains:
Which seems backwards to how this normally works.
2.
Can we use ^ and $ in the regex, for safety reasons? Also, when we convert "*" to ".*", that's going to allow any number of forward slashes as well. With the way this currently works, the scheme
http://www.23hq.com/*/photo/*
will match a lot of bad URLs likehttps://mortenson.coffee?foo=http://www.23hq.com/foo/photo/bar
Comment #140
alexpottI've updated the issue summary with some of the security discussions from recent comments so that people don't have to read everything to know what has been considered.
Re XXE and the XML encoder - just create a new \Symfony\Component\Serializer\Encoder\XmlEncoder in our code. Moving the serializer seems unnecessary.
@samuel.mortenson++
Comment #141
alexpottI've opened #2940879: Don't automatically set a cookie domain to make
example.com
's cookie as secure aswww.example.com
's cookie.Comment #142
chr.fritschThis patch should address #137 2/3.
Regarding 1:
For that reason, we are offering the possibility to define a different domain which would render the iframe content. See the field formatter settings for this.
Comment #143
chr.fritschRe XMLEncoder:
I am using
Symfony\Component\Serializer\Encoder\XmlEncoder
because Wim wants to killDrupal\serialization\Encoder\XmlEncoder
in #2926034: Deprecate 'xml' serialization format in Drupal 10Comment #145
samuel.mortenson@chr.fritsch So to use Oembed securely, Drupal users will have to register a new sub-domain that points to the same site, but does _not_ share cookies? Also, that sub-domain will share cookies by default if the primary domain does not redirect to www.domain.com. How many Drupal sites will actually set this up in production?
Comment #146
samuel.mortensonFor #145 I don't expect us to solve the oembed security problem, but I'm just trying to get a handle on what it would take to really use this securely. The more I research oembed the more I realize that basically everyone just trusts the HTML response, so the fact that we have a path forward for security is better than most implementations I've seen.
I think we can ignore #137.1 unless someone has a better idea than what we're doing now. I would love to use something like the iframe
srcdoc
attribute for this but it's not supported in IE11.Comment #147
chr.fritschRe #139.1:
I am not sure how to solve that because not all providers are supporting schemas. That's the reason we also have to try to get a resource when schema fetching doesn't work.
Solved ##13.2
Comment #148
chr.fritschForgot the interdiff
Comment #149
alexpottThis needs a description and an explanation of why using a sub domain is a good idea. Plus I think media should implement a hook_requirements warning that not using a separate iframe_domain is insecure.
Plus it would be great to have reviews on #2940879: Don't automatically set a cookie domain
This isn't an array of endpoint objects. It's an array of endpoint info stored in an array.
Comment #150
chr.fritschI talked to @phenaproxima and we came to the conclusion that the iFrame domain should be stored globally and not on formatter level. If you did the setup once, you want to use that in every formatter.
For that reason, I created a new settings form for media.
Comment #151
samuel.mortensonMinor review from today:
Can we handle this more gracefully? We throw a number of specific exceptions with messages that could be valuable for end-users. If every exception also sets a Drupal message I suppose that's OK as well.
Is this ever used? If not, I think we can remove it.
I also wanted to re-phrase #146 and say that my prior security concerns have been addressed.
Comment #152
dawehnerThis is some amazing piece of work!
$request->get supports both POST and GET. I guess oembed just uses GET, so let's use
$request->query->get()
here.Is there a reason we don't provide a cacheable response?
I'm curious whether we could link to an article maybe, explaining why this is the case.
Note: We could use the request 'router.request_context' service instead of $base_url. It has a
getCompleteBaseUrl
to use.I'm curious whether this method should have a static cache.
I'm curious, could we put this result into its own value object, with nice getters for the common keys?
seems to be a list we could use here.
+1 for adding constraints!
It feels like this could be totally a kernel test
Comment #153
phenaproximaApologies for the big interdiff. I did some refactoring, documentation fixes, and renaming of things. I also addressed some of @dawhener's feedback in #152:
Comment #154
phenaproximaUgh. Forgot to complete the rename of thumbnails_location to thumbnails_uri.
Comment #155
dawehnerI think in the case of an exception we don't want to cache it. Also not, when you don't assign any cacheabiity metdadata at all, it'll cache forever. Any idea what we could do there?
Comment #156
phenaproximaYes. I'd like to have the resource value objects implement CacheableDependencyInterface, and we'll have the CacheableResponse use their cache metadata. But that is blocked on @chr.fritsch's implementation of the value objects.
Comment #159
phenaproximaApologies for another gigantic interdiff. It took some doing, but we implemented a Resource value object and made tests pass with it. The next step is to mock HTTP client requests in our tests, since we really can't rely on random junk from Vimeo in order for our tests to pass :)
Comment #160
aheimlichThis class doesn't seem to exist.
This class also does not seem to exist.
Comment #161
phenaproximaPer #151, security team concerns have been addressed. Also, we have been hacking heavily on this patch at DrupalCon Nashville. So, adjusting tags to match.
Comment #162
chr.fritschSo I reworked the tests and added a bunch of fixtures to simulate oEmbed requests
Comment #163
phenaproximaYou're a rockstar, dude. That is amazing work -- totally removes any external Internet-based dependency from our tests, even the provider database at oembed.com. WOW!
I did a couple of minor changes, plus a bit of refactoring on the tests -- namely, I removed any need to mock the HTTP client service :) Trying to wrap a mock object around a real object, especially an object that has as many meta-methods as the HTTP client, is utterly painful with Prophecy. This way should be a lot easier.
This is awesome teamwork. I have a feeling that this is very close to ready. Don't be put off by the size of the interdiff; I just renamed a few files, and git is being stupid about it.
Also, really removing the security review tag this time, as per #161.
Comment #164
phenaproximaOh, and: we briefly discussed this patch in a Media Initiative meeting in Nashville. We explained how we are warning people about the insecure nature of third-party embeds (unless in a properly sandboxed iFrame), and @xjm asked us to run our solution past the usability folks. So I'm tagging this for usability review.
Comment #165
bkosborneComing into this real late. I scanned through the comments but didn't read them all carefully, but here's two concerns I have:
1) I get why the iframe solution is needed, and I've often balked at how oembed encourages people to blindly trust data from a third party. But even so, putting everything into an iframe seems to be really bad for UX, no? Iframes make styling much more difficult, especially when dealing with responsiveness. Also, many oembed providers will themselves just return the markup for an iframe, so we'd be embedding an iframe in an iframe in those cases. I think this is something you've all probably thought about, but there's really no good solution, and I don't have one, I just haven't seen this concern explicitly mentioned so here it is.
2) All the Guzzle requests to a 3rd party need to define network/response timeouts. Right now they're just defaulting to whatever Guzzle has as the default, which I think is 30 or 60 seconds. Imagine this scenario: You have an article on a site with 15 twitter or instragram media embeds. This is a popular article. Caches are rebuilt on your site (maybe a code deploy), and suddenly you have X amount of people hitting this page in a cold cache scenario. Every visit to this page results in 15 additional hits to Drupal, each of which is blocked on network I/O waiting for oembed data. If the oembed endpoint is down, or just slow, this can easily lock up your site.
I wrote about that concern in issue queues for the popular media entity provider modules, like #2910292: Performance issue if Instagram API is down. I think a good solution would be to introduce a configurable timeout, maybe 5 seconds, and also institute a "backoff" functionality, where if a certain amount of requests fail within X amount of time, all future requests won't even be attempted for Y number of seconds.
Comment #166
phenaproximaThis is a valid concern for sure and I think it's one we can address.
This is a great idea:
However, it sounds like something that should be implemented upstream in Guzzle, if it's not already.
Any concerns about the iframe, though, I'd rather deal with in a follow-up. This patch is big and complex enough and we should not do anything to increase the scope. Security is the top priority, and what we came up with is good enough for jazz in that department. IMHO the surrounding issues should be secondary for the time being, but I defer to the UX team for the final word on that.
Comment #167
chr.fritschSo let's give that another push. I also tested it heavily and I am feeling really good about it now.
Comment #168
ckrinaJust reviewed this with @chr.fritsch and @phenaproxima and here are some suggestions:
I’d be handful to let users know about this when they are creating the new media type in a non-intrusive nor scaring way. Just briefly inform them about the recommended step before “they create a problem that needs to be resolved” and ideally point them to the next actionable step for that.
I wouldn’t use the warning styles for that because they haven’t done anything wrong and it could be confusing having it inside a form, it might make them feel like they’re doing something wrong. So I’d recommend adding a really short text with a link to the actionable step only when a oembed is selected.
I’d also keep the warning message in the status page and in the config page until the domain is added.
Just removing the 'Needs usability review' for now, but feel free to add it again.
Comment #169
chr.fritschThanks @ckrina
So in this patch, I implemented all the suggestions that were raised by the usability review.
Comment #170
samuel.mortensonSome quick review of the value object. Still need to read through the entire patch (again).
Why are we using inline_template here? Doesn't this mean that OEmbed responses could return Twig template responses? :)
Missing documentation block.
Why is this list hard-coded? Is that based on a specification somewhere?
Missing documentation block.
Comment #171
samuel.mortensonThis should also use
$request->query->get
.Maybe we could have an object representing each kind of resource in the future so they don't all share attributes that may or may not be useful.
Is $this->useCaches ever used?
Remove the space after
list
.Is this a valid way to determine the format of a response? I'm wondering if there's an API method for this.
Can we set $this->urlCache to an empty array in the constructor to avoid this check?
Also, if urlCache is only used in this method I wonder if we could remove it completely and instead just check from $this->cacheGet instead.
Would using #type => html_tag be cleaner than using Twig for this?
Should this be required? It could be set to an empty string as is, I think.
Just confirming - with this and other example data, no external HTTP requests will ever be sent from tests right?
Comment #172
chr.fritschThank you @samuel.mortenson for reviewing it again!
Fixed #170
Re #171
1. Fixed
2. We had that in the first place and decided then to merge everything into one object because of simplicity. The different types have most of the properties in common and we don't want to over-engineer it. This patch is already quite huge...
3. Yes in UseCacheBackendTrait
4. Fixed
5. I don't know. If there is something we could use I am open to suggestions.
6. Fixed
7. Fixed
8. Makes sense
9. Yes, that's what we are trying to achieve. I ran the tests without an internet connection, found one bug and fixed that.
Comment #173
samuel.mortensonInterdiff looks good - I can't find a better API-method for checking the Content-Type header, so I'm going to mark this as RTBC.
Comment #174
alexpottLet's ensure that url is a string.
Let's ensure that max_width and max_height are of the expected type.
Given the data is coming from a third party I think we need to be a bit more suspicious and do some stronger typing and checking of each value.
I think we need to document that the Resource can contain data that should not be trusted. Also I think we should be more strict on the things we accept. Ie. width and height need to be ints.
Comment #175
phenaproximaThat is already the case. See Resource::setThumbnailDimensions() and Resource::setDimensions(). If we don't get numeric data, we throw an exception. The Resource class is quite strict.
Comment #176
chr.fritschI will work on all these things on my flight back from DrupalCon.
I also found out that OEmbedFormatterTest is still calling oEmbed and Twitter API. I will try to prevent that.
Comment #177
chr.fritschOk, here is a new patch where I fixed all the suggestions from @alexpott.
I also adjusted the OEmbedFormatterTest. Now it's not calling the oEmbed API anymore.
Comment #179
chr.fritschOh, my fault.
Let's try to fix the tests.
Comment #181
alexpottI didn't know about getInt() but this is very nice indeed.
First one?
Should we log?
Looks good. I wish there was an annotation like @UserInput or something that denoted stuff is not to be trusted.
Seems a bit strict.
Comment #182
phenaproximaSection 2.3.4 of the oEmbed specification says that all resources must have a version number, and it must be 1.0 (source). I don't see any real reason not to validate that.
Comment #183
chr.fritschAddressed @alexpott's review:
2. Fixed
3. But where? I guess that would flood the watchdog table after every cache rebuild.
5. What @phenaproxima said
I also tried to fix the last test. Lets see.
Comment #184
phenaproximaI did an in-depth review and fixed a bunch of small things I found. Remaining open questions are all documentation things:
We need to expand on this sentence, or remove it entirely. We can't just leave a blithe comment like this one. In what ways might the data be insecure, and how might the threat(s) be mitigated by this object? If we can't explain these things here, we should remove this line.
Is the thumbnail actually optional, according to the spec?
Same question.
Is the thumbnail actually optional? If not, we should remove the word "optional" from the $thumbnail_width and $thumbnail_height descriptions.
Comment #185
phenaproximaFixed my remaining review points.
Comment #187
dawehnerNote: you can use type: uri here
_crsf_token seems to be fine for anonyomus users. I was just pondering about that.
Should we do something to not cache these results?
I'm curious, what would be the advantage of storing the thumbnail locally?
I'm curious whether we could somehow instead of a warning ensure that people at least checked a checkbox to know that they are doing something insecure.
Comment #188
phenaproximaThanks @dawehner!
If we don't, we're basically hotlinking to an unreliable external resource every time we want to display the thumbnail. I'll open a follow-up to expose a configuration option on the oEmbed source plugin to toggle whether or not the thumbnails are downloaded.
I'm personally not in favor of that, but in any event it's outside the scope of this patch. I will open a follow-up issue for further discussion and possible implementation.
Comment #190
phenaproximaFixed @dawehner's review points. Tests will still fail, though...not sure what I did to cause that.
Anyway, responding to #187:
Comment #192
chr.fritschSo I found the reason for the failing test.
The format replacement has to be done before checking the URL.
Also, I changed the MediaSourceOEmbedVideoTest to use the oEmbedTestTrait as well.
Comment #193
phenaproximaOnly one request: let's log the exception message.
Otherwise, I think we have played patch-pong long enough. Any further tweaking can be done in follow-ups. This is a big win for the Media Initiative, a (reasonably) secure implementation of oEmbed, and effectively OOTB support for remote media in Drupal. Fantastic work.
RTBC from me once the complaint above is fixed. Then it's on to @webchick for the moment of truth (WebchickTestCase).
Comment #194
phenaproximaUpdating the IS.
Comment #195
chr.fritschFixed #193
Comment #196
robpowellI attempted to test #195. Here are the steps taken:
#d01bb4f633
)drush si -y --account-pass admin
Filling out the form in this incomplete state will give you the following error:
We should make it so the ajax doesn't fail and the form can be completed correctly (#2932226).
Secondly, when I filled the form out in correct order (ie title then media source), I was able to successfully save a new remote media type. However, there was a "File could not be created " error message and there was an incorrect thumbnail. When selecting this youtube video, I used the short link, https://youtu.be/dQw4w9WgXcQ.
Comment #197
phenaproximaThanks for testing, @robpowell!
Regarding the first problem, my knee-jerk reaction is to assume that #2557299: Any AJAX call disregards machine name verification when AJAX is used and leads to a fatal error is responsible. I have rerolled that patch for 8.6.x and we should try to land it as quickly as possible.
And as for the second problem...damn, nice find. The thumbnail download code was changed in a recent patch, so it's possible we need to improve our testing around that. So I'm kicking this back for that.
Comment #198
chr.fritschDamn it. We had a test for that, but apparently, I broke that as well.
So I fixed that and I found out that we are still calling the API for retrieving a thumbnail image. So I mocked that as well.
Comment #200
chr.fritschSince testbot is using a subdirectory, I think we should use baseUrl instead of just scheme and host.
Comment #202
phenaproximaI did some major refactoring on the tests, so I'm sorry for the big interdiff -- but it works for me locally now. Let's see how Drupal CI fares with it. The nature of the refactoring is so that we can do less HTTP request/response mocking, and more "put stuff into state so that we can actually write useful tests that don't need to do a universe of dirty work in order to properly test that things are working".
Comment #203
phenaproximaI demoed this patch in today's UX call. I'm pleased to say that everyone really liked it, and the remaining complaints are all either relatively minor, or punt-able to follow-up issues.
The oEmbed formatter doesn't appear to respect the max width and max height settings. I created a Vimeo video, set the formatter to 150x150, and the resulting video (as displayed in an entity reference field) was still larger than those dimensions. I think we need to test this more thoroughly and fix the bug.Fixed in #212.Comment #205
phenaproximaCleaned up doc blocks in new test stuff, and fixed a couple of small asks from the UX call.
Comment #207
chr.fritschThat are really nice test improvements. I like the way of mocking the providers.
So I did a couple of things:
Comment #209
phenaproximaThanks for fixing that :) I'm not sure how I feel about adding it to the plugin definition, though; I'd rather we made it part of the configuration form for *all* media sources. So maybe we should remove this for now, then open a follow-up issue to make this part of MediaSourceBase.
If we punt the label work to a follow-up, this can also be reverted.
This is not correct anymore; the problem is, if you look at the JSON file for the Vimeo fixture, I changed the thumbnail URI to internal:/core/misc/druplicon.png. Which resulted in a new filename for the thumbnail.
Comment #210
alexpottI wonder if we could achieve slightly cleaner mocking if instead of locking the http client we used an approach similar to https://lornajane.net/posts/2017/testing-api-calls-in-php-with-guzzle-mocks?
Comment #211
chr.fritschCrosspost with @alexpott
Addressing #209
1/2. Ok, reverted that change and will open a follow-up for that
3. Since the test refactoring in #202 the resource URL is different on every machine that runs the tests. That also means that the Thumbnail name changes every time. In order to prevent that, we are now using the source field value to generate a thumbnail name. I also changed from provider_name to type, because it's the only field where is guaranteed that it is set.
Comment #212
phenaproximaA few more changes in this iteration:
Comment #213
phenaproximaI've opened a follow-up to make the field label configurable when creating a media type: #2965988: Make the source field label configurable when creating a media type.
And with that, all follow-ups needed so far are filed. So I'm removing the tag.
And regarding #210: in a quick Slack chat, @alexpott agreed to defer that work to the existing follow-up for better oEmbed tests (#2964636: Improve oEmbed tests), so I'm copying his comment there.
Comment #214
phenaproximaI've added some verbiage explaining the iFrame stuff, as requested by the UX team in #203.
I don't think we need anything else in order for this to be RTBC.
Comment #215
phenaproxima@alexpott thought it might be a good idea to have a change record, so I wrote one: https://www.drupal.org/node/2966029.
Comment #216
phenaproximaAdded test coverage of hook_oembed_resource_url_alter() as requested by @alexpott on Slack. I also added an update function to create the all-important oembed_providers value in media.settings.
Comment #217
phenaproximaTested the update path as requested by @alexpott on Slack, and wrote a little documentation on the OEmbed plugin, explaining how to extend it to support additional providers.
Comment #220
phenaproximaThe failures are probably due to infrastructure stress because of the security update that came out a short time ago. Setting back to NR while I wait for the mania to subside, then I'll re-queue for testing.
Comment #221
phenaproximaI discussed this briefly with @webchick in Slack and she felt that deferring the source field label changes to #2965988: Make the source field label configurable when creating a media type was going to block this patch. So, after a short stint in the bikeshed, we decided to rename the source field to "Remote video URL", but to do it somewhat generically, such that any additional derivatives of the OEmbed plugin will get a source field labeled "$plugin_label URL" by default. So, in the case of the remote video derivative, the source field is labeled "Remote video URL".
To me, that's a pretty good compromise until we figure out how to make the source field label configurable (which is what the follow-up is for).
Comment #222
tstoecklerLet's use "@" instead of ":" for the variable placeholder, though. ":" is just for urls e.g. in < a href = "" situations
Comment #223
Gábor HojtsyReviewed #214 as requested on slack. I agree on the "Remote video URL" label change that was done later. This text was not changed in later patches yet as per the interdiffs.
1. If others add settings to this screen, it will look super odd to have this help at the top and then a bunch of settings that are unrelated to iframes. Putting all of this explanation on the setting may be a bit too long (maybe we can look into cutting it), but I think that would make the page future-proof.
2. To make sure it is clear to people that we talk about THIS site, IMHO that would be good to include. "Your Drupal site" may not make it clear we are talking about this same site.
"to prevent this"? That looks like a half written sentence to me :)
oEmbed content will not display *at all* if this is not set correctly, right?
Comment #224
phenaproximaIt sort of depends on how it was set, but it's likely to result in a lot of iframes with 404 errors in them.
Comment #225
Gábor HojtsyI think it's better to be super clear here so people will not fiddle with this setting if their iframe is too wide or too dark or whatever :) "not displayed properly" is a very wide category :)
Comment #226
phenaproximaDiscussed #225 on Slack with Gábor and we decided that "...not displayed properly, or at all" would be better, clearer wording.
Comment #227
phenaproximaToday, @chr.fritsch and I jumped on a call with @webchick to test the patch out. Here's what we found:
When creating a new media type that uses oEmbed, you are informed about the use of the iframe to display the oEmbed content. There is a link to configure the iframe domain, and that link should open in a new window. For clarity, we should also add "(opens in a new window)" after the link.Fixed in #228.YouTube videos are by far the most common use case for oEmbed, but we don't have any test fixtures for YouTube videos. Instead, we used Vimeo. We should open a follow-up to either test YouTube fixtures in addition to Vimeo, or just remove the Vimeo fixtures entirely in favor of YouTube.Opened #2966655: Test oEmbed with a YouTube fixtureThe iframe/security warnings are good, but could use a few small tweaks. @webchick will post the specifics.Improved in #228, but needs validation from @webchick.There was general agreement that #223.1 is an excellent idea. Let's wrap the settings form's iframe stuff in its own fieldset, labeled "Security".Fixed in #228.It is possible to use the oEmbed formatter on any link, string, or string_long field. And when you do, there is no filtering by allowed providers. I discussed the security implications with @samuel.mortenson and @webchick, and the agreed-upon course of action is to make the formatter apply only to media types that use an OEmbedInterface source plugin. This is better UX (one way to do something instead of three), and is more secure.Fixed in #228.Attempting to embed a Flickr URL produced an exception. We should investigate that, fix it, and test it.Fixed in #231.Some oEmbed providers -- notably YouTube -- will completely ignore the max_width and max_height settings of the oEmbed formatter, which can result in a video WAY overflowing a frame that is far too small for it, and that looks extremely weird. Therefore, given a max_width/height and the width/height of the resource, the formatter should choose whichever value is bigger.Opened #2966656: Negotiate max width/height of oEmbed assets more intelligentlyComment #228
phenaproximaLet's see how this does. I fixed:
Comment #230
phenaproximaAll needed follow-ups have been opened, so I'm removing the tag.
Comment #231
chr.fritschRegarding #227.2 I found a small bug in the formatter. I fixed that and added test coverage in the formatter test. By doing that I refactored the that test heavily. It's now testing with Media items because we limit the formatter to only work with media entities. I also introduced a dataProvider.
By testing the formatter I found that the provider_name is not a mandatory property. So we can not expect that it will be retrieved by oEmbed. So I adjusted the code a bit to take that into account.
I also noticed that the tests are still rely on an active internet connection and expect that all the oEmbed providers will respond. We should definitly fix that.
Comment #232
chr.fritschOk, here is the fix for the oEmbedFormatterTest.
::discoverResourceUrl
is now mocked and returns directly the local endpoint URL.Comment #233
phenaproximaI don't really see anything to seriously complain about here. Just minor stuff.
Do we really need to check if $media_type is set? If it's not, several things have gone wrong and it should be allowed to fail hard.
Nit: Extra blank line. Also, can we type hint $field_settings and $selectors as arrays, rename $field_settings to $formatter_settings, and document the method parameters?
Nit: s/url/URL
Comment #234
chr.fritschFixed #233
1. Yes, we have to check, because the method is called for every field. And on a 'Name' field, for example, no MediaType can be loaded
2. Fixed
3. Fixed
I also replaced
\Unicode::strtolower()
withmb_strtolower
since #2849669: Fix \Drupal\Component\Utility\Unicode() because of the Symfony mbstring polyfill landed.Comment #235
phenaproximaVery minor doc comment fixes. Otherwise, I think this is ready for final review.
Comment #236
idebr CreditAttribution: idebr at ezCompany commentedWhat great work! This will make many Media users happy :)
I found a number of nitpicks in patch and I am somewhat hesitant to post them given the progress. However, I think they collectively would finish up the patch so here goes:
Since we are expecting a URL, can we use "type: 'string'"?
Since this is an important configuration form, let's add it as a the 'configure' item in media.info.yml for discoverability.
"Constructs a" -> "Constructs an"
I think this violates our class naming convention, specifically #10:
Classes and interfaces should have names that stand alone to tell what they do without having to refer to the namespace, read well, and are as short as possible without losing functionality information or leading to ambiguity.
In other modules in Core, the settings form is typically prefixed with the module name, for example:
- \Drupal\locale\Form\LocaleSettingsForm
- \Drupal\statistics\StatisticsSettingsForm
- \Drupal\update\UpdateSettingsForm
I suggest this class is named MediaSettingsForm.
Use
$this->t()
overt()
in classes.The data type in the construct method does not match the @var declarations (string[] vs array). Let's bring these in line, personally I prefer string[]
@param array[] $endpoints would clarify the expected datatype.
"Low long" -> "How long"
The class mentioned in the doc comment does not match the actual class name (ProviderRepository)
Let's cast the Client to ClientInterface instead. The same goes for ResourceFetcher, UrlResolver, OEmbed.
If there is no known oEmbed provider with the specified name.
phpcs: the @var tag must be the first tag in a member variable comment
The code actually checks for: "$type is anything other than 'link', and $width or $height is empty."
The code actually checks for: "$type is 'rich' or 'video', and $html is empty."
According to the Drupal coding standards, NULL should be written lowercase null in @param and @return types
Missing description for the @return value
The class name in the __construct method does not match the actual class name (UrlResolver)
"Constructs a" -> "Constructs an"
"a iFrame" -> "an Iframe"
The StringTranslationTrait is available from PluginBase, so you can use $this->t() rather that t() here
The class name mentioned in the __construct method does not match the actual machine name (OEmbedResourceConstraintValidator)
Comment #237
samuel.mortensonReviewing #235:
There are three @todo-s without issues.
XML data is being decoded then encoded into JSON, then immediately decoded again. I don't see how this would "unify" the data - why not just use the decoded XML?
Comment #238
phenaproximaOpened #2972846: Improve oEmbed exception logging, but we need to mention it in the patch.
Comment #239
phenaproximaFixed everything in #236, except:
We're already using type: string...
And I fixed #238.1, but I left the JSON/XML stuff alone. That came from the original oEmbed patch and I didn't want to mess with things I don't understand :)
Comment #240
phenaproximaFixed #238.2, and ran all tests with my Internet connection disabled in order to ensure that tests are properly isolated. I did get a failure, so we should probably look into that before calling this done.
Sorry for the lack of an interdiff -- had to reroll against 8.6.x HEAD, for some reason.
Comment #241
samuel.mortensonThis looks good to me - my prior feedback has been addressed. Marking as RTBC.
Comment #242
martin107 CreditAttribution: martin107 as a volunteer commentedI can see the smallest of imperfections.
According to the test run output CS error sections there is a @file tags missing from ....
media_test_oembed.module
Comment #243
larowlanshouldn't there be a default for this in media.settings.yml?
same deal here?
its possible that the provider ID will contain characters that are invalid in PHP function names - e.g. should we be casting - to _ etc?
these bubbles up as a 500, should we instead use something that bubbles up as a 400?
is fieldset the right choice here? we did away with fieldsets for grouping (other than radios/checkboxes where that is their intended use) in #1168246: Freedom For Fieldsets! Long Live The DETAILS.
nit: the english is slightly off here.
should we document that this expects a {format} slug?
is there an existing prescribed format to which these conform or should we be using preg_quote here?
is it worth adding a ::fromProvider factory method given that is the most likely argument folks will pass?
given the large number of optional parameters here, would it be worth having some factory methods?
Resource::html()
Resource::photo()
etc
So it was clear which arguments were required for each flavour.
You could even go a step further and make the constructor protected so you had to use one of the factory methods.
As it stands, the number of arguments, and the valid combinations for each type might end up being a bit 'trial and error' for developers.
is it worth putting this in a local variable to aid readability?
given these have special meaning both here and in the constructor for Resource, should they be constants?
i didn't check these, but we're semi-endorsing them by linking to them here - so are we sure they're something we want to be associated with?
xml resource with a .json url? is it worth a comment here to point out this uses a rel link that is in xml format?
Comment #244
phenaproximaAddressed #243:
In addition to this stuff, I fixed a few other things and addressed a couple of local test failures. Sorry for the rather weighty interdiff; the factory methods added a lot of code (mostly doxygen) to Resource.
Comment #245
samuel.mortensonI don't see that this check has moved to the factory method.
Why was this done?
Comment #246
phenaproximaThe factory method calls setDimensions(), which will validate the incoming dimensions and throw an exception if they're not numbers greater than zero.
Because switching to DrupalSelenium2Driver is out of scope for this patch, and that line is an artifact from my local testing which I forgot to remove from the previous patch.
Comment #247
samuel.mortensonGot it - moving to RTBC based on my review of the interdiff from #244.
Comment #248
alexpottre #243.13 - there are still links to www.collegehumor.com in the patch - do we need them?
Comment #249
Wim LeersFor years, I've wished for oEmbed support. Looks like it's now actually happening! 🎉😵🎉
I could not resist reviewing this…
Overall, zero mentions of
@internal
in this issue. We should make as much as possible@internal
to avoid committing to supporting APIs that only one in a thousand or even fewer people will ever use.This triggered my security OCD, but I see @alexpott already brought that up 👌
Übernit: we usually write
iframe
.Perhaps mention this is expected to be JSON, and point to the relevant section of the spec at https://oembed.com/?
Oh, nice, an actual value object!
Can't we make this a more meaningful example?
For example, automatically make all YouTube URLs use the "nocookie" YouTube domain?
Interesting…
wouldn't it better to call this route/method something like
iframeSource
? So that we can also explicitly document that the HTML returned by this is not to be trusted?Nit: why not split these up?
Return a response instead of a render array so that …
Why
Markup::create()
? AFAICT you can call the constructor without any parameters.TIL
getInt()
is a thing.We're not bubbling cacheable metadata. We're capturing the cacheability of the rendered HTML.
Perhaps add an explicit comment to document that using
Markup::create()
here creates a trusted ("safe") string, even though it's for external HTML. This is only okay because this response is going to be used in an iframe.Can we make this
@internal
?Should this list that only
json
andxml
are allowed?@code {format} @endcode
?Let's then also do
assert(empty(array_diff($formats, ['json', 'xml'])));
Can't be
@internal
because exposed in an API (the alter hook).Typehint mismatch.
Übernit: s/$url/URL/
@internal
?RefinableCDI
and not justCDI
, this is a mutable value object.But apparently it already is mutable anyway.
Are you sure you need this mutability? I don't think you do. It seems you only have this for some of the static helper methods on this class.
Why is this an array instead of a
Drupal\media\OEmbed\Resource
value object?An array of what? URLs?
Drupal\media\OEmbed\Resource
objects?Nice!
Why hash this?
Query string? This alter hooks allows you to alter much more than just the query string!
If we'd move this into a helper method, then this could become:
i.e. then
viewElements()
would truly only contain view (render) logic.Übernit:
Iframe
, that's yet another way of capitalizing it.Shouldn't this also set a max age, per the max age setting for the resource providers?
Given that this will likely be seen quite often by end users, how about prefixing it with
Sorry,
?Wow, cool to see all this metadata become available in Drupal!!! Structured data FTW :)
Comment #250
phenaproximaWell, you know what they say...250th time's the charm.
I was able to address almost all of Wim's feedback in #249:
As for @alexpott's question in #248: CollegeHumor is listed in the canonical oEmbed provider database, and they support serving resources in XML format, which makes them a useful and needed test case since we do support XML resources. I watched the video itself and it's totally G-rated (not sure what the international equivalent is, the G is the rating that Disney cartoons get).
Comment #251
phenaproximaMinor doc comment fixes.
Comment #254
phenaproximaWhoops. Fixing the bad assert().
Comment #256
Wim Leers👍
new CacheableResponse()
. RemoveMarkup::create()
.protected
indeed, all the more reason to NOT useRefinableCDI
then! Because that means this value object is not mutable, except for its cacheability. That does not make sense.@internal
:Comment #257
phenaproximaEverything else is marked neither internal nor part of the API. The services and their interfaces are covered under the 1:1 class/interface rule, as are the new OEmbedInterface interface and OEmbed base class for media sources.
Comment #258
phenaproximaThis will fix the broken test.
It turns out that Twitter triggers an edge case which our tests are catching -- Twitter gives its oEmbed resources an insanely high cache lifetime of 100 damn years. When that number of seconds is added to the current request timestamp, it overflows the 'expire' column of most database cache tables, causing an SQL exception. To prevent that, we are going to arbitrarily limit resource cache lifetimes to 5 years.
Comment #260
Wim Leers👌
😂
It seems Twitter is forcing many HTTP clients to take care of edge cases, including us!
I think this cannot be internal because
Provider
is not internal.Let's see what committers think.
Why not internal?
Why not internal?
Why not internal?
Comment #261
phenaproximaQuoth the BC policy:
I believe these interfaces/classes fall under that umbrella. This is a new API, so it's probably unwise to consider it "final except for critical security fixes" -- which rules out the use of the
@api
tag. On the other hand, these are stable interfaces with implementing classes -- so they are not exactly@internal
, the 1:1 rule applies, and we don't anticipate changing them any time soon. The BC policy does not say that there should be any sort of marker on these interfaces, then, although I can add an explicit comment on all three if you'd like. But it seems clear that neither@api
nor@internal
applies in their case.Comment #262
Wim LeersI personally prefer explicitly communicating that they should be considered
@internal
, because most people don't ever read https://www.drupal.org/core/d8-bc-policy#interfaces. But that shouldn't hold up this patch.Thanks for addressing all that feedback so quickly; I'm glad we squashed a few bugs that we'd have run into later otherwise, and I'm especially glad that this now increases D8's API surface far less!
Comment #263
phenaproximaFixed a few other very small nits -- mostly tiny things in comments -- plus:
@internal
, which is in line with Drupal\aggregator\Form\SettingsForm.hash()
in the URL resolver, as per Wim's previous feedback about a similar call in the resource fetcher.Comment #264
alexpottCrediting reviewers who made comments material to the outcome of the patch.
Comment #265
alexpottRe #260.1 I think removing from
@internal
is fine. So happy to put it in as is.Comment #266
alexpottCommitted 8f6fcbd and pushed to 8.6.x. Thanks!
Fixed on commit.
Comment #268
marcoscanoWell done everyone! This is a huge win. 👏 👏 👏
A small minor bug seems to have slided into the formatter though: #2976795: Fix warning being thrown when configuring media type displays
Comment #269
steinmb CreditAttribution: steinmb as a volunteer commentedFantastic work everyone!!
Comment #270
mondrakeI am afraid this has broken HEAD testing on SQLite https://www.drupal.org/pift-ci-job/979584
Comment #271
alexpottWith #268 and #270 unfortunately I think this means we need a revert.
Comment #273
phenaproximaHere is a fail patch, and fix, for #2976795: Fix warning being thrown when configuring media type displays. I have also included an interdiff against #263, and an interdiff of the bug fix. Thanks for finding this one, @marcoscano!
I'll try to look into the SQLite problem too.
Comment #275
phenaproxima@alexpott and I worked on this for a few hours over Slack.
I got stymied by issues related to the brittle nature of the HTTP request mocking in the oEmbed tests, but @alexpott discovered that SQLite is failing due to a database transaction lock conflict during the entity save process, triggering this exception:
As @alexpott said, we do not want to be in an HTTP request during a locking database transaction, SQLite or no SQLite. So we have some work to do in this department.
However, we agreed that a good (very temporary) solution is to skip the failing test on SQLite, and address the failure over in https://www.drupal.org/project/drupal/issues/2964636. We intend to refactor the media_test_oembed module so that it wraps intelligently around the HTTP client service, which should minimize or eliminate a lot of the problems we are experiencing in testing this functionality. So I will be bumping that issue to major.
In addition to skipping the broken test, I have made the following changes in this patch as a result of talking with Alex (and my own attempts to debug the problem):
Comment #276
alexpottOne thing the SQLite fails show is that the following line
$thumbnail_uri = $this->getSource()->getMetadata($this, $this->getSource()->getPluginDefinition()['thumbnail_uri_metadata_attribute']);
can end up with
$thumbnail_uri = NULL
Which then breaks:
Comment #277
phenaproximaRe #276: I opened an issue to describe that in detail: #2976875: oEmbed source plugin can return null thumbnail URI and break things
Comment #278
phenaproximaIn further discussion with @catch and @alexpott on Slack, we agreed to try moving the potentially brittle code which may trigger HTTP requests out of Media::preSave() -- which is called in the middle of a locked database transaction -- and into Media::save(). Hopefully this works as a quick-fix, but it definitely needs to be revisited so that remote API interactions are never taking place during an entity save.
Comment #280
tstoecklerHmm... doing this in
Media::save()
is not super nice, because we explicitly support$storage->save($entity)
as a (or to be precise: the) method to save an entity, so it's pretty important thatEntity::save()
stays just a simple wrapper around that and is not overridden. I guess we can get away with it as a quick-fix/workaround, but just saying...I know people aren't excited about hooks these days, but @hchonov opened #2892654: Entity save post transactional hook a while back because we had the same use-case at bio.logis. Maybe we can get some eyes one that.
Comment #281
phenaproximaThat's the idea. It definitely sucks, but it's a temporary solution, not a permanent one. I have a few ideas on how to improve things, but they're out of scope for this issue.
Meanwhile, this should fix the test failures. I have also removed the SQLite test skip from #275. Fingers crossed.
Comment #282
phenaproximaHere, for the record, is how I suggest we improve things in a follow-up(s)...
Following the SQLite fiasco, it's clear that Media has some serious implementation issues when it comes to dealing with remote data related to a media item. That includes things like thumbnails which need to be fetched from a remote source, or miscellaneous metadata from an oEmbed resource.
The problem boils down to the fact that we are trying to update thumbnails and metadata -- which, in the case of oEmbed, usually involves talking to some remote API over HTTP -- in the Media::preSave() method. This method is called once the entity saving process has begun, which means a database transaction has started. Interacting with a remote API is an expensive and complicated process, and the database transaction may introduce a lock which can break the API interactions, or vice versa, or any number of different kinds of weirdness. The long and short of it is that we do NOT want to be waiting on an HTTP request(s) while we are in the middle of a database transaction! Now that we (will) have a remote media implementation in core, we need to consider all situations in which a media source may interact with a remote API, and ensure that we encounter none of those situations during a destructive operation like an entity save.
When dealing with oEmbed (and really, any media source that deals with remote systems in any way), virtually any available metadata attribute may trigger a remote API interaction. Therefore, we should not be trying to update a thumbnail, or retrieve any sort of remote metadata, for oEmbed items during an entity save. Only before or after it.
So, this is what I propose: move all operations that may talk to a remote API -- that is, metadata mapping and thumbnail fetching -- into a queue, so that it can all be done when the entity is not in the middle of being saved. One possible way that could look:
This approach would have the advantage of treating the thumbnail like a piece of remote data (which, for many media types, it is), and solving the as yet-unsolved debate about how to correctly handle thumbnails.
I would love to hear thoughts about this.
Comment #283
marcoscano@phenaproxima @alexpott thanks for digging into this!
I like the approach from #282, but I would suggest a minor adjustment:
- We could maybe use the https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Destructa... mechanism to trigger the first processing of the queue item still during the same php process that saved the entity, but after the response is sent back to the browser. If for some reason this processing fails, we can re-queue the item to be processed again later, but at least we have an opportunity to try to have everything set almost at the same moment of the entity save.
For the record, this could either be solved directly in #2878119: Whether queued or not, update the media thumbnail and metadata before beginning the entity save database transaction or the follow-up to solve this could supersede that issue.
Comment #284
alexpottI think we have to solve #276 in this issue because we have to expect that the network request for the thumbnail will fail sometimes. At the moment that will stop you from saving an entity and there will be an error.
Comment #285
alexpottCan do something like - when we create a media entity we always generate a thumbnail pointing to
modules/media_entity/images/icons/generic.png
and then update it once the media entity is saved and if$this->getSource()->getMetadata($this, $this->getSource()->getPluginDefinition()['thumbnail_uri_metadata_attribute']);
is successful?Comment #286
Wim Leers➕
Things will fail. YouTube will be down at some point. Everything will be down at some point.
Comment #287
phenaproximaIn discussion with @alexpott on Slack, we realized that the thumbnail logic of the oEmbed plugin is very convoluted. So I decided to untangle it.
This patch removes the thumbnail_url, thumbnail_local_uri, and thumbnail_local metadata attributes. There is now only thumbnail_uri, which will do the following things:
Additionally, this patch renames the thumbnails_uri configuration option to thumbnails_directory, for clarity's sake.
My fingers and toes are crossed that this passes tests.
Comment #289
phenaproximaThis oughta fix the tests.
Comment #290
alexpottI've been thinking about this line. I think we need to include something that would vary when the thumbnail url varies. Because if the source is updated then we regenerate thumbnail but the file is going to exist. Also for the purposes of caching this feels problematic. Maybe
crc32()
the url or something.Comment #291
phenaproximaThis patch implements two suggestions @alexpott made, both here and in Slack:
This will definitely fail tests, but it shouldn't (hopefully) be too hard to fix those failures.
Comment #293
phenaproximaOkay, this should fix the tests. Also made an additional small change @alexpott requested on Slack -- changed the SHA256 hash to Crypt::hashBase64().
Comment #294
alexpottComment #295
alexpottFor me the Media::save() stuff is still the wrong way around - we should save the media with the default thumbnail first and then attempt to get the remote thumbnail once the media entity is created.
Comment #298
phenaproximaAs in, save it once with the default thumbnail, fetch the remote thumbnail, then re-save it? I was trying to avoid a double-save, but if that's what we gotta do, that's what we gotta do...
Comment #299
alexpott@phenaproxima it's only likely to be one double save right? To create the file entity pointing to the default thumbnail. After that all media entities will be linked to the same thumbnail - right?
Here's a fixed patch.
Comment #300
phenaproximaI must be confused about what you're asking, because this is what I was thinking:
That's a lot of saves, which is why I set the default thumbnail in preSave(), then request an update just before we do the actual save operation. What am I missing?
Comment #301
alexpott@phenaproxima ah yes we'll need to re-save the media in order to update the reference. Maybe this is the best solution for now until we can discuss a better fix which involves queues. One thing I end up thinking is why are we even bothering doing the local thumbnail creation in the first place. Your content is already coupled to the availability of the third party and the whole behaviour of local thumbnails makes it super complex.
Comment #302
phenaproximaIt's unsustainably messy, no doubt about that. I know there's a follow-up issue (filed when the Media module was originally committed to core), about cleaning up thumbnail generation and update code. I'll find that issue and bump its priority.
Comment #303
phenaproximaAdding #2878119: Whether queued or not, update the media thumbnail and metadata before beginning the entity save database transaction as a related issue.
Comment #304
alexpottI've tried to do a naming things review - a few things have come up including a security issue. I've also got a patch that addresses some of this. I'll post in the next comment detailing what's been addressed.
Should we provide a default value of an empty string? And set in the update? I think so because if we don't then config will change by just going to the form and pressing save. Also this is a uri too :)
This should a type of
uri
plus the name could be better - maybeoembed_providers_url
- atm it sounds like a list.Is this new hook in the CR?
This theme function is only about iframes. It should be named accordingly and
post
is also not a great name. Let's called itmedia
as that's what is being displayed.A couple of thoughts here.
1. Do we fear clashes when a bundle and view mode clash with the name oembed and the provider ID?
2. Should we generalise this so its availabble for all sources not just oembed - we could use the source plugin ID.
Been thinking about the security of this. I don't think we should be passing url or other info here. I think we should be passing the media ID and loading it and getting the info from there. This will stop users from being able to manipulate the url and displaying things on sites that the site did not intend to be displayed. Or we should also pass a token on the URL so the attributes can't be futzed with.
This should use a special media markup object specific to the purpose. Like ViewsRenderPipeline.
I don't think this is the right place for this. It's not really about URL resolution. Not sure where to put it though.
This is clever but feels overly complex. We have the static methods - let's just call them. We don't have to shift arguments depending on type we just call each one.
Comment #305
alexpottPatch addresses #304.1, .2, .4, .6 by using a token - but we need to add tests for a bad token, .7 and .9.
The answer to #304.3 is yep it is on the CR - nice one.
We still need to resolve #304.5, add tests for .6 and .8
Comment #306
phenaproximaThis patch:
media__source_$source_id
(for non-derived sources only)media__source_$source_id__derivative_$derivative_id
(for derived sources only)media__source_$source_id__derivative_$derivative_id__provider_$provider_id
(only for oEmbed sources which know the provider name)Comment #308
BerdirI fear the people who worked so much on this aren't going to like this, but I wanted to bring it up..
So we have this 165kb patch that adds a lot of very new, pretty complex (security-wise for example) stuff to the media module, which means it is stable. We also just found a pretty serious security issue, *after* it has been committed for the first time.
There was also a lot of discussion about API stability, @internal and so on.
What if.. we add this instead as a media_oembed module that we can mark as experimental? That means the barrier for getting it committed is much lower. And the only reason media got into core as non-experimental (although hidden) module was that it was fairly close conceptually to the stable contrib module, while this is large amount of completely new code. The media module is growing rapidly and while it made sense to add all the basic media source plugins directly into that module, we also have a bunch of separate modules for various field types for example.
Comment #309
alexpottPatch attached fixes the test fails.
@Berdir thinking about the suggestion to put this in another module. That would make it easier to get this into core, I agree. And perhaps be a little less risky. But I'm not sure that adding another alpha experimental module for this is a good idea. Here's my thinking:
I don't think this patch far away now.
I need to review @phenaproxima's changes in #306 more.
Comment #310
alexpottHere's some testing of the anonymous access case.
Comment #311
alexpottReviewed #306 more. All looks good. Patch attached makes the test mocking consistent and stops using test class properties unnecessarily. Also use
iFrameUrlHelper
consistently to avoid confusion withUrlHelper
.Comment #312
samuel.mortensonMy review of the latest interdiffs looks good to me. I'm not sure if we need another core-person to response to #308, but I agree with @alexpott.
Comment #313
borisson_I agree with @alexpott's reponse to #308.
Comment #314
alexpottSecond time lucky!
Committed 00c2069 and pushed to 8.6.x. Thanks!
Comment #317
xjmWhat update instructions do module developers or site owners need from this issue? If it's just a highlighted new feature of 8.6 it should be tagged "8.6.0 highlights" instead.
Comment #318
xjmComment #319
2dareis2do CreditAttribution: 2dareis2do commented"Enter a different domain from which to serve oEmbed content, including the http:// or https:// prefix. This domain needs to point back to this site, or existing oEmbed content may not display correctly, or at all."
If I understand this correctly there advice here is to set up a subdomain that points to the main url (301 redirect) and add this to your media config? What am I missing here.? When following these instructions I simply get a 404 error where the iframe would be?
Any detail on how the is done would be useful. Do I need to set up a drupal multisite config to achieve this?
I am sure this is a great security feature, but as with quite a lot of things in Drupal the documentation is either hard or simply lacking.