Problem/Motivation
RDF relies on embedding markup inline with objects on the page. This was extremely challenging to implement in the first place, does not have good coverage in contrib, and is getting outdated.
There are various contrib alternatives:
https://www.drupal.org/project/schema_metatag - 25,000 users (depends on metatag)
https://www.drupal.org/project/jsonld
https://www.drupal.org/project/json_ld_schema
https://www.drupal.org/project/jsld
However, those projects don't have exact parity with RDF module. For example, schema_metatag requires configuration, and its current config model requires altering other config objects rather than being able to ship default config that would provide parity with the default RDF config that's currently in the Standard profile.
Proposed resolution
- Deprecate RDF in Drupal 9.4 core. Remove it from Drupal 10.0 core.
- Move RDF to contrib, thereby making it easy for existing sites to keep using it in the short to medium term.
- Create a Change Record that mentions both the RDF contrib module and alternatives, such as schema_metatag.
Remaining tasks
- A maintainer is needed for the RDF contrib module.
- Once we have that maintainer, create an implementation issue to do what's in the proposed resolution.
User interface changes
API changes
Data model changes
Release notes snippet
Original issue summary
As a maintainer of RDF module, I don't have much confidence that the module provides the reliability we need for Drupal 8. We don't have a solution for compound fields (such as addressfield). Not all core field formatters are tested, to say much less of contrib field formatters. And we have criticals that haven't seen any major movement in many months.
In the meantime, Google has announced that it will consume JSON that is embedded in pages using <script>
tags, and other search engines have followed suit. As I outline in my companion blog post, this is preferable to HTML data formats (RDFa, microdata) for a number of reasons. The ones relevant to this issue are:
- It would mean we could use the same pipeline for REST’s serialization and HTML data
- It removes complexity from the field formatters and theme layer
- It makes it easier to replace the implementation from contrib
While this is a large change in terms of the description on the Drupal 8 box, I'm fairly certain it will actually require less work than getting our RDFa up to snuff and has a better chance of being reviewable by other people, which this work hasn't really been in the past.
Comments
Comment #1
linclark CreditAttribution: linclark commentedComment #2
webchickShould probably get scor's opinion on this.
Is this actually a beta blocker? It seems to me if we do choose to remove this module, we could do so all the way up until a very late RC?
Comment #3
linclark CreditAttribution: linclark commentedOh, I just assumed that removing one module and adding another was a beta blocker, but I'm totally cool with it if it isn't.
Comment #4
nod_I read the blog post as well but just to be sure, can contrib still implement RDFa or is there code in other core modules added to support the current RDFa module that wouldn't be possible to do from contrib?
Just asking since that was the case for overlay and it means that module is dead and impossible to implement from contrib. Not a big deal for overlay but I could see people actually wanting RDFa at some point.
Comment #5
linclark CreditAttribution: linclark commentedThe only issue that would make it potentially harder, if it were reverted, is #1778122: Enable modules to inject attributes into field formatters, so that RDF attributes get output.
If a contrib module wanted to support RDFa in the same way that D7 supports it, it would be entirely possible to do from contrib even if the issue is reverted. I believe it would even be possible to support the more reliable RDFa Lite processing model (which is one of the big differences between D7 and D8) from contrib if that issue is reverted, though it would be messier.
Comment #6
nod_Ok, perfect. thanks :)
Comment #7
Crell CreditAttribution: Crell commentedLin, is the net change here that we'd have fewer semantic attributes sprinkled throughout the page, but instead have, essentially, an alternate "machine-targeted" version (in semi-JSON-LD) embedded within the page? Vis:
Or something vaguely along those lines?
If that's correct, then in concept I am very +1. As we've discussed in the past on REST team calls, at this point I think the idea of comingling essentially two different data models into one XML tree (HTML and RDFa) was a mistake from the get-go and they should be separate documents simply linked to each other.
My concerns would be:
1) You've expressed in the past that our data model really doesn't map to JSON-LD well at all. That's why we dropped it in favor of HAL. Would it map better to "JGoogle-LD" (or whatever), or are we still looking at a hard mismatch?
2) What would be the net impact on file size? We would presumably be removing a lot of attribute markup, but adding a big string to the header. Would that be a net win on file size, big net loss, or close enough that we shouldn't bother caring? (My gut feeling is that the non-data markup on the page vastly dwarfs the amount of text we're talking about here either way, but I figure it's important to ask.)
Logistically, Lin is correct that D8's improved pluggability should make this much easier to do. To that end, I'd suggest an approach of ripping out RDFa entirely and planning to ship 8.0 with "none of the above". All of the various bits to make this happen (new encoder for serializer, hooks into the HTML Head, etc.) can happen in a contrib module for the moment, and evolve way faster than core.
Then when that's done we can fold it back into core. If that happens by 8.0, great. If not, it's exactly the sort of functionality we can add to 8.1 and show off as a hot-new-thing for 8.1. The important part for now is just ensuring that the hooks (conceptually, not hook functions) that we'd need are in core. Actually using them can then be a separate non-blocking task.
Comment #8
linclark CreditAttribution: linclark commentedYes, your understanding of how this would be included in the page is basically correct... though in the example in their docs, Google places the JSON object in the body (rather than the head).
The hardest thing about the work before was building in the flexibility to handle mapping the same entity type to multiple domain models which have incompatible ways of structuring data. And then doing so in a way that wasn't super confusing to users. If we are targeting a very specific domain model, this becomes easier.
If a person wants to expose every single property on their page as part of the JSON, then this would be true... for example, if they want to include the body as one of the properties. However, if they just expose the data that is necessary for rich snippets, I expect the difference will only be slight, since those are usually small bits of data and we often have to introduce a span element or the like to mark them up in HTML.
Comment #9
catchI think it'd be fine to leave the API support we put in specifically for RDFa in core, but move the module to contrib. For module removal this issue should be assigned to Dries sooner rather than later but agree scor would be good to hear from.
Even if there's a bigger file size, having less attributes all over the place ought to be a bit better for browser performance. Whether any of that is measurable compared to the rest of the stuff on the page is a different issue.
I'd probably go for a
<script
tag at the bottom of the page, then it absolutely doesn't block anything else getting rendered/parsed?With JSON-LD, what happens if a block is rendered in isolation (client side include for example)? Can it extend the main JSON lump with it's own separate bit? Or would we have to merge those? How does it work in general for multiple things rendered on the same page?
Also we'd need to ensure that the JSON-LD data that's added during rendering is compatible with render caching - it's got similar problems to drupal_set_*() drupal_add_*() vs. #attached. i.e. if the node is rendered from the render cache, then the JSON-LD should be too - rather than not at all, or from scratch each time.
Comment #10
Dave Reid/me wonders how this works with Views or more complex things that aren't necessarily "only one primary thing on a page" since this would switch from inline data to one big chunk of data.
Comment #11
scor CreditAttribution: scor commentedAs appealing as your proposal sounds, I think it would be a big step backwards from all the efforts we've put into RDFa and the theme layer over the years, and comes with its own trade offs. Here are my arguments against your approach:
Overall it seems too much of a drastic change at the last minute to an approach that is still very new (like a few months old) and we're not sure how the JSON-in-HTML is going to pan out in the coming years. On the other hand we know that search engines are comfortable extracting data from traditional HTML as they have done for years, so this support is not going away. Google is currently able to extract RDFa from Drupal 7 and present results in the form of rich snippets (recipe, person), so why take that away? (Note that Google's webmaster documentation is admittedly outdated as has been reported before). I agree the markup generated by D7 leaves a lot to be desired, but in comparison the equivalent markup in D8 is much more efficient.
Regarding the technical concerns you have on the reliability of the markup:
So obviously, a big -1 from me! Here is my counter proposal: give more time to fix the remaining major issues and re-evaluate the situation before RC1.
Comment #12
pwolanin CreditAttribution: pwolanin commented-1 from me also.
Certainly you could turn of the RDFa and add a JSON mapping or add the JSON mapping to a later 8.x release when the spec is more stable?
This seems like a bit of a distraction compare to deep API issues for beta.
Comment #13
linclark CreditAttribution: linclark commentedYou miss the point of what I'm saying... I'm not talking about config. AddressField outputs multiple properties in one blob of HTML. In order to mark up the individual properties, you need to actually have code in the formatter (or in a very hard-coded preprocess function) which knows where to place attributes, and add spans if needed.
So the way you would handle this in schemaorg_contrib is by having a long list of conditionals in hook_preprocess_field... if this is addressfield, use the tightly coupled addressfield special preprocessor. This seems unsustainable to me.
Comment #14
linclark CreditAttribution: linclark commentedTo me, it's less of an issue of what I would do personally. It's more about what kind of implicit promises we make to users when we say we support something.
The data exposed by Drupal 7's RDFa was unreliable. We can chalk that up to the fact that it was the first pass at implementing it. But if we continue to say that we support it in Drupal 8, we should make sure that it really works.
Having been close to this work for a number of years, and having seen the patterns of contribution around it, there is nothing at this point that leads me to believe that we will have solid support.
Comment #15
jneubert CreditAttribution: jneubert commentedI'm writing as an user of RDFa in Drupal 7. The RDFa support in core was the one thing which convinced me, and allowed me convince my organziation (the German National Library of Economics) to use Drupal for a Labs website which publishes information about projects for humans and machines. It worked quite well in this use case with the Drupal 7 standard features. The limitations I met were not general reliaility, but rather missing support of nested RDF structures, which I could work arround via custom field templates. This experience, and the strong commitment of the core development team to RDFa made me kind of a Drupal evangelist in the library linked data community.
An approach as suggested by Lin in this post wouldn't have worked for our use case, simply because schema.org does not provide the expressivity we wanted in the descriptions of projects. It wouldn't either work for some other use cases (a thesaurus of eocnomics and a press archive application), which we implemented with "handcrafted" RDFa applications before Drupal 7 was published. I'd hope that Drupal 8 allows us build such applications on a solidly grounded framework with much less effort, letting us combine schema.org and domain specific vocabularies easily.
Having RDFa dropped now in core would indeed feel like a serious backlash.
Comment #16
webchick@jneubert Thanks for your perspective as a Drupal user/evangelist in this space. By chance, are you or someone else from your organization able to help tackle some of the issues in #1778226: [META] Fix RDF module, particularly #1778410: Throw exception when RDF namespaces collide (the only D8 release-blocker I'm aware of related to RDF)? I think at least part of the genesis of this issue is the fact that those issues aren't moving in a timely manner due to a lack of contributors to semantic web stuff.
Comment #17
linclark CreditAttribution: linclark commentedIf you're already writing your own field formatters to make things work, then this change (to move it from core to contrib) wouldn't have any impact on how it works for you.
Comment #18
linclark CreditAttribution: linclark commentedOff the top of my head, these two would also be release blockers:
Comment #19
webchickIf they're actual release blockers, they should be marked as critical so we can track 'em. But make sure the issue summary has a justification for why it must hold up the entire release of Drupal 8 if it's not fixed before then, and why it could not be fixed in a later point release (e.g. 8.0.1 or 8.1.0).
Comment #20
jneubert CreditAttribution: jneubert commented@lin: To be more precise: I could use 10 or so fields just as they are, and had to hack only one field template, where the nested structure was required (still using the RDF Mapping API for the inner level).
@webchick: Unfortunately, I'm not coding on the level which is required for these patches (just trying to learn about the basics of OO-PHP and Symphony). I'd be happy to help testing, however.
Comment #21
jneubert CreditAttribution: jneubert commentedRe. #1777688: RDFa output incorrect when not using entity template (Views, Panels, etc) or when render array is altered, particularly views integration: I'd love to see this (because it would allow to publish OAI-ORE aggregations). But if it is not available, that's the state of affairs currently, and maybe I or somebody else may be able to extend RDFa support to this use in the future.
Comment #22
linclark CreditAttribution: linclark commentedLate response@Dave Reid:
We currently don't support inline data very well in Views unless you are using the entity system to render (which most Views I've seen do not). I note this in #1777688: RDFa output incorrect when not using entity template (Views, Panels, etc) or when render array is altered. It would be possible to support it with inline data, but would require a lot more markup than we currently output.
Comment #23
linclark CreditAttribution: linclark commentedLate response to @catch:
There is nothing in JSON-LD that would block us from doing this (it is, in fact, one of the things RDF is really good at).
However, Google implements its own processors and they tend to not support anything more complicated than what they show in their examples... meaning that something which should work based on the standard does not actually work for Google.
Their testing tool is still in beta, and when I tried providing two items with the same ID, it gave me an itemtype error (which it really shouldn't). So it's unclear how that would work.
I think this should be fine. The change frequency of the JSON-LD should be the same as the HTML (or the RDFa) would be as far as I can see.
Comment #24
webchickscor chimed in, so unassigning. This should ultimately be assigned to Dries for the final decision, but it feels like we aren't quite done with the discussion yet.
Comment #25
jneubert CreditAttribution: jneubert commentedSome additional, not primarily technical thoughts - as I hope, not out of scope for this issue: As far as I can see, there seems to exist no specification for the "Google-flavored JSON-LD", besides the short example in the Gmail Actions description. So all bets are off to what extend something similar but slightly different will work in another sphere of the Google world, such as search results. Google seems to provide notoriously inaccurate and/or outdated structured data documentation, which may differ from the results of the rich snippets tool. The latter could provide some orientation, but again has explicitly no binding to what Google indeed will do when preparing search results. In each of these three very loosely coupled areas - actual system behavior, testing/validation tool and specification - things may change without further notice. (And of course other search engines using schema.org will interpret markup slightly different again.)
So an commitment to "Schema.org-focused JSON" sounds to me like a commitment to continuous reverse engineering, with an unknown number of moving parts (because we don't even know if Google will interpret the same JSON structures the same way for different subject areas, e.g., events vs. hotel ratings vs. health information).
A whole lot of SEO experts and companies try to keep up with this. As a developer of a data interface, I wouldn't like to - but that's up to the ones who actually create and maintain the code. However, this also puts a heavy burden on everybody else (outside Google), who just want use the data which is published through this data interface. Jeni Tennison (W3C TAG) has published a thoughtful article about Schema.org and the Responsibility of Monopoly, where she states about the lacking "clarity, detail, and conformance criteria within the schema.org vocabulary specification":
To me, this seems true just as well with regard to an apparently almost unspecified use of JSON syntax, compliant to none of the existing standards, and tightly coupled with and restricted to the schema.org vocabulary. Smaller institutions or companies which want to consume data published by Drupal and other sources trying to follow suit Google's course of action, are in the very same situation, but can put only a vanishingly small fraction of the resources of Bing, Facebook or Yandex on the task of dealing with basically the same situation.
Also for these reasons, I'd plead for carrying on with the commitment to RDFa as widely accepted standard. Just the more as extensive reverse engineering already seems to show that Google deals with RDFa quite well.
Comment #26
Damien Tournoud CreditAttribution: Damien Tournoud commentedTotally +1.
Trying to mix HTML and machine-readable metadata in the same pipeline has proven to be nearly impossible for everything except the most trivial use cases.
We tried to add support for RDFa in both Addressfield and the Commerce Price Field (both since Drupal 7), and it didn't go very far.
The RDFa serialization introduces a strong dependency between the HTML theming and the machine-readable metadata, so it reduces the flexibility of both.
Kudos to @linclark for the forward thinking here.
Comment #27
jneubert CreditAttribution: jneubert commentedIf a common feeling among developers should evolve that RDFa in the cases described above is impossible or just too hard to integrate, and/or that it is not acceptable to start with a solution broken in some known places (non-essential in my eyes, but others can judge this better) - please consider to resort to another standard-based solution (such as JSON-LD, which has reached Proposed Recommendation status in November), which
This would require change down the chain, but would furthermore support the use of Drupal in Linked Data publishing.
Comment #28
scor CreditAttribution: scor commentedQuick update: I went through the major issues / blockers and updated them.
- #1778194: RDF module can't handle compound fields and #1778410: Throw exception when RDF namespaces collide can both be closed (see rationales in the respective issues).
- I closed the views issue since the bug was fixed in another issue: #1777688: RDFa output incorrect when not using entity template (Views, Panels, etc) or when render array is altered. Full views support should be left for contrib, core should only support RDFa in the regular entity_view() output.
Comment #29
jessebeach CreditAttribution: jessebeach commentedI would like to voice a vote for process here. We are 3 months away from an aggressive date to cut a beta.
We are reaching a point in the product lifecycle where launching the whole is more important than perfecting any single sub-system. We should never forget the lessons of Duke Nukem Forever.
It makes me very very nervous to start talking about removing and replacing large sub-systems at this point in the 8.0 release cycle.
I do not want to stifle this conversation. I would like to remove the beta-blocking status and take off the table the proposal to get this very risky change into the initial Drupal 8 release.
Comment #30
webchickYeah, I agree this is not a beta blocker, at least unless proven otherwise. The most likely scenarios to come out of this issue are either 1) closed (won't fix) because RDF gets enough work done on it, or 2) RDF module gets removed, and json_embed module or whatever happens in contrib, perhaps moved into core in 8.1.x or a later minor release. Either of which could happen anytime between now and RC1, most likely.
Comment #31
xtfer CreditAttribution: xtfer commentedOr both (1) and (2), theoretically, however I think this is a step in the right direction.
Comment #32
bkudrle CreditAttribution: bkudrle commentedJust some more thoughts along the lines of @jneubert...
The incorporation of RDFa and Semantic Web technologies into Drupal 7 was a strong attraction for me. It led to a couple of academic publications that in one sense evangelized the use of Drupal 7 for scientific applications. So to remove it from Drupal 8 core would seem a step backwards IMHO.
Also, in a case with much larger impact, the Structured Dynamics Group has created their Open Semantic Framework with Drupal 7 at its core. I am not familiar enough with the details of this framework, but their website says that Lullabot took over management of one of the sites based on this framework, so that is a good endorsement of the viability of this framework and, by extension, the advantage of Semantic Web technologies in core.
I have not been able to work much on Drupal for a couple of years, but I hope to in the near future with Drupal 8 and just wanted to voice the sentiment that it would be really nice in many ways to keep Drupal's cutting edge concerning the Semantic Web technologies by keeping these technologies in Drupal 8 core.
Comment #33
jhedstromIs this still under consideration for 8.0.x?
Comment #34
pwolanin CreditAttribution: pwolanin at Acquia commentedIf we are going to do it during the 8.x cycle, we'd need to add another model, while leaving RDF available. This is would be a new feature that might be suitable for 8.1.x
Comment #41
BerdirReviving this in light of #3031710: Remove scor from MAINTAINERS.txt.
As far as I see, there has not been a single commit to rdf.module in the last 3 years that was not about coding standards, deprecations or migrations. The reality is that the module is unmaintained, has been for a long time and IMHO also not meaningful anymore IMHO.
I'm not quite sure how we'd approach this in 8.x, we could add a hook_requirements() error that will tell users that this module will be removed in 9.x and they must either uninstall it or switch to a contrib version. If nobody else steps up, I'm willing to create a contrib project but I have no plans to actually maintain it, but if someone wants to, they can reach out then.
Sites that need the module can then already switch to the contrib version and it will use that instead and the requirements warning will go away.
I don't think that core needs to provide a replacement functionality to be able to replace it, modules like poll were also removed in the past without a 1:1 replacement. There are alternatives in contrib, we recently used https://www.drupal.org/project/schema_metatag successfully.
Comment #42
andypostComment #44
borisson_I think that a requirements hook in rdf.module seems like the best way to go here. Maybe we should add this to system.module as info as well so that people know that this can happen to their site in 9.x even when they don't have the rdf module installed?
Comment #45
e0ipsoIt seems that there are two topics. One about weather or not to remove RDF, and the other one is trying to come up with a core process to remove a module in the next major release.
All the data here is compelling. I'll give my +1 to move rdf to contrib without a core replacement.
That, of course, will drop feature, so we'll need product owners involved. Also, it would potentially impact end users so we'll need to come up with a process to have them to uninstall or switch to the contrib solution. I'm foggy on this point, but maybe @alexpott has thoughts.
Comment #46
naveenvalechaFew questions: I don't see in the deprecation policy. How to deprecate a module in a minor release cycle? It looks like this is a topic of discussion, how to deprecate a module?
+1 to deprecate it in 8.x cycle as there's not any active maintainer of the module and
We can also get the stats of the rdf module from the drupal infra team and add it to the issue. That would also be compelling data to take the decision.
Comment #47
catchWe already have deprecated modules in core, but we don't have a policy for actually deprecating stable modules as such. See #3013276: [META] Remove deprecated modules on the Drupal 9 branch.
Comment #48
xjmSo we should talk about whether we want to deprecate the module before 8.8.0-beta1, and if so, if we're comfortable doing it without adding a different feature to core as in the title (because that's unlikely to happen in the next month).
Comment #49
larowlanMy 2c here
- we actively use the RDF module on production sites to output microdata
- clients still actively ask for microdata
What other options are there for microdata on Drupal 8 at present?
If we remove this from core, is there a {thing} we can point people towards as the 'current best approach'?
Comment #50
BerdirWe used https://www.drupal.org/project/schema_metatag successfully in a project. https://www.drupal.org/project/json_ld_schema was also mentioned somewhere but I have no personal experience with it.
Comment #51
colanRather than constantly throwing things out of core, and then bringing other things back in, can we perhaps develop some sort of framework for one or more plug-in managers, which sit in core, for these types of things? Plug-ins can then site in contrib, generally. If this is possible, wouldn't it be much more sustainable than the never-ending-story of "What's the best machine-readable thingee to use now?"
Schema.org Metatag looks good, yes, but it assumes schema.org (obviously), which clearly isn't always what folks want.
Besides JSON LD Schema API, there's also JSON-LD REST Services, which does RDF.
Ultimately, it would be nice if I could go to some core admin form, and:
...depending on which plug-ins are installed.
As this is a bigger architectural issue, it might be better to resolve it first, either by postponing this issue on a new one, or repurposing it.
For what it's worth, I came across this issue while doing research on Exposing Drupal's Taxonomy Data on the Semantic Web. As I'm not the only one to run into these types of issues, it would be fantastic if we could agree on an architecture that helps everyone.
Comment #52
webchickThat indeed seems like a good idea, if we can figure out a way to make it performant.
Wearing my "product manager" hat, the #1 thing we need to accomplish over the next few months is make the upgrade from Drupal 8 to 9 easy. https://dri.es/making-drupal-upgrades-easy-forever
Given that, I'm not super keen on deprecating/removing anymore stuff in D8 unless for very good reason (e.g. security). Every single one of these changes adds to a growing list of things our end users need to tweak/fiddle with between major versions, and it becomes "death by a thousand cuts" to the point that people say "eff it" and choose to replatform onto something else entirely that requires less tweaking, and less fiddling.
The main reasons given here seem to be that no one's maintaining it (OTOH, it's also not a major source of bugs, either), and that the recommended standards have shifted (which, at least according to one source of data I was able to find: http://webdatacommons.org/structureddata/#results-2018-1 RDFa is definitely not the most prevalent, but is still in fairly widespread use). It also seems to be part of the HTML5 spec: https://www.w3.org/TR/rdfa-in-html/
I dunno, this isn't really my area; it's much more a "frameworky" feature. Just a general plea from the product managers to stay laser focused on making sure D9 is as smooth a trip as possible. 🙏
Comment #53
catch@colan #51, that seems tricky, because RDF is rendered directly with different components (partly why it was so difficult to introduce and why supporting it for different new elements in contrib is hard), whereas JSON LD is a blob of data on the page somewhere. i.e. it's not just a different format but a different delivery mechanism too. This might work for some options though but not at all for what we currently have in core.
@webchick #52. While I agree 8.x-9.x smoothness should be the priority, this issue or a previous one previously drifted during the 8.0.x alpha phase because we were trying to get 8.0.x released. So if we don't deprecate in 8.8.x, I think we should try to deprecate quite soon after 9.x is opened, for 10.0.x. That way we won't be trying to discuss this two weeks before 11.0.x is opened.
For me personally, RDF while it's unmaintained, as webchick points out is also pretty stable/harmless, so I think deprecating for 10.0.x is a reasonable option here. I do think given there are equal or better options in contrib that just allowing contrib to provide this is fine. If we do try to do that, the change record should summarise the pros and cons of the different contrib modules to help people choose.
Comment #54
webchickIf we're discussing deprecating it in Drupal 9 for removal in Drupal 10, that definitely could be on the table.
Then I would personally prefer to see it replaced with something else, versus just removed and booted to contrib. Drupal being able to generate some form of semantic output by default seems desirable as a core feature. This is important for SEO, etc. (If this is not possible, so be it, but that would be my preference.)
I have no idea if JSON is the preferable/modern way of doing this, so would defer to others on that. "Microdata" is the #1 thing on that set of bar graph charts, so if those are the same thing, even better!
Comment #56
catchSo a new issue that has come up, is that easyrdf (which we don't use at runtime, but is a big dependency for RDF module's test coverage) is not PHP 7.4 compatible, and the project looks more or less unmaintained. This means we'll either need to fork it or otherwise refactor RDF module's test coverage in order for it to pass on PHP 7.4.
This is the beginnings of a maintenance burden for RDF which we've not really run into until the past couple of weeks, but should be considered here I think.
Comment #57
larowlanAdded #3090017: Isolate test dependency on easyrdf/easyrdf to a single trait to attempt that
Comment #59
catchUpdated the issue summary a bit, could use more work.
Comment #60
ressa CreditAttribution: ressa at Ardea commentedI recently added structured metadata to a web site, and after a bit of research found that JSON LD is currently one of the more popular formats, and will most likely become the dominant one. I implemented it with the https://www.drupal.org/project/metatag and https://www.drupal.org/project/schema_metatag modules.
Between 2018 and 2019 (http://webdatacommons.org/structureddata/2019-12/stats/stats.html) JSON LD usage by domains increased with ~1,265,000 whereas RDFa usage decreased with ~343,000:
Usage by domains 2018 to 2019
html-microdata is still number #1, but like RDFa it is embedded within the HTML of the website, which could complicate the addition and removal of structured data. JSON LD on the other hand is a chunk of separate data.
See also Schema.org And Metadata in Drupal.
Comment #61
catchAccording to the stats on #3158669: By default deprecate non-experimental modules that are used by less 5% of sites before the next major version, around a quarter of remaining RDF usages are from Drupal sites, probably because RDF module is enabled by default in the standard profile (i.e. many sites won't have made a conscious decision to use it).
Comment #62
ressa CreditAttribution: ressa at Ardea commentedThat's interesting @catch, but isn't that only the Drupal 8 stats? Looking at #2867597: Top Drupal 7 and Drupal 8 core sub-modules, it seems like RDF is also enabled by default in Drupal 7, so could we add another ~530,000 (80% of 672,250 sites using D7) if we include Drupal 7?
It looks like Drupal 9 also has RDF enabled by default, so a rough estimate could be 80% of the Total number of installs, excluding Drupal 5 and 6, which is 1,075,115 - 32,518 = 1042597. 80% of that result is ~834,000, which means that Drupal 7, 8 and 9 could count for as many as ~834,000 out of the 1,039,623 domains currently using RDF, which is more than 80%.
Comment #63
catch@ressa that sounds right!
Comment #65
xjmMoving this to the Ideas queue for discussion there. (We separate the policy discussion from implementation for these since there are different needs.)
Comment #66
xjmComment #67
Gábor HojtsyParenting to the Drupal 10 deprecations meta.
Comment #68
catchI asked @webchick if she had more thoughts since #59, and she said that she still has the same opinion (i.e. would prefer replacing rather than removing with no core alternative), but wouldn't block signing off on removal either.
For me personally, our last chance to deprecate RDF for Drupal 10 is in the next few months, whereas we could introduce a new schema.org experimental module at any time between now and during the Drupal 10 cycle, so it would be good to deprecate and point people to contrib alternatives for now, and then if there's a good core candidate (either from contrib or from scratch) I do think it's something useful to have in core since pretty much any public facing site benefits from it.
Comment #69
Gábor HojtsyReading most of the recent comments, I am not sure how the contrib options are equal or better as per @catch from #53. For example, jsonld says it depends on the core RDF mappings in the first place. So its not much of a replacement as much as a different output format (on a different endpoint?). How would wrapping tha data in a different format solve the problems with compound fields and others?
Comment #70
catchI'm not the best person to answer this, but as I understand it the main difference is this:
RDF: Requires markup inline next to the things being represented - so for example an author name might have some RDF markup right next to it on the page, telling the machine what it is. This was very technically challenging when developing RDF, we had to make changes to the render system and every element needs to be compatible.
JSON/everything else: uses a 'blob of JSON' in the header somewhere, which holds the metadata about the page author, along with metadata for other page elements.
Comment #71
bbralaTo add to that; an simple example of json ld.
Comment #72
NiklanAdded another contrib module for that.
Comment #73
catchTo expand on #68, here's an example of the extremely close coupling of RDF with the theme layer:
https://api.drupal.org/api/drupal/core%21modules%21rdf%21rdf.module/func...
Comment #74
effulgentsia CreditAttribution: effulgentsia at Acquia commentedFrom a framework management perspective, I agree that embedded JSON-LD would be a better implementation for Drupal than RDFa. I don't think JSON-LD was far enough along when we first put RDFa into Drupal core, but now it is.
I share @webchick's concern in #68 about putting core, and therefore Standard profile, into a state where no schema.org output of any kind is output. That does feel like a regression to me as well. However, core doesn't include any SEO modules to begin with, so people who want SEO-friendly sites are already having to get https://www.drupal.org/project/metatag and others from contrib. Given that, having to also get https://www.drupal.org/project/schema_metatag from contrib doesn't seem like a big extra leap.
Originally when we put RDF into core, the thinking was to make the internet better overall by having Drupal sites by default including machine-readable structured data, even for sites who don't proactively want to increase their search engine friendliness. We'd be losing that by punting to contrib, because then only sites motivated by SEO reasons will end up taking the extra step of installing the contrib module.
I realize that Drupal core's RDF module is currently unmaintained, but what's the current state of the problems/risks associated with that? Is it just the dev dependency on #56 that's the biggest risk, or have other significant issues surfaced since then?
I guess my overall feeling is that in terms of framework management, +1 on removing an unmaintained core module that has a solution in contrib that both implements a better technical decision (for 2021, even though that wasn't the better technical decision in 2010) and is better maintained. However, I also don't think those reasons alone are enough to outweigh a desire from product managers to have some schema.org output within Drupal generated pages as a product feature of the Standard profile, unless we have more solid arguments for how the RDF module is creating problems for us.
Comment #75
DamienMcKenna+1. This is past due.
I would suggest removing RDF and leaving the space to contrib modules, or at the very least separating the two goals (1. removing RDF. 2. adding modern microdata formatting to core).
Disclaimer: I'm somewhat biased as lead maintainer of Metatag.
Comment #76
catch@effulgentsia
The main issue from my perspective is the close coupling with the theme layer and the reliance on preprocess, for example https://api.drupal.org/api/drupal/core%21modules%21rdf%21rdf.module/func...
We have longstanding (although not very active) issues to massively reduce our reliance on preprocess, for example #2702061: Unify & simplify render & theme system: component-based rendering (enables pattern library, style guides, interface previews, client-side re-rendering).
Switching to a format which is 'single blob of data somewhere on the page' as opposed to 'lots of little bits of data intertwined with HTML' immediately removes all of that need for preprocess.
Comment #77
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThanks! Yeah, I can see how #76 is compelling.
The biggest barrier that I see for existing RDF module users switching to https://www.drupal.org/project/schema_metatag, other than having to find it and
composer require
it, is that that module doesn't come with default config (and arguably shouldn't), which means when you first enable it, you don't get any JSON-LD output at all, and have to explicitly add the metatags that you want and populate them with the correct tokens for where that data is in the site. Core's rdf module doesn't include default config either, but the Standard profile includes default rdf mappings, so the user doesn't need to do any work to have their site outputting structured data.I wonder if a prerequisite for removing RDF module from core should be to create a contrib module that provides default config for schema_metatag that outputs (approximately) equivalent data that you currently get with Standard profile's RDF config. That way, the migration instruction for existing D9 sites (that didn't customize their RDF mappings, which is probably the vast majority of them) could just be to install that module.
I don't know enough about Metatag module to know how easy or hard this would be. For example, the
metatag.metatag_defaults.node
config object might already exist on the site (for other, not Schema.org, metatags), and I don't know if schema_metatag's metatags would need to be merged into that config object or if such a module could provide its own config objects that don't conflict with existingmetatag_defaults
ones.Comment #78
DamienMcKennaThere's an architectural gap in Metatag around assigning new default values when a submodule is enabled with new meta tags, right now it isn't technically (easily?) possible, but maybe it's something to look at in #2826669.
Comment #79
webchickApparently one of the reasons this isn't moving forward is because I voiced strong opinions back in the day about this. I no longer have strong opinions about this (or indeed, most Drupal things) these days. :)
Less flippantly, unlike with something like Multisite, where there is always end-user pushback whenever we propose removing it, this issue has been around for almost a decade, and has seen no similar pushback. And it's a fair point that anyone who cares about SEO has to go and download several contrib modules today anyway (sad panda).
So if the rest of the reasons to do this are valid, no need to hold it up on my account.
Comment #80
effulgentsia CreditAttribution: effulgentsia at Acquia commentedTaking one tag off per #79.
For framework manager review, I'm +1 for moving the RDF module as-is from core to contrib, so if that's what we want to make this issue's scope, I'd remove that tag. For the current issue title, I'm not confident in that due to #77. However, I would not be opposed to another framework manager removing that tag if they don't think #77 needs to be a blocker to us recommending schema_metatag as the preferred replacement.
Comment #81
catchFor deprecating from core, providing the same RDF module in contrib should be enough - that allows people to maintain their current site configuration without any changes.
However I think it's also a good idea in the change record to point out schema metatag (and other modules) exist too.
Comment #82
quietone CreditAttribution: quietone at PreviousNext commentedComment #83
catchI think all the release managers are agreed on moving RDF to contrib, so removing that tag.
We need a core implementation issue(s) next, probably one issue to isolate RDF support to the module, one to deprecate in 9.4.x, and one to remove in 10.0.x
Comment #84
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI think #83 makes this policy issue RTBC, so doing so for visibility. To mark it fixed, we probably need to open the implementation issue per #83. Do we also want to communicate this decision in some way, or wait until if and when the implementation issues are done before doing that?
Comment #85
catchI think we need someone to volunteer to create the contrib project before we can do the core implementation issues. Communication can probably happen as the implementations are landing or even afterwards.
Comment #86
effulgentsia CreditAttribution: effulgentsia at Acquia commentedComment #87
effulgentsia CreditAttribution: effulgentsia at Acquia commentedComment #88
effulgentsia CreditAttribution: effulgentsia at Acquia commentedUpdated IS for #85 and other cleanup.
Comment #89
effulgentsia CreditAttribution: effulgentsia at Acquia commentedComment #90
quietone CreditAttribution: quietone at PreviousNext commentedAdded child issue for tracking the move of RDF from core to contrib, #3267267: [Meta] Tasks to deprecate RDF
Comment #91
quietone CreditAttribution: quietone at PreviousNext commentedAs for implementation there is also #3273976: [Meta] Tasks to remove RDF from core and move to contrib and the issue to approve RDF maintainers for the contrib version #3304913: Offering to co-maintain RDF.
There is nothing left to do in the remaining issues.
Comment #92
quietone CreditAttribution: quietone at PreviousNext commentedThis work has been completed in core.
Thanks!