Throw exception when RDF namespaces collide [#1778410]

Updated: Comment #44

Problem/Motivation

The RDF module assumes that each prefix is mapped to a unique namespace URI and that this namespace will never change in time. This is because these prefixes are used as is in Drupal config, so changing the namespace would mean a change in the meaning of the mapping. This is easy to avoid in core since there is only one single implementation of hook_rdf_namespaces(), but contrib or custom modules might try to register a prefix which is already registered by another module with a different namespace, and thus end up with conflicting namespaces for the same prefix. This is unlikely to happen in practice given that popular namespaces usually have a unique best practice prefix associated to them, but we need a way to warn developers if they try to register conflicting namespaces.

Proposed resolution

Throw an exception if this situation occurs.

Remaining tasks

Review patch.

User interface changes

none

API changes

none. No change record required since conflicting namespaces were already not supported in Drupal 7, it's just that Drupal 8 will complain loudly about it instead of silently ignoring them.

Older report:

Problem/Motivation

RDFa supports prefix-based indirection, a mechanism for abbreviating URIs. The prefix is a pointer to a full URI. When the local term is combined with the full URI, it creates the meaningful identifier for the term.

The Drupal implementation of this mechanism is flawed. It mistakes the prefix as the global identifier for the vocabulary, instead of the URI it points to. The mappings are stored in the database using CURIEs; e.g. schema:name. However, the mapping between prefix and URI can be changed independently.

Drupal's two collaborating hooks, hook_rdf_namespaces and hook_rdf_mapping, do not properly handle the indirection. The namespace mapping is set in hook_rdf_mapping, which compiles the list of namespace mappings dynamically on every call. However, the CURIEs are not dynamically generated, they are stored (using prefixes) in the database. (For reference, there are two [1] [2] previous issues regarding this).

This results in two problems.

Intra-module inconsistency

The namespace mappings are computed and then temporarily cached, so changes to the mapping are registered with every cache clear. The CURIEs that use those namespace mappings are stored in the database and never recomputed. Thus, if a namespace mapping is changed in a module and it uses a different prefix, the CURIEs will no longer expand properly.

Inter-module inconsistency

hook_rdf_mapping keep track of which module defined which namespace mapping. This causes the same problems that global variables pose in a modular system like Drupal—it is easy for two mappings with the same prefix to collide. If two modules define different namespace mappings for the same prefix, then the terms from one module will expand to the wrong URI.

Proposed resolution

Use a CURIE array for any mappings that use CURIEs. This CURIE array would be structured as follows:

array(
  'local_name' => 'interactionCount',
  'prefix' => 'schema',
  'prefix_module' => 'rdf',
);

Then, compute prefixes so that if the same prefix is used twice, one of them gets a number appended to the end. For example, if a contrib module schema defines the prefix "schema" to map to http://ex.com/my-schema, then any mappings which use the prefix "schema" with the prefix_module "schema" would have its CURIEs processed out with the prefix "schema1". Any mappings that use the prefix "schema" with the prefix_module "rdf" would have its CURIE processed out with the prefix "schema".

API changes

Removes hook_rdf_namespaces and replaces with configuration object.

Changes mapping with CURIE from:

    properties:
      - schema:author

    properties:
      - local_name: 'author'
        prefix: 'schema'
        prefix_module: 'rdf

Comment	File	Size	Author
#68	interdiff.txt	2.82 KB	kay_v
#68	8482667-68-rdf-conflicting-namespace.patch	4.67 KB	kay_v
#63	rdf-conflicting-namespace-1778410-63.patch	4.66 KB	ashepherd
#63	interdiff.txt	1.15 KB	ashepherd
#59	interdiff.txt	985 bytes	ashepherd
#59	rdf-conflicting-namespace-1778410-59.patch	3.52 KB	ashepherd
#58	interdiff.txt	1.37 KB	ashepherd
#58	rdf-conflicting-namespace-1778410-58.patch	3.52 KB	ashepherd
#55	interdiff-1778410-52-55.txt	2.18 KB	Stefan Freudenberg
#55	rdf-conflicting-namespace-1778410-55.patch	4.54 KB	Stefan Freudenberg
#52	rdf-conflicting-namespace-1778410-52.patch	5.58 KB	Stefan Freudenberg
#49	1778410-49-rdfnamespace-conflict_check.patch	1.38 KB	adorsk
#37	1778410-37-rdfnamespace-config__includes1784234.patch	63.87 KB	linclark
#37	1778410-37-rdfnamespace-config-do-not-test.patch	35.82 KB	linclark
#33	1778410-33-rdfnamespace-config_includes1784234.patch	63.82 KB	linclark
#33	1778410-33-rdfnamespace-config-do-not-test.patch	35.82 KB	linclark
#30	1778410-30-rdfnamespace-config_includes1869600.patch	169.62 KB	linclark
#30	1778410-30-rdfnamespace-config-do-not-test.patch	27.75 KB	linclark
#30	interdiff.txt	7.14 KB	linclark
#24	1778410-24-rdfnamespace-config_includes1869600.patch	168.17 KB	linclark
#24	1778410-24-rdfnamespace-config-do-not-test.patch	22.91 KB	linclark
#24	interdiff.txt	11.97 KB	linclark
#22	1778410-22-rdfnamespace-config_includes1869600.patch	165.56 KB	linclark
#22	1778410-22-rdfnamespace-config-do-not-test.patch	19.78 KB	linclark
#22	interdiff.txt	14.45 KB	linclark
#20	1778410-20-rdfnamespace-config.patch	149.22 KB	linclark
#17	1778410-17-rdfnamespace-config.patch	134.49 KB	linclark
#17	1778410-17-rdfnamespace-config__for-review.txt	26.03 KB	linclark
#15	1778410-15-rdfnamespace-config__for-testing.patch	85.05 KB	linclark
#15	1778410-15-rdfnamespace-config__for-review.txt	26.29 KB	linclark
#15	interdiff.txt	14.71 KB	linclark
#11	1778410-11-rdfnamespace-config__for-testing.patch	73.5 KB	linclark
#11	1778410-11-rdfnamespace-config__for-review.txt	14.21 KB	linclark
#11	interdiff.txt	8.94 KB	linclark
#9	interdiff.txt	7.41 KB	tim.plunkett
#9	rdf-1778410-9-do-not-test.patch	8.38 KB	tim.plunkett
#9	rdf-1778410-9-with-other-issue.patch	62.36 KB	tim.plunkett
#8	1778410-08-rdfnamespace-config-do-not-test.patch	6.27 KB	linclark

Comments

Comment #1

Anonymous (not verified) commented 10 September 2012 at 12:53

Component:	base system	» rdf.module
Issue tags:		+RDFa, +html data, +json-ld

Comment #2

Anonymous (not verified) commented 21 September 2012 at 17:21

#1791678: Change the way RDF mappings are stored describes a way to store each term mapping as its own record, instead of as a single serialized array.

Comment #3

scor commented 29 September 2012 at 05:38

The advanced registry handling is interesting and worth considering, thanks for posting this. It's powerful but more complex compared to the full URI. I'm not opposed to it though. Performance wise, although the advanced registry would require more queries and joins to build the mappings for each bundle, it would not have a major impact since the mappings used for entity rendering would be cached (the only impact would be when initializing the caches). In D7 we went for a serialized storage because we didn't need to filter down the mappings within a bundle, this might change in D8, let's see. I'd like to discuss the use cases you're hoping to solve with this approach.

When fields are deleted, it's hard to delete their mappings, which results in cruft in the database

I agree, this would be more straight forward to do with a db query.

Additionally, this would open the possibility to specify configuration on a term-by-term or vocab-by-vocab basis. For example, that only Schema.org mappings should show up in HTML, but all vocabularies should be used in JSON-LD.

Hum, are you thinking of the local vocab vs external mappings? I don't think the local vocab should be stored alongside the external mappings, I think the local vocab terms should be generated on the fly when needed and not clutter the external RDF mappings. Also, how is your idea above different from what you advocated a while back? I think that having one external vocabulary in one serialization and another combination of external vocabs in another serialization would lead to confusion and inconsistencies. I make one exception to this rule though, the local vocab, which is not necessary for some case and should be optional. My point is, external vocabs should remain together in all serializations. There might be additional data in some serializations, but the same data item should not have different mappings depending on which serialization you used to retrieve it. makes sense?

How do you expect people to debug this, compared to having the mappings serialized with a row for each bundle?

How would the import work in the context of features for example? Imagine I want to export my mappings and import them on another site, would there be any primary key id conflicts?

How does your proposal related to evoc? Are you trying to provide a solution for Neologism's vocabulary terms management (replacement for evoc)? I think at best Neologism could leverage this system, but it would require more fine grained provenance of the namespaces beyond the module level, and have groups instead.

Comment #4

Anonymous (not verified) commented 30 September 2012 at 18:25

Also, how is your idea above different from what you advocated a while back?

I'm suggesting that we allow people to choose a subset of vocabulary terms to expose the data. For example, if the startDate of an event is mapped to both schema:startDate and rdf_repo:startingTime, then the user could specify that only Schema.org is used in the HTML data, but both are used in a Turtle version, which will be consumed by the RDF repo that defines the rdf_repo vocab.

I believe providing such a subset is different than allowing developers to totally rearrange the model and insert triples that aren't in the model. If you provide a subset, there is still compatibility between the HTML data version and the other serializations... nothing that is said in the one contradicts what is said in the other. You can't say the same thing for an _alter.

How do you expect people to debug this, compared to having the mappings serialized with a row for each bundle?

I think this is related more to the #1791678: Change the way RDF mappings are stored than this issue, but I'll answer it here.

I honestly don't think that people are debugging the RDF mapping by examining the serialized array in the database... for a person, it is hard to parse a serialized array, especially if the mapping is large.

I believe that providing a row for each property makes it way easier to debug from the database than it is now.

How would the import work in the context of features for example? Imagine I want to export my mappings and import them on another site, would there be any primary key id conflicts?

You would simply write the RDF mapping feature code to handle this upon feature installation. Just because a particular ID is used to connect multiple things in feature code, that doesn't mean that the same ID needs to be saved to the database when the feature code is processed.

How does your proposal related to evoc?

It is not related to Evoc. It is simply a way to fix our broken prefix-based indirection. It doesn't include any vocabulary import.

Fixing the prefix-based indirection should be the main goal. I don't think it's worthwhile to consider Neologism too heavily in current D8 development, as there isn't any clear momentum for it to be ported to D7 at this point and it has a limited user base. However, this proposal would provide a foundation for Neologism, as it is based on a plan I worked on with the current Neologism project lead to ensure that RDFx in D7 would work for them.

Comment #5

tim.plunkett

he/him

English

Philadelphia

commented 5 December 2012 at 01:06

Coming from #1356170: Remove all uses of array_merge_recursive, or document why they are being used instead of NestedArray::mergeDeep(), rdf_get_namespaces() depends on module_invoke_all() using array_merge_recursive(), and then preforms its own filtering.

Furthermore, the actual behavior is that any conflicts result in the prefix not being added at all.
See the last assertion in GetRdfNamespacesTest::testGetRdfNamespaces().

This issue implies that none of the hook-based solutions are ideal, whether the first or last takes priority or none do. Any advice on how to move forward in that issue while not complicating this issue further?

Comment #6

scor commented 5 December 2012 at 15:55

FYI, it's very likely that this part of the rdf module will be refactored and we could definitely keep this requirement in mind when rewriting it.

Comment #7

Anonymous (not verified) commented 15 December 2012 at 23:27

I'm starting to think that an RdfVocabulary config entity might make sense. Then we could attach relevant metadata to the vocabulary as well.

Comment #8

Anonymous (not verified) commented 22 December 2012 at 04:33

Status:

Active

» Needs review

Status	File	Size
new	1778410-08-rdfnamespace-config-do-not-test.patch	6.27 KB

This patch creates an RdfNamespaces config entity and uses that to store the mappings. It also removes hook_rdf_namespaces.

I'm working on top of the patch in #1869600: Refactor RDF mappings to be inline with the new Entity Field API, so it might not apply directly.

Comment #9

tim.plunkett

he/him

English

Philadelphia

commented 23 December 2012 at 00:11

Issue tags:

+Configurables

Status	File	Size
new	rdf-1778410-9-with-other-issue.patch	62.36 KB
new	rdf-1778410-9-do-not-test.patch	8.38 KB
new	interdiff.txt	7.41 KB

I just have some feedback from the ConfigEntity front, attached as an interdiff.
I've also rerolled the above, also using the other issue as a starting point.

I think this is a reasonable usecase of them, but will there ever be a UI to create them? Otherwise it will be our first ConfigEntity with no UI.

Comment #10

23 December 2012 at 01:01

Status:

Needs review

» Needs work

The last submitted patch, rdf-1778410-9-with-other-issue.patch, failed testing.

Comment #11

Anonymous (not verified) commented 23 December 2012 at 16:34

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	8.94 KB
new	1778410-11-rdfnamespace-config__for-review.txt	14.21 KB
new	1778410-11-rdfnamespace-config__for-testing.patch	73.5 KB

Thanks Tim :)

The fails in Tim's patch are because the patch it depends on fails RDFa placement tests.

Since Tim's patch:

Renamed namespaceUri property to namespace_uri. That seems to be the config convention.
Introduced RdfNamespaceManager. This manager processes the RdfNamespace definitions, resolving conflicts. This doesn't solve the issue yet, because the individual CURIEs used to map types and fields need to be connected to the correct namespace definition. I will be posting that in the next patch.

Comment #12

23 December 2012 at 17:17

Status:

Needs review

» Needs work

The last submitted patch, 1778410-11-rdfnamespace-config__for-testing.patch, failed testing.

Comment #13

tim.plunkett

he/him

English

Philadelphia

commented 23 December 2012 at 18:14

The config key maps to the name of the property in the ConfigEntity, which are lowerCamelCase, so that needs to be reverted.

Comment #14

Anonymous (not verified) commented 24 December 2012 at 14:36

I'm confused, the handbook page on the Configuration API uses underscores, http://drupal.org/node/1762632... see some_string, some_int, some_bool

Comment #15

Anonymous (not verified) commented 24 December 2012 at 15:35

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	14.71 KB
new	1778410-15-rdfnamespace-config__for-review.txt	26.29 KB
new	1778410-15-rdfnamespace-config__for-testing.patch	85.05 KB

This patch adds RdfNamespaceManager::prepareCuries() and tests for it. The method accepts an array of CURIES in the following form:

<?php
array(
  // This uses a namespace definition from RDF module, so just use the CURIE as is.
  'schema:Article',
  // This uses a namespace definition from another module. Indicate which module defines the prefix.
  array(
    'curie' => 'schema:Foobar',
    'prefix_module' => 'my_module',
  ),
);
?>

The resulting array would be:

<?php
array(
  'schema:Article',
  'schema1:Foobar',
);
?>

Comment #16

24 December 2012 at 16:23

Status:

Needs review

» Needs work

The last submitted patch, 1778410-15-rdfnamespace-config__for-testing.patch, failed testing.

Comment #17

Anonymous (not verified) commented 29 January 2013 at 19:11

Status:

Needs work

» Needs review

Status	File	Size
new	1778410-17-rdfnamespace-config__for-review.txt	26.03 KB
new	1778410-17-rdfnamespace-config.patch	134.49 KB

Reroll.

Comment #18

29 January 2013 at 22:34

Status:

Needs review

» Needs work

The last submitted patch, 1778410-17-rdfnamespace-config.patch, failed testing.

Comment #19

Anonymous (not verified) commented 3 June 2013 at 17:18

Title:

When RDF namespaces collide, terms expand incorrectly

» RDF namespaces collisions result in invalid CURIEs

Renaming to better describe the issue. I'll be rerolling shortly.

Comment #20

Anonymous (not verified) commented 3 June 2013 at 18:25

Status:

Needs work

» Needs review

Status	File	Size
new	1778410-20-rdfnamespace-config.patch	149.22 KB

Rerolled. This will probably fail miserably.

Comment #21

4 June 2013 at 00:10

Status:

Needs review

» Needs work

The last submitted patch, 1778410-20-rdfnamespace-config.patch, failed testing.

Comment #22

Anonymous (not verified) commented 10 June 2013 at 17:32

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	14.45 KB
new	1778410-22-rdfnamespace-config-do-not-test.patch	19.78 KB
new	1778410-22-rdfnamespace-config_includes1869600.patch	165.56 KB

I've incorporated the changes from the RDF mapping issue, which should help with test failures.

In addition, I've changed the config naming to be a bit more standard and have removed the custom storage controller.

Comment #23

10 June 2013 at 19:55

Status:

Needs review

» Needs work

The last submitted patch, 1778410-22-rdfnamespace-config_includes1869600.patch, failed testing.

Comment #24

Anonymous (not verified) commented 12 June 2013 at 20:00

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	11.97 KB
new	1778410-24-rdfnamespace-config-do-not-test.patch	22.91 KB
new	1778410-24-rdfnamespace-config_includes1869600.patch	168.17 KB

This patch introduces a RdfUri class which handles conversion between a CURIE array and a CURIE string.

Comment #25

12 June 2013 at 21:26

Status:

Needs review

» Needs work

The last submitted patch, 1778410-24-rdfnamespace-config_includes1869600.patch, failed testing.

Comment #26

scor commented 12 June 2013 at 21:47

Status:

Needs work

» Needs review

+++ b/core/modules/rdf/lib/Drupal/rdf/Plugin/Core/Entity/RdfMapping.php
@@ -81,6 +82,17 @@ public function getBundleMapping() {
+    // Transform CURIE arrays into CURIEs.
+    if (!empty($types)) {
+      foreach ($types as $key => $type) {
+        if (is_array($type)) {

are there situations where $type may not be an array? is that to allow full URIs as types, or related to the temporary CURIEs as array approach?

+++ b/core/modules/rdf/lib/Drupal/rdf/Plugin/Core/Entity/RdfMapping.php
@@ -176,4 +209,20 @@ public function getExportProperties() {
+  /**
+   * Temporary function to be used during development to test the concept of
+   * using arrays everywhere without changing everything. If arrays work, then
+   * this will be removed and all CURIE strings will be changed to arrays.
+   */
+  protected function tempConvertCurieToArray($curies) {

ok. did you want to add a @todo to remember to evaluate that later, so we don't forget it?

+++ b/core/modules/rdf/lib/Drupal/rdf/RdfUri.php
@@ -0,0 +1,55 @@
+  /**
+   * Constructor.
+   *
+   * @param array|string $uri_info
+   *   Either a URI string or a CURIE array in the following structure:
+   *     - local_name: The term name. E.g. name in schema:name.
+   *     - prefix: The prefix. E.g. schema in schema:name.
+   *     - prefix_module: The module that defines the mapping between the
+   *       prefix and its full URI.
+   */
+  public function __construct($uri_info) {
+    if (is_array($uri_info)) {
+      $this->localName = $uri_info['local_name'];
+      $this->prefix = $uri_info['prefix'];
+      $this->prefixModule = $uri_info['prefix_module'];
+    }
+    $this->namespaceManager = \Drupal::service('rdf.namespace_manager');

The docs say that $uri_info can be a string, but in this case why is it not stored in some property of the object? also looks like the tests don't test the case where CURIEs are strings (at least in the interdiff, but might have missed something).

Comment #27

scor commented 12 June 2013 at 22:02

This patch introduces a RdfUri class which handles conversion between a CURIE array and a CURIE string.

I'm not sure where the idea of CURIE as array came from? not that I'm against it, but curious to know why you think it's better than CURIEs as strings. Doesn't it add a level of complexity for the developer who sets up the mappings in the YAML file? or is the array just a data structure used behind the scenes in the code logic, but no visible in YAML? If this impact the YAML, could you post an example.

Also, it seems the issue summary is outdated after these discussions. For example, it seems we won't really need to worry about any kind of database storage since we're using a config entity now.

Comment #28

Anonymous (not verified) commented 12 June 2013 at 22:53

The CURIEs as arrays have been part of this patch since around #15, and I discussed it in #11. Using the array makes it possible to indicate which module defines the prefix. Yes, it would impact the YAML... the CURIE string would be replaced by the array.

Prefixes should never be used as persistent identifiers in and of themselves... URIs are identifiers. Prefixes in RDF are inherently rebind-able. The prefix is just a temporary alias, which could be re-aliased in the next request. Therefore, you shouldn't use them as identifiers in your persistence layer.

Regarding adding complexity... it doesn't add complexity, it simply reveals complexity. What we are doing by using CURIE strings, to borrow phrasing from Benji Nowack, isn't reducing complexity... we're just hiding it. And we're hiding it in a way that makes things break. When someone just pastes "schema:name" into a YAML file, they don't understand that some other part of the system that they've never looked at or touched will dictate whether this is a valid CURIE or not. By making it clear that they have to indicate the prefix_module, we signal how exactly this works. We take the dark magic out of it.

We could still provide UIs that do not expose this complexity, because it could be handled in the code behind the scenes. But I don't think that hiding the complexity at the API level is a good thing.

Regarding the implementation comments (the todo and the docs inconsistency), this patch was simply meant to get your feedback on the approach before fully committing to it.

Comment #29

scor commented 13 June 2013 at 18:53

oh, I didn't know you were talking about #15. sure that looks good, let's go ahead with that.

Comment #30

Anonymous (not verified) commented 15 June 2013 at 23:39

Status	File	Size
new	interdiff.txt	7.14 KB
new	1778410-30-rdfnamespace-config-do-not-test.patch	27.75 KB
new	1778410-30-rdfnamespace-config_includes1869600.patch	169.62 KB

This patch adds documentation for the RdfUri class, adds support for passing in full URIs, and makes the tests that were failing pass.

Comment #31

16 June 2013 at 00:34

Status:

Needs review

» Needs work

The last submitted patch, 1778410-30-rdfnamespace-config_includes1869600.patch, failed testing.

Comment #32

scor commented 25 June 2013 at 02:23

Great news! The main refactoring patch for RDF was committed (#1869600: Refactor RDF mappings to be inline with the new Entity Field API), so this issue can move forward now.

Comment #33

Anonymous (not verified) commented 28 June 2013 at 15:15

Status:

Needs work

» Needs review

Status	File	Size
new	1778410-33-rdfnamespace-config-do-not-test.patch	35.82 KB
new	1778410-33-rdfnamespace-config_includes1784234.patch	63.82 KB

OK, this now includes the updated Schema.org mappings, and removes the temp function.

Comment #34

scor commented 28 June 2013 at 21:25

Status:

Needs review

» Needs work

+++ b/core/profiles/standard/config/rdf.mapping.taxonomy_term.tags.yml
@@ -2,11 +2,17 @@ id: taxonomy_term.tags
 fieldMappings:
   name:
     properties:
-      - 'schema:name'
+      - local_name: 'name'
+        prefix: 'schema'
+        prefix_module: 'rdf'

I appreciate the concerns in #28, I think the way you solved it with the array is good. Given that we won't have a mapping UI in core, should we try to flexible model in YAML to let people edit their mappings there directly? I found it quite handy when testing the other the earlier RDF patches to be able to do just edit my YAML files using CURIEs directly. I feel that we're losing in terms of DX when we're switching from:

   title:
     properties:
       - 'schema:name'
   created:
     properties:
       - 'schema:dateCreated'
     datatype_callback: 'date_iso8601'
   body:
     properties:
       - 'schema:text'

   title:
     properties:
       - local_name: 'name'
         prefix: 'schema'
         prefix_module: 'rdf'
   created:
     properties:
       - local_name: 'dateCreated'
         prefix: 'schema'
         prefix_module: 'rdf'
     datatype_callback: 'date_iso8601'
   body:
     properties:
       - local_name: 'text'
         prefix: 'schema'
         prefix_module: 'rdf'

Is there a way we could still support the advanced syntax for those who want to define fancy prefixes, but let those who just want to use the prefixes defined by core (which are known, stable, and always enabled) write CURIEs like in the first snippet above? In other words, if it's a string, then use the prefixes defined by rdf.module. In fact if I use a string in the YAML, it works, and it outputs as is, so this scenario is possible, except there is no validation. So it's up to the developer to check there is no typo. Just wondering whether we should document this practice, or even encourage it for the majority of the users?

Comment #35

Anonymous (not verified) commented 28 June 2013 at 21:59

I think that inconsistency will only confuse the issue for novice users. I would prefer to have the API be explicit and consistent.

Comment #36

Anonymous (not verified) commented 28 June 2013 at 22:07

In fact if I use a string in the YAML, it works, and it outputs as is, so this scenario is possible, except there is no validation.

Also, we can't really do validation on CURIE strings if we also want to allow URIs for the values. Even though CURIEs are not URIs, they fit the URI pattern.

<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]

Therefore, if we throw exceptions for "prefixes" that we don't recognize, we might be rejecting a valid URI.

So if we want to do validation to ensure that the prefix is defined and also allow arbitrary schemes in URIs, I think we have to use the array structure.

Comment #37

Anonymous (not verified) commented 3 July 2013 at 15:47

Status:

Needs work

» Needs review

Status	File	Size
new	1778410-37-rdfnamespace-config-do-not-test.patch	35.82 KB
new	1778410-37-rdfnamespace-config__includes1784234.patch	63.87 KB

Reroll.

Comment #37.0

Anonymous (not verified) commented 3 July 2013 at 15:47

Issue summary:

View changes

Updated issue summary.

Comment #37.1

Anonymous (not verified) commented 3 July 2013 at 20:13

Issue summary:

View changes

Switch proposed solution.

Comment #37.2

Anonymous (not verified) commented 3 July 2013 at 20:19

Issue summary:

View changes

Add explanation of CURIE processing.

Comment #37.3

Anonymous (not verified) commented 3 July 2013 at 20:36

Issue summary:

View changes

Added API changes.

Comment #37.4

Anonymous (not verified) commented 3 July 2013 at 20:37

Issue summary:

View changes

Fix broken end tag.

Comment #38

tim.plunkett

he/him

English

Philadelphia

commented 17 July 2013 at 17:32

Priority:	Normal	» Critical
Issue tags:		+Approved API change

As part of #1802750: [Meta] Convert configurable data to ConfigEntity system

Comment #39

catch

he/him

English

commented 3 September 2013 at 13:18

Issue tags:

+D8 upgrade path

This looks like it'll require a HEAD-HEAD upgrade path if it goes in after beta, so tagging for upgrade path.

Comment #40

Anonymous (not verified) commented 23 September 2013 at 12:18

Top. This needs review.

Comment #41

catch

he/him

English

commented 16 October 2013 at 19:27

Issue tags:

-RDFa, -D8 upgrade path, -Configurables, -html data, -json-ld, -Approved API change

#37: 1778410-37-rdfnamespace-config__includes1784234.patch queued for re-testing.

Comment #42

16 October 2013 at 19:36

Status:

Needs review

» Needs work

The last submitted patch, 1778410-37-rdfnamespace-config__includes1784234.patch, failed testing.

Comment #42.0

(not verified) commented 16 October 2013 at 19:36

Issue summary:

View changes

Change h3 to h2.

Comment #43

alexpott

he/they

English

🇪🇺🌍

commented 12 November 2013 at 20:12

Issue summary:	View changes
Issue tags:		+RDFa, +D8 upgrade path, +Configurables, +html data, +json-ld, +Approved API change

Restoring tags eaten by the d6 tag monster

Comment #44

scor commented 11 December 2013 at 21:14

This is unfortunate, but we can live with this. I don't think the added complexity in the mapping (and performance impact) proposed in this issue is worth it, especially given that it would only benefit a very small portion of use cases. Let's look at the audience of this issue first: the majority of the sites will only use a handful of popular vocabularies (e.g. schema.org and possibly SKOS, provenance, VOID, dcat), and that's all they will need. In the 0.1% of the other use cases, they might require more of the long tail of vocabularies. To avoid any kind of prefix conflict, I've created a documentation page: RDF namespaces registry. It lists all the popular vocabulary prefixes that people can pick from as needed. Secondly, people should never change their prefix bindings. I've never seen this problem in the wild. I don't think Drupal needs to support such feature.

Let's close this issue.

Comment #45

jneubert commented 15 December 2013 at 10:17

+1 for closing.

For the 0.1% using long tail vocabularies, and don't find their vocabs on RDF namespaces registry, perhaps it could be suggested there to

check their prefix at http://prefix.cc
add it on the page (without creating any duplicates for prefixes nor vocabularies!)

Comment #46

Anonymous (not verified) commented 15 December 2013 at 16:57

So there was at least one case of this in contrib where two modules defined the og prefix differently. Yes, one didn't do it in a standard way, but that doesn't mean that users who installed the two modules should expect the prefix declaration to fail.

Silent failure is bad design for something that few people understand or think to check. The system could at least throw an exception when prefixes are defined in a way that would cause the system to drop the prefix from the list of namespace declarations.

That is my two cents. As I am resigning as a maintainer of this module, I will not be adding any further objections.

Comment #47

scor commented 15 December 2013 at 17:37

Title:	RDF namespaces collisions result in invalid CURIEs	» Throw exception when RDF namespaces collide
Priority:	Critical	» Major
Status:	Needs work	» Active

Silent failure is bad design for something that few people understand or think to check. The system could at least throw an exception when prefixes are defined in a way that would cause the system to drop the prefix from the list of namespace declarations.

Thanks Lin for the suggestion, this is something that would be trivial to add so that this problem gets raised as early as possible during site development process.

NB: I will be offline until Dec 23rd and then on and off until Jan 7th.

Comment #48

scor commented 22 January 2014 at 17:03

Issue tags:

+RDF code sprint

Comment #49

adorsk commented 25 January 2014 at 21:28

Priority:

Major

» Minor

Status	File	Size
new	1778410-49-rdfnamespace-conflict_check.patch	1.38 KB

Here's a first pass at this, let me know what you think.

Comment #50

adorsk commented 25 January 2014 at 21:31

Status:

Active

» Needs review

Comment #51

26 January 2014 at 15:37

Status:

Needs review

» Needs work

The last submitted patch, 49: 1778410-49-rdfnamespace-conflict_check.patch, failed testing.

Comment #52

Stefan Freudenberg commented 5 February 2014 at 02:56

Status	File	Size
new	rdf-conflicting-namespace-1778410-52.patch	5.58 KB

I have created a new patch with a test that checks whether the exception is thrown.

Comment #53

Stefan Freudenberg commented 5 February 2014 at 02:58

Status:

Needs work

» Needs review

Comment #54

scor commented 5 February 2014 at 13:41

This looks great, Stefan!

Let's avoid adding a new test GetRdfNamespacesConflictTest, and instead, test for the exception in the existing GetRdfNamespacesTest (the tests will run faster that way). For that, all you need is to enable the rdf_conflicting_namespaces module midway through testGetRdfNamespaces(), and place your code that catches the exception right afterwards....

Comment #55

Stefan Freudenberg commented 5 February 2014 at 15:11

Status	File	Size
new	rdf-conflicting-namespace-1778410-55.patch	4.54 KB
new	interdiff-1778410-52-55.txt	2.18 KB

See attached patch and interdiff implementing previous comment.

Comment #56

scor commented 5 February 2014 at 17:19

Priority:	Minor	» Normal
Status:	Needs review	» Needs work

+++ b/core/modules/rdf/rdf.module
@@ -114,9 +113,16 @@ function rdf_get_namespaces() {
+          $namespaces[$prefix] = $namespace;
+      }
     }
   }
+  }
   return $namespaces;

indentation issues here.

+++ b/core/modules/rdf/lib/Drupal/rdf/Tests/GetRdfNamespacesTest.php
@@ -39,6 +39,15 @@ function testGetRdfNamespaces() {
-    $this->assertEqual($ns['dc'], 'http://purl.org/dc/terms/', 'When a prefix has conflicting namespaces, the first declared one is used.');
+
+    \Drupal::moduleHandler()->install(array('rdf_conflicting_namespaces'), TRUE);
+
+    try {
+      $ns = rdf_get_namespaces();
+      $this->fail(t('Expected exception not thrown for conflicting namespace declaration.'));
+    }
+    catch (\Exception $e) {
+      $this->pass(t('Expected exception thrown: @message', array('@message' => $e->getMessage())));
+    }

Maybe add a short comment to explain what's going on here, why we're suddenly enabling that module.

Comment #57

scor commented 5 February 2014 at 17:38

Issue summary:

View changes

Comment #58

ashepherd commented 7 February 2014 at 18:53

Status:

Needs work

» Needs review

Status	File	Size
new	rdf-conflicting-namespace-1778410-58.patch	3.52 KB
new	interdiff.txt	1.37 KB

Addressed the problems of comment 56

Comment #59

ashepherd commented 7 February 2014 at 18:57

Status	File	Size
new	rdf-conflicting-namespace-1778410-59.patch	3.52 KB
new	interdiff.txt	985 bytes

Missed a space in the comment. This patch removes that extraneous space character

Comment #60

7 February 2014 at 20:02

Status:

Needs review

» Needs work

The last submitted patch, 59: rdf-conflicting-namespace-1778410-59.patch, failed testing.

Comment #61

7 February 2014 at 20:46

The last submitted patch, 58: rdf-conflicting-namespace-1778410-58.patch, failed testing.

Comment #62

scor commented 8 February 2014 at 01:23

@ashepherd: The files rdf_conflicting_namespaces.info.yml and rdf_conflicting_namespaces.module are missing from the last patches (they were present in #55). Patch #55 should be re-applied so you can get these two files, and then you need to "git add" these 2 files and commit them to your branch for them to be part of your patch. (the other files that are affected by the patch will fail to get patched, but that's fine, you can just ignore them).

Comment #63

ashepherd commented 8 February 2014 at 12:17

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	1.15 KB
new	rdf-conflicting-namespace-1778410-63.patch	4.66 KB

added the missing rdf_conflicting_namespaces files that were present in #55

Comment #64

scor commented 8 February 2014 at 13:14

Status:

Needs review

» Reviewed & tested by the community

Thanks @ashepherd, this patch looks good now.
(aside: Nice to see the issue metadata switches back in the d.o comment form!)

Comment #65

ashepherd commented 8 February 2014 at 13:16

@scor, no problem! thanks for showing me the ropes

Comment #66

alexpott

he/they

English

🇪🇺🌍

commented 14 February 2014 at 10:54

Status:

Reviewed & tested by the community

» Needs work

+++ b/core/modules/rdf/lib/Drupal/rdf/Tests/GetRdfNamespacesTest.php
@@ -39,6 +39,16 @@ function testGetRdfNamespaces() {
+      $this->fail(t('Expected exception not thrown for conflicting namespace declaration.'));
...
+      $this->pass(t('Expected exception thrown: @message', array('@message' => $e->getMessage())));

+++ b/core/modules/rdf/rdf.module
@@ -114,7 +113,14 @@ function rdf_get_namespaces() {
+          throw new Exception(t('Tried to map @prefix to @namespace, but @prefix is already mapped to @orig_namespace.', array('@prefix' => $prefix, '@namespace' => $namespace, '@orig_namespace' => $namespaces[$prefix])));

We don't use t() in test messages. Use String::format to do placeholder replacement.

+++ b/core/modules/rdf/tests/rdf_conflicting_namespaces/rdf_conflicting_namespaces.module
@@ -0,0 +1,15 @@
+    'dc'       => 'http://purl.org/conflicting/namespace',

Weird formatting

Also noticed whilst reviewing that rdf_get_namespaces has a completely superfluous call to function_exists() - this is checked in ModuleHandler::getImplementations() - can we open a follow up to remove this?

Comment #67

scor commented 14 February 2014 at 13:30

since rdf_get_namespaces() is being refactored in this patch, let's fix the function_exists() in the next reroll.

Comment #68

kay_v commented 18 February 2014 at 23:22

Status:

Needs work

» Needs review

Status	File	Size
new	8482667-68-rdf-conflicting-namespace.patch	4.67 KB
new	interdiff.txt	2.82 KB

implementing changes requested in #66

however, assuming t in rdf.module at
throw new Exception(t('Tried ...
should be maintained per Coding standards > PHP exceptions

also omitting fix for function_exists() per #67

Comment #69

scor commented 19 February 2014 at 02:43

Status:

Needs review

» Reviewed & tested by the community

Alex's comments in #66 have been addressed. back to RTBC.

Comment #70

andypost

he/him

Russian

commented 19 February 2014 at 02:55

Issue tags:

+Needs change record

Comment #71

scor commented 19 February 2014 at 03:35

@andypost: Is the Needs change record tag in light of what's written in the OP:

No change record required since conflicting namespaces were already not supported in Drupal 7, it's just that Drupal 8 will complain loudly about it instead of silently ignoring them.

Happy to be proven wrong, but wanted to check if you had seen this ^^.