There are currently a few different approaches to handling db inserts/updates in the code base, and I've only just started really moving them over to dbtng, as that was higher priority at first. Now that that's all pretty much set, though, we should move on to updating the writes to be smarter, too. There's currently a set of (some-commented) methods on VersioncontrolEntity that form the basis of the write logic. These need to be thought through, expanded, and ultimately properly implemented by all the descendants.

Comments

sdboyer’s picture

Issue tags: +git phase 2

Tagging. We need this for a solid API, I'd be embarassed to finish phase 2 without it.

webchick’s picture

Priority: Normal » Critical

Marking as a migration blocker.

chrisstrahl’s picture

Issue tags: +git sprint 2

Tagging for sprint 2

sdboyer’s picture

Status: Active » Needs work

Been working on this, probably halfway there.

sdboyer’s picture

Title: Unify the various save/insert entity methods under DBTNG » Unify entity C(R)UD
Issue tags: +git sprint 3

Collapsing this with #724616: provide a way to delete an entities from the database, as it really makes most sense to just deal with all the CRUD at once.

eliza411’s picture

Issue tags: +git sprint 4

Tagging Sprint 4 for completion week 1

sdboyer’s picture

StatusFileSize
new20.83 KB

OK, here's an initial patch. What's in here is really kinda straightforward, but it took a while because working through it raised a LOT of things to think about. Some notes:

  • There's no batched insert in this patch. Batch inserts require quite a bit of backflipping to make sure that we've attached new ids to the right objects, as well as making sure that any secondary, dependent inserts get accomplished as well. So, while it'd probably be very nice to have, I'm going to hold off on doing it until performance tests actually indicate it to be a bottleneck in need of addressing.
  • There's also no batched delete in the patch. Batch deletes are less complicated than batch inserts (because we don't have to realign serial-generated IDs later after the fact), but I'm still applying the same reasoning as with inserts.
  • ensure() methods are gone (or if they managed to stick around, they're certainly deprecated). They need to be replaced, as it's potentially useful for most of our entities, possibly all (the question is on VersioncontrolRepository). This patch shouldn't be committed until they're added.
  • More minor, kinda - seems I totally forgot VersioncontrolOperation in this patch. Not sure how that happened, I know I made a lot of those changes...

There are some remaining schema issues that really need to be cleared up - versioncontrol_source_items is the easy example (do we keep it? or do we just kill it and stick the relevant data into versioncontrol_item_revisions?). Representing operations vs. entities, which is what #879600: Meta: introduce an activity stream separate from commit logs is turning into, is more complicated. Obviously, how the schema works is going to affect CRUD. To the extent that it's possible, though, I think we can keep this issue focused on creating the interface/skelestructure, committing it with the schema we have, then changing the internal logic behind the interface as we make schema changes.

marvil07’s picture

+++ includes/VersioncontrolRepository.php
@@ -214,37 +209,29 @@ abstract class VersioncontrolRepository implements VersioncontrolEntityInterface
-    $this->_update();
+    $this->backendUpdate();

this looks good, moving to another consistent way to name backend interaction(vs. general hook interaction)

As some hooks are changing the apid doc file should also be changed(I just committed a minor change) so please update versioncontrol.api.php with this patch.

Powered by Dreditor.

sdboyer’s picture

Yeah, glad you like that change. I figure it's a pattern we can use elsewhere, too - if we know it's something the backend should be working with, just stick 'backend' at the beginning of the method name.

sdboyer’s picture

StatusFileSize
new33.56 KB

Updated patch. Adds two big things:

  • Moves VersioncontrolOperation over to the new, unified system. We skip the {versioncontrol_operation_labels} table entirely and use 'commit' as the named entity type in the hook invocations, as I'm expecting that this class will shortly be moved over as a dedicated handler for commits (see #879600: Meta: introduce an activity stream separate from commit logs). This also rips out a lot of logic for managing attached/nested objects (like, the items on an operation), though, which will ultimately need to be restored.
  • An $options array as the sole parameter to be passed to every one of the CRUD methods, so that we still have a unified interface but the caller can instruct the object to take different logical paths.

The $options parameter is tied to the nested objects issue. It might end up not being necessary, but from where I'm sitting right now there are too many possibilities for how CUD should behave for me to be comfortable with a single grand-unified method. These possibilities have nothing to do with how each backend operates individually, so inheritance can't help us.

I'm gonna try to get another patch round in on this tonight.

marvil07’s picture

+++ includes/VersioncontrolBranch.php
@@ -9,6 +9,8 @@
+  protected $_id = 'label_id';

Maybe we can also use this issue to review all the classes data members, IIRC I have seen some inconsistencies on data members like repo_id instead of a repository object and only use repo_id on the drupal_write_record and related stuff.

+++ includes/VersioncontrolBranch.php
@@ -59,55 +61,41 @@ class VersioncontrolBranch extends VersioncontrolEntity {
+  public function update($options = array()) {

If we are using an $options variable we should pass it at backend<crud_op>()< /code> method and <code>versioncontrol_entity_<entity>_<crud_op>() hook.

+++ includes/VersioncontrolBranch.php
@@ -118,8 +106,20 @@ class VersioncontrolBranch extends VersioncontrolEntity {
+  public function deleteRelatedCommits() {
+    $commits = $this->loadCommits();
+    foreach ($commits as $commit) {
+      $commit->delete();
+    }
   }

In DVCS a commit can be applied to many branches/tags, so we should not delete commits before making sure they are not applied to other branches, but maybe that should happen overwriting the method. Not really convinced about this method.

+++ includes/VersioncontrolOperation.php
@@ -197,160 +198,37 @@ abstract class VersioncontrolOperation extends VersioncontrolEntity {
-    foreach ($item_revisions as $path => $item) {
-      $item->sanitize();
-      $item->vc_op_id = $this->vc_op_id;
-      $item->ensure();
-      $item['selected_label'] = new stdClass();
-      $item['selected_label']->get_from = 'operation';
-      $item['selected_label']->successor_item = &$this;
-
-      // If we've got source items (which is the case for commit operations),
-      // add them to the item revisions and source revisions tables as well.
-      foreach ($item->source_items as $key => $source_item) {
-        $source_item->ensure();
-        $item->insertSourceRevision($source_item, $item->action);
-
-        $source_item->selected_label = new stdClass();
-        $source_item->selected_label->get_from = 'other_item';
-        $source_item->selected_label->other_item = &$item;
-        $source_item->selected_label->other_item_tags = array('successor_item');
-
-        $item->source_items[$key] = $source_item;
-      }
-      // Plus a special case for the "added" action, as it needs an entry in the
-      // source items table but contains no items in the 'source_items' property.
-      if ($item->action == VERSIONCONTROL_ACTION_ADDED) {
-        $item->insertSourceRevision(0, $item['action']);
-      }

I know this logic is obscure, and it needs a replacement, but for now is needed. I do not see how the operation nested objects are inserted, you also mentioned this to me on IRC, so I am just writting it here to avoid forgetting it and to put it in context.

+++ includes/VersioncontrolOperation.php
@@ -197,160 +198,37 @@ abstract class VersioncontrolOperation extends VersioncontrolEntity {
+    return $this;

It seems like this patch also tried to unify the way we are returning/throwing exceptions, and it seems like it is going to be like DBTNG, returning the object to be able to chain method calling.

+++ includes/VersioncontrolOperation.php
@@ -359,64 +237,14 @@ abstract class VersioncontrolOperation extends VersioncontrolEntity {
+  public function delete($options = array()) {

Yep, all VersioncontrolOperation CRUD operations are nested.

+++ includes/VersioncontrolOperation.php
@@ -442,26 +270,6 @@ abstract class VersioncontrolOperation extends VersioncontrolEntity {
-  private function setLabels($labels) {
-    db_query("DELETE FROM {versioncontrol_operation_labels}
-    WHERE vc_op_id = %d", $this->vc_op_id);
-
-    foreach ($labels as &$label) {
-      $label->ensure();
-      db_query("INSERT INTO {versioncontrol_operation_labels}
-      (vc_op_id, label_id, action) VALUES (%d, %d, %d)",
-        $this->vc_op_id, $label->label_id, $label->action);
-    }
-    $this->labels = $labels;
-  }

+++ includes/VersioncontrolRepository.php
@@ -258,24 +249,19 @@ abstract class VersioncontrolRepository implements VersioncontrolEntityInterface
-    watchdog('special',
-      'Version Control API: added repository @repository',
-      array('@repository' => $this->name),
-      WATCHDOG_NOTICE, l('view', 'admin/project/versioncontrol-repositories')
-    );

We need this for the nested crud for operation labels or an analogue method for it, like you mentioned in IRC.

+++ includes/VersioncontrolRepository.php
@@ -220,37 +215,33 @@ abstract class VersioncontrolRepository implements VersioncontrolEntityInterface
-    watchdog('special',
-      'Version Control API: updated repository @repository',
-      array('@repository' => $this->name),
-      WATCHDOG_NOTICE, l('view', 'admin/project/versioncontrol-repositories')
-    );

Maybe is the right time also to unify where we should add watchdog messages. By now it seems like it is going to be nowhere at crud. Is that what we want?

+++ includes/VersioncontrolRepository.php
@@ -304,70 +289,21 @@ abstract class VersioncontrolRepository implements VersioncontrolEntityInterface
-    db_query('DELETE FROM {versioncontrol_labels}
-              WHERE repo_id = %d', $this->repo_id);
-
-    // Delete item revisions and related source item entries.
-    $result = db_query('SELECT item_revision_id
-                        FROM {versioncontrol_item_revisions}
-                        WHERE repo_id = %d', $this->repo_id);
-    $item_ids = array();
-    $placeholders = array();
-
-    while ($item_revision = db_fetch_object($result)) {
-      $item_ids[] = $item_revision->item_revision_id;
-      $placeholders[] = '%d';
-    }
-    if (!empty($item_ids)) {
-      $placeholders = '('. implode(',', $placeholders) .')';
-
-      db_query('DELETE FROM {versioncontrol_source_items}
-                WHERE item_revision_id IN '. $placeholders, $item_ids);
-      db_query('DELETE FROM {versioncontrol_source_items}
-                WHERE source_item_revision_id IN '. $placeholders, $item_ids);
-      db_query('DELETE FROM {versioncontrol_item_revisions}
-                WHERE repo_id = %d', $this->repo_id);
-    }
-    unset($item_ids); // conserve memory, this might get quite large
-    unset($placeholders); // ...likewise
-
-    // Delete accounts.
-    $accounts = $this->loadAccounts();
-    foreach ($accounts as $uid => $usernames_by_repository) {
-      foreach ($usernames_by_repository as $repo_id => $account) {
-        $account->delete();
-      }
+    // FIXME accounts are changing significantly, this will need to, too
+    foreach ($this->loadAccounts() as $account) {
+      $account->delete();
     }
 
-    // Announce deletion of the repository before anything has happened.
-    module_invoke_all('versioncontrol_repository', 'delete', $this);
+    db_delete('versioncontrol_repositories')
+      ->condition('repo_id', $this->repo_id)
+      ->execute();
 
-    $this->_delete();
+    $this->backendDelete();
 
-    // Phew, everything's cleaned up. Finally, delete the repository.
-    db_query('DELETE FROM {versioncontrol_repositories} WHERE repo_id = %d',
-      $this->repo_id);
-
-    watchdog('special',
-      'Version Control API: deleted repository @repository',
-      array('@repository' => $this->name),
-      WATCHDOG_NOTICE, l('view', 'admin/project/versioncontrol-repositories')
-    );
+    module_invoke_all('versioncontrol_entity_repository_delete', $this);
   }
 

Definitely we need to resurrect some of this, repo deleting needs to remove all related stuff. So, again nested crud operations for repo this time are needed.

Ok many transversal comments, but I like the target of this issue, clean the CRUD is one thing we must do :-)

Powered by Dreditor.

sdboyer’s picture

Maybe we can also use this issue to review all the classes data members, IIRC I have seen some inconsistencies on data members like repo_id instead of a repository object and only use repo_id on the drupal_write_record and related stuff.

Not a bad time to do this, probably. I've sorta been doing it as I go along. I've tended to avoid tasks like this because they end up spiraling into other API-fixing tasks, but as we're really getting quite near the end of the API work, it might actually be feasible. But I'm gonna keep it in the "I'll do it if there's time" category.

If we are using an $options variable we should pass it at backend<crud_op>() method and versioncontrol_entity_<entity>_<crud_op>() hook.

Agreed. It already is passed to the backend*() methods, but should also be passed to the hook invocation.

I know this logic is obscure, and it needs a replacement, but for now is needed. I do not see how the operation nested objects are inserted, you also mentioned this to me on IRC, so I am just writting it here to avoid forgetting it and to put it in context.

Yeah, this is one of those nested cases...however, even if we do retain the source items table (which I continue to be on the fence about) the logic here is still too much...I think. More importantly though, that logic should be on the Item, not the Operation/Commit. In any case, though, this logic has been removed and that needs to be dealt with (schema change, some other CRUD additions), somewhere.

It seems like this patch also tried to unify the way we are returning/throwing exceptions, and it seems like it is going to be like DBTNG, returning the object to be able to chain method calling.

Yep, to the extent that it makes sense, I'd like to present a fluent/chainable API.

Maybe is the right time also to unify where we should add watchdog messages. By now it seems like it is going to be nowhere at crud. Is that what we want?

Yes, I stripped out all the watchdog calls - I don't really see a reason to have them in there. Then again, maybe I'm misinterpreting what watchdog is really supposed to track, but I tend to think of it as only having "WARN" and "NOTICE"-type loglevels, not "INFO" or "DEBUG" ones. We can re-add them if that's a legitimate use case for watchdog. Given that it's yet another insert to do, though, it does merit justification.

Definitely we need to resurrect some of this, repo deleting needs to remove all related stuff. So, again nested crud operations for repo this time are needed.

It's more nested deleting, yes, but that's actually one place that I left some some code in there - look at the beginning of delete(), it loads all contained entities and deletes them. That was prior to me starting work on the nested stuff, though, so it does need refactoring - I just mention it because, in that particular case, I pulled out all the old code because those foreach()s at the beginning should actually accomplish the same thing.

sdboyer’s picture

StatusFileSize
new51.56 KB

OK, added the framework for nesting and implemented a bunch of the necessary nested logic. Also updated to be consistent with #975864: Merge {versioncontrol_item_revisions} and {versioncontrol_source_items} tables. We're pretty close to complete, here, but this DEFINITELY needs a sanity check as I'm going cross-eyed on it. There's a really nasty potential situation with VersioncontrolItem::insert() due to the way that source items now must work, but other than that, I think we're in pretty good shape.

marvil07’s picture

Status: Needs work » Needs review
StatusFileSize
new28.15 KB
new69.61 KB

Third round review

+++ includes/VersioncontrolItem.php
@@ -51,12 +53,15 @@ abstract class VersioncontrolItem extends VersioncontrolEntity {
+  protected $source_item_revision_id;
+

We have some simple fields as data members directly, but for repository we have a special case that loads also the repository based on repo_id. It would be great to have "autoload related entities" instead of just adding foreign keys on the target entity. I remmeber I have implemented some of it for the logic you made at controllers load. Well, maybe that deserve another issue :-)

+++ includes/VersioncontrolItem.php
@@ -51,12 +53,15 @@ abstract class VersioncontrolItem extends VersioncontrolEntity {
+  protected $sourceItem;
 

IMHO VersioncontrolItem::sourceItem should have the same casing that all other data members. Any reason for use camelCase here?

+++ includes/VersioncontrolItem.php
@@ -682,127 +673,102 @@ abstract class VersioncontrolItem extends VersioncontrolEntity {
+  public function getSourceItem() {

at VersioncontrolItem::getSourceItem() we should handle the source_item_revision_id=0(is a new item) instead of return FALSE ?

+++ includes/VersioncontrolRepository.php
@@ -220,37 +221,36 @@ abstract class VersioncontrolRepository implements VersioncontrolEntityInterface
-  public final function update() {
+  public function update($options = array()) {
+    if (empty($this->repo_id)) {

another change I see is the no-more-controlled environment(so backend can see exactly where it should interact), I mean no more final methods as part of a-controlled-crud, not really sure about it, since the backend<crud_operation>() method makes sense only in that context(if we use public maybe the recommended way to implement it is just call the parent on the child method).

Powered by Dreditor.

about the patches I am attaching

I have made some changes that IMHO we must have in. I am providing two patches, one that have all the thing joined, and the other one on top of the patch at #13 as requested.

After this test are passing again :-)

What I have done:

    - Remove backend<crud_operation>() methods from VersioncontrolEntityInterface.
      "All methods declared in an interface must be public, this is the nature
      of an interface." - http://php.net/manual/en/language.oop5.interfaces.php
    - Non-abstract method VersioncontrolEntity::backendInsert() must contain body.
    - Make VersioncontrolAccount class use interface signature, but defering
      the real logic chnage until we have a real one-PK for account.
    - update versioncontrol_account_status module to the new crud and hooks.
    - actually return item revisions on VersioncontrolOperation::loadItemRevisions().
    - Use consistent signature for save() methos at VersioncontrolEntityInterface.
    - Pass  parameter to backend<crud_op>() and remove unnecessary over-writes from repo class.

what IMHO would be great to review in detail before commit

These things woud e great if can be reviewed, for consistency.

  • all crud methods must return the object for chained ops
  • add docs to non-crud methods
  • no-repeated data members between inheritance relations on our entities
  • unify hook names (IIRC I see a problem in account hook names), but it worths to review them all at once.
  • review the flow at each crud method, making sure we are always using the same pattern.
sdboyer’s picture

We have some simple fields as data members directly, but for repository we have a special case that loads also the repository based on repo_id. It would be great to have "autoload related entities" instead of just adding foreign keys on the target entity. I remmeber I have implemented some of it for the logic you made at controllers load. Well, maybe that deserve another issue :-)

Lemme tell ya, the thought of introducing an "autoload related entities"-type system was pounding through my skull when working on this. It's one of those rough areas that we don't handle consistently, but really _really_ need to. Since we've essentially now got a poor-man's ORM, we need to make sure that the objects we have loaded always & only get loaded a single time, and get properly updated throughout the course of execution. The fact that we don't have a unified system for managing the autoload of related entities is a huge problem, and something that does need to be addressed, probably first at the controller level and then within the entities themselves.

...but, while it's a bit messy, I think we'll be able to survive, for now, without doing it. Unless we run into a case where it's really just murdering us, I'm ok with deferring it until after we do the public unveil and we've got some more wiggle room.

IMHO VersioncontrolItem::sourceItem should have the same casing that all other data members. Any reason for use camelCase here?

The only, _only_ time I haven't used camelCase (when adding new stuff) is when it's a property that maps to something in the db. Since we don't have a layer that translates from db fields into proper camelCase for properties on our classes, I suffer through that. But - and this does need fixing - if it's not a db property, then it should follow coding standards and use camelCase.

at VersioncontrolItem::getSourceItem() we should handle the source_item_revision_id=0(is a new item) instead of return FALSE ?

Hoo man...the reason I don't deal with that case is because VersioncontrolItem::getSourceItem() is totally irrelevant for new items. That method is strictly about populating the property based on known values that are db-derived. It's not about doing any of the calculations or discovery that might go into figuring out what the item might be. So if I'm following your question correctly, then no, the only time that we can't get an item will be if the item we're currently on HAS no source item - that is, is the first revision of a given item in the repo.

But...maybe I'm overthinking how complicated that all is.

another change I see is the no-more-controlled environment(so backend can see exactly where it should interact), I mean no more final methods as part of a-controlled-crud, not really sure about it, since the backend() method makes sense only in that context(if we use public maybe the recommended way to implement it is just call the parent on the child method).

I think you probably remember me arguing against using final way back when y'all first introduced it :)

The architectural thought pattern I'm following here goes like this:

  1. We need to provide a public-facing method for external code to call. This method should encompass as much backend behavior as possible, without doing backflips. The real benefit of this is reducing code duplication.
  2. We should provide additional sub-methods, called from the main method at structured points in the logic flow, that are expected to be overridden for backends to run their decorating/additional logic.
  3. This second point is especially true in cases, like update/insert/delete, where the logic sequence is highly sensitive and not easily overridden.
  4. In the event that a backend cannot do what it needs to with the specialized sub-methods, that's fine - it can simply override the parent's main method and alter the logic as needed.

The crucial thing to remember is that the API and backends are colluding to present a simple interface to third-party code - they, together, are API producers, and third-party code is the API consumer. Thus, it should be our goal to make API<->backend interaction as flexible as possible, so that it can produce the simplest, most intelligible API for its consumer. Controlling the environment with things like final are the antithesis of that.

So that's to respond to your text bit...now I'll look at the patch.

sdboyer’s picture

OK, basically all of VersioncontrolAccount's direct updates to any tables other than {versioncontrol_account} are wrong, because it assumes it can just pwn the mapping itself. But we agreed to ignore VersioncontrolAccount for the moment, so I'm gonna mostly skip those.

all crud methods must return the object for chained ops

Probably worthwhile, but I don't want to block this patch on that - it's an easy follow-up later :) I did, however, check, and we're consistent for now - all insert() and update() (and therefore all save()) methods return $this. delete() methods do not, as it doesn't really make sense to.

add docs to non-crud methods

Yeah...later. I want to doc sprint through the entire API once we've got it all shored up. This isn't core, which means I don't think this is a reason to hold up this huge patch :)

But...wait. non-crud methods? like, which ones are you referring to, especially given that this is the CRUD patch?

no-repeated data members between inheritance relations on our entities

Don't quite follow that.

unify hook names (IIRC I see a problem in account hook names), but it worths to review them all at once.
review the flow at each crud method, making sure we are always using the same pattern.

Reviewed, checked, and done. I'm gonna let this cool for a bit while I attack some other things, and probably commit it in a couple hours.

Aggregate and incremental patches are attached.

sdboyer’s picture

Status: Needs review » Fixed

OK yeah, this sucker's goin in. http://drupal.org/cvs?commit=453330

Status: Fixed » Closed (fixed)
Issue tags: -git phase 2, -git sprint 2, -git sprint 3, -git sprint 4

Automatically closed -- issue fixed for 2 weeks with no activity.

  • Commit 8dd392f on repository-families, drush-vc-sync-unlock by sdboyer:
    Issue #879858 by sdboyer and marvil07: unify all VersioncontrolEntity...
  • Commit d6decf5 on 6.x-2.x, repository-families, drush-vc-sync-unlock by marvil07:
    Issue #879858 follow-up: Fix operation related documentation on main api...

  • Commit 8dd392f on repository-families by sdboyer:
    Issue #879858 by sdboyer and marvil07: unify all VersioncontrolEntity...
  • Commit d6decf5 on 6.x-2.x, repository-families by marvil07:
    Issue #879858 follow-up: Fix operation related documentation on main api...