The API is a mixture of retrieving stuff from the database and delegating to backend modules. It's extensively (some would say ridiculously) documented with doxygen/phpdoc compatible function descriptions, in order to make it straightforward for higher-level modules to use the API and for backends to implement it.

An introduction presentation video about Version Control API
Doxygen documentation for Version Control API

Central concepts and objects array structures

In essence, the API revolves around the same concepts present in every common VCS. They are mostly represented as associative arrays (think of them as objects) which can be retrieved and passed around using the various API calls. The following list briefly introduces those entities:

  • Repositories: contain fundamental information about the repository, like its name, root path/URL and the backend that powers this repository. A repository can be local or remote, and any number of repositories can exist at the same time for any given VCS backend. (With DVCS like Git, the number of repositories is potentially huge.) Most admin preferences are stored per repository, or should be.
  • Accounts: Just a (probably incomplete) association of Drupal uids to VCS usernames. Not represented as an array but rather as a combination of Drupal uid, VCS username and repository. The current API (6.x-1.x or earlier) is lacking, and does not allow multiple accounts per Drupal user and repository.
  • Items (a.k.a. item revisions): Files or directories inside a specific repository, including information about the path, type ("file" or "directory") and (file-level) revision, if applicable. Most item revisions, but probably not all of them, are recorded in the database.
  • Labels: The unified term for "branches and tags", including the label type ("branch" or "tag") and name (e.g. "trunk" or "DRUPAL-6--1-0"). Labels are also stored in the db, but potentially unreliable as they can change without us knowing. Also note that as an API user, you should try not to assume any fixed label names such as "HEAD" or "master" for the main development branch, because that's not VCS independent and can change even within a single repository.
  • Operations: In a nutshell, operations are what you see on http://drupal.org/cvs or http://example.org/commitlog - stuff that happened in a repository at a specific time. That includes commits as well as the creation and deletion of branches and tags (which is then called “branch operation” or “tag operation”). An operation includes information about revision author, repository, date/time of the operation, revision number/id, and the log message. An operation is also associated to any number of items ("operation items") and labels ("operation labels") that they modify/affect. When referenced from an operation, both labels and items feature a couple more properties, like the action that was performed on it in that operation.

Version Control API is based on the idea that the current state of a repository is essentially unknown, but all log information up to a certain point in time is available in complete form in the database so that commit logs can be shown (and commit statistics calculated) without invoking the VCS binary. For browsing the repository, direct interfacing with the VCS itself is required. Also, the association of items to branches and tags cannot possibly be recorded in a correct & maintainable way, so determining that is also left to on-the-fly invocations.

Version Control API takes care of managing the above entities, and provides hooks for modules to act when e.g. a commit has been recorded (that would be hook_versioncontrol_operation()). Those hooks are documented in the hook_versioncontrol.php file. Repositories and accounts can be extended by backends and other modules by implementing hook_versioncontrol_repository() and/or hook_versioncontrol_account() respectively - for example, the CVS backend adds passwords to accounts and a list of CVS "modules" to each repository.