Community & Support

Building a Drupal site with Git

Last updated July 22, 2011.

Introduction

This document is intended to outline a basic process of using Git in the context of a basic site building, testing and deployment process. While there are many possible approaches to fitting Git into this process, this particular set of procedures should work in most circumstances and contains many best practices for using Git in this manner. When applied properly and with some forethought, Git is a very powerful tool for helping to manage collaboration, configuration and code changes during the life cycle of a Drupal-based project. Further documentation will be written to show how best to integrate other tools such as Drush into this process.

This documentation assumes that the project will be following a basic 4-tier development environment model: developers work on most code locally, then push that code up through Development, Staging and Production environments. It can easily be adapted to fewer tiers, as necessary. We will be illustrating the process by building out a Drupal site called ‘FooProject’, and will use fooproject as a placeholder anywhere a project or site name would be used.

What can you manage with code?

One challenge of using any kind of change management with Drupal is that many configuration changes normally reside only in the project’s database. You can move some of these configuration settings into code using certain Drupal modules, such as Ctools exportables or the Features module. You may want to take advantage of these tools to export as much of your site’s configuration as possible into your repository for deployment and collaboration purposes.

Fortunately, since a Drupal site's module and theme files already live in code, you can manage those in Git without any extra tools.

Creating the Central Repository

When working on a project with multiple environments, a good first step after server provisioning is to create a central repository from which all other environments will pull. This could live on one of the servers provisioned for the project, a Gitosis server, a Gitolite server, Github or any other repository hosting solution. For purposes of illustration, assume that your FooProject is running on a server configured with a user named ‘fooproject’ and all three environments will be running from separate directories under that user’s public_html directory. For convenience, create the central repository inside fooproject’s home directory.

$ cd ~
$ git init --bare fooproject.git

This command creates a new directory called fooproject.git that contains all of the git objects. This directory is not the working tree, where you edit and commit code. Rather, it is simply the central location for the git objects and history, and is essentially empty at this point.

Locally Cloning Drupal

This example begins the development process on the developer’s local development environment, but you could follow these steps on the server as well. Let's clone Drupal to create a local development environment.

$ git clone  http://git.drupal.org/project/drupal.git fooproject
$ cd fooproject
$ git checkout 7.0

The first command clones the Drupal core Git repository from Drupal.org and saves it in a directory named fooproject. The fooproject directory will become your working tree. The final command, git checkout 7.0, ensures your code is on the Drupal 7.0 release. To choose another release before install, you can run the following command to view a list of all releases:

$ git tag

Then, to switch to the version you want, you would type the following command, where <tagname> is the name of the release you want to use:

$ git checkout <tagname>

Updating Remotes

You’ll need to update your remotes to reflect that you won’t be pushing to drupal.org with your project’s code. You should rename the original origin remote (the drupal.org Drupal project repository) to ‘drupal’ and create a new origin pointed at the bare repository you’ve created on your server.

$ git remote rename origin drupal
$ git remote add origin path/to/your/central/git/repo

(example: ssh://fooproject@fooproject.com/home/fooproject/fooproject.git)

To see a list of your remote repositories, run the command:

$ git remote

For a more detailed listing that includes the remote repositories' URLs, add a -v flag (for verbose) to the end of the command:

$ git remote -v

Creating a Working Branch

Now, you need a branch where you can track not only Drupal core, but also all of the contributed and custom modules and themes for your site. Create a branch using the command:

$ git checkout -b fooproject

This command creates a new branch named fooproject and checks it out. It is equivalent to running the commands:

$ git branch fooproject
$ git checkout fooproject

You can use this fooproject branch as a working branch to add contributed and custom modules and themes to your site. Consider it the equivalent of the default Git ‘master’ branch for your project.

At this point, you should complete the Drupal installation process to get a working local installation.

Setting up the .gitignore file

There are a few things that you probably do not want to have tracked in the repository, namely sites/default/files and sites/default/settings.php. You would exclude the files directory if you prefer only to track application files in Git and to omit site content from the repository. You probably want to exclude settings.php because it contains sensitive database access information and will be different on each environment.

One way to tell Git to exclude certain files and directories from the repository is to set up a .gitignore file. When you clone Drupal 7 from the Git repository, it comes with a .gitignore file. Drupal 6 does not come with a .gitignore file at the time of this writing.

The settings in the Drupal 7 default .gitignore file are as follows:

# Ignore configuration files that may contain sensitive information.
sites/*/settings*.php

# Ignore paths that contain user-generated content.
sites/*/files
sites/*/private

Customizing .gitignore

You may want to keep the above settings for your own site. However, if you decide to use different version control policies for your site, for example by deliberately excluding certain modules, themes, or libraries from your repository, you need different .gitignore settings. Here are some options for getting around the default .gitignore settings:

  1. If you don't want sites/all to be controlled at all (you want to ignore all modules and themes and libraries), add a file at sites/all/.gitignore with the contents a single line containing nothing but *.
  2. Simply change the .gitignore and commit the change. You won't be pushing it up to 7.x right?
  3. If you track core code using downloads (and not git) you can simply change the .gitignore and check it into your own VCS.
  4. Add extra things into .git/info/exclude. This basically works like .gitignore (it has good examples in it) and is not under source control at all.
  5. Add an additional master gitignore capability with git config core.excludesfile .gitignore.custom and then put additional exclusions in the .gitignore.custom file.

Note that only 1 and 2 are completely source-controlled. In other words, #3, 4, and 5 would have a slight bit of configuration on a deployment site to work correctly, but they work perfectly for a random dev site.

For more information about the above options, see Randy Fay's blog post:
http://randyfay.com/node/102

If you add a new ignore file or edit one that is not being tracked by Git, remember to add it to your Git repository using the git add command, and then commit those changes using git commit. For example:

$ git add .gitignore.custom
$ git commit -m "Initial FooProject commit"

Creating a global .gitignore file

You can also create global ignore settings across all of your Git projects. First, you create a global .gitignore file in your home directory (~/.gitignore). Then, you add it to your global configuration using the following command:

git config --global core.excludesfile ~/.gitignore

Pushing Code to the Central Repository and Completing Initial Deployment

Now, you can push your code up to the origin remote on your server:

$ git push origin fooproject

This command copies your local branch fooproject to a branch of the same name in your remote repository origin.

You can now provision your other tiers with this code from the repository. Log into your server and provision a development environment from the code you’ve committed:

$ git clone --branch fooproject ssh://fooproject@fooproject.com/home/users/fooproject/fooproject.git fooproject_dev

Now, you have a fooproject_dev directory that you can use as the root of a new virtual host. You can proceed through the normal Drupal installation process using this development copy of your site and a separate database for it. Repeat this process for the Staging and Production environments - we’ll assume that they live on the same server in directories fooproject_stg and fooproject_prod.

Adding Contributed Modules and Themes

The site development process rarely ends with core Drupal - you’ll likely be adding contributed modules and themes throughout the development process. There are a number of possible approaches to this process that additional documentation on Submodules, Drush and Dog will describe. For the purposes of this documentation, we are not concerned with keeping Git history for contributed modules. Simply download and install these modules and themes to your site and add them to your main development branch. Let’s use Views as an example:

$ cd sites/all/modules
$ wget http://ftp.drupal.org/files/projects/views-7.x-3.0-beta3.tar.gz
$ tar -xzf views-7.x-3.0-beta.tar.gz
$ rm views-7.x-3.0-beta.tar.gz

If you check the status of your repository at this point, Git will point out to you that you have some new untracked files living in your working tree:

$ git status

# On branch fooproject
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
# views/
nothing added to commit but untracked files present (use "git add" to track)

So, we’ll now add them to the Git index:

$ git add views
$ git commit -m “Added Views 3.0-beta”

You can now push this code back up to your central repository and pull it down to your Dev server:

(Locally)

$ git push origin fooproject

(On Dev server)

$ git pull origin fooproject

At this point, you should see the Views code in both your local environment as well as on the Development server’s codebase, and you should be able to enable this module in both environments. You can follow this basic procedure for adding any contributed module, contributed theme, or custom code to your site.

Topic / Issue Branches

As your project progresses, you may find yourself in a situation where you need to work on an issue or feature outside of the main code-base, where your code changes won’t impact your fellow developers or your client. Alternately, it’s not uncommon to engage in fixing an issue with your code, only to find that the solution is more involved than you have resources to commit to it at the moment. In these situations. it’s helpful to be comfortable with Git’s branching system in order to give you a clean sandbox to work on your changes without losing the ability to pull code from the central repository.

Assuming you’re currently on the main fooproject branch before you begin working on an issue submitted by your client, the workflow is simple:

$ git checkout -b issue_606_theme

This command creates a new branch for the issue based on the branch you were on before issuing the checkout command. At this point, you can work on code and commit changes to this local branch without any worry of losing your work in progress or being able to switch back to the main code-base to work on other issues.

Let’s assume that this particular issue involved changing something in your theme’s page.tpl.php file. Once you’ve edited the file, you can follow the basic change/add/commit workflow of Git:

$ git add sites/all/themes/footheme/page.tpl.php
$ git commit -m “Issue 606: Removed offending div from page.tpl.php”

You can make as many changes and commits to your branch as required to fix your issue or implement your feature. At this point, all of these commits are living in a branch that exists only on your local copy of the repository. If you have need to switch back to the main branch, you can add and commit your files in your topic branch and issue the following command to get back:

$ git checkout fooproject

Once you’ve brought your topic branches back into your mainline fooproject branch, it’s a good idea to clean up after yourself. When you’ve decided you no longer need that topic branch, deleting it is a simple command:

$ git branch -d issue_606_theme

If you’ve created a topic branch but never merged it back into it’s parent, Git will keep you from deleting it accidentally. If you’re sure that you no longer wish to have this unmerged branch, simply replace -d with -D to force deletion of the unmerged branch.

Bringing Branches Back into the Main Codebase

Once you have finished working on your fix or feature, you’ll need to bring those changes from your topic branch back into the main fooproject branch. There are a number of different strategies and opinions on managing the merging of code using either Git’s rebase or merge commands. Here are some basic considerations:

  1. If your branch’s commit history is considered public (i.e. multiple developers working on the same topic branch in their own repositories) - ALWAYS choose Merge.
  2. If you’re concerned with keeping a clean, linear history - Rebase.

Please note that these ground rules are nearly mutually exclusive.

For the purposes of this documentation, let's assume that you are going to use the basic merge-based strategy. For suggestions about using Git's rebase command, read Randy Fay's post at http://randyfay.com/node/91.

Merging

If you are the primary developer on a project, and you are not concerned with maintaining a completely linear history from your topic branches, using Git’s merge command provides a straightforward way for you to bring your topic branches back into your mainline codebase.

$ git checkout fooproject # Puts you back into your main branch
$ git pull # Fetch and merge changes from your main repo back into your local fooproject, if any exist
$ git merge issue_606_theme  ## Merges your topic branch back into fooproject
$ git push ## Sends your newly updated fooproject code back to the origin repository

Staging and Production - Tag Based Deployment

You could simply allow your Staging and Production environments run from the code you’ve been adding and merging into your main fooproject branch - but it’s worth considering the fact that the fooproject branch is going to be moving forward as development progresses even after your site’s launch, and production sites should really be running a single, well-tested snapshot of the code-base during ongoing development. For this reason, it’s a good idea to use tags instead of branches for managing your non-development code. Tags are simply references to the state of the code-base at a specific commit - a snapshot of the project at one specific moment in time.

When your code has been tested and is ready to be deployed into the production environment, you could follow this process locally:

$ git tag prod_20110419  ## Creating a tag from the current commit.  You can specify a commit here if you wish.

Now, you can push this tag up to your repository:

$ git push origin prod_20110419

Now, in your server’s fooproject_prod directory:

$ git pull
$ git checkout prod_20110419

Handling Hotfixes

An inverse procedure can be used to handle changes that need to be made to code in the production environment. Because your production and staging environments are running from tags, they are considered to have a ‘detatched HEAD’ - with no commit history, and no branch to commit into. You probably saw Git warning you of this situation when you checked the tag out. Still, git makes it easy to manage hotfixes in this way. For example, you’ve been told about a bug that was just found in production, and it needs to be cleaned up right away. Open your local development copy and perform the following:

$ git checkout prod_20110419 # switching to the offending production branch
$ git checkout -b prod_hotfix_issue_707 # you’re now starting a new branch that begins at your current prod tag
# Code to fix the problem
$ git add <changed files to be commited>
$ git commit -m “Production hotfix for issue 707: Fixing production bug”

Now, you can create a new tag from this commit to run on

$ git tag prod_20110515_hotfix_707 # create the new tag
$ git push origin prod_20110515_hotfix_707 # and push it to the repository

Now, logging in to the production repository:

$ git fetch --tags #update the dev repo with your pushed changes from production
$ git checkout prod_20110515_hotfix_707 # puts you back into a detatched HEAD state against your new tag

Now, you’ll can merge that code back into your local fooproject branch

$ git checkout fooproject
$ git merge prod_20110515_hotfix_707 # brings the hotfix code back into the mainline code-base

Updating Drupal Core

In this workflow, managing updates to Drupal core is a fairly trivial process:

$ git checkout fooproject # making sure we’re on our main fooproject branch
$ git fetch drupal # update our repository with changes from the main Drupal upstream repo
$ git merge 7.1 # merge in the updates to Drupal

Then, run update.php and your code and database should be properly updated. If you wish to test the upgrade first, you could always create a new topic branch for the update testing process and merge the Drupal release tag to that branch before bringing it into your mainline codebase. From there, simply create new stg_ and prod_ tags, push them to the repository and pull them into your other environments as above, making sure to run update.php after checking them out!

Notice that in this particular workflow, modules are kept as part of your own code-base and not pulled with Git from drupal.org. Update these modules as per the normal Drupal documentation or with the drush up command, run update.php, then add and commit the updated modules to your repository.

[TODO: Managing the Database
Using Submodules
Moving a Custom Module or Theme into it’s own repository]

Comments

remotes renaming

I suggest the origin gets renamed to drupalorg rather than drupal for the sake of clarity.

If you think this is a good

If you think this is a good idea, can you submit an issue to the infrastructure issue queue?

github.com

I would really love a github specific howto. Really stumbling through how to do this with myself, github and a remote collaborator. Especially the part about provisioning the server.

Thanks,

Landon

I'm trying to work up

I'm trying to work up specific documentation for github as well - just taking me some time.

I get lost around here as well.

First off, great job, thanks. It's exactly the info I've been looking for. As a beginner in version control I was feeling lost.

On your post, I'm getting stuck around the same spot. I'm not clear on which "box" things reside and are created on, i.e. the branches like the fooproject_dev are created on the Repo and cloned to a hosting server, or created on the hosting server itself?

In my case what I'm trying to set up would look like:

DRUPAL.ORG
LOCAL (LAMP)
REPO (Github)
HOST (Somewhere)

I think a basic diagram with boxes and a few arrows would really help me.

Last thing, any tips if the HOST does not have Git, and/or does not allow shell access?

Thanks again
Alan

Development, Staging and Production Remotes

Is it advisable to rename the remotes in your development, staging and production environments to point to your repository's new origin in the same way that you do for your local environment:


You’ll need to update your remotes to reflect that you won’t be pushing to drupal.org with your project’s code. You will want to rename the original origin remote (the drupal.org Drupal project repository) to ‘drupal’ and create a new origin pointed at your bare repository you’ve created on your server.

$ git remote rename origin drupal
$ git remote add origin path/to/your/central/git/repo

(example: ssh://fooproject@fooproject.com/home/fooproject/fooproject.git)

The way this walkthrough

The way this walkthrough works, your initial (local) repository should be the only repo that you need to rename origin - it's the only one that has pulled from d.o. Your other environments will have your own central repository as origin, because that's where you cloned them from.

Thanks

I probably should have poked around the different environments a little more; I would've found the same answer you just provided.

For instance, in my development environment, I ran git remote -v and it lists where the remotes point to.

Thanks again.

Updating Remotes

command line

$ git remote add origin path/to/your/central/git/repo

was a bit confusing me.
we have set a fix projectname "fooproject" and are current in path /home/...
why not change this line to?

$ git remote add origin home/[user]/fooproject.git

less code - less bugs
johanneshahn - @eccence.de

Multisite and Git

In a multisite setup using Git, what are the advantages and disadvantages of:

1. Setting up nested repositories for each site? For example, having a Git repository for Drupal core but then also having a repository for each sites/site1, sites/site2....etc.

2. Setting up one repository with separate branches for each site?

3. Setting up one repository using submodules for each site?

4. Setting up one single repository?

Lynn Taylor

Multisite and Git

I too am interested in the best approach for managing multiple sites from one core.

problem with external repository

Hi, i followed the instruction to set version control, tried to use gitenterprise.com as bare repo.
when i go to the final push :

git push origin myproject #where origin is an ssh://path/to/repo

i got this result:

Counting objects: 108330, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (23412/23412), done.
Writing objects: 100% (108330/108330), 30.19 MiB | 53 KiB/s, done.
Total 108330 (delta 80222), reused 106501 (delta 78600)

followed by a list of:

remote: You are not <supposed drupal user name> <supposed drupal user email>.

and:

remote: error: hook declined to update refs/heads/myproject
protocol error: error refs/heads/myproject *** PERMISSION DENIED *** hook declined
To ssh://<username>@gitent-scm.com/git/<mydomain>/<myrepo>
! [remote failure]  myproject -> myproject(remote failed to report status)
error: failed to push some refs to 'ssh://<username>@gitent-scm.com/git/<mydomain>/<myrepo>'

---

If i try the same using bettercodes service as repo provider. I get this:

Fetching remote heads...
  refs/
  refs/heads/
  refs/tags/
updating 'refs/heads/myproject<'
  from 0000000000000000000000000000000000000000
  to   <hash>

then it takes a lot (minutes) until it output:
the number of object to be transferred (usually more then 100,000)
then it stall hours before outputting an http error

Any suggestion?

Check if credenials of

Check if credenials of commits you are trying to push are the same as on gitenterprise. If there are commits done by somebody else, you can't push them under your name.

I had similar problem today, but not with drupal, just with some of my projects hosted on Git Enterprise after I switched to another computer. I viewed git log | head and found out that I commited a change before setting my user.name and user.email to those from gitenterprise. So the username and email was taken from OS user account. Resetting last commit did the trick for me: git reset HEAD~1. After that I set user.name and user.email and commited last change again and than managed to push it to remote repo.

UPD: But looks like your problem is easier. <supposed drupal user name> and <supposed drupal user email> should be those from Git Enterprise. See "Help" tab in your gitent-repo web frontend.

supposed drupal user name

supposed drupal user name refers to actual drupal users, working on and patching drupal itself.

i did this:
cloned drupal from drupal org
changed remote: drupal git repo is now drupalorg, and my project on gitenterprise became origin.

If I type git history I can see all the commits of the drupal users working on the project, and I suppose git enterprise is complaining those history items.

I hope i was clear... not so experienced yet.

Documentation fix

Anchor "updatecore" is not correct.
<h2 code="updatecore"> should be <h2 id="updatecore"> :)

error: src refspec ..... does not match any

I get the following error when trying to git push origin:

error: src refspec ....... does not match any
error: failed to push some refs to 'ssh://.......'

Please help,

Thanks

Question about Creating the Central Repository

I have been running a drupal site for several years and when drupal.org switched to git, I thought this would be a good time to put my site under version control. I am very new to git, so I apologize if my questions do not make a lot of sense. I followed the above instructions to clone the repository from drupal.org:

$ cd ~/Sites
$ git clone  http://git.drupal.org/project/drupal.git fooproject
$ cd fooproject
$ git checkout 7.0

I got the usual "detached head" warning:
Note: checking out '7.0'.
You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 4979149... Drupal 7.0

So right here I'm stuck. After reading about "detatched HEAD", it is clear that I need my head attached. Also, my site is running drupal 6, and I do not want to upgrade to 7.0 yet, just apply security and other updates to my 6.x site.

I decided I needed a local branch, so I followed git's suggestion above, and created a branch and checked out the new branch.

git checkout -b 7.0
Now I have my own branch of drupal 7.0.

When I type git branch I see:

* 7.0
8.x

where I am on branch 7.0 in my repository, and I have an 8.x branch from where HEAD pointed when I cloned the drupal.org repository.
This is where I get stuck again. If I try to update my 7.0 branch from drupal.org using git fetch , won't I get the 8.x updates, because that is where HEAD points in the drupal.org repository? Is fetch always going to fetch whereever HEAD is in the upstream remote?

Also, I do not quite understand the instructions above on Updating Drupal Core:

$ git checkout fooproject # making sure we’re on our main fooproject branch
$ git fetch drupal # update our repository with changes from the main Drupal upstream repo
$ git merge 7.1 # merge in the updates to Drupal

Since a specific tag (7.0) was checked out/branched, there will never be any updates to that tag. Why would fetch ever be used? Shouldn't the commands be the following:

git checkout fooproject # making sure we’re on our main fooproject branch
git branch 7.1 drupal # get a new local branch of 7.1 from drupal.org assuming the upstream remote was renamed to drupal.
git merge 7.1 # merge in the updates in the 7.1 release into my fooproject code

Thank you for your patience.

Some brief and quick answers :)

You don't have to be "attached", being "detached" simply means that you're not going to get updates etc. that would follow from being on a branch, like a given tag can belong to a variety of branches. Whenever you need to be on a branch, simply checkout the branch you wish to go to.

If you wish to stay on 6.x, you will need to check out one of the 6.x tags (the 6.22 tag should bring you to the latest release) or the 6.x branch.

Even if you're "detached" due to being on a non-changing tag, fetch will still pull in the data from the origin repository (or whichever repos you specify), which also pulls in e.g. new tags, allowing you to do git checkout/merge new-tag. Note here, that fetch doesn't change any of the files you're currently working on, it just fetches data from the external repository you specify.

Thanks! And a few more questions.

Wow. I have a lot more to learn. So, if I checkout the 6.x branch, and then fetch, I will get the data for the latest update for the dev release, right? If I want to actually change my files, then I have to do a merge after the fetch. Is this correct?

Great tutorial! Thanks!

Great tutorial! Thanks!

The guide left off just when

The guide left off just when it was getting interesting! Would love to learn how to handle the database with this workflow.

gitignore

Hi All,

I consider myself an intermediate level user of Drupal and a n00b with git. I am involved in the development of a site with drupal setup as a multisite (single code base and multiple databases). The site is primarily focused on user-generated content (where uploaded files play a significant role) and these files can, if we get our user base, grow to very large numbers.

I am in a dilemma as to whether to gitignore the files (and hence achieve, I believe, a significant boost in server performance; backing up the files folder regularly using rsync) or, alternatively, to use git itself as a backup mechanism (therefore not using gitignore) facilitating easy rollback if (and when) necessary.

Would appreciate responses from the community and thanks in advance.

Best regards,
Arun.

Great tutorial. Images would make it better.

Thanks for this great tutorial.

It would be great to add images with repos and arrows between them to represent actions for each command (or at least at the beginning) - that would remove any ambiguity between dev, staging and production environments.

Also, it is still unclear about using DB (I saw TODO about it) in branches. Maybe, using Backup and Migrate for every push to remote (through git hook) could solve this issue.