Cron script for multi source, multi site setup
Last modified: March 22, 2008 - 15:44
I've just hacked up this little script as I have multiple installations of drupal each with multiple sites that get more each day. This should figure out which sites cron should run for.
#!/bin/bash
SITESROOT=/var/www/sites
MYIPRANGE=41.204.221
# get the base installs
cd $SITESROOT
for drupaldir in $(find . -maxdepth 2 -name INSTALL.mysql.txt | awk -F/ '{print $2}')
do
cd $SITESROOT/$drupaldir/sites
for site in $(find -L . -maxdepth 1 -type d -iregex "./[a-z].*\.[a-z].*" | awk -F/ '{print $2}')
do
IP=$(dig $site | sed '/AUTHORITY SECTION/,$d' | grep -v "^;" | grep "IN[[:space:]]*A" | sed 's/.*A\W*//' )
if echo $IP | grep -q $MYIPRANGE
then
#echo "Doing cron for $site"
wget -O - -q http://$site/cron.php
else
a=1
#echo "Skipping cron for $site"
fi
done
done
Why did you add the IP-range
Why did you add the IP-range that you use? Is it because every Drupal-site (domain) in your multisite setup uses it's own unique IP adress?
If I have only 1 IP adress for all the sites in the multi-site setup, what do I enter in the IP-range line? The last 3 digits of the IP adress or the entire IP adress?
Why the IP?
Because we develop some sites on this server that subsequently gets moved to other servers and I only want to run cron for the ones that should still be active on this server.
A more simple bash script
The script above works, but is a little over-engineered for an environment where you mostly trust the contents of your 'sites' directory to be accurate. It also depends on 'dig' which is not available on every server (I tend to not install it on mine).
This version of the script does a few things differently.
The one case where I'm unsure how the script will behave looks like:
The domain (foo.example.com) is pointing at your server and you have a directory by the name 'foo.example.com' in your sites directory, and you have not configured apache to point at the drupal multi-site installation for incoming foo.example.com requests. This should not stop the others from running, I give the warning because I have not tested this case. Expected cases work, and domains not pointed at the server do not work.
Put this in a file with a .sh extension, and call that from your crontab.
#!/bin/bash
# This script will iterate through the sites directory of a multi-site install
# and run the cron.php for each named site in the directory.
# NOTE: the site defined in 'sites/default' must have its URL set statically here.
# set the domain of the site defined in 'sites/default'
# comment these lines out if you don't need or use them.
# DEFAULTDOMAIN=www.example.com
# wget -O - -q http://$DEFAULTDOMAIN/cron.php
# set the system path for the multi-site sites directory
SITESROOT=/var/www/drupal-5.10/sites
# set the IP of your server
MYIPRANGE=192.168.1.101
cd $SITESROOT # work in the right dir
for site in $(ls |egrep -v "all|default")
do
if ping -c 1 $site |grep -q $MYIPRANGE
then
wget -O - -q http://$site/cron.php else
fi
done
Assume
I assume you meant:
if
done
and not
fi
done
I try and not ass.u.me :-)
In shell scripting, 'fi'
In shell scripting, 'fi' means 'end of the if statement'.
egg on face
Got it! Thanks.
===
Doug O.
Video Walkthrough
I just want to give a shout-out to gnat's solution. It worked great for me! Just be sure to change value of MYIPRANGE to your server's IP address -- after several moments of mucking around with the code and trying to get my cron job to run, I realized I hadn't entered in my own IP address.
I found ngat's solution from a helpful video by Matt Petrowsky. The video is located here and gives a good overview of cron for new users: http://gotdrupal.com/videos/setting-up-drupal-cron
Script hanging w/out finish
When I run this script from crontab, it doesn't finish.
I went to a bash irc room and they said this about the script:
Please never parse, pipe, grep, capture, read, or loop over the output of 'ls' or 'find'. Despite popular belief, 'ls' is not designed to enumerate files or parse their statistics. Using 'ls' this way is dangerous (word splitting) and there's always a better way; eg. globs, find -exec, etc.Is the script not finishing because of the ls command? thanks
This works for VHosts with static IPs
Under Mac OS X Leopard server I have my Apache's virtual hosts set up with static IP addresses (for SSL certs, etc.), so setting the MYIPRANGE variable to one IP address failed for me. Here is my adjusted script (based on gnat's) that works for the IP range of the domains and uses curl, since wget isn't on OS X by default.
#!/bin/bash
# This script will iterate through the sites directory of a multi-site install
# and run the cron.php for each named site in the directory.
# NOTE: the site defined in 'sites/default' must have its URL set statically here.
# set the domain of the site defined in 'sites/default'
# comment these lines out if you don't need or use them.
# DEFAULTDOMAIN=www.example.com
# curl --silent --compressed http://$DEFAULTDOMAIN/cron.php
# IP range of vhost domains on server
MYIPRANGE="192\.168\.1\.1[10-22]"
# set the system path for the multi-site sites directory
SITESROOT=/home/drupal/html/sites
cd $SITESROOT # work in the right dir
for site in $(ls | egrep -v "all|default"); do
# get IP of your vhost domain, if need be - comment MYIPRANGE above if used
#MYIPRANGE=$(host $site | egrep -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$")
# verify ping
if ping -c 1 $site | grep -q $MYIPRANGE; then
# run cron.php for vhost domain
curl --silent --compressed "http://$site/cron.php"
fi
done
And here is the 'localhost.user.drupalcron.plist' for launchd to run the 'drupalcron.sh' script above at an interval (in seconds) when placed in /Library/LaunchDaemons. Cron is replaced by launchd on OS X server, though cron is still available.
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>localhost.user.drupalcron</string>
<key>ProgramArguments</key>
<array>
<string>/home/drupal/drupalcron.sh</string>
</array>
<key>LowPriorityIO</key>
<true/>
<key>Nice</key>
<integer>1</integer>
<key>StartInterval</key>
<integer>86400</integer>
</dict>
</plist>
Then load the .plist job on demand with the following command in Terminal, or reboot...
sudo launchctl load -w /Library/LaunchDaemons/localhost.user.drupalcron.plist