Continuous integration/Architecture/Castor
Castor is an umbrella term for the caching of dependencies/package managers materials for the isolated instances .
The CI jobs start up in a fresh environment and have to retrieve dependencies over the internet and eventually, for native dependencies, compile them. The download phase can be arbitrarily long with package managers such as maven
download a long list of dependencies, and has the risk of upstream blacklisting our network abusing bandwidth. The installation and compile phase can be quite slow as well and it does not make sense to compile again and again the same material.
We introduced a very lame system based on rsync
. It copies from the instance a list of directories to a central place whenever the change succeeded in the Zuul gate-and-submit
pipeline. When a job start, it first attempts to retrieve the material from the central cache, thus warming up the cache before invoking the package manager. The cache itself is namespaced by:
Variable | Description |
---|---|
ZUUL_PROJECT |
The git project name |
ZUUL_BRANCH |
git branch the patch has been made against |
JOB_NAME |
The Jenkins job name |
Mechanism
[edit]For reference see integration/config.git:jjb/castor.yaml
- Instance:
integration-castor05.integration.eqiad.wmflabs
configured in Jenkins viaCASTOR_HOST
env variable - Location:
/srv/castor/
When a job is in gate-and-submit
and is successful, it triggers the jenkins job castor-save
which runs on the Castor instance. The job will connect to the instance the original gate job ran on, and then rsync the package managers caches to the Castor instance.
The cache is namespaced by: Gerrit project name with /
replaced by -
(eg: mediawiki-core
), target branch (eg: master
) and job name (eg: rake-jessie
).
The job have a builder macro that attempt to rsync
the cache from castor
into the home dir, thus populating the local cache. When the package manager installer is run (eg: npm install
), it will hit the local cache, saving it from having to download packages over the internet.
The JJB macro refers to the host using the CASTOR_HOST
environment variable which is configured as a global variable on the Jenkins controller.
/srv/castor
is a Cinder Volume mounted in the instance.