For a small RPM repository, one can mirror the %{_topdir} directory of the build machine (at Rutgers, /usr/local/src/rpm-packages) over NFS. Repository users can use rpm -qp to find dependencies and package information. However, for large repositories this is infeasible: NFS is insecure, and rpm -qp is slow and unfriendly. Instead, we use apt and rpm2html over FTP and HTTP to distribute packages.
Apt requires a repository layout that differs from %{_topdir}. Instead of grouping RPMs by architecture alone, it groups RPMs into "distributions" and then "components." At Rutgers, we have distributions for each architecture:
Each distribution is split into three components: stable, unstable, and testing. This structure is represented by directories:
rpm-packages/sparc-sun-solaris2.6/RPMS.stable
rpm-packages/sparc-sun-solaris2.6/RPMS.testing
rpm-packages/sparc-sun-solaris2.6/RPMS.unstable
rpm-packages/sparc-sun-solaris2.7/RPMS.stable
rpm-packages/sparc-sun-solaris2.7/RPMS.testing
rpm-packages/sparc-sun-solaris2.7/RPMS.unstable
.
.
.
The RPMS for component x go into RPMS.x.
Strictly speaking, each distribution should have parallel
directories for SRPMS (i.e. SRPMS.stable,
SRPMS.testing, and SRPMS.unstable). However,
since each Rutgers distribution contains identical packages, with
few exceptions, there is no reason to duplicate SRPMS. Instead we
have top-level directories SPRMS.x for each
component x.
Apt adds a "base" directory to each distribution. These are generated with the genbasedir command:
genbasedir --topdir=topdir distribution component [ component ... ]
To generate web pages for the repository, we use rpm2html. It generates static web pages that index RPMs in each distribution and component by name, date, group, vendor, maintainer, RPM distribution; and static web pages for each package.
On rpm.rutgers.edu the rpm2html configuration file is stored in /etc/rpm2html.config (which should be checked out from /etc/RCS/rpm2html.config,v). It has a header:
; $Id: maintaining_repository.html,v 1.1.1.1 2001/12/14 20:38:46 sbi Exp $
maint=Sam Isaacson
mail=sbi@nbcs.rutgers.edu
dir=/free/public/file/0/rpm-html
; prefix on web server
url=http://rpm.rutgers.edu/rpm-html
and sections for each distribution-component combination, e.g.:
[/free/public/file/0/rpm-packages/sparc-sun-solaris2.8/RPMS.stable]
name=Stable sparc-sun-solaris2.8 packages
ftp=ftp://rpm.rutgers.edu/rpm-packages/sparc-sun-solaris2.8/RPMS.stable
ftpsrc=ftp://rpm.rutgers.edu/rpm-packages/SRPMS.stable
color=#66aa66
subdir=sparc-sun-solaris2.8
The format of /etc/rpm2html.config is documented here.
To run rpm2html, execute the command rpm2html /etc/rpm2html.config. This must be done as a user with write permission to the directory "dir" specified in the /etc/rpm2html.config header.
In addition to RPMs, the Rutgers RPM repository contains the associated specfiles and sources, and caches of virtual packages per distribution. Since rpm does not add %include'ed specfiles to SRPMs, distributing the specfiles separately facilitates repository maintenance.
The caches are used by the RPM bootstrap scripts. They contain the RPMs generated by building and signing the output of solaris-contents.pl -c and shells.pl -c on the build machines. Bootstrap.pm expects to find the cache for distribution x in x/CACHE. For more information on configuring the location, see the file bootstrap.html in the bootstrap source.
Moving hundreds of RPMs and then running genbasedir and rpm2html by hand can be quite a pain. Thus the Perl script update-apt.pl is provided to automate most of the installation process. It requires the presence of "pending" directories /free/PENDING.x for each component x. When run, it installs each file in /free/PENDING.* in the repository, and runs genbasedir and rpm2html if necessary. For more information on configuring and running update-apt.pl, see scripts.html in the repository-scripts source.
In addition to update-apt.pl, clean-up-rpm2html.pl is provided to remove the search form from the rpm2html output. It is run automatically by update-apt.pl. It takes no command-line arguments. Again, see scripts.html.
Documents on writing specfiles and building RPMs are located in the rpm-guide directory on smeagol.rutgers.edu, the OSS CVS server.
Once you have working RPMs, build them for each supported architecture on identically configured machines. Use remote-rpm (also in CVS on smeagol.rutgers.edu). Make sure that none of the files that you use will overwrite ones already in the repository. Sign the output with your GPG public key.
Use update-apt.pl to install the packages. If the stable distribution is frozen, install them in testing, or unstable. If the stable distribution is stabilizing, and your packages fix bugs, install them in stable. If your packages are of alpha software or are otherwise unfit for public consumption, install them in unstable. If possible, add your specfile to the SPECS directory in CVS on smeagol.rutgers.edu.
Generally this is discouraged. Apt will automatically grab the newest version of a program, so you can safely leave old versions without confusing users. Moreover, several systems depend on the presence of exact versions of RPMs. When stable is frozen, put new versions in testing. When stable is stabilizing, check with sysadmins who need fixed versions of software before deleting anything.
If you must prune a repository, the script find-dup.pl in the rpm-tools directory in CVS on smeagol.rutgers.edu can be used to get a rough list of old packages. Beware, though: sometimes its version-sorting algorithm differs from what you may expect. To eliminate all the remnants of a package, including the sources and specfiles, try this sample, untested, script:
#!/bin/sh
ATTIC=/free/ATTIC
remove () {
[ -d $ATTIC ] || mkdir -p $ATTIC
echo "Removing: $*"
for i in $*; do mv $i $ATTIC; done
}
cd /free/public/file/0
# Iterate through basenames of package. By default, rpmquery will
# print "%{NAME}-%{VERSION}-%{RELEASE}\n" for all the packages it finds:
for BASENAME in `rpmquery --specfile $*`; do
# Find source RPMs, if possible:
SRPMS=`find rpm-packages -name "$BASENAME.src.rpm"`
if [ "X$SRPMS" != "X" ]; then
# Find the source associated with each SRPM. With the query
# format %|x?{y}:{z}|, rpmquery searches for tag x in each
# specified package. If found, it prints y; otherwise, it
# prints z. We must enclose the inner %{SOURCE} in square
# brackets as it is a list. Unfortunately the sources are
# only listed in SRPMs, not specfiles:
SOURCES=`rpmquery --queryformat '%|SOURCE?{[%{SOURCE}\n]}:{}|' \
-p $SRPMS | xargs -i{} echo 'rpm-packages/SOURCES/{}'`
remove $SOURCES
remove $SRPMS
fi
# Remove RPMS:
remove `find rpm-packages -name "$BASENAME.*.rpm"`
done
# Remove specfiles:
remove $*
If you want to change a large number of packages, consider building
an entirely new repository with the build-repository tools (in CVS
on smeagol.rutgers.edu).
Everything that goes into the repository should be signed by its builder with GPG. Use check-sigs.pl to ensure that the signatures of all packages in the repository are valid.