My Own Private Debian Repository

So now that I’ve got all these .deb files made from non-free commercial software packages, and some more packages from unstable (since ANSYS depends on libopenmotif, and I needed a more current version of puppet), and a deb package of Torque based off the work of the nice folks at SARA, I need a place to put them. I already had a local Debian mirror, since many of my systems don’t get off-campus Internet access, and it’s used by other Debian users on campus.

So I made a new Debian-style repository off to the side of my regular Debian mirror. Features include:

  • A pool directory where all the binaries and sources actually reside
  • A release directory for etch, including binaries for both i386 and amd64 architectures, and sources
  • Gpg-signed, so I can convince apt to download from it without warning me about unauthenticated sources
  • Protected from unauthorized access via Apache directive, proftpd directive, and keyed ssh access.

This covers some of the same territory as Roberto Sanchez’s 2005 howto, but goes a bit further in some areas. Details after the jump.

Let’s start with the directory structure and contents. Starting at the top-level directory (topdir), there’s a pool directory where all the packages get uploaded. Its directory structure includes:

./pool
./pool/contrib
./pool/main
./pool/main/p
./pool/main/p/puppet
./pool/main/t
./pool/main/t/torque
./pool/non-free
./pool/non-free/a
./pool/non-free/a/abaqus66
./pool/non-free/a/ansys10
./pool/non-free/a/ansys11
./pool/non-free/m
./pool/non-free/m/maple11
./pool/non-free/m/matlab74
./pool/non-free/o
./pool/non-free/o/openmotif

These are all created as needed, as I upload new packages. Once that’s populated, a mkpackages script I wrote handles the rest:

#!/bin/bash

topdir="REDACTED"
releases="etch"
categories="main non-free"
architectures="i386 amd64"

for release in $releases; do
    cd $topdir
    for category in $categories; do
        for architecture in $architectures; do
            mkdir -p dists/$release/$category/binary-$architecture
            dpkg-scanpackages -a $architecture pool/$category /dev/null \\
                2>/dev/null > \\
                dists/$release/$category/binary-$architecture/Packages
            gzip -c dists/$release/$category/binary-$architecture/Packages > \\
                dists/$release/$category/binary-$architecture/Packages.gz

            mkdir -p dists/$release/$category/source
            dpkg-scansources pool/$category /dev/null \\
                2>/dev/null > \\
                dists/$release/$category/source/Sources
            gzip -c dists/$release/$category/source/Sources > \\
                dists/$release/$category/source/Sources.gz

        done
    done
    cd dists/$release
    rm Release Release.gpg
    apt-ftparchive release . \\
        -o APT::FTPArchive::Release::Origin="TTU CAE Network" \\
        -o APT::FTPArchive::Release::Codename="etch" \\
        > /root/Release
    mv /root/Release .
    gpg -abs -o Release.gpg Release
done

There’s several stages in that script. dpkg-scanpackages makes a Packages index for each architecture and component. Similarly, dpkg-scansources makes a Sources file. Excerpt from the non-free amd64 Packages file:

Package: matlab74
Version: 7.4.0.287+0.3
Priority: extra
Section: unknown
Maintainer: Mike Renfro <renfro@tntech.edu>
Depends: libxext6, libxp6, libxt6
Architecture: amd64
Filename: pool/non-free/m/matlab74/matlab74_7.4.0.287+0.3_amd64.deb
Size: 790362804
Installed-Size: 1479976
MD5sum: 54db1d34c9d7f377d4d67116bebe9bcb
Description: MATLAB - The Language of Technical Computing
 MATLAB is a high-level language and interactive environment that enables

and from the non-free Sources file:

Format: 1.0
Package: matlab74
Version: 7.4.0.287+0.3
Binary: matlab74
Maintainer: Mike Renfro <renfro@tntech.edu>
Architecture: any
Standards-Version: 3.7.2
Build-Depends: debhelper (>= 5)
Directory: pool/non-free/m/matlab74
Files:
 086a9f30fb37bdefb8319f3dee9270e6 271 matlab74_7.4.0.287+0.3.dsc
 ec0c7521724f54fcd457f4a2bcf6f69a 1598177945 matlab74_7.4.0.287+0.3.tar.gz

So far, this gets us a usable apt repository, but one that will cause apt to constantly complain about unauthenticated packages. There’s always the apt.conf option to allow unauthenticated sources, but it’s a hack. So let’s make a properly signed repository, configure our clients to trust the repository key, and have some small degree of tamperproofing on the packages.

For repository signing:

  1. Generate a GPG key for the repository with gpg --gen-key
  2. Make a Release file with apt-ftparchive. To make this match the regular Debian Release files, I run apt-ftparchive from the top-level release directory (topdir/dists/etch, in this case), and redirect the output to a temporary Release file elsewhere. This is mostly to keep the Release file from being self-referential.
  3. Move the Release file into the etch directory, and sign it with gpg, resulting in a Release.gpg file.

Step 1 needs to be done manually, but only once. Steps 2 and 3 need to be done every time you upload a new package, a new version of an existing package, etc. My mkpackages script does both those steps, so I just run it instead.

For getting clients to trust the repository’s key:

  1. Export the repository key with gpg --export --armor > repository_key.asc
  2. Store this exported key file somewhere on the repository server, in your puppet manifests, etc.
  3. On each client, download the exported key file, then run apt-key add repository_key.asc to add it to the trusted key list.
  4. Now, when you add an entry for your repository to sources.list, the Release and Release.gpg file will be downloaded and authenticated, and your client will have some degree of confidence that the packages they download are unaltered.

Regarding limiting the access to the repository, this isn’t for reducing network traffic as much as it is for averting enormous copyright problems. The non-free areas of my repository contain some relatively expensive software packages. Granted, they’re educational licenses, and so they might not have all the features of the commercial version. Also granted that outsiders are more likely to just pirate cracked commercial versions from their favorite P2P network than to grab my copies. But my systems administration career has already included one visit from the U.S. Customs Service (not at the university, but elsewhere — and it wasn’t anything I did, but some less than legal activities one of our customers was conducting); I’d rather not have a second visit from any similar organizations.

So I’ve got three methods of accessing the repository: http, ftp, and ssh. Securing each is done as follows:

http — Added the following line to the server’s Apache configuration:

        <Location REDACTED>
                order deny,allow
                deny from all
                Allow from A.B.C.0/24
        </Location>

where REDACTED is the directory part of the URL for apt, and A.B.C is the first three numbers in the class C that the majority of my systems are in.

ftp — Added the following two stanzas to proftpd.conf:

<Class cae-managed-system>
  From A.B.C.0/24
</Class>

and

  <Directory REDACTED>
    <Limit ALL>
      AllowClass cae-managed-system
      DenyAll 
    </Limit>
  </Directory>

ssh — This is a neat and much more secure method, but seems much slower than http and ftp methods. The speed problem may be unavoidable with the Xen system I’m running the ftp server on, and it may still be fast enough for what I need. It’s not like I’ll be updating packages on a daily basis once I’ve got the initial set of them running. For this method, we need the following two items:

  1. An unprivileged account on the ftp server. This account could be a system account, but it needs a real shell, not /bin/false. To minimize the damage that can be done with this account, I made sure that it can’t write anything to its home directory. It can only write to /tmp and other public areas.
  2. Root-owned RSA or similar identities on the client systems that will be accessing the repository over ssh. Each of these root accounts will effectively get password-less ssh access to the ftp server as this unprivileged user, so don’t hand these out to just anybody.

Generate the root RSA key on your client systems with ssh-keygen (use these instructions at debian-administration.org if you need them). Copy the public half of the identity to the ftp server, and add it to the end of the unprivileged account’s authorized_keys file. Now a sources.list line of the form:

deb ssh://unprivileged-user@ftp-host/full/path/to/topdir/ etch main non-free

should let you pull all the packages and other files over ssh.

Join the Conversation

10 Comments

  1. Mike, you can cut one step out from your (Releases|Sources).gz generation scripts:

    mkdir -p dists/$release/$category/binary-$architecture
    dpkg-scanpackages -a $architecture pool/$category /dev/null \
    2>/dev/null | gzip -c > \
    dists/$release/$category/binary-$architecture/Packages.gz

    mkdir -p dists/$release/$category/source
    dpkg-scansources pool/$category /dev/null \
    2>/dev/null | gzip -c > \
    dists/$release/$category/source/Sources.gz

  2. Yeah, I can drop the uncompressed Packages files entirely, but I kept them in there because I occasionally browse them outside of apt. I may have misconfigured Apache2, or else my normal Windows Firefox settings, because if I try to view a .gz file, Firefox will just try to download it rather than uncompressing it on the fly.

  3. Nice.

    But, what hapends if I have mutliple releases? My pool directory will contains differents revision of same package. I want dpkg-scan* only scan package revisions for a particular release (as in http://ftp.debian.org). How to do?

  4. I think you’d do multiple releases with an overrides file for dpkg-scanpackages and dpkg-scansources.

    I haven’t needed that myself, since I tend to stick with Debian stable, only upgrade or reinstall during a semester break, and most of my packages are big commercial binaries that would work on multiple Debian versions anyway.

    But this Berlios page has instructions that indicate something along the lines of dpkg-scanpackages -m -ai386 pool/contrib/d/damaris indices/override.testing.contrib|gzip -c > dists/testing/contrib/binary-i386/Packages.gz for testing and dpkg-scanpackages -ai386 pool/contrib/d/damaris indices/override.stable.contrib|gzip -c > dists/stable/contrib/binary-i386/Packages.gz for stable.

    That might be enough to get you working. Let me know if it is.

  5. wow now this is very interesting. I tried doing this followingyour method exactly how you have set it out but somehow I’m not able to pack those files in my final .deb?

Leave a comment

Leave a Reply to Velociroflcoptersaurus Cancel reply

Your email address will not be published. Required fields are marked *