It all started so simply: I was going to set up a little Xen instance to be my next cluster submit host, and needed a spare address for it:
- I started setting up an instance for ch208i.cae.tntech.edu, since it was no longer on the Xen host like it was several months ago. Crap, the reason it’s no longer on the Xen instance is because I moved it to its own dedicated hardware — it’s still my main ftp/mirror server. Ctrl-C that one.
- Hmm, what’s available from old Xen instances? mail2.cae.tntech.edu.cfg from when I was testing out a new mail server setup last fall — doesn’t ping, doesn’t show up in
xm list, no problem.
xen-create-image --hostname=mail2.cae.tntech.edu --ip=22.214.171.124 \ --gateway=126.96.36.199 --netmask=255.255.255.0 --size=10Gb --memory=256Mb \ --swap=1Gb --debootstrap --force
A few minutes later, my instance is debootstrapped and ready to go.
- Oh, crap. Why am I getting an error on
xm createthat says my LVM is already in use on a domU somewhere?
- Further crap. Looking in
/etc/xen/mail.cae.tntech.edu.cfgfor the production mail server, it apparently uses the old mail2.cae.tntech.edu LVMs. Wonderful.
ssh mail? It works since sshd was already memory-resident, but
/root/.profiledoesn’t exist. And neither does much of anything else.
- Great. I’ve just killed the mail server. Off to the Amanda server to do a quick restore of its data. What? I never put mail.cae.tntech.edu into the backup list? Not normally the end of the world, since the mail stores are held accessed over NFS from the main file server, but what about my dovecot and postfix configurations?
- Oh, well. Time to see how good my puppet manifests are for the mail server.
Not too bad, as it turns out. Total downtime was only a couple hours, including having to redo the postfix and dovecot configurations (which were then copied off to the puppetmaster). I still have a few more things to fix, but mail delivery is up, and imap is running. TLS support for my sending mail from home isn’t up yet, but it’ll be fixed shortly.
I still need to fix that submit host, though. Next time, I think I’ll use an IP address reserved for my office.
Update: after getting a partial TLS/SASL setup going late Wednesday night, I went to sleep without realizing I’d killed mail delivery again. Finally got it straightened out Thursday morning.