Authentication Servers

The whole point of an authentication service is that it allows the client to prove itself to be trustworthy, or at least to prove itself to be the same nefarious character it claims.

Infrastructures.org

I want to make our existing Active Directory the source for all the following:

  • Lists of users allowed to log into the managed infrastructure
  • UIDs and GIDs of those users
  • Passwords of those users

But I’m striking out on everything except the passwords themselves, and I’ve had that part working for years. So no new progress on that front, despite reading tons of LDAP and NSS howtos, bug reports, etc. So instead, let me lay out our planned implementation and workflow:

  1. Windows administrator makes a new user account in Active Directory.
  2. Windows administrator (or sometimes me) logs into the main file server, and runs an nearly-unmodified adduser script that creates the user’s Unix UID, home directory, and registers the user on either our student mailing list or our faculty/staff mailing list. This script currently has a big ‘ssh in a for loop’ part for adding accounts on the client systems, but this will be removed.
  3. A cron job on the file server runs every two minutes (even-numbered ones) and puts all the usernames, UIDs, and GIDs into a protected file accessible via NFS export. This takes only a fraction of a second to run.
  4. A cron job on the rest of the Linux systems runs every two minutes (odd-numbered ones — good thing I got time synchronization working earlier, right?), reads the protected file and the contents of our main home directory NFS area, and determines which users should be added and deleted on its local passwd and shadow files. Even on a freshly-reinstalled system with no domain users added, this takes seconds to run.
  5. Whenever a user gets deleted via our deluser script, the rest of the Linux systems will pick up on the deletion by the fact that their home directory will be owned by root and inaccessible to all other users.

You may have noticed that there’s nothing in any of this about the user’s actual password. It’s only stored on the domain controllers, never on any of the Linux systems. We use Kerberos and PAM to directly authenticate the users against the Active Directory servers. Code and configurations for this setup after the jump. Continue reading “Authentication Servers”

Client Application Management (Part 1, for .deb packages)

(Original infrastructures.org writeup here.)

Wow, this part has been a learning experience. The things I’ve picked up out of this stage:

  • aptitude is not apt-get. Obvious, yes. But how different they are was not apparent until this weekend.
  • pkgsync is great, and does exactly what it claims, but read its claims very carefully, since it can wreak legally-precise and well-defined havoc on your system.
  • Xen instances beat actual hardware for testing puppet configurations. As long as the dom0 is accessible, I can pull up a console for a domU and investigate why my host fell off the face of the earth. Doing the same investigation on a real system requires a bit of a drive.

More after the jump.
Continue reading “Client Application Management (Part 1, for .deb packages)”

Time Synchronization

Time synchronization makes lots of things work better, including:

  • make
  • Kerberos
  • tar
  • syslog

We’ve got a central NTP server on campus, and I’m using that to sync from. Puppet handles ntp and ntpdate configuration on the managed systems. Components of that setup:

  • ntp.pp and ntpdate.pp classes imported from puppet/classes
  • Virtualization-detecting facter recipe (originally from here, but also included below since it’s short and in case the original gets moved). This does two things: first, Xen domUs get their time from the dom0 by default. They won’t fail running ntp, but if dom0 has the wrong time, you’ll have a hard time getting any of the domUs to ever get the right time. So we’ll make sure ntp isn’t running there, as a reminder. Second, according to the virtualization recipe’s author, VMWare guests can’t run ntp at all. So we’ll disable it there, too.

/etc/puppet/facts/virtual.rb

Facter.add("virtual") do
  confine :kernel => :linux
  result = "physical"
  setcode do
    lspciexists = system "which lspci >&/dev/null"
    if $?.exitstatus == 0
      output = %x{lspci}
      output.each {|p|
        # --- look for the vmware video card to determine
        # if it is virtual => vmware.
        # ---     00:0f.0 VGA compatible controller: VMware ...
        result = "vmware" if p =~ /VMware/
        }
    end
    # VMware server 1.0.3 rpm places vmware-vmx in this place,
    # other versions or platforms may not.
    if FileTest.exists?("/usr/lib/vmware/bin/vmware-vmx")
      result = "vmware_server"
    end
    if FileTest.exists?("/proc/sys/xen/independent_wallclock")
      result = "xenu"
    elsif FileTest.exists?("/proc/xen/capabilities")
      txt = File.read("/proc/xen/capabilities")
      if txt =~ /control_d/i
        result = "xen0"
      end
    end
    result
  end
end

/etc/puppet/manifests/classes/ntp.pp

class ntp {
  $ntppackage = $operatingsystem ? {
      Solaris => "SUNWntpu",
      default => "ntp"
  }
  package { $ntppackage:
      ensure => installed,
      provider => $operatingsystem ? {
          Solaris => "sun",
          default => "apt"
      }
  }

  file { ntpconf:
    path => $operatingsystem ? {
      Solaris => "/etc/inet/ntp.conf",
      default => "/etc/ntp.conf"
    },
    owner => root, group => root, mode => 644,
    source => "puppet://REDACTED/ntp.conf",
    require => Package[$ntppackage],
  }

  service { ntp:
    ensure => $virtual ? {
      vmware => stopped,
      xenu => stopped,
      default => running
    },
    enable => $virtual ? {
      vmware => false,
      xenu => false,
      default => true
    },
    subscribe => [Package[$ntppackage], File[ntpconf]]
  }
}

/etc/puppet/manifests/classes/ntpdate.pp

class ntpdate {
  package { ntpdate: ensure => installed }
}

and one entry from /etc/puppet/manifests/site.pp:

node ch405l {
  include ntp, ntpdate
}

Minor annoyances or deviations from the way things used to be configured: as of Debian 4.0, ntpdate is run when network interfaces are brought up, rather than at a user-defined time via the SysV init system. So if a system was installed with a bad time (most commonly on our dual-boot systems) and you want to avoid reboots, you’ll have to run ntpdate-debian once to get the clock in sync with the NTP server before ntpd will do anything right.

The Gold Server

In infrastructures.org terminology, the gold server is the one location that all clients check in with to see if they need to make any configuration changes. No changes needed? No changes made. No gold server available right now? No changes made, check back later. In theory, this should be a pretty simple server to provision. Do a base OS installation, then enable the configuration management daemons, and let everything else grow from there.

Our gold server is a Xen instance with 128 MB memory and 10 GB disk running Debian 4.0. Top-level packages installed include puppet, puppetmaster, cfengine2, subversion, and xemacs21. The subversion repository for the gold server has /etc/cfengine and /etc/puppet as its top-level folders:

repository-top.jpg

Puppet’s getting started docs work pretty well here. One thing to keep in mind with the Debian and Ubuntu puppet packages is that they have the --waitforcert value set to 0, so they’ll immediately exit after sending their key to the gold server. Once you sign their key on the gold server, puppet will work fine after a /etc/init.d/puppet start, but this gets in the way of unattended installation. I’ll probably end up putting waitforcert back to its default of 120 seconds when I get back to the bootstrapping procedure. I’ve made a /etc/default/puppet file that contains DAEMON_OPTS="--server REDACTED --factsync" to return waitforcert back to the default, to bring down new facter facts, and to tell puppet where the master server is. That puppet file gets copied during the boostrap procedure, similarly to the cfengine update.conf. Earlier, I had incorrectly assumed that the waitforcert value would cause puppet to hang, but it just backgrounds itself and waits on the certificate to be signed.

As for cfengine setup, I defer to the articles at debian-administration.org, as I don’t have much to add at this time.

Abaqus 6.6 on Debian Etch (amd64 port)

Q. Why does the condemned man get a last cigarette, instead of one of those through-the-skin stick-on nicotine thingies?

A. Don’t patch the executable.

rec.humor.funny

Bad Abaqus! Or more properly, bad ZeroG InstallAnywhere! This post indicates that AMD64 Java doesn’t have NPTL problems you keep checking for. And then, when I comment out the parts of the installer script that check for that, it ends up shifting the position of this godawful binary you put at the end of what was otherwise a shell script (or two), and then it won’t extract properly, and I end up searching half the day trying to figure out why I keep getting

Exception in thread "main" java.lang.NoClassDefFoundError: com/zerog/lax/LAX

errors every time I run the installer. So note to anyone else installing Abaqus on amd64 Debian (i386 works fine if you use the -nosystemchecks flag): make sure any line you put a comment on, you either edit in overwrite mode, or you make sure you delete a character for every # sign you add. Example patch section:

--- /cdrom/lnx86_64/product/UNIX/Disk1/InstData/NoVM/install.bin
+++ /tmp/aba/lnx86_64/product/UNIX/Disk1/InstData/NoVM/install.bin
@@ -2086,49 +2086,49 @@
if [ `uname` = "Linux" ]; then
debugOut "checking for NPTL + JVM vulernability..."
#check libc to see if it was compiled with NPTL
-       nptl="`strings /lib/libc.so.6 | grep -i nptl`"
-       if [ "$nptl" ]; then
-               debugOut "NPTL detected! checking for vulnerable JVM....";
+#nptl="`strings /lib/libc.so.6 | grep -i nptl`"
+#if [ "$nptl" ]; then
+#      debugOut "NPTL detected! checking for vulnerable JVM....";

note how each # sign has replaced one tab or other whitespace character. Your ultimate goal here is to convince the installer that you do not, in fact, need any LD_ASSUME_KERNEL hackery here. If you let Abaqus continue with its out-of-the-box amd64 install, all kinds of stuff like ls, nawk, etc. will complain with problems loading shared libraries.

So copy the installation CD to the hard drive, edit install.bin along the lines of what’s shown above, and run the main setup script with the -nosystemchecks flag. You might also want to use -jre system flag as well if you have Sun Java installed.

And on a much less aggravating note, to paraphrase from The Princess Bride: this flag -console, I do not think it means what you think it means:

abaqus-console-install.jpg

Infrastructures: Version Control

The infrastructures mothership says the following about version control:

It may seem strange to start with version control. Many sysadmins go through their entire careers without it. But infrastructure building is fundamentally a development process, and a great deal of shell, Perl, and other code tends to get generated. We found that once we got good at “doing infrastructures”, and started getting more work thrown at us, we had several distinct infrastructures at various stages of development at any given time. These infrastructures were often in different countries, and always varied slightly from each other. Managing code threatened to become a nightmare.

Managing Unix systems has been either a hobby or a job for me since 1991. I have never, prior to this infrastructure management thing, ever put any of that work under version control. Nor did any other sysadmins in my acquaintance, as far as I know. The VMS folks had backup versions of DNS and other configuration files made automatically as they edited. The Windows folks didn’t have a whole bunch of text to sling around to begin with. And I guess the Unix folks just sort of winged it like I did.

I have occasionally used version control on more complicated development projects, for the usual reasons. Easier change tracking, automatic publishing of latest versions, etc. But if the code was simple and nobody else had to see it or use it, it was easier to just write it and let the automatic tape backups handle it from there. Continue reading “Infrastructures: Version Control”

Unattended Debian Installations (or How I Learned to Stop Worrying and Love the preseed.cfg)

A CMR project recently bought 12 new Dell PowerEdge SC1435 servers to replace some of our aging compute cluster systems. In previous server rollouts, I’d generally get one system installed and configured, image it with SystemImager, and then PXE-boot the rest of the systems to pull that image. However, it’s tough to audit exactly what got installed, and how. It’s also arguably a waste of space to keep images of all the types of cluster systems we have (PowerEdge 2650, PowerEdge 1850, PowerEdge 1855, Dimension 9200, etc.). So enter Debian Preseeding. With preseeding, I can make a text file that describes what packages I want installed, debconf configurations that vary from the default, how I want the partitioning done, copy configuration files from web or ftp servers to the target drive, etc. It took a few days to get going, but the long-term payoff should be enormous.

The resulting installation timeline for a PowerEdge SC1435 (relative to power-on in HH:MM:SS):

  • 00:00:25 – Power-on self tests have completed enough to allow me to hit F12 for PXE boot.
  • 00:01:20 – Rest of power-on self tests have completed, PXE boot process starts. All installation parameters are passed in from the pxelinux.cfg file on the DHCP server or the preseed.cfg file on a nearby ftp server.
  • 00:04:50 – Base and standard task packages installed. openssh-server and puppet packages are installed. puppet defaults file is downloaded. System starts formatting a 500GB SATA drive for /tmp space.
  • 00:08:20 – System finishes formatting /tmp, and reboots since that was the last step of the unattended install.
  • 00:09:30 – System is at login prompt. Root password works. puppet will take over installing miscellaneous packages and other post-installation configuration tasks. Once puppet has finished copying over root@adminserver’s public key, I can log in from adminserver without being prompted for a password.

Total number of keypresses from power-up to login: 1. Annotated preseed and pxelinux files after the jump.

Continue reading “Unattended Debian Installations (or How I Learned to Stop Worrying and Love the preseed.cfg)”

Exporting Figures from MATLAB

I just discovered the WordPress.com MATLAB feed today. Frinkytown’s complaint about copying and pasting figure reminded me of things I had to do to write my M.S. thesis, and other things I discovered afterwards:

  1. MS Word is the devil, and Equation Editor is its evil spawn. When I started writing my thesis, I had been using WordPerfect for 9 years. I cannot imagine writing a technical document without Reveal Codes or a close equivalent. As you might also have guessed, I greatly preferred the old WP method for entering equations. That is, typing out a math code for an equation rather than pointing and clicking through palettes of symbols, clicking placeholders for subscripts, superscripts, and other elements, etc. I’m a [latex]\LaTeX[/latex] geek now.
  2. Windows Metafiles What Word does to Windows Metafiles is a close third on deviltry.

Continue reading “Exporting Figures from MATLAB”

Work in Progress: Policy-Driven Blog Registrations for Universities

In response to this thread at the WPMU forums, I’ve hacked up parts of the signup procedure to match what I wanted for my student/faculty/staff blogs here. Particularly, the goals for us were:

  1. Restrict registration to people with email addresses in our subdomain cae.tntech.edu — this would include every engineering graduate student plus a sizable chunk of the engineering faculty, and a few staff.
  2. Force blog addresses to be of the form blogs.cae.tntech.edu/username/ , where username is the individual’s designated CAE username. >99% of these match the university’s username.
  3. Streamline signup as much as possible. If the user has activated their CAE login account, all they should have to do is enter their username, hit the register button, check their email, and get going.
  4. Remove the 4-character minimum on usernames and blog names. Mostly because two of us have 3-letter usernames here.
  5. Restrict creating multiple blogs. One blog per user by default — if a working group or project team needs a group blog space, I’ll create it myself.

These patches for wp-signup.php and wpmu-functions.php appear to be working fine for goals 2-5. Goal 1 was already handled in WPMU by restricting the email domains that valid users can sign up from. I’ve not vetted these for blogs in subdomains instead of subdirectories, nor for security. But the changes should be small and simple enough to be easily audited.