{"id":24,"date":"2007-05-19T14:04:48","date_gmt":"2007-05-19T20:04:48","guid":{"rendered":"http:\/\/blogs.cae.tntech.edu\/mwr\/infrastructure-management-work-in-progress\/"},"modified":"2007-05-19T14:04:48","modified_gmt":"2007-05-19T20:04:48","slug":"infrastructure-management","status":"publish","type":"page","link":"https:\/\/sites.tntech.edu\/renfro\/infrastructure-management\/","title":{"rendered":"Infrastructure Management"},"content":{"rendered":"<p>(Work in Progress)<\/p>\n<p>Looking in the CAE Nagios monitor today (April 30, 2007), I see we&#8217;re watching 200 hosts, including the following:<\/p>\n<ul>\n<li>Linux workstations: 5<\/li>\n<li>Linux X-terminals: 8<\/li>\n<li>Linux servers and cluster boxes (including Xen instances): 29 (plus other recent Xen instances that need to be added)<\/li>\n<li>Linux firewalls: 4<\/li>\n<li>Solaris workstations: 5<\/li>\n<li>Dual-boot Windows workstations that also function as an after-hours cluster annex: 42<\/li>\n<\/ul>\n<p>So for me, that makes right at 100 root filesystems, software loads, etc. that&#8217;s getting to be a major pain to keep consistent. It&#8217;s easy enough to consistently install Debian systems in the beginning: make one model system, image it with <a href=\"http:\/\/sysresccd.org\/\">System Rescue CD<\/a>, <a href=\"http:\/\/www.tux.org\/pub\/people\/kent-robotti\/looplinux\/rip\/\">RIP<\/a>, or <a href=\"http:\/\/systemimager.org\/\">systemimager<\/a>, and deploy the image to the rest of the systems. The hassle comes in more when you&#8217;re trying to maintain them consistently. Make a needed change to one system, then make absolutely certain it ends up in the system image. Write init scripts, especially for the dual-boot systems, that download any updates to the system image on each boot as soon as the network comes up and the filesystems are available. Install <a href=\"http:\/\/packages.debian.org\/cron-apt\">cron-apt<\/a> everywhere you can remember to so that security updates get installed automatically. Create accounts on the non-dual boot systems via a looped ssh with useradd, but write scripts for the dual-boot ones that useradd and userdel accounts according to the contents of the big NFS share that holds everyone&#8217;s home directories. Remember which systems have Matlab 7.0.1, which have 7.2, which have 6.5.1, and which have more than one version. It&#8217;s worked out fine overall, but there&#8217;s a lot of very reliable baling twine holding things together. And even if it&#8217;s reliable, it&#8217;s still baling twine.<\/p>\n<p>Enter <a href=\"http:\/\/www.infrastructures.org\/\">infrastructures.org<\/a> and friends. I had read through their stuff months ago, but the arrival of 12 new Opteron computational servers and a Xeon server that I can run Xen instances on gives me the incentive, the test-beds, and the spare hardware to do it right.<\/p>\n<p>Servers and roles that go into the management infrastructure, plus links to any posts where they&#8217;re explained in more detail:<\/p>\n<ul>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/04\/30\/infrastructures-version-control\/\">Version control<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/06\/the-gold-server\/\">Gold server<\/a><\/li>\n<li>Host install tools: <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/04\/17\/unattended-debian-installations-or-how-i-learned-to-stop-worrying-and-love-the-preseedcfg\/\">Debian<\/a> and <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/10\/31\/solaris-jumpstart-installations-in-an-all-debian-environment\/\">Solaris<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/17\/ad-hoc-change-tools\/\">Ad hoc change tools<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/17\/directory-servers\/\">Directory servers<\/a><\/li>\n<li>Authentication servers (<a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/16\/authentication-servers\/\">old and nasty version<\/a>, <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/08\/02\/authentication-servers-the-next-generation\/\">much improved version<\/a>)<\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/13\/time-synchronization\/\">Time synchronization<\/a><\/li>\n<li>Network file servers (part <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/08\/02\/the-new-file-server-preseeding-and-lvm\/\">1<\/a> and <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/08\/02\/the-new-file-server-puppet-and-modules\/\">2<\/a>)<\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/10\/31\/file-replication-servers\/\">File replication servers<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/10\/31\/client-file-access\/\">Client file access<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/19\/client-os-update\/\">Client OS update<\/a><\/li>\n<li><a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/10\/31\/client-configuration-management\/\">Client configuration management<\/a><\/li>\n<li>Client application management (<a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/14\/client-application-management\/\">for official deb packages<\/a>, <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2008\/02\/01\/the-autostow-is-dead-long-live-stowedpackage\/\">for stowed packages<\/a> <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/19\/client-application-management-part-2-for-stow-packages\/\">(old version here)<\/a>, <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2008\/02\/05\/stupid-puppet-trick-agreeing-to-the-sun-java-license-with-debconf-preseeds-and-puppet\/\">using debconf preseeds to answer installation questions<\/a>, <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2007\/05\/28\/making-debian-packages-from-commercial-software\/\">for making deb packages of enormous commercial packages<\/a>, and <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2008\/05\/21\/making-solaris-packages-from-commercial-software\/\">for making Solaris packages of commercial packages<\/a>)<\/li>\n<li>Mail<\/li>\n<li>Printing<\/li>\n<li>Monitoring<\/li>\n<\/ul>\n<p>I have <a href=\"http:\/\/blogs.cae.tntech.edu\/mwr\/2008\/04\/22\/giving-a-presentation-at-the-tennessee-higher-education-it-symposium\/\">a presentation summarizing these pages<\/a>, too.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(Work in Progress) Looking in the CAE Nagios monitor today (April 30, 2007), I see we&#8217;re watching 200 hosts, including the following: Linux workstations: 5 Linux X-terminals: 8 Linux servers and cluster boxes (including Xen instances): 29 (plus other recent Xen instances that need to be added) Linux firewalls: 4 Solaris workstations: 5 Dual-boot Windows &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/sites.tntech.edu\/renfro\/infrastructure-management\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Infrastructure Management&#8221;<\/span><\/a><\/p>\n","protected":false},"author":87,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"open","template":"","meta":{"footnotes":""},"class_list":["post-24","page","type-page","status-publish","hentry","entry"],"_links":{"self":[{"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/pages\/24","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/users\/87"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/comments?post=24"}],"version-history":[{"count":0,"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/pages\/24\/revisions"}],"wp:attachment":[{"href":"https:\/\/sites.tntech.edu\/renfro\/wp-json\/wp\/v2\/media?parent=24"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}