|
Linux in the Library
What can it do for you?
Last modified October 21, 2008
|
|
This page grew out of a Linux presentation I gave at the 1999 Customers of Dynix Inc. (CODI) conference. I continue to update this page as time permits. Portions of it have been used to give presentations at the following conferences:
-
The 1998 Colorado CODI (Customers of Dynix, Inc) Conference (Westminster, Colorado).
-
The 1999 CODI (Customers of Dynix, Inc) Conference (Seattle, Washington).
-
The 2000 ALA (American Library Association) Conference (Chicago, Illinois).
-
The 2001 CLA (Colorado Library Association) Conference (Snowmass, Colorado).
-
The 2001 CODI (Customers of Dynix, Inc) Conference (Salt Lake City, Utah).
-
The Douglas Public Library District "Geekfest '02" (Castle Rock, Colorado).
-
The 2002 CAL (Colorado Association of Libraries) Conference Technology Panel (Keystone, Colorado).
-
The October 2002 Boulder Linux Users Group meeting (Boulder, Colorado).
-
The 2002 joint CODI/HUG (Customers of Dynix, Inc / Horizon Users Group) Conference (Orlando, Florida).
-
The 2003 Pathfinder Library System Annual Retreat (Crested Butte, Colorado).
-
The 2003 CAL (Colorado Association of Libraries) Conference "Son of Geekfest" Technology Panel (Keystone, Colorado).
-
The 2004 CAL (Colorado Association of Libraries) Conference "Geekfest 2004" Technology Panel (Denver, Colorado).
-
The 2004 CODI (Customers of Dynix, Inc) Conference (Portland, Oregon).
-
The 2005 CODI (Customers of Dynix, Inc) Conference (Minneapolis, Minnesota).
-
The 2006 IUG (Innovative Users Group) Conference (Denver, Colorado).
-
The July 2006 North Colorado Linux User's Group meeting (Ft. Collins, Colorado).
-
The 2006 CODI (Customers of Dynix, Inc) Conference (Salt Lake City, Utah).
-
The October 2006 North Colorado Linux User's Group meeting (Ft. Collins, Colorado).
The original version is kept here for historical purposes and to see if my writing style and html editing skills have gotten any better. ;-)
For those who aren't familiar with SirsiDynix, they are a vendor of Integrated Library Systems. Their two primary products are called Dynix (their legacy product) and Horizon (their current product). The Horizon database server is supported on Linux, as are some of their middleware products. Presently there is no Linux client for Horizon, but the vendor is planning to add one. Our Horizon database server originally ran on HP-UX, but was migrated to Linux in 2006.
It might be of interest to you even if you don't work for a Library, but are curious about what Linux can do. This is by no means a complete list of all the things that are possible, just some of the things we've used it for.
Comments, suggestions, opinions & questions are welcome.
Please send them to: Eric Sisler
(esisler@cityofwestminster.us)
Index
Although everything listed in the index is on this web page, I've included it in the event you want to look at a particular section or just want to know what you're getting yourself into. ;-) Enjoy!
-
Introduction
-
How did I get started with Linux?
-
Why did I recommend Linux?
-
College Hill Library
-
Network Infrastructure
-
Linux Servers
-
PC Configuration
-
76th Avenue Library
-
Irving Street Library
-
Network Infrastructure
-
Linux Servers
-
PC Configuration
-
The Linux kernel, GNU, BSD & friends
-
What is a Linux distribution?
-
What distributions are available?
-
What comes with a Linux distribution?
-
Will Linux work with other operating systems?
-
Choosing Linux - direct costs
-
Hardware
-
Software - The Red Hat conundrum
-
Choosing Linux - indirect costs
-
Services
-
The Learning curve (time)
-
Learning materials
-
Pre-production server hardware
-
Remote system administration
-
Ongoing system administration
-
Choosing Linux - stability & performance
-
Stability
-
Performance
-
Rebooting
-
The Kernel & other processes
-
Shared libraries
-
Software "bit-rot"
-
Disaster recovery
-
Choosing Linux - support
-
Choosing Linux - updates & source code
-
Software updates
-
Software package management with RPM
-
Source code
-
Package updates
-
Least privilege
-
Proper configuration of running services
-
Disabling and/or removal of unnecessary packages & services
-
Using the Linux firewall tools protect a server
-
Logfiles & monitoring tools
-
Backup, verification & recovery
-
Uninterruptable Power Supply (UPS)
-
Windows viruses, worms & exploits
-
Domain Name System (DNS)
-
Samba (network file & print services)
-
Apache (Internet web server)
-
Internet filter & cache (Smart Filter & Squid)
-
Dynamic Host Configuration Protocol (DHCP)
-
Automating tasks with cron
-
Internet e-mail
-
Running listservs with Mailman
-
Rsync (file & directory synchronization)
-
Network Time Protocol (NTP)
-
Trivial File Transfer Protocol (TFTP)
-
Calcium web scheduling software
-
Secure communication with OpenSSH
-
Remote logging with syslog
-
Perl programming
-
Web development with LAMP
-
VMware virtual servers
-
Firewalling with iptables
-
CUPS printing
-
Revision control with RCS
-
Bugzilla
-
Ethereal packing sniffing
-
Linux on the desktop
-
Future projects at the Library using Linux
-
Sources & further reading
-
Thank you's
-
Dedication
PART I: INTRODUCTION & OVERVIEW
|
Introduction - who am I, anyway?
I am Eric Sisler, Library Applications Specialist for the City of Westminster. I have worked for the Library for 21+ years in various jobs: page, circulation clerk, courier, bookmobile operator & staff, technical services processor and cataloger. I am currently part of the Library's two member Automation Services department, providing computer and network support to two library facilities. My primary responsibilities include the care & feeding of 12 Linux servers (7 physical / 5 virtual), 7 Windows servers (1 physical / 6 virtual), assorted network gear and far too many client PCs.
Index
How did I get started with Linux?
In 1996, we were in the process of moving our Dynix system from a shared HP-UX box to its own HP-UX box. (HP-UX is Hewlett Packard's proprietary version of Unix.) I was just beginning to learn more about Unix in general and wanted something I could use as a learning tool without the worry of destroying it. Linux fit the bill nicely - a distribution could be had for around $50 and it would run on my home PC. We were also in the process of planning the automation needs for a new library facility. I began thinking about what kind of services we wanted to provide and how we might go about doing so. I knew we would be providing access to some CD-ROM databases, so I began experimenting with Linux as a CD-ROM server, discovering it was easily capable of much more. Like many who like to tinker with computers, I have been accused of trying to re-create the Internet in my home out of spare parts!
Index
Why did I recommend Linux?
Obviously, I felt it was the best solution to our needs, which initially included serving some CD-ROM databases, a DNS server and some basic file & print services. Seemingly on its own, this list has grown to include many other services: more file/print services, domain logons & scripts, MARC record storage & retrieval, public PC administration & security, staff & public web pages, Internet object caching, DHCP, shell scripting, task automation and others I've probably missed.
I also felt limited by other Operating Systems for a variety of reasons:
-
Knowledge:
Although I wasn't (and still am not) a Unix guru, what I had learned intrigued me and I could see the possibilities Linux had to offer. My knowledge of other server operating systems was limited and I wasn't interested in trying to learn more about Unix and other operating systems at the same time - I had enough to do already. Although I began using Linux to teach me more about Unix in general, I now know more about Linux than any other flavor of Unix, so I'm not sure if it worked as a learning tool or not... ;-)
-
Cost:
We had a limited computer budget to work with and every dollar spent on the server OS was one less dollar available for client PC hardware & software.
-
Services available:
Linux could do "out of the box" everything I wanted and a whole lot more, something I was unable to find with other operating systems. The services either weren't available or were only available as an add on (read: added cost).
-
Reliability / performance:
Dynix running on HP-UX had always proven stable & a good performer, and all indications were that Linux would perform the same way.
If I had to do it all over again, would I make the same decision? Absolutely! I can't imagine providing all the services currently available as efficiently & reliably any other way.
Index
PART II: LIBRARY FACILITIES
|
Library facilities - College Hill Library
College Hill Library is a joint project between The City of Westminster and Front Range Community College. The 76,000 square foot facility was opened in April of 1998 and is run by both agencies, a story in itself that is beyond the scope of this document.
-
Network infrastructure
-
The network at College Hill Library is a switched environment with 100 Megabit copper connections to all clients. Servers are connected via Gigabit fiber, bonded Gigabit copper or bonded 100 Megabit copper connections. Although College Hill Library is located on the Front Range Community College campus, it is a network unto itself, separate from both the College and City networks. Access to City Hall is provided via a Gigabit fiber optic Wide Area Network (WAN). College Hill has two paths to the Internet: Comcast cable Internet service is used for most traffic from staff & public computers. Additionally, two T-1's shared with the City are used for remote access to library services (like this web page) and for online databases using IP address based authentication.
-
Linux servers
-
There are five Linux servers at College Hill Library - Gromit, Nick, Preston, Wendolene & Mr-Tweedy. They provide the services described later in this document.
-
Gromit:
Gromit was the Library's first production Linux server and initially ran on the following hardware:
-
Compaq Deskpro 2000; 200MHz Pentium II processor, 64Mb RAM.
-
8.5Gb disk space (IDE).
-
4 drive external CD-ROM enclosure.
-
4mm DDS-2 tape drive.
-
APC Smart-UPS 700.
-
Initially Gromit was built from desktop class hardware because it was the only hardware available. I had done much of the setup at home and I wanted to prove to myself (and others) that Linux was capable of what I wanted to do with it, even on "regular" hardware. If Linux failed to meet expectations I could re-use the hardware for a different OS. Happily, Linux has met and greatly exceeded expectations. Gromit was moved to server class hardware in December of 1998, partly to add drive space, partly to gain a little more performance, but mostly because Linux had more than proven itself and we wanted to reduce the chance of hardware failure by moving it to better hardware. Gromit's hardware is replaced on a on a regular schedule, and currently lives on the following hardware:
-
Dell PowerEdge 2850; 3.2GHz Xeon processor, 4Gb RAM, redundant power supply.
-
373Gb disk space:
-
2 x 73Gb Ultra-320 SCSI hard disk drives in hardware RAID1 configuration.
-
2 x 300Gb Ultra-320 SCSI hard disk drives in hardware RAID1 configuration.
-
2 x 1 Gigabit copper network cards, bonded (load balance & failover configuration).
-
External Dell PowerVault 124T DLT VS1 16-tape changer.
-
APC Smart-UPS 1500.
-
Nick:
Nick is the library's Horizon database server. It was migrated from PA-RISC/HP-UX to Intel/Linux in 2006:
-
Dell PowerEdge 2650; 3.2GHz Xeon processor, 6Gb RAM, redundant power supply.
-
72Gb disk space (4 x 36Gb Ultra-320 SCSI hard disk drives in hardware RAID10 configuration + hot spare).
-
1 Gigabit copper network card.
-
APC Smart-UPS 1500.
-
Preston:
Preston's first responsibility was acting as a firewall between the Library & College networks. Although assembled from recycled pieces & parts, it was more than adequate for the task at hand:
-
Gateway 2000; 66MHz 486 processor, 16Mb RAM.
-
600Mb disk space (IDE).
-
APC Back-UPS Pro 420.
-
As the library's use of Linux grew, some of the services on Gromit were moved to Preston to balance the load. Preston's hardware is also replaced on a regular basis and currently resides on the following hardware:
-
HP / Compaq DL380 G3; 2 x 2.8GHz Xeon processors, 6.0Gb RAM, redundant power supply.
-
216Gb disk space (4 x 72Gb Ultra-320 SCSI hard disk drives in hardware RAID5 configuration).
-
4mm DDS-4 tape drive.
-
2 x 1 Gigabit copper network cards, bonded (load balance & failover configuration).
-
APC Smart-UPS 1500.
-
Wendolene:
In 2004 the Library's use of Linux grew significantly, and a server running Windows 2000 was switched to Linux:
-
Compaq DL380 G2; 2 x 1.4GHz Pentium III processor, 3.5Gb RAM, redundant power supply.
-
108Gb disk space (4 x 36Gb Ultra-320 SCSI hard disk drives in hardware RAID5 configuration).
-
4mm DDS-4 tape drive.
-
2 x 100 Megabit copper network cards, bonded (load balance & failover configuration).
-
APC Smart-UPS 1500.
-
Mr-Tweedy:
Mr-Tweedy's responsibilities are as a router/firewall, so it runs on regular desktop hardware:
-
Compaq Evo D500; 1.7GHz Pentium IV processor, 768Mb RAM.
-
20Gb disk space (IDE).
-
3 x 100 Megabit copper network cards.
-
APC Smart-UPS 1500.
-
College Hill Library PC configuration (113 total):
-
- 49 staff PCs.
- 64 public PCs.
- 23 Internet.
- 8 catalog only.
- 2 word processing.
- 5 stand-alone children's CD-ROM stations.
- 1 instructor's workstation (Library instruction classroom).
- 22 student workstations (Library instruction classroom).
- 3 SAM sign-up stations (PC time management / print cost recovery).
- 15 network printers.
- 4 self-check units.
Index
Library facilities - 76th Avenue Library
76th Avenue library is the former main library for the City of Westminster. Originally built in 1961 and remodeled several times, it is 6,000 square feet in size. The 76th Avenue library was closed in March of 2004, replaced by the Irving Street library.
Library facilities - Irving Street Library
The Irving Street library opened in April of 2004 and is 15,000 square feet in size. It replaced the 76th Avenue Library, located just a few miles away.
-
Network infrastructure
-
The network at Irving Street Library is a switched environment with 100 Megabit copper connections to all clients. Servers are connected via Gigabit fiber or 100 Megabit copper connections. Access to City Hall and servers at College Hill are also via a Gigabit fiber optic WAN (Wide Area Network). Like College Hill, Irving Street also has two paths to the Internet: Comcast cable Internet service is used for most traffic from staff & public computers. Additionally, two T-1's shared with the City are used for remote access to library services and for online databases using IP address based authentication.
-
Linux servers
-
Irving Street has two Linux servers, Shaun & Mrs-Tweedy. They provide many of the same services found at College Hill. One reason for giving Irving Street its own servers was the small data circuit size available at the time Irving Street opened - 512K. Windows 2000 roaming profiles can chew up a great deal of bandwidth, making retrieving them from College Hill slow and painful. Another reason is to provide some independence between the facilities - a downed data circuit or server at one location does not affect the other as much. Although Irving Street is now connected via Gigabit fiber, there are currently no plans to remove the servers.
-
Shaun:
Shaun is a VMware GSX host. Virtual machines on it provide many of the services on Gromit, Preston & Wendolene, just on a smaller scale. Its hardware was upgraded in 2006:
-
Dell PowerEdge 2850 II; 2 x 3.6GHz Xeon processors, 8GB RAM, redundant power supply.
-
146Gb disk space (4 x 73Gb Ultra-3 SCSI hard disk drives in hardware RAID10 configuration).
-
1 Gigabit copper network card.
-
APC Smart-UPS 1500.
-
Mrs-Tweedy:
Mrs-Tweedy's responsibility is as a router/firewall between the library and the Internet. It runs on hardware identical to Mr-Tweedy:
-
Compaq Evo D500; 1.7GHz Pentium IV processor, 768Mb RAM.
-
20Gb disk space (IDE).
-
3 x 100 Megabit copper network cards.
-
APC Smart-UPS 1400.
-
Irving Street Library PC configuration (37 total):
-
- 14 staff PCs.
- 23 public PCs.
- 13 Internet.
- 3 catalog only.
- 2 word processing.
- 3 stand-alone children's CD-ROM stations.
- 2 SAM sign-up stations (PC time management / print cost recovery).
- 4 network printers.
- 1 self-check unit.
By the way - if you're wondering about the naming scheme for our Linux servers, they're all named after characters from Nick Park's excellent claymation series Wallace and Gromit, the motion picture Chicken Run and the motion picture Wallace & Gromit: Curse of the Were-Rabbit.
Index
The Linux kernel, GNU, BSD, etc.
-
The Linux Kernel:
When you talk about Linux, you're really talking about the Kernel itself - the core software that allows other software to "talk" to the hardware. Linux was created by Linus Torvalds, then a student at the University of Helsinki, Finland. It is an open-source Unix clone that aims at POSIX (Portable Operating System Interface) compliance. It is developed and maintained by many Unix programmers & wizards across the Internet.
-
GNU General Public License (GPL) software & utilities:
Linux would not exist without the many drivers, compilers, utilities, services & programs ported from the Free Software Foundation (FSF) under the GPL.
-
Berkeley Unix (BSD):
Linux also takes advantage of many Internet daemons & utilities ported from BSD, one of the original flavors of Unix.
-
Commercial software from vendors:
Many vendors are beginning to port their software to Linux or write new software for Linux. Some distributions may include full-blown commercial packages and/or demos.
-
A (perhaps bad) analogy using Windows:
Windows = Linux kernel.
Windows drivers, utilities & software = GNU & BSD.
Index
What is a Linux distribution?
When most people talk about Linux, what they're really talking about is a Linux distribution, which typically comes with the following:
-
A stable version of the Linux kernel:
Although experimental versions of the kernel are usually included with a Linux distribution, production servers are normally installed with a stable kernel.
-
A collection of frequently used services & software:
As a full-featured Unix implementation, Linux comes with just about any Internet or network service you can think of. Most distributions install many of these by default using some kind of package management software.
-
Installation guide, documentation & media:
In addition to the actual installation instructions, the installation guide usually contains general information about Linux, the particular distribution you're installing, a basic user's guide and some essential system administration information. The installation media is usually a boot floppy and 1 or more CD-ROM's.
-
Commercial software:
Some Linux distributions come with commercial software as an added feature, or demo versions of the full package.
-
Commercial support:
Many distribution vendors now offer several levels of commercial support for their product. Anything from "per-incident" contracts to specialized consulting and 24 x 7 support. Some hardware vendors also offer Linux pre-installed on their equipment and often include an option for support as well.
Index
What distributions are available?
There are a number of Linux distributions available. While not a complete list, but some of the better-known ones include:
-
Red Hat
We run Red Hat Enterprise Linux on some of our production servers. See The Red Hat conundrum for more information about this product.
-
CentOS
CentOS is an Enterprise-class Linux Distribution derived from sources freely provided to the public by a prominent North American Enterprise Linux vendor. CentOS conforms fully with the upstream vendors redistribution policy and aims to be 100% binary compatible. (CentOS mainly changes packages to remove upstream vendor branding and artwork.) CentOS is free. We use this distribution for firewall boxes and other small servers where justifying the cost of RHEL is difficult.
-
Fedora
The Fedora Project is an open source project sponsored by Red Hat and supported by the Fedora community. It is also a proving ground for new technology that may eventually make its way into Red Hat products.
-
White Box Enterprise Linux
Another free recompiled clone of Red Hat Enterprise Linux. This variant endeavors to be as close to RHEL as possible. It uses a modified version of up2date for package updates.
-
Slackware
The distribution I started with. At the time I used it, a little difficult for the novice. A favorite of hardcore Linux enthusiasts.
-
Debian GNU/Linux
A popular distribution, Debian has no commercial arm whatsoever. It is organized & run entirely by volunteers.
-
SuSE
Linux with a German flavor, SuSE is now owned by Novell.
-
TurboLinux
TurboLinux specializes in the Asia Pacific market.
-
Gentoo Linux
Gentoo is a specialized flavor of Linux with a unique twist. Its packaging system downloads software packages as source code, which are then compiled and optimized for your system.
-
Knoppix
Knoppix is a bootable CD with a collection of GNU/Linux software, automatic hardware detection, and support for many graphics cards, sound cards, SCSI and USB devices and other peripherals. Knoppix can be used as a Linux demo, educational CD, rescue system, or adapted and used as a platform for commercial software product demos. It is not necessary to install anything on a hard disk. Due to on-the-fly decompression, the CD can have up to 2 GB of executable software installed on it.
-
Ubuntu
Ubuntu is a complete desktop Linux operating system, freely available with both community and professional support. The Ubuntu community is built on the ideas enshrined in the Ubuntu Manifesto: that software should be available free of charge, that software tools should be usable by people in their local language and despite any disabilities, and that people should have the freedom to customise and alter their software in whatever way they see fit. "Ubuntu" is an ancient African word, meaning "humanity to others". The Ubuntu distribution brings the spirit of Ubuntu to the software world.
If one of these distributions isn't to your liking, Distrowatch has an extensive list of them, complete with announcements, reviews and general information.
Index
What comes with a Linux distribution?
As a full-fledged Unix clone, Linux comes with everything you'd expect, and then some. This is by no means a complete list, just a sampling of what's included:
-
Command interpreters (shells):
Bash, korn, csh, zsh, sash, tcsh.
-
Networking protocols:
Common network protocols including TCP/IP, PPP, IPX, ethernet.
-
TCP/IP applications & services:
Telnet, ssh, samba, apache, php, squid, dhcp, ftp, e-mail (pop, imap & smtp), news, nfs, dns, MySQL, Postgres.
-
Languages, compilers & scripting tools:
C, C++, Fortran, Pascal, assembly, BASIC, perl, python, Tcl/Tk, lisp, scheme, expect, php.
-
X Windows applications:
The OpenOffice suite or its cousin
Sun's StarOffice suite. OpenOffice and Star Office include typical office productivity applications - word processor, spreadsheet, database, e-mail client & slide show. A wide variety of other X Windows applications are also included.
-
Commercial software and/or demos of full products:
Many distributions include a "bonus" CD that contains commercial software from various vendors.
-
Many more too numerous to list:
For details, check out the website of one or more Linux distribution vendors.
Index
Will Linux work with other operating systems?
Because Linux "speaks" many network protocols, it works well with other operating systems, including:
-
DOS.
-
Windows 3.x.
-
Windows 95 / 98 / NT / 2000 / XP / 2003.
-
Novell NetWare.
-
OS/2.
-
Macintosh.
-
Other "flavors" of Unix.
Index
Choosing Linux - direct costs
-
Hardware
-
Linux will run on nearly any of Intel's family of x86 processors (and clones), from the 386 to the Xeon and beyond. It also runs on a variety of other architectures including Alpha (DEC, now owned by Compaq) and SPARC (Sun), AMD and is being ported to even more platforms, large (IBM's S/390) and small (3com's Palm Pilot).
-
Choosing the correct hardware is really a balance of the server's intended purpose and how much you want to spend. Although the Library's first production servers ran on desktop class or older "recycled" hardware, most have been upgraded to server class hardware. They have become too important during daily operations to run the risk of having them down because of hardware failure. Some things to keep in mind when choosing hardware:
-
The Server's intended purpose:
There are times when recycled hardware makes sense. A backup DNS server will run just fine on older hardware - no need to waste expensive new hardware on it.
-
Falling hardware costs:
Hardware costs continue to come down as performance goes up. For some services, perhaps desktop hardware is sufficient, as is the case with our firewall boxes.
-
Getting more for less:
Because Linux is so efficient, it generally doesn't need to be on that ultra-high end, multiple-processor box to perform well. It can make due with less, so maybe last year's soon-to-be-discontinued (and possibly discounted) model is sufficient.
-
Software costs:
Since you can buy (or download for free) a copy of Linux and install it on as many machines as you want, it's easy to justify spending a little more on the hardware.
-
Split the load:
Since Linux doesn't have licensing issues, consider splitting the functions onto two or more smaller servers rather than having one "super" server. Having several smaller servers provides options that a single server doesn't:
-
Hardware replacement:
Replacing/upgrading a single server can take quite a bite out of the budget. Replacing/upgrading a group of smaller servers can be done over a several year period, easing strain on the budget.
-
Single point of failure:
By having several servers you avoid having a single point of failure. If a server fails, access to some services will be lost, but chances are they can be moved to a different server temporarily.
-
Server clustering:
It's also possible to build a cluster of servers so that a single server failure will not interrupt any services. This option can be a bit more expensive though because each server in the cluster must be able to handle the full load.
-
Intangibles:
Although they will make your server a little more costly, there are some "accessories" no server should be without:
-
A tape drive for backups:
Even though your server may have hardware or software RAID to protect it from disk drive failure, it's still important to have (and follow!) a routine backup procedure. RAID will not protect you from accidental or intentional file deletion and should not be used in place of a good backup procedure.
-
Uninterruptable power supply (UPS):
A server running Linux will aggressively use RAM and swap space to cache data reads & writes, so it's important to shut it down cleanly. In addition to preventing unexpected server downtime due to power failures, a UPS also provides conditioned power, thereby extending the server's lifespan by protecting it from power spikes & surges.
-
Spare parts:
Spare parts can help minimize downtime. By choosing a standard hardware platform, many of the parts are often interchangeable - disk drives, network cards, power supplies and RAM. Having similar server hardware also opens up the possibility of having one server become another by switching disks.
-
Software - The Red Hat conundrum
-
When we began using Red Hat Linux in 1998, boxed sets could be purchased for between $50 - $150, or you could download ISO disk images for free. Package updates were accomplished by downloading the RPM files from Red Hat's errata website and installing them. Red Hat streamlined this process by starting the Red Hat Network, which offered easier to use package updates for a small fee - $60/year per server. Quite affordable, so we continued purchasing at least one of every new boxed set and added subscriptions to Red Hat Network for each of our servers.
In 2003, Red Hat significantly changed their product line, and Red Hat Linux 9 was the last of the retroactively dubbed "community" releases. When Red Hat announced the Red Hat Enterprise Linux product line and the associated costs we nearly went into shock, as did a lot of other loyal Red Hat customers/users. There was a huge flurry of commentary (read: confusion, anger & cries of "sell out") on Red Hat related lists, slashdot and many other places. The bare minimum for a server version of Red Hat Enterprise Linux (RHEL) was $349/year per server. That's a 581% jump in annual cost as compared to Red Hat Network. Ouch! RHEL is a subscription service, and as such subscribers are entitled to package updates and new versions RHEL, but the baseline versions don't include support beyond 30 day basic installation & configuration support. For that type of Service Level Agreement (SLA), you need to step up to the standard or premium support package. We weren't interested in a SLA, just package updates and new releases, but $349/year per server still seemed like a lot of money.
For a comparison of the RHEL line see Red Hat's comparison chart and System Configuration Limits. For support options and subscription costs, see Server support options & pricing and Client supports options & pricing. One thing I found confusing at first was the differing products & support options. There are four products (Workstation, Desktop, ES & AS) and three support levels (basic, standard & premium). Workstation & Desktop are similar, with Desktop designed for large corporate installations. ES & AS are for servers and include the same packages, but ES is designed for smaller servers and has processor and RAM limits. Within the products you can choose whatever support option you want, although not all support options are available for all products. Confused yet? I was at first.
The obvious question is, why did Red Hat change their product line and costs so drastically? Thoughts & opinions differ, some of mine are:
-
Red Hat is a for-profit company, and a publicly-held one. They obviously have to make money to stay in business, and I certainly don't begrudge them that. They have done good things for Linux and the Open Source community, and continue to do so. I think they may have to adjust their pricing a little, and I'd certainly like to see them offer discounted RHEL subscriptions for libraries of all types.
-
They are trying to attract ISVs (Independent Software Vendors) to Linux. Software development and porting to multiple operating systems is an expensive and time consuming process. Having a stable, supported OS with a long-term release cycle makes it more likely the ISV will port their software to Linux. Some would argue that "we don' need no steenking commercial applications", but I think that's a bit short-sighted. IBM has made a huge commitment to Linux, offering it on their servers, porting their applications to it, donating code to it and providing funds for development. Sure they're making money off Linux, but they are also giving back, and there's nothing wrong with that. Other commercial companies have embraced Linux whole-heartedly, in some cases opening their source code and providing support for a fee. Some provide binary-only software that runs on Linux, but at least it does run on Linux, giving customers another OS choice.
-
They are trying to attract corporate customers to Linux. Providing a comprehensive solution that includes servers, applications, desktops, migration services and a variety of support options is probably the only way to lure some companies away from that "other" OS. While long-time Linux users know just how good Linux is, corporate customers often need extra convincing, often taking the form of "someone to call and/or blame when something goes wrong".
The big question is, where does that leave organizations that can't justify (or afford) the price jump? Linux and open source are all about choice, and here are some:
-
To quote Red Hat user Dave Ihnat, "Pay for the full-bore system. Live in RPM harmony. All is RedHat supported, life is good, the bills are higher." During the firestorm of anger, frustration, confusion and cries of "sellouts" following Red Hat's announcement, his e-mail struck me as a voice of reason amid the chaos. Part of his original e-mail and my thoughts on it can be found here.
-
Run Fedora, a Red Hat sponsored and community-supported project. Fedora is not "officially" recommended for production servers, as much of it is beta quality software. Package updates are provided, but the release cycle is short (2-3 releases per year), potentially a problem for long-term use. (But see the Fedora Legacy Project, below.) Once mature & stable, many of the packages and features of Fedora will make their way into future RHEL releases.
-
Run a recompiled (clone) of RHEL, perfectly acceptable under the GPL. Two of the better known versions are probably White Box Enterprise Linux and CentOS. Obviously neither one includes any kind of support options, but both have an update mechanism similar to Red Hat Network. They may lag a little behind the official RHEL package updates & releases.
-
For those institutions that qualify, purchase Red Hat's academic products. The Desktop and AS versions are available for dirt cheap, but no support options are available. Currently academic pricing doesn't include public libraries. I have inquired about adding them or establishing a library pricing structure, albeit without success. :-( If your e-mail address ends in ".edu", your institution probably qualifies.
-
Roll the cost of RHEL into your server hardware purchase. We were able to include a 3-year RHEL ES subscription when purchasing our first Dell server. Although the price break wasn't all that great, at least we won't have to worry about the subscription for three years.
-
Purchase RHEL subscriptions from your hardware vendor. Since many hardware vendors now offer Linux pre-installed, you may be able to get a price break from them.
-
Continue to run older versions of Red Hat Linux and Fedora Core. The Fedora Legacy Project provides security and critical bug fix package updates for select end of life Red Hat Linux and Fedora Core distributions. Eventually you'll have to upgrade, but if you are going to run an older distribution, Fedora Legacy provides a way to do it safely.
-
Switch to another distribution entirely. Distrowatch has reviews, rankings and links for a large number of Linux distributions.
Now that you know some of the available options, which one did we choose? Well, we're doing a variety of things:
-
We run RHEL ES on our physical servers and those virtual servers that require it. We run CentOS on firewall boxes and other small servers where justifying the cost of RHEL is difficult.
-
I mentioned rolling the cost of RHEL into purchasing server hardware. We will likely continue to do this as long as it's an option. Since our servers are currently on a 5 year replacement cycle, we'll have to supplement the initial 3-year subscription with an extension when the time comes.
Index
Choosing Linux - indirect costs
-
Services
-
Most of the services we wanted to provide were available "out-of-the-box". Those that didn't come with the distribution were available from the Internet. Other operating systems either didn't provide all the services we wanted or were only available at an additional cost.
-
Time (the learning curve)
-
Yes, it is Unix and it does have a steep learning curve, but I felt it was well worth the effort. If you already know one flavor of Unix, learning another isn't that difficult and since I was already trying to learn Linux to teach myself more about Unix in general, this gave me a practical reason for doing so. The time required to get proficient with Linux really depends on the person learning it - your experience will almost certainly be different from mine. One thing that helps is breaking the task down into smaller, more manageable chunks - something that is relatively easy to do. Pick a service and configure it. After the first service is up and running, pick another and work on it. This will help you get comfortable with Linux, gain some experience and build on the knowledge acquired from earlier projects.
-
Learning materials
-
Although there are a wide variety of man pages, FAQ's, Howto's, web sites and other documentation available for every aspect of Linux, sometimes there's just no substitute for a book. I've read several of the O'Reilly "animal" series, titles by other publishers and the user's guides that come with the distribution.
-
Pre-production server hardware
-
It's always a good idea to have a pre-production server around to experiment on before rolling changes out to a production server. Although RPM makes it easy to revert to an older version of a software package, it's a bit tricky after you've upgraded the entire server to the vendor's latest release. Individual services may have undergone major changes, which will sometimes necessitate a new configuration file structure. The new release may also have new features that are worth investigating. A pre-production server can also be useful for testing out a new service you're planning on making available. The hardware for the pre-production server doesn't need to be anything fancy - an old desktop PC will generally do nicely.
-
Remote system administration
-
With servers at 2 different locations, remote administration is a must, and Linux fits the bill nicely. All administration can be done remotely from the shell prompt, although there are some graphical (GUI) administration tools as well. Most people prefer one or the other for system administration, you can read my musings on the subject here if you'd like. I routinely perform the following administration tasks remotely from the command line:
-
Server status check - logfiles, uptime, CPU load average, free disk space, backup completion, etc.
-
Stop / restart services.
-
Package updates.
-
New package installation.
-
Package removal.
-
Configuration file changes.
-
Kernel compilation & server reboot.
-
User account additions / changes / deletions.
-
Ongoing system administration
-
While many of the routine housekeeping tasks are performed either automatically by Linux or shell scripts, there are obviously tasks that require human intervention:
-
Daily tasks (approx. 5 minutes/server):
-
Read root's e-mail.
-
Read logwatch's e-mail.
Logwatch is a program that summarizes some of the logfiles generated by Linux and e-mails the results to you. It's a good way to keep an eye on your logfiles without having to read through all of them each day.
-
Backup completion check - normally included in root's e-mail.
-
Server status check - free disk space, load average, uptime, etc.
Performing this task regularly gives you a "feel" for what the server is doing, sometimes making it easier to spot trouble before it gets serious. Think of it as being the system administrators equivalent of knowing how your car feels when it's running well.
-
Weekly tasks (approx. 10 minutes/server):
-
Verify a randomly chosen backup tape.
-
Peruse previous week's logfiles.
Choosing a logfile to review each week gives you a good idea of what the server should be doing, which can sometimes tell you when it's misbehaving.
-
Backup & verify filesystems that contain mostly static data.
There is one filesystem that contains software, PC image files and the like, none of which change very often. In order to speed backup (and possibly restore) time, this filesystem is backed up once a week to a separate tape.
-
Monthly tasks (approx. 15 minutes/server):
-
Verify the monthly backup tape.
-
Other tasks, performed as needed (time varies depending on the task):
-
RPM package updates:
The number of packages that need to be updated depends on the services running on the server and the maturity of the vendor's release. Typically a new release will go through a flurry of package updates at first and then stabilize to a few packages a month as it matures.
-
Pre-upgrade:
Unless you're a glutton for punishment, I don't recommend blindly upgrading a production server to the vendor's latest release without doing some testing first. By using a test server as a guinea pig, you can ensure a smooth upgrade on the production server(s).
-
Install & configure a new server.
-
Install & configure a new service.
-
Add & delete user accounts.
-
Print queue maintenance:
It's inherent to network printing that sometimes the queues get stuck, and printing with Linux is no exception. Although some of the cleanup is handled automatically, sometimes it's necessary to manually "unjam" a network print queue.
Index
Choosing Linux - stability & performance
-
Stability
-
Linux has proven to be an extremely stable server OS. The old Gromit ran continuously from February to December 1998 with only 3 minor interruptions: an extended power outage, a physical move of the server and to install some additional hardware. The current continuous uptime record is held by Mrs-Tweedy - 757 days and counting!
-
Knock on wood, since the Library's first Linux server went into production in 1998, I've only had one software problem that required a reboot, which was probably my fault. I was moving the server from a 10 megabit switch to a 10/100 megabit switch (and back - the new switch decided not to work) and had forgotten to properly bring down the network interface before doing so. I think the TCP/IP stack got confused and couldn't decide if it was supposed to be operating at 10 or 100 megabit. Even so, the server managed to limp along until closing time. I can't ask for better reliability than that!
-
Performance
-
Linux performs well on most hardware, including older hardware. It uses the CPU and RAM efficiently, has one of the fastest TCP/IP implementations available and frequently outperforms other operating systems on the same hardware.
-
You can compile your own kernel (not as difficult as it sounds) to add or remove support for specific hardware or services, thus making the kernel image smaller and more efficient. It's also possible to tweak specific settings, like memory management, if you have a service that's a memory hog.
-
I hesitate to quote any of the many benchmarks floating around on the Internet. Benchmarks are frequently biased in some way and the same set of data is sometimes interpreted in different, often contradictory, ways. From personal experience I can say that our servers have been up to whatever task we've thrown at them.
-
Rebooting
-
There are only a few circumstances when Linux must be rebooted: after upgrading to a new release, after compiling a new kernel, to replace/install hardware and of course after an extended power outage. While frequent rebooting may be a necessary evil on the client end, my philosophy is rebooting the server should be a rare event, something that Linux seems to agree with.
-
The Kernel & other processes
-
Very few things can crash the Linux kernel, faulty hardware being the #1 culprit. I have had services crash, generally due to misconfiguration on my part (oops!), but fixing the configuration and restarting the service is all that's been necessary.
-
Shared libraries
-
Like most other operating systems, Linux uses shared libraries (similar to Windows .dll files) to reduce the size of compiled binaries (programs) and provide a standard set of functions and procedures. Unlike some operating systems, the only time these shared libraries are changed is when you either (a) upgrade to a new release of Linux, which generally includes new shared libraries or (b) specifically upgrade them. Regular software packages that are linked to these libraries do not arbitrarily overwrite shared libraries or install their own versions. This prevents a newly installed piece of software from breaking others or having to install software in a particular order to get everything to work.
-
Software "bit-rot"
-
Although it may take a bit more work to setup, once a Linux server has been properly configured you can run it until the hardware croaks without ever having to re-install the base operating system. Some operating systems can suffer mysterious performance slow-downs and stability problems after being installed for awhile, sometimes requiring a complete reinstallation of the operating system. Linux will keep on running regardless of the number of software packages added or removed.
-
Disaster recovery
-
With a good backup routine and a little preparedness, it's possible to completely restore a crashed system to the state of the lack backup without having to reinstall the operating system. For more information, see Restoring Feathers from the dead. In Feather's case the data was stored on another server, but the same would have been true had the data been on tape or other media.
Index
Choosing Linux - support
One question I've been asked while presenting is "What would happen to the Library's Linux servers if you left?" A very good question and one that was more difficult to answer when we first began using Linux. At the time, Linux was still relatively unknown and support options were limited. Some distribution vendors provided installation support and maybe limited initial configuration support, but that was about it. Today Linux is growing rapidly and there are many more support options. Many distribution vendors are happy to sell you a support contract, including whatever SLA (Service Level Agreement) you need. Everything from basic installation and configuration to custom programming/data services and Linux migration roll-outs. Hardware vendors like HP/Compaq, IBM, Dell, Gateway and others are now on the Linux "bandwagon", offering pre-configured systems and support contracts for Linux running on their hardware. There are also companies and independent Linux consultants supporting Linux regardless of platform or distribution. If I left, a short term solution could involve using the City's IT staff in conjunction with a short term support contract from a vendor or working with a Linux expert, either locally or via e-mail. A long term solution would (obviously) be to hire a Linux system administrator to take my place.
That said, commercial support has never been an issue here. It is nice to see support options becoming available for (a) small agencies who can't afford or don't have a resident expert and (b) IT departments, who although may already have Unix/Linux expertise on staff, are required by management to have a support contract.
Don't think that just because commercial support is now available that it's something you must have. To quote an anonymous Linux user - "There's a bordering-on-clinically-interesting level of support from the Linux community at large." There are many Linux user groups, listservs, and gurus who are more than willing to help other Linux users. Whether you join a listserv or use e-mail mentoring, you can count on a solution from the Linux community.
My best sources for support and information include:
-
Program documentation & man pages.
-
Books, books & more books. (After all, I do work in a library.)
O'Reilly publishing has a book on nearly every computer topic you'd care to learn about.
-
O'Reilly Publishing's Safari Bookshelf.
O'Reilly also offers a large set of technical books from many different publishers via their Safari service. For a small monthly fee, you can search, browse & read over 2,000 titles from more than 12 publishers online.
-
Listservs e-mail & newsgroups.
For a list of some Linux resources, click here.
Index
Choosing Linux - updates & source code
-
Software updates:
-
Software updates, especially security related ones, are released in a timely manner (sometimes days or even hours after a problem is discovered) via the Internet. Many vendors of proprietary operating systems release updates quarterly or less often. Sometimes they are reluctant to even admit the presense of bugs, especially security related ones. Its often been said that "Security through obscurity is no security." That's never been more true that in today's world of the ever-expanding Internet.
-
Software updates are also released as individual packages rather than one big bundle. This allows you to pick & choose which packages get updated. "All inclusive" updates from vendors of proprietary operating systems may include fixes for things you don't have or install things you don't want. Others can even break functioning software. In the event an updated package does cause problems, RPM makes it easy to revert back to the older package.
-
Software package management with RPM:
-
RPM is a software management utility created by RedHat that has since been adopted by other distributions of Linux. It makes software installation, upgrades and even removal quite easy. Other distributions that do not use RPM generally have their own software management utility.
One thing that makes RPM especially useful is that it includes all installed packages, not just operating system related ones. Update mechanisms offered by other vendors sometimes include only the operating system and drivers, which makes keeping these systems up-to-date a multi-step process. Use one tool to update the OS, search a website to download updates for other applications. Time consuming, and at times, frustrating!
-
Source code:
-
Source code is available for all open-source, GPL'ed software included with a distribution. This can be useful if you discover a bug, want to make changes or just practice your programming skills.
Index
Security is something that frequently gets overlooked regardless of the operating system the server happens to be running. Because Linux is sometimes considered the "plaything" of Hackers and college students, it has an undeserved reputation for being insecure. Although older versions of Linux often had insecure services running by default, newer ones are much better and often include the option to configure a firewall during installation.
-
Package updates:
-
Probably the #1 reason servers get "cracked" (broken into) is because system administrators don't keep up with software package updates. Nearly all Linux distribution vendors have websites that list updated packages. While packages that fix minor bugs or add new features are optional, updating packages that fix security related problems is a MUST, especially for servers that are used by the public. With package management tools like RPM to make life easier, there's no excuse for not updating critical packages.
-
NOTE: The term "Hacker" and derivatives like "hacked" have been given a negative, even sinister, connotation by the popular media. There are many variations to the meaning of the word, most of them shedding a positive light on the term. According to The New Hacker's Dictionary one meaning is "A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary." (See the dictionary for additional meanings.) Call any respectable longtime system administrator, programmer or computer geek a Hacker and he'll probably take it as a compliment. Most Hackers consider those who try to break into computer systems to be "Crackers" or "Script Kiddies", also listed in the dictionary. Since people on both sides seem to like the term, some have resorted to referring to them as "White Hat" or "Black Hat" Hackers.
-
Least privilege:
-
Least privilege is another good way to help make a server more secure. Rather than denying activities you don't want and allowing everything else, the concept of least privilege states "allow only these specific activities, deny everything else." A good example is shell (often via telnet) access. Just because Linux can provide shell access to all staff, does everyone really need it? By denying shell access to everyone but those who need it you close a potential security hole.
-
Proper configuration of running services:
-
Services that have many configuration options are a good place for potential holes to exist. If unsure, find some documentation (man pages, FAQ's, Howto's, books, etc.) that explains it or ask - there are many Linux-related websites, newsgroups & listservs where you can post a question.
-
Disabling and/or removal of unnecessary packages & services:
-
Just as proper configuration of running services is important, why run services you're not using? With RPM and other package managers it's trivial to remove an unused package. You can also re-install the package later if you discover a need for it. When removing the package is not an option, disable it or deny access to it. Telnet is a good example of this. While outbound telnet is often useful incoming telnet is disabled on all of the Library's servers. By making a small change to /etc/inetd.conf, inbound telnet is disabled.
-
Using the Linux firewall tools protect a server:
-
While Linux firewall tools like ipchains and iptables are frequently used on a firewall server to protect an entire network, there's no reason why they can't be used to protect an individual server. Libraries are a unique example of why this is useful, because there are typically both staff and public PCs on the same network. A traditional firewall only protects PCs and servers from attacks originating from the Internet. But what about those publicly accessible PCs that are already inside the firewall? By using the firewall tools on each server you can help protect your internal servers from harm while still allowing legitimate access.
-
Logfiles & monitoring tools:
-
During normal operation a Linux server will generate quite a number of logfiles. There are automated tools that will summarize these logfiles and even alert you based upon criteria you've chosen. Perusing the logfiles periodically to get a "feel" for what your server is doing is also a good idea.
-
Backup, verification & recovery:
-
While the importance of having (and following!) a good backup routine cannot be overstated, verification and recovery are important too. Periodically pick a tape at random and restore several files to a temporary directory. Compare them to the ones on disk to be sure file are really being backed up. If they don't match, find out why. Has the file changed since the backup or was the file backed up incorrectly? Be comfortable with the backup software's recovery options in the event you need to use them. While you're trying to recover from a disk failure is not a good time to learn the nuances of your backup utility.
-
Uninterruptable Power Supply (UPS):
-
A properly configured UPS and monitoring software will not only provide protection against momentary power outages, but extended ones as well. Software is available to shut down the server cleanly during an extended power outage and reboot it once the power is restored. Test the functionality of the software on a pre-production server if possible just to be sure it works properly.
-
Windows viruses, worms & exploits:
-
With all the Windows viruses, worms & exploits "du jour", it's nice to run a server operating system that's immune to all of them. This isn't to say that Linux is exploit-free (no OS or application is), but because Linux was designed with multiple users and network connectivity in mind it is much more secure. When a serious flaw is discovered in Linux, the problem is generally fixed quickly - often within hours or days.
Index
PART VI: WHAT WE USE LINUX FOR AT THE LIBRARY
|
Domain Name System (DNS)
The Domain Name System or DNS is the Internet "phonebook" of hostnames & IP addresses. Anytime you connect to a computer on the Internet using its host address, DNS provides the translation from the hostname to the corresponding IP address.
-
Internal DNS:
We provide hostnames for all our PCs and other networked equipment to make configuration, location & troubleshooting easier. Using DNS internally avoids website "confusion" - you can point browsers to the same DNS name internally and externally, only the resolved IP address will be different. By configuring clients to use DNS resolution whenever possible, you can change a server's IP address without having to reconfigure each client - simply change the IP address the name resolves to.
-
External DNS:
The City's IT staff maintains all City & Library public DNS records, but the library provides a DNS zone slave server. This provides redundancy and avoids overloading the zone master. If you have or are considering getting your own domain, Linux can be an inexpensive way to administer it.
-
Remote DNS resolution:
We have multiple DNS servers for speed & redundancy.
-
"Fixing" online database vendor's DNS records:
Having multiple paths to the Internet can complicate access to IP address authentication-based online reference databases somewhat, especially since the Comcast addresses change periodically. Normally this can be fixed by adding a static route out the City's Internet connection to the database vendor's website. This ensures that the vendor always "sees" the same IP address when patrons & staff are accessing the resource. Many database vendors also use web acceleration services like Akamai to speed access to their products. The initial request goes directly to the vendor's site, but static objects (graphics, buttons, etc) on the page are retrieved from an acceleration server "closer" to the client making the request. By providing a static route out the City's Internet connection to the website, the initial request comes from an IP address authorized by the vendor. Other objects on the page are fetched from one of Akamai's servers via the Comcast connection. In most cases, this works just fine.
Recently we began having trouble with access to one of our databases. A little troubleshooting revealed that the DNS records were no longer pointing directly at the vendor's website, but rather to Akamai's servers. According to their website, Akamai has 20,000 servers spanning 1,000 networks in 71 countries. Akamai frequently rotates their DNS records to balance the load and help prevent outages, so in theory we could be using any one of their 20,000 servers to access the database. Needless to say, trying to add static routes for all these servers would have been difficult at best. It would also have somewhat defeated the purpose of having the Comcast connection. Many companies use Akamai's web acceleration service, so we would have been routing that traffic out the City's Internet connection as well, rather than using the faster Comcast connection.
The fix for this problem was to add DNS records for the vendor's website to our internal DNS servers, effectively making them "authoritative" for those addresses. Rather than receiving a random address for one of Akamai's servers, now the DNS record always resolves to the vendor's server. Initial requests go to the vendor's website and static objects are fetched from the Akamai servers. Problem solved!
Index
Samba (network file & print services)
Samba provides logon, file & print services, much like Windows or NetWare. It supports domain logons, logon scripting and a "browse" list of available shares. There are many access control options, both system-wide and share specific.
All of our client machines are currently Windows 2000, and the requirements for them are simple - TCP/IP networking and the MS "Client for Microsoft networks" client, both of which are included with Windows 2000. The NetBEUI protocol is NOT needed or helpful.
-
Domain logons & scripts:
-
Samba supports domain logons by username or machine name. Logon scripts are written as DOS style batch files, with the initial script often calling a series of "service" scripts. This method simplifies administration when changes are needed - just edit the service script instead of each user's individual script. Logon scripts typically perform the following functions:
-
Mapping drive letters to shares for staff & public PCs.
-
Updating the Dynix PAC for Windows configuration file.
-
Updating various other configuration files, including the registry.
-
Copying shortcuts to Windows startup file to automatically start applications.
-
Windows roaming profiles:
-
Samba also supports Windows roaming profiles. While it's nice to have your desktop "follow" you around, roaming profiles can get quite large if not managed properly. (We learned this the hard way and are working to correct it.)
-
CD-ROM databases:
-
At one point we used Samba to serve a number of CD-ROM databases. Since nearly all of our databases have moved to the Internet, we no longer need Samba for this purpose.
-
Although we have a number of multimedia PCs available in the children's area, multimedia & DVD CD-ROM's are not served from the network as they tend to be bandwidth hogs.
-
Server shares:
-
A "share" is merely a directory on the server that is accessible from a client PC, via a mapped drive letter or UNC (Universal Naming Convention) path.
-
Domain logon scripts:
Although the logon scripts can be edited from the Linux shell prompt, we found it was just as easy to provide access to all the scripts & configuration files using Samba. Editing a script is as simple as starting a text editor like notepad or wordpad and opening the file you want to change. Changes to other configuration files are made in much the same way.
-
Client software configuration files:
Not all public PCs have access to the full range of services provided. Since the configuration files are stored on and distributed from the server, making changes to what's available on a specific PC is usually as easy as editing its logon script and having a staff member reboot it.
-
Software & client image storage space:
Also housed on the server are frequently installed software packages and ImageCast images of all staff & public PCs. ImageCast is a drive imaging utility (similar to Ghost) that takes a "snapshot" of a hard drive and stores it in a file. This allows us to easily restore a public machine that has been trashed and to install periodic updates to all staff & public PCs. Rather than having to manually install software updates to each machine, the hard drive is reformatted and the image installed. This allows the installation to be clean and guarantees that each machine will at least start out with the same configuration.
-
Staff home directories:
Staff members have their own home directory on the server where they can store their private documents. The files are backed up every night and are available from most any PC they happen to be using. This eliminates the need to backup staff PCs, which we lack the staff and time to do.
-
Group directories for projects:
In addition to private directories, there are group directories where staff can store documents that need to be accessible by others.
-
Dynix PAC for Windows:
Public Access Catalog (PAC) for Windows is a front-end menu system for Windows. The configuration file is updated from the server during the network logon process, allowing us to make changes to the buttons quickly and uniformly.
-
Network printers:
-
Samba also provides access to network printers. Access to these printers can be configured by user login, group or individual PC. Although some printers are not directly supported by Linux if the client OS has a driver for the printer, Samba and Linux will cheerfully spool print jobs to it. We have actually phased out printing via Samba, switching to printing via CUPS. As we were installing SAM, a PC time management / print cost recovery solution, we had some difficulty getting it to work with Samba printing.
Index
Apache (Internet web server)
Apache is the most widely used web server software on the Internet, and we use it at the library to host a variety of web pages:
-
Automation Services pages:
Various documents written by Automation Services, including:
-
Automation Services updates - what we've been up to.
-
Automation Services upcoming projects.
-
PC & data recovery / storing documents on the server.
-
Power outages & your computer.
-
Software supported by Automation Services.
-
Windows file management 101.
-
Dynix, Horizon and other statistics:
Statistical reports for our old Dynix system, our new Horizon system, Telecirc and various electronic reference products are made available via the web. This allows easy distribution to library staff and permits electronic archival instead of using paper. Current reports include:
-
Dynix circulation statistics.
-
Horizon circulation statistics.
-
Telecirc statistics.
-
Electronic reference products statistics.
-
Other reports for staff.
-
Internet browser proxy autoconfig files:
Most browsers have the ability to automatically configure their proxy settings by downloading a small file from a web server. This is much easier than having to change the proxy setting for every computer in the building anytime your proxy server address/software/port/etc changes, which we've been through a couple of times. The proxy autoconfig file can also instruct the browser when to fetch documents directly, which is handy (a) if you have no proxy server, but want to be ready for that possibility or (b) for sites that don't work well (or at all) when cached. The file syntax is a bit terse and documentation is somewhat scarce, but once you get the hang of it automatic proxy configuration can come in quite handy.
-
Meeting room schedule & policy pages:
College Hill Library has several meeting rooms and the schedule for each week is made available on the web via Calcium, web calendar software.
-
Public PC resources:
Although our public PCs have their web browser's home page set to our online catalog, there are a number of links from it to additional public resource pages on our Apache web server:
-
Online database information & links.
-
Favorite sites of the reference sites.
-
Recommended search engines.
-
Library locations & hours of operation.
-
Circulation policies.
-
Short & long versions of the library's Internet acceptable use policy.
-
Homework help pages for K-12 students.
-
Colorado REFORMA:
We host the Colorado REFORMA website. Colorado REFORMA's mission is to promote and improve library services for Latinos, Hispanics and other Spanish-speaking persons in the United States.
-
Bugzilla and Mailman:
Apache is also the webserver used by our Bugzilla and Mailman servers.
Index
Internet filter & cache (Smart Filter & Squid)
Smart Filter is a commercial Internet filter supported & developed by Secure Computing. It uses Squid as the proxy/cache engine and is a good example of combining Open Source software with commercial software. We use it in conjunction with SAM, a PC time management / print cost recovery system. PCs in the children's area are always filtered, minors are filtered regardless of the PC they use and adults can choose filtered or unfiltered access. Although there are other solutions available, many open source (and often free), we wanted to insure compliance with federal CIPA (Child Internet Protection Act) guidelines as well as Colorado's own laws.
Index
Dynamic Host Configuration Protocol (DHCP)
Dynamic Host Configuration Protocol or DHCP is a way to configure a PCs TCP/IP settings during startup, including IP address, hostname, domain name, default gateway, DNS servers, WINS servers and more. Note: Although hostnames can be changed with DHCP, NetBIOS (computer) names cannot. There are other ways to change the NetBIOS name remotely and I recommend the hostname & NetBIOS name be the same. It just makes life a little easier. ;-)
For security reasons and to aid troubleshooting, we statically assign IP addresses to all PCs using a MAC to IP address map in the DHCP configuration file. Since all PC configurations are the same upon restoring a PCs image, using DHCP reconfigures most of the network settings on reboot. In the event of a change in domain name or router failure, these settings can be changed on the server and propagated to the PCs by rebooting them.
Index
Automating tasks with cron
An ongoing project has involved trying to automate some of the more routine system administration tasks. After all, what good is having a computer or two if it can't do some of the more mundane tasks for you? Some of the tools I've used so far include:
-
Cron:
Cron is a job scheduling daemon. It provides a way to run tasks at defined intervals - by minute, hour, date, month and day of week. A task can be anything from a simple command to a complicated script or other program.
-
A Command interpreter or shell:
The command interpreter or shell is what you normally see when you login to a Linux server. There are different types of shells available, including bash, korn, C and others. While shells are used for executing commands interactively, they also have their own internal scripting language, which can be used to write shell scripts. The logon process runs a series of shell scripts that sets up your user environment - your user ID, group ID, home directory and even your default shell.
-
Perl:
Perl (Practical Extraction and Reporting Language, aka Pathologically Eclectic Rubbish Lister) is an interpreted, cross platform programming language created by Larry Wall. It excels at text processing, regular expression pattern matching and has many, many uses. It has been described as "The Swiss Army chainsaw of Unix programming."
Write a shell or Perl script, schedule the script using cron and you've got an easy way to automatically complete routine tasks. Some automated tasks include:
-
Nightly backups:
Nightly backups was one of the first things to be automated. A simple cron entry coupled with full & incremental backup scripts. Just remember to change the tape!
-
Horizon SQL full & transaction log dumps:
The normal way to backup an SQL database involves "dumping" the data & table structure to a disk file and writing the file to tape. Our Horizon database also creates transaction logs, which are dumped hourly to provide incremental backups. Both are created via a simple shell script scheduled by cron.
Index
Internet E-mail
One of the major uses of the Internet is e-mail, and we use Postfix for delivery. It is fast, easier to administer than sendmail and secure. It is designed to be sendmail compatible, so in most cases you can use it as a drop-in replacement, which is what we did. While sendmail is a monolithic application with a difficult to learn (at best) configuration syntax, postfix is broken up into smaller modules and uses a "plain english" configuration syntax. The City & College provide Exchange/Outlook accounts for staff, so our use of Postfix is somewhat limited:
-
Staff e-mail groups:
Since each agency has its own Exchange server, the simplest way to create staff e-mail groups was with Postfix. Mail is sent from each Exchange server to a sendmail-style list alias, which Postfix expands and routes back to individual City and College accounts.
-
Delivering mail from Bugzilla, Calcium and Mailman:
Postfix delivers ticket updates from Bugzilla, room reservation confirmations from Calcium, and listserv e-mail from Mailman.
-
Automation Services staff accounts:
Automation Services staff are typically subscribed to a lot of listservs, so we receive (and keep) a lot of mail. Since this would fill up our Outlook quota (and we'd get scolded!), we use Eudora and Evolution to keep our e-mail organized.
Index
Running listservs with Mailman
Mailman is mailing list software similar to Majordomo, listserv, smartlist and other Internet mailing list (aka "discusion list") software. Each list has its own informational, archival and administrative web pages. The administrative pages allow the list owners to maintain and customize nearly all aspects of the list without resorting to e-mail commands or bugging the server's administrator. ;-) Although many search portals like Yahoo! will allow you to run your own listservs, it's nice to have the mailing list addresses be a little more "official".
Index
Rsync is kind of a cross between the Unix rcp (remote copy) program and ftp. Rsync can run from the command line, as a daemon (service) and can also use ssh as the transport protcol for extra security. Rsync is much more flexible and generally faster than either rcp or ftp, is easy to run unattended and can make an exact copy of a directory structure, including ownership, permissions and timestamps. This comes in handy when you want to synchronize data between multiple servers. Rsync is currently used to keep the following data in sync, often automatically via cron:
-
User account synchronization:
Since many library staff may work at College Hill one day and Irving Street the next, their user account information must be on both Gromit and Feathers. User accounts are added, modified and deleted on Gromit and the information is then replicated to Feathers via rsync.
-
Logon script & configuration file updates:
The logon scripts and associated files used by Samba to authenticate a client, map share points, grant access to printers and update the client's configuration file are also stored on Gromit & Feathers. Rather than having to make changes on both servers, changes are made on Gromit and propagated to Feathers using rsync.
-
Centralized backup of all servers:
Rather than having to deal with multiple tapes & tape drives, data from all servers is mirrored to Gromit and then backed up to a single tape drive. This gives us an online backup and an archival backup.
-
Mirror of Horizon database server SQL data:
Nightly Sybase SQL dumps and hourly transaction logs are mirrored via rsync to provide an online, off-site backup. The dumps are also written to tape, creating an archival copy.
-
VMware virtual machine backups:
To minimize VMware virtual machine downtime during backups, the virtual machine is stopped and rsync is used to mirror the files to a different server. The virtual machine is restarted and the copy is backed up to tape.
-
Nightly mirror of selected directories on VMware virtual machines:
Since the VMware virtual machines are not normally backed up on a daily basis, rsync is used to make a mirror of critical directories on these servers. The mirrored data is then backed up to tape nightly to prevent loss of important data.
-
Restoring Feathers from the dead:
During 2004's System Administrator Appreciation Day (when else?), Feathers crashed and refused to reboot. It became obvious that troubleshooting the problem and getting replacement parts was going to take longer than we could afford to have the server down. I was able to use Knoppix and rsync to copy Feathers' online backup to temporary hardware, restoring services while we impatiently waited for parts to be delivered. Several days, one RAID card and a main system board later I was able to use rsync again to put Feathers back on its regular hardware.
-
Change the Linux distribution and version on a firewall server:
By using rsync and Knoppix, I was able to pre-stage a firewall server upgrade on similar hardware. Once I installed the new version of Linux and copied the necessary configuration files from the production firewall server, I booted with Knoppix and used rsync to mirror the disk drive to another server. Reboot the production firewall using Knoppix, repartition/reformat the disk drive, rsync the new OS, reinstall the boot loader and the production firewall is now running a new Linux distribution & version. Total downtime for the production firewall was less than 30 minutes.
If you're getting the impression that rsync is an incredibly useful tool to have, you're right!
Index
Network Time Protocol (NTP)
Network Time Protocol (NTP) provides an easy, automated way to keep the time synchronized between devices (PCs, servers, network equipment, etc.) on a network. NTP servers are arranged in a hierarchical fashion, with each layer called a "stratum". A stratum 1 NTP server is directly connected to some type of highly accurate clock. A good example would be the official United States time kept here, maintained by the National Institute of Standards and Technology (NIST). A stratum 2 NTP server receives time from a stratum 1 server and so on.
-
NTP server configuration:
In order to avoid overloading one of the many publicly available NTP servers on the Internet, it is customary to pick a single device/server on your local network and configure it to receive the time from a remote source. This device/server then provides NTP services to clients on your local network.
-
NTP client configuration:
We use NTP to keep the time in sync for our servers, network gear and Windows clients. Generally it's a matter of installing (or enabling) the NTP client software and pointing it at your NTP server.
Index
Trivial File Transfer Protocol (TFTP)
Trivial File Transfer Protocol (TFTP) is similar to regular FTP except there's generally no authentication (username/password) involved. Although it was designed for quick & easy transfer of small files (hence the name trivial) many times the data being transferred is hardly "trivial". We use it for:
-
Printer firmware upgrades.
-
Network equipment software upgrades.
-
Backups of network equipment configuration files (routers, switches, firewalls, etc).
Index
Calcium web scheduling software
There are a number of meeting rooms available at College Hill Library and when we first opened in 1998 scheduling these rooms was done on paper and worked reasonably well. As more groups began using these rooms, it became obvious that the paper method was lacking in both access and efficiency. A centralized notebook worked but was difficult for more than one person to maintain and each day's schedule had to be copied and distributed to various service desks in order for staff to direct patrons to the correct meeting room. The daily copies were often outdated shorty after being printed, not to mention the waste of paper. The web is an ideal place to put information you want to make available to a wide audience and our room schedule calendar was no exception. The College was demoing some software for similar scheduling needs but in addition to being somewhat expensive, it lacked web access. We wanted everyone (staff & patrons alike) to be able to view the calendars from a web browser. We also wanted staff responsible for the room scheduling to be able to edit the calendars via a browser without requiring any additional software.
Unable to find suitable software for the moment, I created a set of ugly but functional templates for staff to schedule rooms. Each week was a separate html file, updated using a simple editor. Although primitive, it at least made the room schedules available to staff and patrons via a web browser. We continued this way for some time until it became obvious that the rooms were getting even more popular and our scheduling "system" was badly in need of an overhaul.
Enter Calcium from Brown Bear Software. It slices, it dices, it makes thousands of Julian...no wait - that's another product entirely. Although commercial software, the vendor is easy to work with and $500 for the entire package was very reasonable. Some of Calcium's features include "master" calendars, pop-up windows, grouping & coloring, e-mail confirmations, e-mail reminders and searching/filtering. Calcium is written in Perl and can be modified for local use if desired. You can view our room schedule here. Although not without its own quirks, I think the only way I'd be able to take Calcium back would be to pry it from staff's cold, dead hands! ;-)
Index
Secure communication with OpenSSH
Applications like telnet, rsh, ftp & rcp typically send the username/password in plain text. This is bad enough on a LAN with public computers on it, but it's completely (IMO) unacceptable for remote system administration across a public network (like the Internet) because you never know who might be listening with a packet sniffer. In its most basic form, ssh is a secure replacement for all of these. All data is encrypted, preventing sniffing of the authentication process (username/password) and session data. ssh is quite useful for other tasks as well and some of the things we use it for include:
-
Secure remote administration:
Whether from home or while attending a conference out of state, ssh provides an easy and secure way to administer & troubleshoot our servers remotely.
-
Application port forwarding:
Even more useful is the ability create a secure tunnel with ssh and then forward an insecure application's traffic over the tunnel. The destination of the forwarded traffic can be the ssh server or another network device entirely, extremely useful for equipment that doesn't support ssh. Think of application port forwarding as "VPN lite".
-
One example is VNC, a remote control application similar to PC Anywhere. Rather than allowing VNC traffic from the Internet, a connection is established using ssh. The VNC data is then forwarded over this secure channel, preventing snooping and having to open additional holes in our firewall. We use this method for remote troubleshooting of desktop PCs and various Windows servers.
-
Another example is sending e-mail via SMTP. Since our mail server doesn't allow relaying from outside the library, when sending e-mail from home I forward it over an ssh tunnel.
-
The transport protocol for rsync:
rsync is a wonderful file/directory mirroring tool and it can be made secure by using ssh as the transport protocol. We use this method to synchronize username/password information between some servers.
-
Secure file copying:
When rsync is overkill, scp (secure copy) provides an easy and secure alternative to ftp.
Index
Remote logging with syslog
Most network equipment (hubs, switches, routers, firewalls) generate logfiles but often they have only a small buffer to store these messages. syslog provides the ability to collect these messages on a central server and store them indefinitely. We use syslog to collect logging messages from our network gear. These logs are analyzed periodically and kept as a record of network traffic.
Index
Perl programming
No discussion of Linux would be complete without Perl, although this one was added it bit late. ;-) Although Perl is widely used for system administration scripts, dynamic web pages, database manipulation, text processing, etc, etc, etc, I didn't begin using it until we migrated from Dynix to Horizon in May of 2003. My initial use of Perl is covered in the next section as it involved more than just Perl, but here are a few of the things I've accomplished with Perl:
-
Preparing Front Range Community College (FRCC) student records for import into Horizon:
-
Twice a semester the library receives a file of students registered for the upcoming semester at FRCC. Horizon includes a utility (bimport) for importing borrower records en-masse, but requires the records to be in a specific format. The data we receive isn't readable by bimport and may contain invalid characters, which causes bimport to die a horrible screaming death. Since most students register for more than one semester, the file also contains many potentially duplicate records. I needed a solution that could:
-
Read a file of records exported from FRCC's SIS (Student Information System).
-
Cleanup any characters invalid in Horizon.
-
Query the Horizon database for a match based on student ID or SSN.
-
Create an add or update record readable by bimport.
-
Write the records to a file readable by bimport.
My SIS to bimport conversion script currently does the following:
-
Prompts the user for borrower type, record type, campus, expiration date and source file.
-
Sets defaults for assorted Horizon fields like phone type, age group and address type.
-
Reads records from the source file, cleans up invalid data and puts useful fields into variables.
-
Creates a hash array of Horizon city codes & descriptions. Attempts to match the spelling of the city in the source record with a Horizon city code. If unable to match, the city is used as is.
-
Creates a PIN based on the phone number, which is required for access to various library services.
-
Queries the Horizon database for existing records based on student ID and/or social security number. If a match is found, an update record is created and written to the output file. If no match is found, a new record is created and written to the output file.
-
Summarizes the record counts (total records processed, new/update records, match/no match on city code) for the user.
It's a little more complicated than that, but those are the main features. Fields included in the output file are:
- Record type (new or update).
- Student ID.
- Student SSN.
- Horizon borrower type.
- Horizon borrower record expiration date.
- Name.
- A note indicating when the record was added/updated by bimport.
- PIN.
- Horizon borrower number (update records only).
- Phone number.
- Horizon phone type.
- Horizon borrower statistical codes (age group & gender).
- Horizon city code -or- city & state from record.
- Zip code.
- E-mail address.
I can't imagine how we'd ever import records into Horizon without Perl! This ability became even more important when the College's online learning program grew significantly. They wanted to offer their online students access to some of the online databases we have at the library. We suggested using Horizon Remote Patron Authentication (RPA) for this purpose as it was something we already had. RPA uses Horizon records for determining what (if any) remote databases someone is entitled to use. Since I had already written the necessary Perl code to create records for bimport, it was a matter of making a few modifications to my program to handle the new student type so RPA would know what kind of authorization to grant. We receive between four and six files of online student & instructor records per semester and are able to get them loaded into Horizon quickly.
One of Perl's many strengths is the enormous amount of code, often in the form of modules, contributed by Perl hackers worldwide. Perl would not have been able to talk to Sybase without the modules written by Tim Bunce (DBI) and Michael Peppler (DBD::Sybase), so thank you very much! And of course thanks to Larry Wall for creating Perl in the first place!
-
Horizon bibliography & reading list data extraction:
-
During the years we were on Dynix, I was able to purchase and write a number of tools useful for extracting data from our system. The migration to Horizon was a huge change in database engines (from UniVerse to SQL) and of course none of my tools were any good anymore - back to square one, I guess.
The next thing staff wanted was a way to produce printed bibliographies and reading lists, but the Horizon client wasn't able to pull together all the data required. The overall design philosophy of SQL dictates table structure and ultimately spreads the data over a number of tables. Add to that the complexity that is Horizon and the weird constraints of storing MARC (MAchine Readable Cataloging) records and you wind up with a large number of tables and data that isn't necessarily legible (or useful) in raw format. There are a number of SQL tools available, but the ones we had access to were either difficult for the novice to use, seemed better at summarizing information than displaying details or both. I needed a script that could:
-
Read a file of Horizon BIB numbers. Staff would search Horizon for the titles they wanted and export the BIB numbers to a text file.
-
Get & cleanup title, author and item data.
-
Write the data to a file that could be easily imported into another program by staff, minimizing my involvement in the process.
I already had code available to connect to the Sybase database and I was beginning to learn my way around the Horizon table structure. I had a pretty good idea how to go about getting the data I wanted, but what I needed was a way to deal with "processed" fields, where the data isn't legible without "unprocessing" it (for lack of a better term). Once again another Perl hacker was able to provide the missing code, and my thanks go to Paul H. Roberts of Alpha-G Consulting, LLC. Currently my script does the following:
-
Reads through a file (or several files) of Horizon BIB numbers, discarding non-numeric data.
-
Performs a basic sanity check to see if the BIB number read actually exists in Horizon.
-
Gets the title. This may sound simple, but is actually quite complicated. The title is stored in differing locations depending on length, and may include processed data. There's a lot of things to check and quite a few to cleanup just to produce a legible title that can be sorted without leading articles getting in the way.
-
Gets the author. Once again it sounds simple but isn't. Author information isn't stored directly in the MARC record, but rather in an authority record, which the MARC record is linked to. Therefore it's necessary to get the author from the authority record. More checking and cleanup to produce reasonably legible author information.
-
On to the item information extraction, which includes barcode, location, collection code, call number, item status, etc. More cleanup including processed fields and fields that need to have their code translated to the corresponding description.
-
Finally the record is written to the output file as a series of tab-separated fields.
-
The file is imported into Excel or Word by library staff members, where they can remove unwanted fields & records, sort the report, and do pretty much whatever else they want with it.
The reports have been used to create "bookmark" reading lists for patrons, shelf lists for pulling items and shared with smaller Colorado libraries as topical reading lists. Many of the smaller libraries lack an automation system capable of complex subject searches or simply have no automation system at all.
In trying to keep with Perl tradition, I wrote much of the code as functions and placed them in a Perl module. This lets me centralize code that I might use again and makes it available in the event anyone else wants it. The program is capable of processing a large number of records in a short amount of time, certainly much faster than having to extract the data manually via an SQL tool. At present the program only works with Horizon BIB numbers but I have plans to expand it to work with barcode numbers.
Index
Web development with LAMP
LAMP is an acronym for Linux, Apache, MySQL and Perl (although the M & P can mean other things). I became acquainted with LAMP during our migration from Dynix to Horizon. We discovered during the migration that BIB (title) and holdings (item) use statistics wouldn't be carried over to Horizon, something collection development staff were rather unhappy about. My (evil) plan was to investigate the possibility of dumping our Dynix BIB & holdings use statistics into a MySQL Database and then using Perl CGI pages to search the database by BIB number or barcode and display the results as a web page.
My first stop was the book Open Source Web Development with LAMP by James Lee and Brent Ware. On the cover is a Swiss Army knife and the authors explain "A Swiss Army knife contains many useful tools, but most people only ever use the knife & screwdriver. Our purpose in writing this book isn't to teach you all the nuances of any of the topics we cover, because there are already plenty of books available for that. Following the 80/20 rule, our goal is to teach you the 20% of commands you'll use 80% of the time while including pointers to more in-depth reading." I highly recommend this title for anyone considering web development using Open Source tools.
-
Dynix "Stats-O-Matic"
-
After reading the LAMP book, I decided my plan was workable and so Dynix "Stats-O-Matic" was born. The first step was extracting the title, author and use statistics from Dynix and cleaning them up. I won't cover the details here because it was a time-consuming and ugly process. The available UniVerse data extraction tools were pretty decent, but very slow and there was a *lot* of data to extract.
The next step was loading the data into a MySQL database which was *way* faster than extracting it. With the data available in a MySQL database, I needed to tackle the Perl CGI script. The search page is a simple form that accepts user input and hands the data off to the CGI script, which does the following:
-
Rule one - Never trust user input! Validate user input - check for no data, both BIB & barcode entered, and non-numeric input. If the data is still ok, proceed. If not, generate an error page.
-
If a barcode was entered, get the BIB number. If the barcode or BIB is not found, notify the user.
-
Select title, author and various use statistics from the MySQL database.
-
Create an html report include BIB number, title, author, BIB use statistics, item information and item use counts. Hand the report back to the browser.
That's it - sounds simple but it was my first LAMP project. It was a *lot* of work and even more trial and error. You can see the fruits of my labor here if you'd like. (With appropriate BIB & barcode title results, of course!)
-
Horizon Technical Services "Requests-A-Mundo"
-
Success with earlier Perl and LAMP projects gave me the experience & knowledge I needed to tackle the next Dynix to Horizon migration gap. When we were on Dynix, I routinely created a list of on order items that had requests (holds) on them. Technical Services staff used the list to locate these items so they could be processed, cataloged and delivered to the waiting patron quickly. There was no such report available in Horizon and I wanted to (a) create one that was better than the old Dynix report and (b) could be updated automatically.
The information Technical Services wanted on the report included PO number, title, author, item status and BIB number. I needed additional fields to select the correct items, translate status codes into descriptions and whatnot, so I knew there would be quite a few tables involved. TS Requests-A-Mundo performs the following functions:
-
Populates a hash array of collection codes to descriptions and an array of status codes to descriptions.
-
Gets all unique BIB numbers from the Horizon requests table.
-
Gets all item record numbers for each BIB number.
-
Gets a variety of fields from various Horizon tables, including: BIB number, barcode, collection code, call number, item status, PO number, author and title.
-
Inserts item record information into a MySQL database.
-
Gets & sorts all unique PO numbers from the MySQL database where the item's barcode is a "fake" (on order) barcode. Since all requests were put into the MySQL database, this process selects only those PO's where items are still on order. Older PO's won't have any items left on order, but may still have requests. I didn't pre-select only items on order because I thought I might want to have all items with requests available for any future reports.
-
For each PO, select only those items that are still on order. We may have received a partial shipment, so some copies would already be in patron's eager hands while others are still on order.
-
Creates and writes an html report file.
The report is updated twice a day automatically by cron, and Technical Services is very happy to have it. It includes more useful information than the old Dynix report and I don't have to create it manually. You can see the current Technical Services Requests-A-Mundo report here if interested.
-
Horizon Public Services "Requests-A-Mundo"
-
During the Library's 2005 Summer reading program, I "broke" the Horizon utility staff had been using to create a report of display items with requests on them. As I dug deeper into the problem, I discovered that the utility was actually now working correctly and by putting it back the way it was I would be creating more problems than I was solving. Staff still needed a way to create a report of these items but I didn't want to risk further problems by restoring the utility to its original state. Fortunately I had written the Technical Services Requests-A-Mundo report using two scripts. The first script collected the data and put it in a MySQL database and the second script created the report. I added a number of new fields to the MySQL database schema and modified the data collection script to gather the additional information needed to produce a second report for Public Services staff. This report is quite different from the Technical Services report, consisting of items already in the system rather than those on order. Both repors share a common database, the PS report selects items for the report using different criteria:
-
Selects items from the request database containing a specific location and display status code. Each location has its own report and the items are grouped by display status.
-
Counts the total number of requests and subtracts copy-specific requests and suspended requests. If the number of requests is still greater than zero, the item is added to the report.
-
If there are items to report, creates and writes an html report file. If there are no items to report, creates and writes an html report stating "No items to report."
Like the TS report, the PS report is updated twice a day automatically by cron. The reports contains more information than the one previously created by Public Services staff, which was somewhat labor intensive and error prone. You can see the current Public Services reports here if interested.
Does all this Perl programming qualify me as JAPH (Just Another Perl Hacker)? I'm not sure if I've gained enough experience for that title just yet, so perhaps I'm still JAPN (Just Another Perl Novice). ;-)
Index
VMware virtual servers
VMware is software that creates multiple virtual machines on a single piece of hardware. This allows one piece of hardware to do the work of several servers. VMware consists of three main parts:
-
The host operating system:
-
For VMware Workstation, GSX & Server, a host operating system is required (Linux or Windows).
-
For Virtual Infrastructure 3, VMware ESX is the host OS.
-
The host operating system is installed on the physical hardware and may provide services beyond just being a host for VMware virtual machines.
-
The VMware software provides a virtual hardware layer for the virtual machines to run on and manages physical hardware utilization. Virtual hardware can include anything a "real" computer would have: CPU, RAM, disk drives, CD-ROM drive, sound card, network adapter, serial/parallel/USB ports, etc. One highly important and useful note is that VMware always presents the same virtual hardware, regardless of the real hardware. This makes virtual machines highly portable - simply copy the directory containing the virtual machine from one server running VMware to another.
-
VMware GSX & ESX include a web-based management user interface, which is used to:
-
Provide an overview of all virtual machines (CPU/RAM usage, uptime, heartbeat, etc)
-
Start, stop, suspend & reboot virtual machines.
-
Create new virtual machines.
-
Modify an existing virtual machine's hardware & other settings.
-
All virtual machines have their own virtual console, which allows you to do anything you'd do while sitting in front of a real computer:
-
Watch POST and the boot process.
-
Change BIOS settings.
-
Login to the virtual machine and actually use it!
-
The guest operating system or virtual machine (VM):
-
The guest operating system runs on virtual hardware provided by VMware. Many operating systems are supported, including:
-
MS-DOS 6.
-
MS Windows 3.1, 95, 98, NT, ME, 2000, XP, 2003.
-
Linux (various distributions & releases).
-
Novell NetWare 4.2, 5.1, 6.0 & 6.5.
-
FreeBSD.
-
Sun Solaris for the Intel x86 platform (experimental).
-
All virtual machines run independently of each other and the host operating system. Each virtual machine has its own separate configuration, including network settings. A virtual machine may be powered on, powered off or rebooted without affecting other virtual machines running on the same host. Of course if the host OS crashes, the virtual machines will *probably* go down with it. ;-)
VMware comes in three flavors, depending on your needs:
-
VMware Workstation:
-
Run multiple virtual machines on a desktop PC without having to "dual-boot".
-
Create & test new virtual machines for later migration to GSX or ESX.
-
Use as a sandbox to test software and patches before rolling out to production machines.
-
Limited to a certain number of running virtual machines & RAM.
-
No management interface or scripting tools.
-
VMware Server (replaces VMware GSX, which we currently run):
-
Free! (as in beer). VMware has committed to giving this version away for free. Support & subscription options are available for purchase.
-
Run multiple virtual servers on Intel server hardware to consolidate hardware, reduce operating costs and conserve server room resources (space, power & A/C).
-
Browser-based management user interface to control virtual machine operation (start, stop reboot), add new virtual machines and edit the configuration of existing virtual machines.
-
Remote console allows you to be "at the console" of a virtual machine from anywhere and do anything you'd do while sitting in front of a "real" computer.
-
Scripting and command line tools for programmed interaction with and automated control of virtual machines.
-
Symmetric Multi-Processing (multiple processors) available for virtual machines.
-
VMware Virtual Infrastructure 3 (ESX Server + add-ons):
-
All the features of VMware Server & GSX.
-
Runs directly on the physical hardware - no host operating system required.
-
Cluster virtual machines across multiple physical servers for maximum availability.
-
Move running virtual machines to a different VMware host using VMotion!
-
Virtual Center for managing groups of VMware ESX hosts.
-
Configuration options to guarantee physical server resources for critial virtual machines.
When we migrated our ILS (Integrated Library System) from Dynix to Horizon, we were on the verge of drowning in hardware. Things that used to run on the database server or in combination with other services now seemed to need their own server. Rather than putting each piece of middleware on its own small server, we decided to use VMware GSX to reduce our hardware needs, control server growth and (hopefully) make supporting all these middleware servers easier. We purchased one copy of VMware GSX server and installed it on Preston. Initially Preston hosted 4 virtual machines and we later added 2 more. We were so impressed with VMware we decided to further consolidate our server hardware by upgrading an existing server and installing VMware on it. In 2006, we replaced a server at Irving Street and elected to make it a 3rd VMware host. Having VMware running on three servers allows us to balance the load of our virtual machines and provides redundancy in the event of hardware failure. Having VMware servers at two locations also provides some basic disaster recovery capabilities. Preston, Shaun and Wendolene currently host the following services & virtual machines:
-
Preston:
-
Internet filtering via Smart Filter & Squid (Host OS).
-
Centralized logging for network equipment via syslog (Host OS).
-
Public DNS records via BIND (Host OS).
-
SAM PC time management / print cost recovery system (Windows Server 2003 standard).
-
SIP2 protocol server for SAM system (Windows 2000 Professional).
-
SIP2 protocol server for 5 self checkout units (Windows 2000 Professional).
-
Test server for compiling & testing new versions of Smart Filter (Red Hat Linux).
-
Shaun:
-
Feathers, the primary Linux server at Irving Street.
-
Shaun's hardware is largely underutilized at the moment. In 2006 we'll be collapsing the staff/server networks at College Hill & Irving Street into a single network. It will then be much easier to move virtual machines between facilities, at which time we will re-balance the load by distibuting the VMs among all 3 VMware servers.
-
Wendolene:
-
Apache web server & Postfix mail server (CentOS).
-
Bugzilla - Automation Services work ticket tracking system (CentOS).
-
Horizon Day End Processing & Debt Collect (Windows 2000 Professional).
-
Horizon Information Portal (HIP) public access catalog (Red Hat Enterprise Linux).
-
Horizon Remote Patron Authentication (Windows 2000 Professional + Apache).
-
Mailman mailing list software (CentOS).
-
Norton AntiVirus Corporate Edition (Windows 2000 Professional).
-
Automation Services remote access support VM (Windows 2000 Professional).
Additionally, there are a number of other, non-production virtual machines running on our servers. We use them to experiment with new software, test major system upgrades (like Horizon & HIP), make modifications to other systems (like Bugzilla) and for training. Staff must attend a one-hour class before they begin using Bugzilla, so we copy the production VM, rename it Trainzilla and use it for training.
Overall we've been very happy with VMware - For us VMware is technology that's way cool and useful! As with any software there are pros and cons, here are some to consider before getting started:
Pros:
-
Avoid drowning in hardware!
-
Make better use of existing hardware and control future hardware growth.
-
Virtual machine portability. Since VMware always presents the same virtual hardware regardless of the physical hardware, virtual machines are quite portable. This provides redundancy when running multiple VMware servers. It also makes server upgrades easy - just install VMware on the new server, copy the VMs over and you're good to go.
-
"Undoable" mode prompts for saving or discarding changes made when powering off the virtual machine. Handy for "sandbox" type testing of new software, patches, etc.
-
Use the remote client's CD-ROM drive to install software without physical access to the VMware server. Way cool!
-
Disconnect and/or disable unused devices to conserve system resources.
-
Virtual machines boot faster than real hardware.
-
Short downtime during virtual machine backups - just copy the VM's files and back them up.
-
Make snapshots of virtual machines for quick rollback after an upgrade or major change that went wrong.
-
Be on the local LAN while troubleshooting remotely. Map drive letters, access local services, print to network printers, etc - all without exposing internal services to the Internet or requiring complex firewall rules.
-
The management user interface and remote console can be available from anywhere. Both can communicate encrypted and only two TCP ports are necessary.
-
The Server product is free! Support & subscription options are available.
Cons:
-
Can be costly to get started with unless using the free version. Even when paying for support or purchasing Virtual Infrastructure 3, it's still cheaper in the long run than buying server after server.
-
Probably not ideal for hardware intensive applications unless high end hardware is used.
-
Not designed for heavy-duty multimedia gaming, but you probably wouldn't use VMware for this purpose anyway.
-
Specialized hardware may or may not be supported.
-
Starting many virtual machines simultaneously can slow all of them down untill all are running. This can be alleviated by staggering VM startups.
-
Unless you're running VMware on multiple servers, VMware is potentially a single point of failure for many services. Use good hardware and run VMware on multiple servers or have backup hardware available if at all possible.
The VMware website includes product information & documentation, FAQ's, a good knowledge base and user forum. VMware Server is free, so try it already!
Index
Firewalling with iptables
Linux firewalling tools have come a long way since the days of ipfwadm. The current tool, iptables, is a full-featured firewall rivaling some commercial offerings. In fact, there are some commercial products based on iptables. Some of the features include:
-
Full inbound & outbound NAT (Network Address Translation). Outbound traffic can be NATed to a single IP address or group of IP addresses. Inbound traffic can be re-directed to the correct e-mail, web or other server.
-
"Masquerading", which is simiar to outbound NAT, but useful when the NATed address is unknown or subject to periodic change. A good example of this is a home DSL or Comcast cable Internet connection, where the address changes from time to time.
-
Connection tracking, sometimes called "stateful" or "dynamic" filtering. Outbound connections are tracked and ports are opened and closed dynamically so that ports don't get left open indefinitely.
-
Loadable modules for complex applications like active-mode ftp, some multimedia applications, etc.
-
Granular logging control, superior to some commercial offerings (IMO).
I've already mentioned the library's two firewalls; Mr-Tweedy & Mrs-Tweedy. Both use iptables and they firewall the following networks from each other:
-
Mr-Tweedy:
-
Library network.
-
Comcast cable modem Internet connection.
-
Community college network.
-
Self-check units.
-
Dial-in support modem.
-
Mrs-Tweedy:
-
Library network.
-
Comcast cable modem Internet connection.
-
Self-check units.
In 2006, our network configuration will become much more complex. In addition to collapsing the staff/server networks between the two facilities, we'll also be moving the public PCs to their own network and adding a wireless network for patrons to bring in their own laptops. Prior to beginning this project, my preferred method of firewall creation was to use a shell script with lots of variables. This method was already becoming increasingly cumbersome to maintain across multiple servers and I wanted a better option for managing multi-network firewalls. Firewall Builder is a GUI for creating & managing firewall configurations. It lets the user focus on the rules instead of the syntax by abstracting hosts, firewalls & services as objects. It can create rulesets for a variety of operating systems & firewall tools, including: Linux (ipchains & iptables), OpenBSD (PF) and Cisco (PIX). I now use Firewall Builder to manage firewall & individual host iptables configurations.
Another handy tool to have, is IP Tables State (iptstate), which creates a "top-like" display of active connections through the firewall. Some distributions now include iptstate.
Index
CUPS printing
Unlike printing via Samba, CUPS printing uses no Windows authentication, making it easy to use from Windows or Linux. CUPS supports a number of printing methods, including LPR, which is what we use. LPR is an older Unix printing service, but Windows clients can be configured to use it as well. We switched to LPR printing via CUPS after discovering that our public PC time management / print cost recovery system was having difficulty printing to network printers via Samba. We also use it to print from Windows servers that aren't part of our Samba domain.
Index
Revision control with RCS
As the name indicates, RCS is a system for managing multiple versions of files. The ".bak", ".old", ".older" & ".save" method of preserving files can get confusing and out of sync in a hurry - which version of the file did you really want? RCS alleviates this problem by storing the file in a way which can display version differences line-by-line as well as user input notes. When you "check in" a file, RCS prompts you for text describing the changes and increments the file's version number. Decide you need to start over with the last working version? Simply use RCS to check out an older version. Checking out the file for editing locks it so others can't change it. Checking out a file without locking it makes it available for use read-only. I use it primarily for firewall scripts and Perl programs, although I should be using it for configuration files as well. Would this web page be a good candidate for RCS? Probably so, although if it gets much bigger I'll have to consider moving it to Wiki format.
Index
Bugzilla
Bugzilla is an industrial-strength bug tracking system. Bug tracking systems allow developers to keep track of outstanding bugs in their products effectively. Most commercial defect-tracking software is expensive. Despite being "free", Bugzilla has many features its expensive counterparts lack. Consequently, it has quickly become a favorite of hundreds of organizations across the globe. As of September 2006, 571 known companies and organizations worldwide use Bugzilla, including: Mozilla (Netscape & Mozilla browsers), the Linux kernel developers, NASA, Red Hat, the Apache Project, and Id Software.
Although Bugzilla is designed to track bugs in shipping products, it works well for tracking work requests of all kinds, which is how we use it at the library - to better track requests & issues submitted to Automation Services. Trying to track requests via paper, e-mail, voicemail, sticky notes, written lists and other means was proving increasingly difficult, not to mention frustrating for all staff. Bugzilla provides a web-based method of entering, tracking, updating and resolving issues submitted to Automation Services staff. All non-emergency requests to Automation Services are handled using Bugzilla. In retrospect, Bugzilla is one of those things we wish we'd installed much earlier. During our search for a solution, we reviewed a number of products before choosing Bugzilla.
Products reviewed:
-
Track-IT:
Track-IT is a commercial product aimed primarily at Windows environments, and it has many optional features beyond just issue tracking that we weren't interested in. It can also be quite expensive, depending on the version and features desired.
-
FogBugz:
FogBugz is also a commercial product, although the pricing is better and they offer a discount to public libraries. Its primary platform is Windows / IIS / MS-SQL / ASP, but a Linux / Apache / MySQL / PHP version is also available. FogBugz attempts to create a "low barrier to entry" for using the product, which we liked at first, but ultimately felt it was lacking some features we knew we wanted. (It may have some of these features now.)
-
Bugzilla
Bugzilla is an Open Source bug tracking system, so cost isn't an issue. Its primary platform is Linux / Apache / MySQL / Perl, although many other configurations are available. Unlike FogBugz, Bugzilla offers a myriad of options - actually too many for our needs. However, since Bugzilla is Open Source and modifications are freely permitted, we felt we could customize it to suit our needs.
Reasons for using a ticket tracking system:
-
Limited Automation Services (IT) staff:
There are only two of us and anything that can help us keep better track of requests is welcome.
-
To reduce interruptions:
Staying focused on a complex project is difficult enough at times without having to answer the phone for a minor request that could have been reported in a different manner, but no standard manner existed so staff would use whatever means they preferred - often the phone. I estimate my support-related phone calls have dropped 70%, allowing me to focus and get my work done more effectively.
-
Previous methods were inefficient and frustrating for all staff:
E-mail, voicemail, phone calls, PDA tasks, written lists, sticky notes, hallway conversations, etc. It also prevents Veronica & me from having to say "Ok, one of us has that request, but which one of us is it and which list is it on?" We were overdue for a centralized method of reporting & tracking requests.
-
More effective communication with staff about their issues & requests:
Bugzilla provides pro-active (some would say hyper-active) notification via e-mail anytime a ticket is updated. Although there are some areas of Bugzilla that are limited to Automation Services staff only, library staff can add themselves (and others) to any staff-available ticket if they wish to be notified when a ticket is updated. Staff can also add e-mail group addresses to a ticket, which provides a quick way to notify everyone in a particular work group. Staff often use this feature when reporting an out of order computer - library staff working other shifts are informed of the problem, which also helps minimize duplicate reports.
-
Building a knowledge base / FAQ:
Since tickets are retained indefinitely, Bugzilla provides a knowledge base of problems and solutions which is shared by all support staff. Eventually we hope to formalize some of these solutions as procedures and create a Wiki for them, but for now at least valuable information is retained and there's less "reinventing the wheel" when a similar problem occurs. Bugzilla also provides a place to thoroughly document difficult or infrequent problems, including things like command syntax or difficult to find settings. For example, a complicated SQL statement which you may be able to re-use later or a hidden administration URL that comes in handy.
Accountability, responsibility & process transparency for all staff:
-
Centralized reporting of issues:
Since Bugzilla is the method used to report all non-emergency issues, the responsibility for reporting an issue or requesting service is squarely on their shoulders - if it isn't important enough for them to enter into Bugzilla, then it doesn't get done, period. To put it another way, "No tickee, no workee!" ;-) While this may seem a little harsh at first, it avoids later claims of, "...but I asked you to do this last week when I stopped you in the hall as you were headed home for the day." Aaargh!
-
Checking an issue's status:
Staff can check the status of an outstanding ticket and see who is responsible for working on it. If they have additional information to add or would like to see the ticket bumped up on the priority list, they have an easy way to do so.
-
Responsibility & process transparency:
The responsibility to offer an opinion, add additional information or make a decision when needed. For issues that affect a large group or the library as a whole, Bugzilla provides a forum where staff can discuss the issue and (hopefully) reach a consensus, or at least a decision. For complex problems, more information is often needed from staff, and Bugzilla provides an easy way to do so. In cases where meetings occur, meeting notes and additional thoughts are often added to the ticket after the meeting.
-
Organizational memory:
Bugzilla also provides a "paper trail" of thoughts, opinions and decisions. Since comments can be added by all staff, there's no excuse for staff not being involved in issues that may affect them. No more "he said, she said" or wondering why a particular decision was made and who approved it. This has already come in quite handy for a few issues where tempers were a bit high.
-
Tracking of low priority issues:
Keeping a better eye on projects that may have been moved to the "back burner". If staff want to know the status of a project or ask why their issue isn't being handled in a timely manner, they now have an easy way to do so.
-
Better coverage by Automation Services (IT) staff:
Since all outstanding requests are available in a central location, important issues don't get delayed during staff sick days, vacations, etc.
Once we had decided on Bugzilla, our first thought was, "Wow, Bugzilla really has a lot of fields to be filled in. There's no way staff will use it in out-of-the-can form." Bugzilla has a lot of features and we knew that we wouldn't need many of them. We didn't want to mandate a new method for requesting IT support that was so complex no one would use it. Issues would go unreported and staff would be frustrated. We had some policy decisions to make and customization work to do before we could begin using Bugzilla.
Initial design:
-
Open vs login required:
Bugzilla is designed to be an "open" system, meaning anyone can search for existing tickets and create their own login to add a comment or report a new issue. Since our use of Bugzilla is limited to staff only and tickets may contain sensitive information, we use Bugzilla in a "closed" fashion. New accounts are created by Automation Services after the staff member has attended a short Bugzilla training class. Bugzilla also runs encrypted via SSL to further protect information.
-
"Regular" staff only:
We also limit Bugzilla accounts to regular staff, and only those with official City or FRCC e-mail accounts, no outside e-mail accounts permitted. This was a difficult decision, as the library employs a number of substitutes on an as-needed basis. In addition to protecting any sensitive information, we also wanted to be sure questions would be answered in a timely manner, and there's no guarantee when (or if) a particular sub will be working again.
-
Initial notification of new tickets:
Rather than having the initial notification of a new ticket go to only one Automation Services staff member, we decided to have it notify both of us for a number of reasons:
-
No worries about tickets getting "lost" if one of us is out for an extended period.
-
Some tickets are "no brainers", meaning it's obvious who should take it. We each have our areas of expertise and for these tickets it's just a matter of assigning the ticket and setting the priority level as appropriate.
-
We often discuss who will work on what ticket anyway and then assign the ticket accordingly.
-
We work on some tickets together. In this case one of us will accept the ticket, becoming the "point" person to make sure the ticket gets worked on. Normally the other person will be added to the cc: list for notification.
-
It's a good indication of how much new work is coming in and a good way to head off duplicate tickets before we both waste time and effort working on the same thing.
-
Products, components & versions:
Bugzilla uses products, components & versions to categorize tickets. While this is geared toward shipping software releases, we were able to use it to categorize things we support at the library. A sample of our current products & components is:
-
Horizon (our integrated library system):
- Cataloging
- Circulation
- Searching
-
Horizon Information Portal (the online catalog for patron use):
- Appearance
- Booklists
- Patron empowerment
- Searching options
-
Computers:
- Staff
- Public
- Network printers
- Discard
-
Network services:
- Bugzilla
- E-mail
- File storage
- Internet connectivity
- Internet filtering
-
Don't Panic!:
-
Use this when you don't know how to categorize the ticket. We attempt to categorize these if possible, adding a new product/component if necessary.
-
Ticket type:
The ticket type helps to clarify the nature of the ticket. We currently have three:
-
Fix Me! (bug) - Something was working, now it isn't. This applies to hardware and software.
-
Request - A general request for new service.
-
Information - A general question or discussion item. Tickets of this type may become a request if a discussion item becomes an action item.
-
Ticket priority:
Relative importance as compared to other tickets and projects being worked on by Automation Services. We currently have five priorities:
-
1-Critical (pager) - This priority is used for critical issues, including major problems for which we've been called or paged. Often this type of ticket will be entered after the fact for tracking purposes.
-
2-High - High priority projects, patron problems, time-sensitive issues and system-wide problems affecting normal operation.
-
3-Medium - The default priority used for most tickets.
-
4-Low - Projects that are on the "back burner" or projects that are in the information-gathering stage. Once we begin working on a ticket with this priority, it is generally moved to "medium" priority.
-
5-Ultralow - Future projects, tracked in Bugzilla instead of written lists, etc. These tickets may be in the very early stages of planning, etc. The priority is generally changed as the ticket starts to get more attention.
-
Ticket resolutions:
There are currently four ticket resolutions:
-
Fixed - This is the normal resolution for most tickets. We don't normally close the ticket until we receive confirmation from staff that the issue is resolved. Occasionally we will resolve & close tickets due to lack of interest/feedback from staff.
-
Can't fix - The issue requires the software vendor to fix the problem, which may or may not happen in a future release. We check for these tickets when installing a new version of library applications.
-
Won't fix - Technically we could fix the problem, but we won't for a number of reasons, including: library policy, violating system integrity, impact on other staff, etc.
-
Duplicate - This ticket is a duplicate. Bugzilla will mark the ticket as such and update both tickets with this information.
Initial customization:
-
Form simplification:
We simplified the forms & re-arranged field elements to better suit our needs. Removed fields can always be added back in if the need for them arises.
-
Terminology changes:
"Bug" became "Ticket" in most places. Since some requests aren't to fix something that's broken, but rather a general request, ticket seemed better terminology than bug.
-
Rewrite ticket entry guidelines:
Most of the language was geared toward reporting a bug in commercial software, we wanted something that would reflect the needs of the library.
-
Rewrite the Bugzilla home page:
We re-wrote the home page and added a link to the Automation Services on-call pager guidelines. There are some issues that Automation Services should be called or paged about right away and we wanted staff to have easy access to the guidelines for doing so.
-
Add a link to the staff e-mail groups page:
We added a link to the staff e-mail groups webpage after the cc: list box. Since notification is one of Bugzilla's strengths, we wanted to make notifying other staff as easy as possible. The staff e-mail groups page opens in a new browser window and e-mail addresses can be copied & pasted to the cc: list box. Initially the cc: list box was left off the new ticket entry form, enough staff were using it that it was re-added.
-
Remove the [reply] link:
We removed the "[reply]" link in each comment's header. This link copies the text from the original comment and inserts it into a new comment. Some staff were using the link and adding their comment before the original message, others were just adding their comment. This made tickets somewhat confusing to read, and since each ticket already includes all previous comments in order, seemed unnecessary.
-
Remove the [mailto] link:
We also removed the "mail to:" link in each comment's header. The whole idea of Bugzilla is for users to add their comments *to* the ticket so that everyone can see them, not send private e-mail to other staff.
-
Make the "Assign to" field a drop-down box:
We made the "assign to" field a drop-down box with Automation Services staff names in it. There's only two of us, so this is easier than having to fill in the field.
-
Reduce the amount of e-mail sent:
We felt there was no reason for Bugzilla to send an e-mail when something like the product, component or version changed. Bugzilla sends e-mail only when a comment or attachment is added. Since a comment is required when marking a ticket fixed or closed, staff will always be informed when a ticket has been resolved.
Automation Services (IT) staff only customizations:
-
Hidden products:
Some products are hidden from library staff. This includes products we use for internal tracking and products that regular staff wouldn't be making requests about, like network equipment, server configuration, etc.
-
Changing a ticket's priority:
Only Automation Services staff may assign tickets or change a ticket's priority. We determine who will handle a ticket and must balance new tickets with our existing workload. Staff can request a ticket be given high priority and we will attempt to meet their needs if possible and appropriate, but most tickets just go "in the queue" and are worked on in order.
-
Ticket resolution:
Only Automation Services staff may resolve, close and reopen tickets. Once we think the issue has been resolved, we will mark it "resolved/fixed" and wait for confirmation from staff before closing the ticket. Fixed & closed tickets may still have comments added to them, so staff can request a ticket be reopened if it seems the problem was not fixed.
-
Hidden comments & attachments:
Only Automation Services staff may view comments & attachments marked as "private". This isn't generally as nefarious as it sounds. A private comment/attachment is a good place to add details about the work in progress or things to check on, but would be of little interest to staff. Bugzilla will not sent e-mail notification to staff not permitted to view private comments/attachments, so this is also a way to help reduce the amount of e-mail sent.
Bugzilla concerns, issues & pitfalls:
-
Staff reluctance:
Some staff have been reluctant to use Bugzilla and we have tried to "gently" steer them toward it. We can't force them to use it, but in order to get things done they will need to.
-
Entering a ticket vs calling/paging:
Staff confusion about when to use Bugzilla vs. calling or paging. Our pager guidelines provide general guidelines for answering this question, but it basically boils down to:
-
Call or page for major system problems.
-
Call during regular Automation Service office hours when patrons need "immediate" assistance. We make every effort to assist patrons with computer-related problems while they are in the library.
-
Call if unsure whether your need constitutes an emergency or not, you may be told to "Bug it, man!"
-
Realistic expectations:
Some staff have the expectation that we will begin working on their issue as soon as it is entered into Bugzilla, which isn't necessarily true. While we will make every attempt to at least acknowledge high priority items, most issues just get added to the "queue", we prioritize them and work on them in turn. We haven't added any staff, we're just tracking issues differently.
-
No response from ticket reporter:
Some tickets remain open for long periods, awaiting further input from staff. Sometimes a gentle reminder (via Bugzilla) is sufficient to garner necessary input. Other times the ticket is eventually closed "due to lack of interest." (Sometime the threat of closing a ticket is enough to elicit a response.)
-
Unreported issues:
No doubt there are "unresolved issues" that have not been entered into Bugzilla. We have largely refused to operate on rumor, hearsay and innuendo - if it isn't in Bugzilla, it doesn't get done! This isn't to say we don't communicate with staff by any other means! Essentially if it's going to take more than 10 minutes or has implications for other staff then it probably needs to be in Bugzilla. Somewhat of a hard line to take perhaps, but many other organizations (IT and otherwise) have a similar policy for work requests - they must be submitted via the official method to get done. Aside from tracking work performed, one additional reason for this policy is to help justify the need for future staffing increases.
-
Ticket scope creep:
Sometimes multiple issues are reported on the same ticket. We try to adhere to the mantra of "one issue, one ticket" whenever possible. Partly this is to prevent a ticket from ballooning way beyond its original scope. Additionally the same person may not be responsible for all items on the ticket, making assigning the ticket somewhat difficult. When a ticket comes in with multiple issues, we will either move the additional issues to new tickets or ask the staff member to do so. For large-scale projects we know will have multiple tasks, we often enter a "summary" or "status" ticket and create additional tickets for specific tasks. Bugzilla has a cool feature called "auto-linkification" that will automatically create a URL to another ticket by using the syntax "ticket X" or "ticket X, comment Y".
-
Time specific events:
One thing Bugzilla can't currently do is remind you of time-specific events, like turning on the data jack in a meeting room on a certain date. What I normally do is add an appointment to my PDA and reference the ticket number.
Further information:
For more information about our customization of Bugzilla, including screenshots and downloadable templates, click here.
Index
Ethereal
This section is under construction
Index
Linux on the desktop
So, now that you've read all that blather, what about using Linux on the desktop? Well I certainly use it on one of my desktop PCs for things like: e-mail, web browsing, server administration, network equipment configuration, programming and other assorted tasks. Linux is also installed on a laptop for "walking around the building" troubleshooting as well as weekend work and on-call support.
For staff computers, Windows 2000 is the only available option at this point. The Horizon client requires Windows and both the City and College are heavily invested in the MS Office Suite and Internet Explorer. The 8.0 release of Horizon is supposed to include Linux client support, at which time we will re-examine our options. One possible solution could involve using Codeweaver's crossover plugin to run the required MS applications.
Although our Horizon web catalog doesn't require Windows, our current PC time management / print cost recovery system does. I don't expect this to change anytime soon, but if it does we'll certainly investigate the possibility of moving our public computers to Linux.
Index
Future projects
What does the future hold for Linux at the Library? Some of the projects I have in mind for "down the road" include:
-
Investigate the possibility of running the Horizon client application on Linux. Currently the vendor doesn't support Linux on the client end, but has indicated they will when version 8.0 is released "sometime" in 2006. The City and College are heavily invested in the MS Office suite, so this may not be possible for many of the staff PCs, but it might prove useful on circulation desk PCs. It would also be one more application I could run on my Linux workstation. ;-)
-
More scripts to automate some system administration tasks.
-
Additional web pages, including more technical information about the services we provide with Linux.
-
Begin documenting system procedures, problem resolution notes, Horizon tips & tricks and anything else that would benefit from being available in an online, searchable format using Wiki. A good example of a working (and useful) Wiki is Wikipedia.
-
Additional LAMP projects including automatic creation of a barcode & PIN for CCC Online students. Many online students never set foot in the FRCC campus, but to access the library's online resources a Horizon barcode & PIN are required. The script should read user input from a web form, which would include the student's ID number and e-mail address. The script would then query the Horizon database, generate a barcode & PIN (or read existing ones) and e-mail them to the student. This would allow online students to "register" for library services 24 hours a day and remove staff involvement.
-
An additional network segment at College Hill to provide Internet access for meeting rooms and patron laptops. For security reasons, this will be a physically separate segment and will have a DHCP server for easy configuration of patron PCs.
Index
Sources & further reading
Index
Thank you's
Patricia, my Wife:
-
For not asking too many questions when I come home with yet another computer or book.
-
For listening to me babble about things she has no real interest in.
-
For not panicking when I'm still awake at 4:00am because I've got a problem to work through.
Veronica Smith, my supervisor:
-
For being a great partner - I couldn't (and probably wouldn't) keep all the various pieces & parts at the library running without you!
-
For putting her trust in me when I say "this will work".
-
For trying to learn way too much about way too many technologies from me - no small task.
-
For encouraging me to write this web page and being a tireless editor of it. While not always implemented, your suggestions were appreciated. ;-)
-
For helping me keep my sense of humor during well over two years of non-stop enormous projects. "Things will settle down sometime soon" - right?
My parents, Mel & Fran:
-
For teaching me to think for myself and giving me a lifelong love of learning.
Scott Hewes:
-
For being the catalyst that got me started in the Unix world. At the time, we were both working for the City of Westminster and he invited me to help install the Library's new HP-UX server. From there I took Unix and Linux and "just ran with them" as Scott would say.
Gerald (Jerry) Carter, Open Source developer, Samba team member, author & all-around explainer:
-
My first real foray into setting up a service on Linux was Samba, and I struggled mightily at first. Jerry was kind enough to take me under his wing, answering all my questions via e-mail for months at a time until I was finally able to say "Eureka - I get it!". He never lost patience with me or suggested I seek answers elsewhere, and for that I am immensely grateful. In keeping with the fine tradition of Linux & Open Source, I have tried to repay him by helping others. I hope I've succeeded.
Bill Leeb, Rhys Fulber & company:
-
They are, in various guise and with various other members: Conjure One, Delerium, Equinox, Front Line Assembly, Noise Unit, Pro-Tech, Synaesthesia and others I've probably missed or just don't know about.
-
Thanks for music that helps me concentrate, and other music for when I don't want to.
All those people who make Linux possible:
-
I can't even begin to list them all - but they know who they are! ;-)
Index
Dedication
This page is dedicated to the memory of Judith A. Houk, my friend and mentor for many years.
-
Thank you for always encouraging me to learn and take on new projects.
-
Thank you for being a sounding board when I had an idea or just needed to rant.
-
I regret you were unable to see the College Hill Library open to the public.
-
Although you would be suitably impressed by what I've managed to accomplish with Linux, you would not be surprised. You always believed in me, even when I didn't.
Index |
top |
Back to Eric's Linux pages
This page created February 25, 2000, based largely on an earlier document.
This page last modified October 21, 2008 by:
Eric Sisler (esisler@cityofwestminster.us)