This article has been republished in its entirety from the Fall 2001 issue of Library Journal Net Connect magazine with the permission of Library Journal, 245 West 17th Street, New York, NY 10011.
Media coverage and general awareness of Linux has exploded in the past few years, but perhaps a short review is in order. Linux was created by Linux Torvalds, then a student at the University of Helsinki, Finland. It is an open source Unix clone that aims at POSIX (Portable Operating System Interface) compliance. It is developed and maintained by many Unix programmers and wizards worldwide via the Internet.
Linux itself is the core piece of software that allows other programs to communicate with the hardware. When most people talk about Linux, what they're really talking about is a Linux distribution, which includes the Linux kernel and a wide variety of software from other sources. Linux would not exist without major contributions from two of these sources: the Free Software Foundation's GNU project and Berkeley Systems Design (BSD) Unix, one of the original flavors of Unix. Beyond the Linux hype, these are the benefits and drawbacks of using Linux. What are the criteria for deciding if Linux is right for your Library? If it is, how do you get started.
First, the good news: there are many advantages to using Linux as your server's operating system. Linux doesn't have the overwhelming hardware requirements of other operating systems. It will run on nearly any of Intel's family of x86 processors and clones, which means getting started with Linux can be relatively inexpensive. Although my Library's Linux boxes now live on server class hardware, they started out running on desktop PCs . One server was a recycled 486/66 with 16Mb of RAM and a whopping 300Mb disk drive! This may not sound like much by today's standards, but it was more than adequate for the server's duties at the time.
Linux can be purchased on CD-ROM from one of several Linux distribution vendors or downloaded for free from the Internet. Purchasing Linux on CD-ROM is a good way to get started because distributions typically include an easy-to-use graphical installer, a wide variety of commonly installed software, documentation and often limited installation support. All the software packages included in the distribution have been checked and will, in most cases, work and play well together. Linux doesn't have any licensing issues, so one copy can be installed on any number of computers, a definite advantage over commercial operating systems.
Linux is open source, which means that all noncommercial software includes the source code. Although this may not be of interest to the average user, it allows those so inclined to make improvements and submit bug fixes back to the software's maintainer(s). These changes are often incorporated into the next release of the software, thereby benefitting all users.
Linux is an extremely stable operating system. Other benchmarks and testimonials aside, it has been astoundingly reliable at our library, Westminster PL, CO. There are only a few occasions when a server running Linux must be rebooted: to install/replace hardware, after an extended power outage or when installing a new kernel. There are no regularly scheduled reboots and the record for continuous uptime at the Library stands at 201 days, with computers powered down only to install more memory, a redundant power supply, and a larger UPS (uninterruptable power supply). We have never experienced software-related downtime and frankly don't expect to.
Linux also doesn't suffer from software "bit rot", meaning that system performance doesn't degrade over time until the operating system must be reinstalled from scratch. Like many modern operating systems Linux uses shared libraries, similar to Windows .dll files. These shared libraries reduce the size of compiled binaries (programs) and provide a standard set of functions and procedures. Unlike other operating systems, regular software packages do not arbitrarily overwrite these shared libraries or install their own versions. This greatly increases system stability and prevents newly installed software from breaking existing software. It also means that software doesn't have to be installed in a particular order to get everything to work correctly. The only time these shared libraries are upgraded is when installing a new release of Linux or when the libraries are intentionally upgraded.
Most Linux distributions use some type of software package management utility, similar to the Windows install/uninstall wizard. This makes all aspects of software management much easier to live with by handling package installation, upgrading, verification, removal and dependency checking. Verification ensures that all files originally installed are still present and reports any differences. Dependency checking verifies that all other required software is already installed and prevents new software from overwriting files already installed by another package. Software packages can also be digitally signed by the package's creator, ensuring that the package hasn't been tampered with.
With the recent attacks on server running Microsoft's Internet Information Server (IIS) and the ongoing battle against Outlook-transmitted viruses, it's refreshing to know that Linux is immune. It is not completely free of vulnerabilities, but its overall design makes it difficult for viruses and worms to have any effect.
Now the bad news: as with any operating system, there are some disadvantages to using Linux. A big potential disadvantage is the steep learning curve associated with Linux. There just isn't any way around this except to grit one's teeth and get through it. However, once some of Linux's initial complexity has been conquered, things begin to make more sense and get easier.
Many libraries don't have computer support staff and either make due with technically-minded library staff or in some cases, support from their parent organization. A possible solution could include training library staff and/or a support contract from a distribution vendor, hardware retailer or local consultant. Fortunately the numbers of Linux-related training services and support providers continues to grow.
Because no one owns Linux, organizations considering its use get nervous. Often they feel that without "someone to blame" they risk being stuck with a problem they can't fix or with an obsolete operating system. Linux relies heavily on support from the community to report and fix bugs, often resolving problems much faster than traditional software vendors. With no single commercial owning Linux, there's no danger of it going "out of business", even though distribution vendors may come and go.
At Westminster PL, Linux evolved from a UNIX learning tool to the operating system on servers that provide access to virtually everything except the catalog. The list is a veritable plethora of computer terms and acronyms: DNS, DHCP, FTP, Sendmail, POP3, NTP, Apache and Samba to name a few. A complete description is an article unto itself, but you can read more about it in the sidebar "Evaluating Linux at the Westminster Public Library". Linux isn't for everyone, so here are some things to consider before jumping into it:
One of the best ways to get started with Linux is to pick a single task. The knowledge gained from configuring that first service will help with consecutive tasks and gradually things will get easier.
Most libraries have small connections to the Internet, which may be shared by a number of branches or even with the rest of the organization. Setting up an Internet cache server is a good way to take some of the load off the Library's Internet connection.
Squid is a good, solid Internet cache and is included with most Linux distributions. Basic configuration is relatively easy, although a library with a lot of Internet PCs will probably want to run Squid on a server with plenty of RAM and fast disk drives. Our main branch has a 23-workstation instructional classroom, and the effects of installing Squid are quite noticeable there. As the instructor is demonstrating a site, Squid is busy caching many of the site's pages and graphics. Squid then serves these cached items as each student workstation requests them, resulting in faster retrieval time and a reduction in Internet traffic.
Although more and more databases are moving to the web, many libraries still have the CD-ROM version, often available at only a single PC. These standalone PC's often have changers attached to them that are slow to switch discs and can be a hassle to maintain. A Linux server running Samba can provide CD-ROM database access to multiple PCs. If the server has enough disk space the CD's contents can be copied to the drive, eliminating the overhead of multiple CD-ROM drives or towers.
Apache is the most widely used web server on the Internet and we use it to serve a variety of web pages: The public PC's home page, staff recommendations, statistics, Linux information and so on. A web server is a great place to put documents of all types that need to be easily accessible, regardless of whether they are regular HTML documents, PDF files or even Excel spreadsheets.
If you have or are considering getting your own Internet domain, Linux can provide an inexpensive way to administer it. Providing DNS records for internal devices like PCs, printers and network equipment aids in device identification and troubleshooting. DNS, or the Internet "phonebook," maps host names to their corresponding IP addresses.
Dynamic Host Configuration Protocol, or DHCP, provides centralized management of a client's TCP/IP settings - host/domain name, IP address, default gateway, DNS servers, etc. Once a network is up and stable, changes are generally pretty minimal, but DHCP can save you a great deal of time should an unexpected change occur. Although the "D" in "DHCP" means "dynamic", it can be used to assign static addresses to PCs, which we do for a couple of reasons. One, some of our web databases are authenticated by IP address, which would present a problem if IP addresses "drifted" from PC to PC. Two, by assigning static addresses to all PCs, troubleshooting is easier: we know that each PC always gets the same IP address. Matching a PC with its corresponding hostname makes it easy to locate.
Linux can provide a wide variety of network services, all at a fraction of the cost of a commercial operating system. If your Library is considering alternatives (to commercial operating systems), Linux is definitely worth investigating.
Eric Sisler (esisler@cityofwestminster.us) is the Library Computer Technician, Westminster Public Library, CO
In December of 1996, the library received an early Christmas from the City's IT department: our own HP-UX server to run Dynix on. HP-UX is Hewlett Packard's proprietary version of UNIX. Installing and configuring the new server only whetted my appetite for learning more about UNIX. What I didn't want to do was destroy the Library's brand new server. Clearly I needed something to use as a learning tool and Linux fit the bill quite nicely. A distribution could be purchased for around $50 and it would run on a desktop PC.
We were also in the process of planning the technology needs of a new jointly run library facility, and we began thinking about what kind of services we wanted to provide and how to go about doing so. One of the first things that came to mind was some kind of CD-ROM database server. We had a few CD-ROM products available on a standalone PC basis. Our partner in this new joint-venture Library facility, Front Range Community College (FRCC), also had some CD-ROM products. These databases lived on a Novell server that had never quite lived up to expectations. I installed Linux on a spare PC and began configuring Samba to serve CD-ROMs.
Domain Name System (DNS) resolution was another service we knew we'd need right away at the new library. In addition to providing resolution for Internet hosts, we also wanted records for all network devices to aid troubleshooting. Setting up zone files for DNS proved a bit tricky at first but was a good lesson in configuration file syntax and interpreting error messages.
Two services successfully configured, it was time to see if the library's fledgling Linux server, named Gromit, would fly. All the testing in the world just can't compete with real-world usage. At this point both services were running on a very modest "server", really just a desktop PC with some extra hardware including extra memory, a CD-ROM tower, UPS (uninterruptable power supply), and tape drive. Fingers crossed, we added the server to the Library's network and brought it to life.
Several months went by, and it was obvious that Linux had indeed lived up to expectations. We felt confident that Linux would be able to handle whatever challenges we threw at it. The next step was to provide a central place for library staff to store their documents. Given the limited number of computer support staff available (two), backing up each and every computer simply isn't feasible. Since all computers are periodically wiped clean and new software is installed via ImageCast, having a backup of the computer's software is unnecessary as well.
Before requiring staff to store their documents on the server we wanted to minimize the likelihood of hardware failure. Since Linux has modest hardware requirements we were able to purchase a refurbished HP NetServer that would meet our needs at a good price. This new server had SCSI (Small Compuer System Interface) disk drives, plenty of RAM, a tape drive, UPS and room for growth. After moving Gromit to its new hardware, staff began storing their documents on the "H" and "L" drives as they are commonly known. Initial fears that staff wouldn't like or use this new method were allayed when one of our nontechnical children's librarians later said, "I use it every day. I can work on a document in my office in the morning and then later on the Children's desk." Mission accomplished!
There isn't enough space in this article to describe everything Linux does. For the rest of the story, please visit gromit.westminster.lib.co.us/linux.