|Self-Service Linux: Determining Problems and Finding Solutions|
|author||Mark Wilding and Dan Behman|
|publisher||Prentice Hall, PTR|
This is not a basic Linux HOW TO book: authors Wilding and Behman take the reader to a level past the introductory Linux OS installation instructions and KDE/GNOME window dressing changes. In all real life scenarios and at some critical point, a Linux user or admin will need to troubleshoot some aspect of the system they use at home or the systems they manage on the job. This book is for that power user, systems administrator or developer who will, out of base necessity, require a proper bag of tools and practical guidance in establishing an effective set of skills for troubleshooting one or more Linux systems.
A quick scan of the table of contents gives a very abbreviated summary of the book and actually belies the depth of the contents. The authors break the chapters into very self-contained topics including best practices and initial investigations, strace and system call tracing explained, the /proc filesystem, compiling, GDB (GNU Debugger), Linux system crashes and hangs, and kernel debugging, among others. These chapters are filled with detailed examples that perfectly illustrate real world scenarios that any Linux user will be familiar with.
Chapter 1 is an overview of the complex process of problem determination and resolution and begins with steps to configure your Linux system(s) for optimal troubleshooting. The authors outline a selection of tools they recommend the reader/user install on their Linux system(s) in anticipation of future problems: strace, ltrace, lsof, top, traceroute/tcptraceroute, ping, hexdump, tcpdump/ethereal, GDB and readelf. These and many others are categorized by type (process information and debugging, network, system information, files and object files, kernel and miscellaneous) in Appendix A, "The Toolbox." Wilding and Behman stress the importance of balancing the need to solve issues immediately vs. building troubleshooting skills. They outline four phases of problem investigation (using your own knowledge and skills to investigate, using the Internet, conducting a deeper investigation, and getting help) and discuss where the various tools fit into different scenarios, how to collect information about system changes, what resources are available on the Internet (Google, USENET, Bugzilla, etc.), how to handle more difficult problems and where and how to get outside help, if necessary.
Chapter 2 explains system call tracing, introduces the strace tool (traces system calls between a process and the kernel) and how to use it to diagnose errors related to the operating system. This is a very well organized chapter with plenty of depth. Wilding and Behman offer an extensive discussion of this first tool, progressing from simple examples that illustrate how to read the strace output, how/when to use strace options, timing system call activity, tracing an existing running process, to many practical debugging examples.
Chapter 3, "The /proc Filesystem," looks at user process information (/proc/self, /proc/<pid>, /proc/<pid>/environ, /proc/<pid>/mem), kernel information and manipulation (/proc/cpufreq, /proc/cpuinfo, /proc/devices, /proc/meminfo, /proc/partitions), and system information and manipulation (/proc/sys/fs, /proc/sys/kernel, /proc/sys/vm). The authors run through files and directories relative to /proc and describe how to view information about the kernel and currently running processes. This chapter gives a good example of how to use the "kernel magic sysrq key" feature (using the ALT-SysRq hotkeys to get kernel information) when a system hangs. Output from the commands showPC, showMem and showTasks are given as examples.
The next chapter details the GCC (GNU Compiler Collection) and compiling. The authors don't attempt to walk the reader through basic kernel source compiling but rather they concentrate on how to decipher errors that arise from compiling source. They give a basic outline of some basic compile failures (environment/setup errors, compiler version differences, user error, code error, etc.) then show a common error involving both incorrect code and different allowances made between compiler versions. Wilding and Behman show the reader how to decipher the kernel error and how to use both existing documentation and bugs posted on the Internet to correct the errors and rerun the compilation successfully. This is a very practical demonstration of how compile errors can be worked out and solved quickly.
Chapter 5 begins with a definition of "stacks" and a description of stack structure, local variables, and stack frames. Also shown is how to display the raw stack in a debugger like DDD (Data Display Debugger) or GDB (The Gnu Debugger) in order to perform a detailed stack analysis. The authors use the backtrace command to look at stack traceback output from GDB, "walking the stack" (manually walk the raw stack frame by frame using the dladdr function), common causes of stack corruption, and SIGILL signals.
Debugging applications is the subject of the next chapter with the majority of the chapter dedicated to The GNU Debugger. This is a logical place for a discussion on debuggers as the authors point out that they are particularly useful when problems can't be solved through log files, error messages, etc., when a problem is of an immediate nature (i.e. doesn't extend over a long period of time) and when source is available. GDB command line editing is covered along with how to control a live process by running the process directly through the debugger, how to attach to a running process and how to use a core file (or a process image) to perform debugging. The authors also examine viewing the memory map and variables, looking at the contents of register dumps, working with C++ (inline functions and exceptions), and problems with threaded applications. A brief description of the Data Display Debugger (DDD) GUI front-end to GDB is included at the conclusion of the chapter.
Chapter 7 deals with "System Crashes and Hangs" and how to assemble the appropriate information necessary to troubleshoot a system problem using various tools and techniques: using the syslog, setting up and using a serial console, using the SysRq kernel magic hotkey, examining the oops report generated by a manual kernel trap, considering hardware failure issues, and setting up cscope to index kernel sources. This chapter prepares the reader for documenting proper and extensive details about errors and problems not only for rapid diagnosis but also in the event he or she needs to call in an expert.
Kernel Debugging with KDB is a brief chapter that instructs the reader on how to enable and activate KDB, basic commands associated with its use, and some examples on how to use it. Several good illustrations of where KDB proves useful over other tools are included.
The final chapter explores the ELF file format (executable and linking format) for shared libraries and executables. The authors provide a comprehensive look at the ELF standard on Linux. They start with basic definitions and concepts (symbols names and C versus C++, linking with static libraries and run time linking, and run time linker) and prep the reader with some source code that is used in later chapter examples. They examine the ELF file structure (the header using hexdump, segments/sections with readelf, the program header table, and the section header table). This is probably the strongest chapter in the book. There is enough information and instruction in this chapter to arm a Linux system troubleshooter to follow the practical examples with little effort.
The book concludes with two valuable appendices that detail the authors' selected tools for Linux problem determination and include a data collector script intended to capture basic critical system information in the event of a problem. As discussed above, the "Toolbox" appendix is a summary of the authors' selection of best Linux tools for diagnosing problems. Each tool has a brief description, where to get it, level of usefulness, when to use the tool and additional notes. Appendix B, "Data Collection Script," offers the reader a sample bash script tool that gathers a broad range of system information. The authors provide several optional switches to increase the amount of data collected with the caveat that time to collect that information also increases.
Wilding and Behman assume some familiarity with the Linux system: their advice and instruction are intended for those users who are not afraid of the CLI and who understand basic Linux operating and file system structure. That said, Self-Service Linux: Mastering the Art of Problem Determination is a valuable resource for advanced users and system administrators. In short, this book is for anyone who uses Linux on a daily basis on one or multiple systems. The examples are fully detailed: the reader gets commands, options, output, sample code, and a variety of possible outcome scenarios. Wilding and Behman set out a realistic and practical approach to problem solving; they satisfy the troubleshooter in all of us. Self-Service Linux is a welcome addition to the Bruce Perens Open Source series of Prentice Hall PTR professional reference books."
You can purchase Self-Service Linux: Determining Problems and Finding Solutions from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.