Minimal Perl for Unix and Linux People 332
Ravi writes "Perl (Practical Extraction and Report Language) — the language which was created by Larry Wall is arguably one of the greatest programming languages. But it has a reputation for taking an excessive cryptic nature which gives it an image especially among Perl novices as a language which is complex and hard to master. Minimal Perl: for Unix and Linux people, authored by Tim Maher and published by Manning Publications addresses the obstacles presented by Perl's complexity. This book which is divided into two parts comprising of a total of 12 chapters takes a unique methodology to explain the Perl syntax and its use. The author emphasizes on Perl's grep, awk and sed like features and relys on concepts such as inputs, filters and arguments to allow Unix users to directly apply their existing knowledge to the task of learning Perl." Read on for the rest of Ravi's review.
What I found while reading this book is that the "Minimal Perl" is a specially crafted subset of Perl language designed to be easily grasped by people who have a Unix background and who wish to use Perl to write their scripts. Its aim is to filter out the complex way of writing programs using Perl and whenever possible to accomplish tasks using just one or two lines of Perl. In the first part of the book, the author explains how Perl can be used to do the same tasks as accomplished by common Unix tools such as grep, awk, sed and find. He goes one step further by explaining how one can accomplish much more and in a much simpler way by using Perl techniques.
Minimal Perl for Unix and Linux People | |
author | Tim Maher |
pages | 464 |
publisher | Manning Publications |
rating | 8 |
reviewer | Ravi |
ISBN | 1932394508 |
summary | Provides a slice of Perl which when mastered can accomplish most of the jobs which require Perl |
Throughout the book, the author makes sure that the learning curve in acquiring Perl skills remain gentle. Perl is a language whose syntax has a multitude of options, this book is peppered with numerous tables which provide excellent information at a glance. For example, in the third chapter titled "Perl as a (Better) grep command", the author lists and compares the fundamental capabilities of Perl and the different grep commands such as grep, egrep and fgrep which clearly shows the advantages that Perl has over grep. In another table, you get a birds eye view of the essential syntax of Perl's regular expressions and their meaning. This chapter alone has around 12 tables. This is a really nice feature because it doubles as a Perl reference where you can flip to the respective page and get the information you need.
The main strength and drawback of a language such as Perl is its dependence on regular expressions for accomplishing complex tasks. Once you master the regular expressions, the sky is the limit for ordering and segregating data using this language. In Perl, there is more than one way of doing the same thing. What is unique about this book is that the author specializes in explaining the easiest way of doing a particular task.
In many places, the author demonstrates complex tasks using just a few lines of Perl code. Many of the examples covered in this book are practical examples which give an idea of how the commands relate to the final outcome. For instance, while elaborating on the one line grep like commands in Perl, the author illustrates a web oriented application of pattern matching where he shows how to extract and list, the outline of slashdot.org site's front page. The surprising thing is this is accomplished using just a single line of Perl code. This book has lots of such one line examples which teache how to use Perl intelligently using minimal effort.
If part I of this book focuses on ways in which simple Perl programs can provide superior alternatives to standard Unix commands, the second part throws light on the other aspects of Perl concentrating on the syntax of the language and various built-in functions and modules available which do away with a lot of re-invention of the wheel, so to speak, and helps churn out code which is portable.
Chapter 7 titled "Built-in functions" introduces an eclectic mix of functions available in Perl. You have functions which are used to extract a list of fields from a string, functions to access the current date and time, generating random numbers, sorting lists, transforming lists, managing files with functions and so on. These functions are broadly classified into those which generate and process scalars and those that process lists.
In chapter 8 of this book, the author involves the reader on the numerous scripting techniques that can be used to write better Perl programs.
It was quite surprising that the author has chosen to discuss the variables, more specifically the list variables comprising of arrays and hashes, as well as the looping constructs only in the 9th and 10th chapters, when they should be somewhere up front. In hind sight, I feel it is a good decision. Once you execute the one liner Perl programs in the initial chapters, you will be fairly confident in using Perl by the time you reach the 9th chapter.
The last two chapters deal with creating sub-routines and modules. Over the years various Perl programmers have created modules which are used for diverse purposes. With an aim to share these modules, they are collected and stored at one central place known as CPAN, which is an acronym for Comprehensive Perl Archive Network. The final chapter, apart from teaching how to create modules in Perl and manage them, also introduces the CPAN and ways in which one can find the right module by searching on CPAN.
The special variables cheat-sheet and the guidelines for parenthesizing code provided in the two appendices are really useful as a quick reference while writing Perl programs.
This is not a comprehensive book on Perl, rather the author provides a slice of Perl which when mastered can accomplish most of the jobs which require Perl. You won't find object oriented concepts of Perl being mentioned in this book. In many ways the author has moved beyond explaining a subset of Perl by providing a section titled "Directions for further study" at the end of each chapter, where the author lists further material which can be used to learn more about the topic that is covered.
I really enjoyed going through this book, especially because of its focus on the practical side of using Perl and taking a minimal approach.
Ravi Kumar maintains a blog titled "All about Linux" where he shares his thoughts and experiences in using Linux, Open Source and Free software.
You can purchase Minimal Perl for Unix and Linux People from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Re:*sigh* (Score:4, Insightful)
Should I read this or continue with sed/awk? (Score:3, Insightful)
I'm currently going through http://www.oreilly.com/catalog/sed2/ [oreilly.com], but I can see my using perl the more I do website programming. Would an experience scripter suggest that I switch to perl (for it seems it can perform similar text manipulation functions conveniently in a programming lanuage), or spend more time with sed/awk?
I'll probably do both incidentally, but opinions would be appreciated. It seems everyone rates perl.
I was going to switch to Python, but apparently Perl is better for smaller/one line regexp manipulation in scripts, and python for building large applications.
Cryptic? (Score:4, Insightful)
Is it just me, or is it possible to create perfectly legible code in Perl if you use good technique, just like in any other language?
The cryptic/convoluted stuff only comes out when you try to be too cute.
Not hard to learn, very easy to remember (Score:5, Insightful)
The thing I've noticed, as a Perl programmer, is that it is the *only* language I've ever used (amongst bash, c, c++, java, rexx, fortran, basic) that I can take a break from for a year, come back, and be able to write a simple script without the need to refer to any books or online manuals. That is VERY useful for those of us who are more sysadmins than programmers. This power is partly due to the "more than one way to do it" philosophy, that lets you program in a style that works for you, hence allowing you to remember *how* to write in that language.
Then again, that's what most anti-Perl folks bitch about. Any language can be obfuscated. If you write hard to decipher code in Perl, you'll write it that way in any language.
Re:Cryptic? (Score:3, Insightful)
Whenever I come across code that just looks like `an explosion at an ascii factory', I first indent it (which usually fixes the readability). If that doesn't work, I try to figure out what it's trying to do (likely developer didn't know any better, and used some clunky code to do something trivial; usually code that can easily be `clarified' by replacing whole sections of it with a single regex).
Perl is surprisingly beautiful and easy to read language---you just have to know it well.
Cryptic whitespace (Score:4, Insightful)
A language which makes a semantic distinction between tabs and spaces may give the appearance of enforcing legibility but in fact does little useful to help legibility.
A programming language should not make a distinction on meaning based on whether tabs or spaces are used; all whitespace should be regarded equaly (except, understandably, end of line characters).
Otherwise, python seems ok. I just wish it had a whitespace-agnostic mode.
*I* cannot visualy tell the difference between tabs and spaces, why should the programming language?
Re:Cryptic? Complex!? (Score:3, Insightful)
I was a Python afficionado, although most of my professional experience was with Java. Then I joined a Perl project. I was open minded, any language can be good in the right hands. Now, two and a half years later, I'm pretty good at it.
As the team grows, we find ourselves relying more and more on standard techniques. They're not your standard techniques, they're just what we came up with as our standard way. They work well. We have a beautiful object oriented mod_perl/Template Toolkit system, unit tests, RoboDoc, the works. We know how to do this.
But, exactly as you say, we need coding standards. Lots. Just to make code more comprehensible, it needs to look pretty uniform. We can do that.
But then, note that objects are just hashes. Sometimes, you get odd data in them, due to some bug. Where did that happen? Of course you use grep, but there are so many ways to put something into a hash, that you run into problems. So you use getters and setters and make sure that all the code everywhere uses them.
But even things like renaming functions... different calling syntax can make it hard to grep for uses of a function, even. It's getting too ridiculous. Our book of coding standards is getting so thick that we could be coding fucking Java instead, and feel liberated. It's madness.
So, yes, you can do Perl for larger projects. It's possible. But you have to tie yourself down so badly, most of Perl's strengths as a language can't be used.
Now I want to get back to Python or Java...
Re:Cryptic? Complex!? (Score:5, Insightful)
perl5 has run it's course (Score:4, Insightful)
I've been programming perl for 10 years. I've written enough XS modules to be sadly familiar with perlguts and perlapi. I've used perl for a huge array of applications, not excluding some pretty twisted apache hacks using mod_perl. I write perl code every day in my job.
Lately however, I've grown more and more frustrated with this language. Here's some reasons why:
Strategies for complex perl code bases (Score:4, Insightful)
That's a pretty common way of implementing objects in perl, but it is, of course, not the only way... The current thinking seems to be we should all switch to using "Inside-Out Objects" (briefly: object data is moved to class data, and the object only needs to be a unique id to pick out the correct values from the class data -- so you bless a scalar ref, and get a lightweight object which stringifies to a unique id). The point being that if you do things this way, you really *have* to use the accessors, you can't cheat and treat the object as a hash reference any more. Unfortunately, last I looked there was some argument about what precisely was the right way to do this (there's some issue with thread support), though the best publicized way of doing it certainly the one recommended by Damien Conway in his newish book "Perl Best Practices".
If you're not interested in re-writing your entire code-base to conform to someone's notion of "Best Practices", myself I might suggest looking into "lock_keys" in the Hash::Util module. You could adopt the practice of doing a lock_keys on the hashref at the end of the object/creation initialization stage, and then if anyone accidentally tries to create a new hash field later, it will throw an error. A simple, effective trick, and I wish it were better publicized...
On occasion I wonder how hard it would be to write an automated test that would look for cases where someone has done a "$obj->{hash_field}"...
In general, coding standards are important, and where the language is really flexible, they arguably become even more important -- but I think a lot of that problem can be solved with some good automated testing. For example, there's a CPAN module called Perl::Critic that will do things for you like check to make sure your code matches a given set of coding standards (it defaults to Conway's "Best Practices", as I remember it).
Re:Cryptic? Complex!? (Score:4, Insightful)
Re:PseudoHashing (Score:2, Insightful)
Actually, pseudohashes made everything slower, so they've been long deprecated and won't be in Perl 5.10.
Re:Not hard to learn, very easy to remember (Score:2, Insightful)
When it comes to programming, however, I have discovered, my worst enemy is always that guy pretending to be me from around two years past. Admittedly, he does some pretty clever things sometimes, but invariably in totally strange ways. He's been following me around since quite a long time, by now, but I've never quite figured out his ways. I found, giving him a least some constraints on how he can believably impersonate my past self is often rather helpful.
Re:Cryptic? Complex!? (Score:4, Insightful)
You know, it seems like *everyone* is put off by this aspect of Python at first. The first time I looked at it, it drove me nuts, and then I ignored Python for another two years.
But once I actually tried to write a program in Python, I found I didn't mind it one bit. Within a few hours my eyes didn't get confused by the lack of braces. I think it's actually easier on the eyes once you get used to it.
So I can't say, "don't knock it", because I've done that myself for sure. But do give Python another look, maybe play around with the tutorial for an hour or two.
Re:I really want to know... (Score:3, Insightful)
Re:Cryptic? Complex!? (Score:3, Insightful)
Yay! Let's reinvent the wheel by writing 10, 20, or more lines of code for something regular expressions would be able to handle in one. Furthermore, let's claim this is done for the sake of keeping the code 'pretty,' because it's far too embarrassing to admit that we don't really understand how to use regular expressions!
Hmm, like string functions that allow the use of regular expressions to make your string manipulation quick, efficient, and useful?
Yes, regex can be an odd concept to deal with at first, as they tend to be quite a bit more succinct than the languages you're more familiar with. Are you aware, however, that regular expressions can contain comments [regular-expressions.info] and extra whitespace [perl.com]?
Maybe you're paid by the line of code, or am attempting to squeeze in every extraneous hour of programming to inflate your consultant fee. If that's the case, I would certainly recommend avoiding regular expressions; they save far too much time and work entirely too well.