Forgot your password?
News Books Media Book Reviews

Refactoring: Improving the Design of Existing Code 99

Posted by Hemos
from the solid-advice-for-programmers dept.
SEGV has returned and is continuing his excellent set of reviews. This time around, we're looking at Martin Fowler's (with Kent Beck, John Brant, William Opdyke, and Don Roberts) Refactoring: Improving the Design of Existing Code. Click below for more details.
Refactoring: Improving the Design of Existing Code
author Martin Fowler with Kent Beck, John Brant, William Opdyke,
pages 431
publisher Addison-Wesley
rating 9/10
reviewer SEGV
summary Just what the working programmer ordered: a catalogue of practical refactorings with solid advice on when and how to apply them.


This book could very well do for refactoring what the "Gang of Four" book did for design patterns. In fact, with the number of contributing authors, this might well become known as the "Gang of Five" book. (They contributed content to chapters 3 and 12 through 15.)


Refactoring leaps in feet first with an extended example. I found this to be a surprisingly effective opener: it didn't overwhelm me, and left me hungry for more. The first chapter follows a sample program through several incremental refactorings, and the reader gets the idea via osmosis.

To illustrate the technique of refactoring, the first chapter presents the original code on the left page, and the resulting code on the right, with changes in bold. This presentation, coupled with explanatory text, makes it easy to see what's going on and focus on what's happening. It's as if you're looking over the author's shoulder as he edits, compiles, and tests code in his development environment.

What is Refactoring?

Now that you've done a refactoring, you might be curious to know more about what refactoring is. The next few chapters provide the relevant background.

Refactoring is what the book's subtitle suggests: changing code in in ways that preserve behaviour, but improve the way that behaviour is generated. This could be as trivial as renaming a method, or as tricky as separating domain and presentation classes.

Why go through this trouble? In the end, the code is different but it acts the same; there has been no new functionality added. Why? You do this to place yourself in a better position to add new functionality to the software. If you don't, you eventually end up with spaghetti code that is unmaintainable and will not support new functionality at all.

I think anyone who has worked on real code can appreciate the need for refactoring. In fact, most good programmers already do it, although perhaps only on a subconscious level. What this book aims to do is to raise that ad-hoc activity to a higher level of applied technique. Just as there are principles and practices in GUI design (as opposed to merely throwing widgets together randomly), there are principles and practices in refactoring activity: this book catalogues them.


Sandwiched between introductory and summary chapters is the meat of the book: a catalogue of over seventy refactorings. This catalogue follows in the footsteps of the highly successful Design Patterns format: Pattern Name and Classification, Intent, Also Known As, Motivation, Applicability, Structure, Participants, Collaborations, Implementation, Sample Code, Known Uses, and Related Patterns. Since the individual refactorings are less complex than patterns, this catalogue uses the format: Name, Summary, Motivation, Mechanics, and Examples.

The idea is the same. The name and summary provide a definitive vocabulary and a reference-card example. The motivation explains the relevance of the refactoring. The mechanics cover the step-by-step details of how the refactoring is executed. Then a series of examples demonstrate the variations.


I like the catalogue. Although some refactorings seem deceptively trivial, it is useful to have them laid out in step-by-step detail. You never know when you will make a mistake, and when you absolutely positively must fix a bug or add a feature by the next day, and need to refactor to do it, slow and steady wins the race.

Further, other refactorings are not so trivial and familiar, and it is certainly useful to have their traps and pitfalls exposed. Frequently, they rely on the smaller refactorings themselves.

I can see this book becoming well-used in a shop with plenty of production code.

Supplementary Material

The non-catalogue chapters are informative as well. I especially appreciate the metaphor of bad smells in the code: the "if it stinks, change it" philosophy is the perfect counter-point to the oft-cited "if it ain't broke, don't fix it" mentality.

The chapter on refactoring tools discusses the possibility of automating much of the mechanical work of refactoring. Although there is a Refactoring Browser for Smalltalk, I suspect that Java and C++ versions are a little ways off. I'd wager that, as with the UML, tool support will lag industry practice for some time.


As always, the author's writing style is down-to-earth and easy to read. Martin tells you straight up what he's found useful and what he hasn't. He tells you where he's made mistakes, and where the risk is less pronounced.

I like the way he goes through an example, then goes through it again under different conditions, thereby revealing the many-splendoured variations. Frequently he continues examples that were left off from other refactorings.

Plenty of further reading is suggested; I always like that.


The book has a Java focus, and that is the language used for the examples. There is some mention of Smalltalk and C++, but not much; far less than Design Patterns, for example. Still, the book is quite understandable to anyone with object-oriented development experience.

The book references design patterns; some refactorings even apply and manipulate patterns. However, I wish there were more direct references to the Design Patterns book. That would especially help those new to both refactorings and design patterns.

There are a few minor typos (nothing major), so check the author's web site for errata and try to get a recent printing if you can.


It's no secret that I think this is a book whose time has come. I'm hoping it will codify my approach to refactoring, to help me be more efficient in my development.

I recommend this book as both a practical catalogue, and as a general work on the theory and practice of refactoring. I think that the refactoring community will grow much as the patterns community before it, and that we will see more published on the subject.

Until then, this book is a good start.

Purchase this at Amazon.


1. Refactoring, a First Example
2. Principles in Refactoring
3. Bad Smells in Code
4. Building Tests
5. Toward a Catalog of Refactorings
6. Composing Methods
7. Moving Features Between Objects
8. Organizing Data
9. Simplifying Conditional Expressions
10. Making Method Calls Simpler
11. Dealing with Generalization
12. Big Refactorings
13. Refactoring, Reuse, and Reality
14. Refactoring Tools
15. Putting It All Together
List of Soundbites

This discussion has been archived. No new comments can be posted.

Refactoring: Improving the Design of Existing Code

Comments Filter:
  • ...This is an important new concept that needs to be looked at. Seems like open source works pretty well but there is still a lot of over development being done, I know this debate has existed between desktop environments etc, but it seems like on things that are smaller projects there could be a lot of effort put into existing projects or starting new projects rather than writing another ftp client when there are already 30 of them.
  • by Anonymous Coward
    I think a lot of people underestimate the importance of refactoring code. It's put to good use (as I can attest from experience) in the Extreme Programming [] software development methodology. (If you haven't heard of this, check it out. It seems kind of radical, but it works very well in practice if applied appropriately.)
  • by Desdicardo (71571) on Thursday September 23, 1999 @05:14AM (#1664622)
    I've owned this book for a couple months now and I feel it was definitely worth buying. The section on self-testing code was quite useful, even if it was short. My one complaint is that the book does not address how to apply refactoring in an environment that closely tracks SPR's. On a project where there is formal witness testing you usually try to keep SPR's small with limited impact. This is exactly the opposite of how refactoring works... i.e. redesign the whole thing if it is the Right Thing to Do. While self-testing code helps, having to pay for a complete formal regression test for each SPR would get expensive. Other than that, however, this is an excellant book. I would be very happy if it attracted a following as large as the Design Patterns book.
  • I haven't read either this book or the book on patterns, so I may be well off the mark. However the key idea behind both of them seems to be old chestnuts in software engineering.

    For design patterns, read reusability. Some languages, eg. Haskell, support a high degree of reusability in the way they work, and so one can implement the idea-in-itself once and for all, but in most languages you will find yourself reimplementing the same idea again and again.

    Similarly with the current book, for refactoring read refinement. If we have programs M and N, and N terminates on the all of the inputs M does, and with the same observables, then N refines M.

    Both of these ideas are important, and fraught with hazards in practice, so they are well deserving of a book length treatment. What irritates me is the contention that until now these ideas are ones we were only subconsciously aware of. Absurd: they are old ideas, and ones good sofware engineers are very conscious of.
  • by Anonymous Coward on Thursday September 23, 1999 @05:30AM (#1664626)
    The more I write code, the more I realize that it is like any other kind of writing. And after years of looking at garbage spagheti code, I have come to the conclusion that the best way to raise the level of coding is for experienced and talented programmers to review the code of others, making revisions if necessary. Many programmers would no doubt scream in protest, and this too would be a good thing: in my experience the worst programmers are also the ones with the most vanity (especially in regards to their code).

  • ...can replace good, solid comments. Comments lines are soooo understated in schools, but are sooo conforting in the real world.

    At University of Toronto, examples in first year had 2 line of comments for each line of code. I try to stick with that ratio at work. (I said TRY :( )

  • There is a school of thought which contends that if your code requires that many comments, it hasn't been very well-written. Good code should be almost self-documenting - an excess of comments can swamp the code and make it just as difficult to understand, if not more so, than the same code with no comments at all.

    Additionally, if you're spending twice as much time writing comments as you are writing code, then only one third of your time is being immediately productive. I appreciate the long-term value of well-documented code, but also tend to believe that brevity is the soul of clarity. And with that in mind, I'll shut up now.


  • No amount of programming methology...can replace good, solid comments. Comments lines are soooo understated in schools, but are sooo conforting in the real world.

    No amount of comments will help with poorly structured code with badly chosen variable and function names. Code should be written such that it needs very little comments (admittedly, this is a subjective matter).

  • by Anonymous Coward
    ...Good code should be almst self-documenting.

    Well, I guess that means that there is no good C++ or Java code (duck & cover)

    Just another AC.

  • This is not meant to be an insult, but i suspect that some of your disdain for architectural analysis will disappear when you work on bigger projects. I'm personally a hacker at heart, but it simply doesn't work when you get into projects that involve many people, multiple sites, and millions of lines of code. In big projects, like many open source projects, you might never even meet most of the people that you work with. As for the re-factoring book, i think it's about time. There's an old engineering aphorism that goes "if you don't have the time to do it right, when will you find the time to do it over"? I think the opposite is true with Software. There will never be enough time to do it optimally from the start, so refactoring and review is crucial.
  • by tuffy (10202) on Thursday September 23, 1999 @05:59AM (#1664634) Homepage Journal
    I don't spend a lot of time on comments the first time around writing things, but invariably I comment the second time around as an eternal reminder why I did things the way I did. This holds especially true for the low-level bits of logic I try to abstract away first and hardly look at until the boss calls for some tweaks.

    Too many comments can be as bad as too few, and trying to get the right mix is somewhat of an art that I still haven't quite mastered. But I think using them as reminders has come in very handy.

  • Excuse my ignorance but what is a SPR?
  • An SPR is a Software Problem Report. When a bug is found or an enhancement is requested an SPR is created to track the changes to the code base. The term is often used interchangebly with Software Change Report (SCR) and bug report. Check out GNATS []. Its the GNU SPR-tracking tool.
  • Another book on this subject that you all may find interesting is Anti-Patterns []
  • There is a school of thought which contends that if your code requires that many comments, it hasn't been very well-written. Good code should be almost self-documenting - an excess of comments can swamp the code and make it just as difficult to understand, if not more so, than the same code with no comments at all.

    I think it is much more important to have comments outside a function, method or class, explaining what it is and how to use it, than inside the code itself. This is for two reasons: firstly, if the comment is outside (say, just before the function definition), it doesn't clutter the code. But more importantly, somebody who wants to use a function shouldn't have to read the code to see what it does. You should put a comment at the top explaining the interface and what the code does, and any 'gotchas' in using it. Below that, you can optionally comment on how it does it, for the benefit of those trying to debug or improve the implementation.

    Of course, having comments doesn't excuse you from choosing meaningful method names, variable names, and so on. But in most cases, a name like get_user() isn't sufficiently descriptive in itself.

  • I tend to agree: You can write bad code in any language. (Is there a name for this axiom anyone?)

    Too much code, and you're spending too much time reading TWO languages. But how many programmers usually over-comment? ;-)

    Aside, does anyone have a list of the "must-have" software development books?
    i.e. "Computer Graphics" by Foley and van Dam and the Graphic Gems series are the "bibles" of graphics.
    I have "Code Complete" and "Rapid Development" both by Steve McConnel along with Design Patterns, which I find are great. Anyone have any others that every developer should have?
  • > What irritates me is the contention that until > now these ideas are ones we were only
    > subconsciously aware of.

    I think the main reason to be for books like this and Design Patterns are to
    a) provide a common vocabulary so everyone knows what we're talking about - for example I 'discovered' the Command pattern and called had a load of classes derived from command. But I talked about compounds rather than composites. I frequently use template methods but until I read Design Patterns I had no concise way of refering to them - I had to explain it everytime. Refactoring is the same.
    b) show you more then you already knew, or cast new insight on something you're already familiar with.

    This second reason is probably why Design Patterns is so popular and why I think Refactoring will also be. You already be using more than half of what the book describes, but the book(s) show you more then you knew. Because you're already in agreement with the authors (because some of the stuff is already familiar) the new stuff goes in easy.
  • Good software engineers may already be aware of this sort of thing; I know as I read that book review I see things I've done and things I've wanted to do but not had the time and things I'm perfectly well aware of.

    But that's not the primary value of a book like this. If the ideas were brand new and untested, they would be less valuable to have written down. The thing is that there are at least N+1 ways, for any given value of N, to re-engineer or refine or redesign a piece of code, and ideally you want to consider as many as possible before choosing which one to do. A book listing lots of them gives you a massive boost because it reduces the chance that you might overlook the one strategy that could be the biggest win. Think of it as a checklist: you may know, if you look in the fridge, that you have no milk, and as you walk around the house you may see any one of forty things you need to buy and they're all obvious to you, but you still make a shopping list when you go out because otherwise there's a good chance you'll forget at least one of them.

    In addition, writing these down might help turn bad software engineers or learning software engineers into good software engineers. I think learning software engineers, if they're going to be good ones, probably are already subconsciously aware of these ideas and benefit from having them brought up into the conscious level and carefully reviewed.

  • That is funny, someone moderate it up!
    (And yes, I'm a profession programmer who likes C/C++/Java.)
  • I know this sounds wishy-washy, but it all depends on your mode of development. For a strict, heavily reviewed development cycle, such as a large amount of DoD work, you need to do as much of the work in the requirements analysis and architectural design as possible. This will minimize (though never erase) the number of "gotchas" down the road. This is very important with fixed-price contacts! :)

    On the other hand, if you're doing more of a RAD style, the best advice I've ever heard is for each development mini-cycle, to "do the simplest thing that works, then refactor". Unfortunately, people love the first half of the advice but rarely remember to do the second half... :(


  • ..." should have?"

    All of the Dilbert books.

    Just to maintain your sanity.
  • I often write small assembly programs, where correctness, then time optimization are the critical design goals. I have to agree with your statement on over architecting small one-off systems, though careful design is always rewarded.

    I'd also agree architecture is of critical importance in large multi-programmer projects, especially any that must be expanded and grow over time. Many open source projects qualify nicely of course.

    But refactoring really rings a bell, even with small assembly programs. Typically I develop a simulation in C, test it, translate to assembly, verify equivalence of output, then refactor it until it's time optimal. Don't know if this book would suggest methods useful to me, but rewriting code while preserving its function is something I do a lot of.
  • by Anonymous Coward
    You need the right environment for reviews to work:
    1) social: if programmers feel the need to protect their code from their coworkers, you do not have a team development. Your whole application, being put together by modules from each programmer, will only be as good as your weakest team member.
    2) testability: if you cannot easily test (parts of) your application code, refactoring will not work. No one, other than the original designer, will dare to do any changes. Especially if it is spaghetti.
    3) concurrent version control: using something like cvs, clearcase or such is vital. For doing refactoring you need your own sandbox to be able to play in.

    If you have all three items working in your project, you will get, in my experience, the best results. Everybody can learn from everyone else.

    The worst experience was in a department where everyone worked isolated. Being new there, I got a piece of code to extend with some minor function. Peeking into the main header file I found:
    #define TRUE (-10)
    #define FALSE (-11)
    The guy who had written this had left the company. Showing this piece of marvel around and asking, if this was common practise, I was told not to do such thing. Someone could feel embarrassed by it. "Well, someone should", I thought, went back to my desk and refactored heavily...

  • Good structure and good variable names help you understand what is going on, but don't help explain why.

    Comments the just restate what the code says, like

    foocnt++; /* increment foocnt */

    are absolutely useless, and those who put them in code need to be taken out and beaten. But comments like

    foocnt++; /* yes, this situation counts as a foo */

    (ok, a bit of a contrived example) are more useful. Function and module headers that explain design and interface are very useful, and should be required on all sizeable projects, IMHO.

    Inside the code itself, perhaps the best way to comment is to write down what a section of code is going to do before you write it. Thus, you write the comment

    /* now we have to mung the frobnitz for each element of the barbaz array */

    before you write the code that loops over the array . (Assuming that the code is more complex than for (i=0;i) Then, when you're going back over the code, and you see something that makes you pause for a second - or even a quarter of a second - and say "why are we doing this?", comment it. Trust me, the guy who inherits your code will love you for it.

    (I have a calligraphic button that says "Code as if whoever maintains your code is a violent psychopath who knows where you live." Great advice. Anyone know the original source?)

    Anyway, I'm glad that I know have a term for refactoring. I've been doing it for years, but it was sometimes difficult to explain to management what had to be done and why. I shall add this tome to my purchase queue.

  • >There is a school of thought which contends that if your code requires that many comments, it hasn't been very well-written


    Most comments I do are for the sanity of my future self/others, who are not perfect and are not experts at coding. (eg 6months of simple programming from college)

    Another reason why comments are needed is to get around limitations of development environment. I'm not talking about GNU C + vi. I'm talking about full environments like Developer 2000 or Uniface. Developer 2K doesn't even have an effective global search (fixed in next version).

    >if you're spending twice as much time writing comments as you are writing code, then only one third of your time is being immediately productive

    I disagree with you on this. Comments can save you/other people/you company lots of time/money when you come back 1 year later when you have forgotten what everything does and its limitataions. Also makes you look better in front of customers when you can fix things quickly. This is more important than any academic elegance of code.

    >I'll shut up now

    I should take your advice. :)
  • While undercommenting is bad, so is overcommenting. At least I find that too much comments obfuscates the code. There is no reason to comment every single line in most code. If you don't understand what it does, and you have a description of what the block of code does (or is supposed to do), it's probably either because you need to think more about what it does, or it is just plain badly written (in which case you should rewrite it), not because there aren't enough comments. Especially since comments can be misleading, while code can't be.
  • ARRGGH! That little code snippet looked find in Preview mode. I guess What You See Is Not What You Get. Anyway, hope the idea came through.
  • WikiWikiWeb ( is one of the underappreciated jewels for programmers and anyone who wonders how you can turn into a 'good' programmer, or what a good programmer is anyway.

    I bought this book as soon as I saw Martin Fowler's name on it, and it hasn't disappointed me. On some level, it's design patterns in practice, but it deals with the niggling deals of loose code far more effectively than DP. I liked it.

    The testing framework methodology really interests me, but I found I was spending more time writing the tests than actually writing the code. YMMV.
  • No amount of commenting can make bad code any better.

    Comments also often contain bugs, or are out of date.

    When code is properly designed and implemented, it's intent is usually pretty clear. Code that requires explanation can nearly always be rewritten so that it's intent is clear even without comments.

  • Java's strong typing makes for awfully easy to read code. The objects wind up being nouns and the methods are usually pairs. The result is very verbose stuff that reads pretty well. Like:




    Maybe Perl or Python would be better examples :)

  • >"Code as if whoever maintains your code is a violent psychopath who knows where you live."


    Actually, I try to include the business rules/concepts in my comments also.

    /* Since each box contains a max of 28 packages and each shelf has 4 rows of 8 boxes.*/

    Important when the business rules change. And they will :(

    Also, your post is formated enough to get the idea.

  • I've only got about three months' experience as a real live (commercial) hacker -- but most of my colleagues think Design Patterns is the bible. I read it and it was a somewhat laborious catalogue of systematic ways to write in C++ or Java constructs which are completely natural in ML or Haskell. Maybe this book is different, more about people than about code; but "refactoring" sounds suspiciously like another one of those commercial-software-engineering buzzwords which exist only to conceal the fact that commercially used languages have no abstraction power.

    Coding in Java for my job is beginning to rot my brain. I have to write Haskell programs no one will ever use to keep my sanity. Every time I write a type-cast I want to cry. My wrists hurt at the end of the day from the amount of typing required to implement the simplest concepts. Worst of all I look at my own code and I'm not sure whether it's right -- and it hurts because I know that if it were written in ML, or Haskell, or one of a dozen other civilized languages, I would be able to look at it and know that it was right.

    Is there any escape?

  • Sigh. Submitted too soon.

    >/* Since each box contains a max of 28 packages and each shelf has 4 rows of 8 boxes.*/

    Business rules that could change assumptions you made, there are now a new type of box, the rows are now different depending on which warehouse it is in, a box can now contain packages or containers or some of each.

    Business rules suck. They get in the way of my beautiful code.
  • Same difference really...
    If You think about it... :)
  • Many times in reviewing other peoples' code I see things that look like a half-assed implementation or poor design. It isn't until I dig deeper that I realize why it was done that way. I mean, you could write a script that says "reads through file line by line and process contents" but you need a person to say "Used positional delimitation within the '\' delimited fields to avoid multiple uses of strtok(,,)." It helps future users, and yourself.
  • by kuro5hin (8501) on Thursday September 23, 1999 @08:02AM (#1664663) Homepage
    In my experience, the times when 'well' architectured systems suck is when some manager with experience in, say, Visible Analyst decides it's up to him to engineer the whole system from the get-go, unencumbered by any knowlege of the language it'll be written in, or the strengths and limitations of that language.

    The most important thing about all of this is that software development goes in cycles. First you make it work, then you make it right, then you make it fast. Leaving out any of these steps is very bad.

    Another very bad thing is when you have the whole system planned out in excruciating detail before you write line one of code. Inevitably, one of your assumptions will turn out to be totally unworkable, and if it's already set in stone, that will probably break everything else. Generally you have to sketch the broad strokes, fill in the major code, find out what works and what doesn't, throw away what you've done so far, and start for real. That's just the way it is, and if you don't plan to throw away your first try, you'll just end up being overbudget and late when you have to throw it away anyway.

    We all take pink lemonade for granted.

  • by Anonymous Coward
    Your idea has a fatal flaw: where are you going to find senior people qualified to act in that role *and* willing to do so? It's not that we don't consider it important, especially the professional development issues associated with it, but the people who can act as editors will usually be working as tech managers (even project managers) or as extremely senior developers working on cutting edge stuff.

    From a business perspective it's even worse. Do you spend $X to hire Alice to write some code, or many times that to have Alice review the code that Bob wrote? Depending on the problem, you could literally have Bob take a week to write code that Alice could write in a few hours (e.g., because Alice realizes that it's a perfect example of when you should use bison/flex), then she would have to spend some time explaining what tools she used to reduce a big hairy problem to a little one.

    (coyote-san unlogged in)
  • I feel that there are merits to more than one style of commenting, but the appropriate style is based on the language being used and to some degree, the type of project being coded. However, I feel that the code itself should always be as readable as possible. Why? Well, suppose you are working on a piece that has been revised many times by many different programmers. The comments may or may not give an accurate picture of what's going on with the program. However if it compiles, the code will always tell what's happening. That should be made as painless a process as possible for those reading it.

    That aside, I would offer my opinions for the following languages:

    Stack Assemblies: Comment anything tricky, and include a stack status comment on every line! This is necessary to make sure the stack never underflows or overflows due to not knowing what to expect in it after unconditional branching.

    Other Assemlies: Commenting most line is still likely a good idea. That way it's possible to see what's being moved and so forth.

    Higher Level Languages: Inline comments are usually a waste. In fact they can reduce the clarity of the code. Assume that the reader of your code knows the language. Save inline comments for tricky algorithms, highly mathematical content, or maybe obscure functions in a large language like Perl or Ada. Block comments have a great deal of use though. I'd probably recommend a block comment to describe each function.

    ...just my 2 cents; it's saved me a heck of a lot of time.
  • I'm interested in this "Extreme Programming" concept, but the link provided is either dead or /.ed or both. Can someone provide additional pointers or descriptions to "Extreme Programming"? Thanks!

    - Mike

  • The ideas of both patterns and refactoring are acknowledged to be old. The design patterns book specifically states that each pattern had to be used in at least two successful, major projects to be included in the catalogue. Similarly, the refactorings in this book have been tested and tried. The best programmers have been working this way for ages.

    What is new, is the codification of these ideas into a more textbook form. This is a step along the way from art to science. You can open a textbook now, and discover the steps of a proven method to get from software point A to B, when to do so, and when not to.

    This makes it more akin to engineering. These are proven recipes for engineering software, and now they are codified in books.
  • No way. Without some sort of accountability, whether it be pair programming or code reviews or whatever, dirty little secrets will become pervasive in the code base and eventually pull the works down. In addition, the best way for new members to get up to speed as quick as possible is to have design and code reviews. It takes time from senior developers, but pretty soon the people getting reviewed are doing reviews on others. Development done in isolation is doomed to fail.
  • I'd really recommend reading Stroustrup's "The Design and Evolution of C++" to understand *why* C++ is the way it is. It is *not* entirely about C... it is about other issues, such as performance, static type checking, etc.

    Then Stroustrup's "The C++ Programming Language (Third Edition" to understand the language as it now is. If you can see past the support for casting and operator overloading, you will discover that Stroustrup has as much to say on large scale software design as Booch or Lakos.

    There are whole chapters on expressing architecture in C++. Of course, you have to understand the syntax and semantics of the language to achieve that goal. But C++ is about more than obfuscation and C compatibility. Lurking in there is a language that combines the best of Simula, C, and other languages, into a successful language that supports large scale programming.

    Don't forget: Stroustrup's background is heading a research centre for large scale programming at ATT. He doesn't just work with toy problems.
  • A manager where I work said that I should not just comment the "hairy parts" of functions and give a general overview of what each function does, but also include "dependencies" in a boxed comment near the front of each function. That is, like so:

    struct mystruct
    foo(int bar, int baz, FILE *fnord)
    /* This calls functions "function1" and "function2" and needs a FILE*
    *that points to the comma-delimited data that gets updated in function "blorf".

    That's saved me a few hassles. ("Why is this not working right?" "Hold on, I tweaked foo... looks like function blorf needs an update too; I'll get right on it.") Does CVS do something like this automatically? I ask because I have never used CVS; it seems like overkill for the essentially one-person project I'm working on. Besides, the users always let me know semi-immediately whenever anything breaks.

  • Oh the humanity...why why why..why do we continue to get these snide comments about Perl being difficult to read. ANY high-level language can be difficult to read if it's written poorly, and easy to read if it's done properly.... I write my Perl code just like I'm writing C ( style wise ) and I think it's VERY easy to read and follow. Hmph.

    Now ..python with all of those spaces etc... that's another story :)
  • by gid-foo (89604) on Thursday September 23, 1999 @09:24AM (#1664678)
    Hear hear, this is the big problem with code reviews, in my experience. They tend to either be a rubber stamp or a tedious examination of religious principles (i like 2 tab indents, not 4, or your open bracket should be on the next line not following your function declaration, etc). In a best case scenario you actually have engineers reading over your code and looking for problems or bogus assumptions. Too often most "engineers" are mediocre and uninterested in anything but widening their cubicle-monkey asses. Egoless development doesn't just apply to the individual writing the code but to all coders in the shop. On another note, in an Intel style environment where the whole goal is to make yourself appear superior to your colleagues by putting down their ideas and abilities (you're manager won't hire anyone potentially better than them because they are in direct competition with their own engineers) code reviews are merely political tools. In other words, an excellent chance to sink the hatchet into your cubicle mate's back.
  • Here is a link that might help: []

  • There's another good one I've seen about that says "4 or 5 months of development can easily save you 2 or 3 hours in the library." Or something to that affect. Managers and marketing people love it when an engineer uses this as their sig. gid-foo
  • Semantics??? Forgive me, but I think you missed my point totally.

    C++ doesn't HAVE a semantics. C++ has a tangled, elephantine heap of ambiguities. ML has a semantics (Milner et al, The definition of Standard ML, second edition).

    As for syntax ... I read a story once about someone (I forget who) who was trying to write a yacc grammar for C++, and every time he ran across an ambiguous case and ran it through cfront (the defining implementation at the time) to check its parsing, cfront dumped core. But that doesn't even matter; any language could have its syntax reduced to Lisp's and it would be fine with me.

    Type checking? After working in a language with a really powerful static type system (such as ML or Haskell), trying to express concepts as types in C++ feels like moving a sand dune with tweezers. Parametric polymorphism and algebraic sum types are just the beginning of what I miss.

    In short, individual language features (casts, overloading, whatever -- see Haskell to understand what overloading should be about) aren't what bother me. What bothers me is the whole philosophy of clumsy thinking at too low a level of abstraction. It astonishes me that people can and do write large programs in C++ and Java. I can respect that after a fashion, but just because the wall is bloody from everyone else banging their head against it doesn't mean I should do the same.

    I apologize for the strong language in this post, but I feel very strongly that the use of insufficiently abstract languages (principally C++ and Java these days) in production software is a major reason why said software is so frequently so bad; and that there is no excuse for the failure to use more advanced language technology to improve our design of and reasoning about programs.

  • I think refactoring is a Good Thing. The problem, though, is with the politics of implementing it in an organization. Since the benefits are long term, it can be difficult to convince people to devote resources to it. Most places I've worked at, programmers who quickly churn out massive amounts of barely functional, but glitzy, prototype code are regarded by management types as geniuses, while those who argue for reengineering code are considered obstacles.

    Perhaps the book addresses this (I haven't read it). Anyone actually work anywhere where management signed on to refactoring?

  • The problem with high levels of abstraction crop up when the abstract model isn't a good fit for the task or process you're trying to write code for. Bad as coding in C++ may be, it's easier than trying to change the business model. (That said, I've seen more bad C++ code than perhaps any other language -- I've also seen some very good C++ code).

    I used to do a lot of development in APL -- now there's a language with a lot of high level abstraction, but it's oriented in a particular direction that is not necessarily a good fit for some of the things I've seen it applied to (email!? business management!?).

    It may well be that the applications you're working on could in fact be better developed in ML or Haskell (I'm not familiar with either of those), but in the commercial world that's only one consideration. Other considerations are: who supports the develpment tools, and how big is the available pool of talent to support what gets built. I've beaten my head against that wall, too (I was an early adopter of C++ back in the 'cfront 1.0' days because the OO approach was a much better fit for some of the applications we were developing -- this in a UNIX/C shop that had just barely finished migrating some of their developers from VMS/FORTRAN. C++ has gone downhill since then, in my opinion.)
  • It strikes me that your argument is somewhat circular.

    Those who really, really want "a better language" are going to stick with something like Haskell, or Eiffel, etc. But the majority of people are going to accept C++'s warts (yes it has a few, I believe all languages do) because of its other advantages.

    So, more people use C++. Granted? Not everyone is a good programmer. Probably there are worse programmers using C++ (sheer numbers argument here) than Haskell.

    It's not surprising you see bad C++ code. Again, sheer number of users.

    You haven't seen my C++ code. Do I sound like the kind of person who would tolerate inelegance? I've used quite a few languages (although admittedly not Haskell, I'll take your word that it's nice). I understand C++ seems inelegant in places, but I believe that's more of a surface impression.

    C++ is currently the best, prevalent, language we have for software architecture. Java lacks language features (particularly const, and stack-allocation for user-defined types) and performance. C lacks higher-order features (such as classes). Smalltalk/Lisp/etc. lack performance and static type checking. Eiffel/Haskell/etc. lack prevalence.

    As long as C coders use C++, we will be stuck with programmers who don't understand the difference between initialization and assignment. The language can't help that.
  • Luckily, where I work the emphasis is on rock-solid code. Management will accept the re-engineering of already 'working' code, but it's the same battle every time: convince me that the pay off is worth the extra time invested. I understand their point of view, but sometimes it gets tedious...
  • That's exactly the kind of comment that should never be written. First, it's incredibly easy for the comment to become wrong - if one of your numbers changes and someone only changes the code and not the comment. Second, you've hard-coded (in your comment, at least) numbers that should clearly be named constants.

    If I have constants defined like this, then I can't imagine ever needing to write a comment like that:

    public interface BusinessRuleConstants {
    public static final int MaxPackagesPerBox = 28;
    public static final int BoxesPerRow = 8;
    public static final int RowsPerShelf = 4;

    Also, I wouldn't call the statement in your comment a business rule. A business rule in my experience is more like, "If the number of packages in a box is less than 5, order more packages." And that rule could easily be expressed as a language construct, with no comment needed, if the data structures and constants have been given clear, correct names. And if they haven't, maybe you can refactor so they have.
  • "A Fortran programmer can write Fortran in any language."

    Not a complement.
  • I always imagine writing a good program to be similar to writing a good novel. Most writers will rewrite a given paragraph or chapter tens or hundreds of times until it's just perfect.
    Programs are no different--whenever you write a function or module, consider it a draft, and don't worry about throwing it away and writing it again. Too often I see people spending hours and hours trying to get acceptable behaviour out of their fundamentally flawed "first draft", when it would have been much simpler and easier to just toss the code and rewrite it, now that the problem is better understood. That way, when you're done, you have an elegant, easy-to-understand, simple program, instead of an inscrutable mess that "seems to work okay" (as far as your testing shows, anyway!)

  • Or a compliment for that matter.
  • Stop coding in Perl and you won't need so many comments.
  • Chapter 13, "Refactoring, Reuse, and Reality", by William Opdyke, covers this.

    It's a hard topic, especially since many of the best technical people are not the best politicians. How do we cope?

    At one point, Fowler addresses the question of "What if your manager won't let you refactor?" His controversial advice is "Don't tell the manager you're refactoring." His justification is, you are a professional, you know what it takes to do your job, and if refactoring here and there is the right thing to do, just do it. When your development improves because the code improves, your manager won't complain.
  • I am the reviewer. I have trashed books in the past, when deserved. I gave a game programming book 5/10. This book is 9/10, deservedly so. Please don't question my reviewing integrity.

  • >That's exactly the kind of comment that should never be written.

    I would agree with you with this.

    But what happens when you are working with a language and business requirement which really pushes the boundaries. (I asked a consultant if I could do what I wanted to do. Reply:"Why would you want to do this?" And then we proceded to spend 2 hours on how to do it without making it crash. I am forced to use this language because the customer requested it.)

    It kinda hard to explain but the example is not what I did. It was a difference between comment/program like this _or_ risk running into a known issue in the programming language.

    Some languages are not as nice as C. I would love it if I could have used C.

    I think I'll shut up now before I get in trouble.

  • >comments are not necessarily.

    It was a first year course and it was mostly to explain to students what was going on. But coming from high-school it was quite abit of comments.

    >why Waterloo produces better and more sought-after developers than UofT. :)

    I know and worked with quite a few Waterloo grads and I must admit they are _all_ very nice people and good programmers. Considering that the university is in the middle of nowhere. :)

  • Another good book on this subject is AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis (ISBN: 0471197130).

  • If you are interested in looking into Haskell, my main advice is to be prepared to spend time learning it: if you have never used a lazy functional programming language it takes time to learn how to use these features, and the way Haskell uses state (indirectly via `monads') is tricky to grasp.

    the time is worth it in my opinion; laziness provides a powerful heuristic for attacking difficult optimisation problems efficiently: an application of haskell is in providing effiecnt dynamic prgramming solutions to NP hard optimisation problems, and then step-by-step transforming these into industrial strength `C' code. This strategy has produced some of the best solutions to the problem, because laziness captures a fruitful intuition about how to minimise resources in such problems.

    If you are not enthusiastic about putting the time in, it is worth having a look at Objective CAML (`ocaml'), a language that combines an excellent marriage of objective and functional programming with one of the best development suites in functional programming. References to both can be found at the FAQ for comp.lang.functional [].

    Of course both languages lack `prevalence', though that may change, since Simon Peyton-Jones, one of the chief architects of Haskell, has taken a post at Microsoft...
  • I can't say I find his argument convincing. Problem is, you don't see any results for a long time. It might save time when some new feature is added, or some maintenance work done, but this most likely will be a year or two down the road. In the meantime, if you are a secretly refactoring you just seem slow.

    Also, you won't achieve much if you are the only one doing it. If you're part of a 5 person team, and only your 20% of the code is refactored, you're not going to accomplish much.

    My pessimistic view is that this just requires too long a view -- especially when most companies are only looking ahead a few months.

  • I've seen companies shut down new development for a quarter, while they refactor and rearchitect their code base. Not due to foresight, though, but rather because functionality could no longer be safely added, and sales people could no longer sell the outdated product. When *managers* finally realize refactoring is necessary, the *technical* people really know it is too late.
  • That information is great to have, but think about how it will be maintained. Chances are, it won't. You may maintain it religiously, but the next guy will change something, and not think about that it changes the information.

    Once the documentation is not reliable, people will stop reading it, and it will grow obsolete at an ever increasing rate.

    So I'm doubtful about this mechanical approach.

    But any documentation that is in the code itself is always 10 times better than the one that is on it's own in a binder or web site somehwere. That stuff never gets either read or updated, and is just a pure waste of effort.
  • Refactoring requires inexpensive testing.

    When refactoring, you run unit tests. To finalize a refactoring, you must run a full test suite on the entire product.

    This is all part of ExtremeProgramming.
  • After commenting code, you should rewrite it so the comment is no longer needed. Embed the semantic content into the source code. This will fix bugs.


    foo++; // got another special foo

    turns into

  • The WikiWikiWeb is probably *the* finest resource of information for professional object developers that I've ever found. The knowledge laid out there by its little community has completely changed the way I view (and do) software development -- and for the better.
  • > Most comments I do are for the sanity of my future self/others, who are not perfect and are
    > not experts at coding. (eg 6months of simple programming from college)

    I'm not advocating having no comments at all, which seems to be the way everyone's reading what I said. I'm simply saying that if you find you need to write two lines of comments per line of code something is definitely wrong. I used to write code like this, and it was a living nightmare to update later on.

    Code sensibly; choose sensible variable and function names, comment at the beginning of function definitions explaining the purpose of the function and at the beginning of each logical block - a for loop, for example. If there are a couple of lines which need additional explanation, a comment there. I find that this works, and code I have written in this way is still actively in use and maintainable by others some six years on.

    Again, the closer your code is to being self-documenting the easier it will be to maintain, irrespective of how many comments are in it. There is little worse, in my opinion, than finding code like:

    a = a + b; // Add a to b and store the result in a

    A well-commented line, to be sure. But what additional information does the comment pass on? If you're writing two comment lines per line of code, I can't see how you're avoiding stuff like this.

  • I've used Miranda and NIAL (Nested Interactive Array Language). Both are lazy evaluating, functional languages.

    I have a background in computer science, and have studied dynamic programming, functional programming, etc.

    You should check out C++'s standard valarray templated class. It is designed for optimum performance. Implementations typically use proxy objects for intermediate access and operations. This, in effect, means lazy evaluation. However, it's partly done at compile time, which effectively means performance.
  • Agreed. However, even writing has its books on "elements of style" and "how to write effectively" etc.
  • Yes, my web page has a comprehensive catalogue of my software development library.

    Of course not all are "must haves" but I did buy them all.
  • There is a school of thought which contends that if your code requires that many comments, it hasn't been very well-written. Good code should be almost self-documenting - an excess of comments can swamp the code and make it just as difficult to understand, if not more so, than the same code with no comments at all.

    Good comments are like newspaper headlines -- they give you a quick summary of a section of code without having to read article (code) in detail.

    Even though the article may be well-written, it still takes more time than the headline if all you want is the gist. Sometimes you are simply hunting for something and need a way to filter out the unlikely code "paths" faster.

    Also, giving a "hint" before the actual code may make it's purpose jump out faster because you were prepared with the general idea.

    Good commenting is art form, (just like programming itself.)

  • I've given away dozens of programming books, but here is a list of books still on the shelf. [] Used bookstores are better off without a computer section.
  • After reading this review I went to The Bookpool [] (where Refactoring is available for $28; sorry, Amazon) and ordered it. I've now had it a few days, sampled a number of sections, and started seriously on reading from cover to cover.

    Maybe SEGV's seen something I haven't, but I'm tempted to give it at least 9.5/10, and thinking about more.

    Yes, as many posters above note, I too have been refactoring for much of my career, to save my sanity if for no other reason. But I called it ``cleaning up the code'', and often couldn't articulate to my peers or bosses why it was the right thing to do. I was abstracting the form of the code, changing it to make it easier to understand. Fowler has abstracted the form of the changes, to make them easier to recognize and execute correctly. This higher level of abstraction is what makes the book worthwhile.

    In addition, he's labelled and codified abstractions I haven't thought of, but which will be useful now that they've been brought to my attention.

    It's also nice that he's given ``guest authors'' chapters to themselves, so we get different views of the subject. Fowler's upfront about what he owes to others in developing the concepts; he says they should have written the book, but since he's the one to get around to it, he's at least roped them in for their expertise.

    All in all, if you ever have to touch sub-standard code, get and apply this book. I would have killed for this at my last job.

Real computer scientists don't comment their code. The identifiers are so long they can't afford the disk space.