Forgot your password?
typodupeerror
Book Reviews Books Media

High Performance Web Sites 132

Posted by samzenpus
from the heavy-duty-net dept.
Michael J. Ross writes "Every Internet user's impressions of a Web site is greatly affected by how quickly that site's pages are presented to the user, relative to their expectations — regardless of whether they have a broadband or narrowband connection. Web developers often assume that most page-loading performance problems originate on the back-end, and thus the developers have little control over performance on the front-end, i.e., directly in the visitor's browser. But Steve Souders, head of site performance at Yahoo, argues otherwise in his book, High Performance Web Sites: Essential Knowledge for Frontend Engineers." Read on for the rest of Michael's review.
High Performance Web Sites
author Steve Souders
pages 168
publisher O'Reilly Media
rating 9/10
reviewer Michael J. Ross
ISBN 0596529309
summary 14 rules for faster Web pages
The typical Web developer — particularly one well-versed in database programming — might believe that the bulk of a Web page's response time is consumed in delivering the HTML document from the Web server, and in performing other back-end tasks, such as querying a database for the values presented in the page. But the author quantitatively demonstrates that — at least for what are arguably the top 10 sites — less than 20 percent of the total response time is consumed by downloading the HTML document. Consequently, more than 80 percent of the response time is spent on front-end processing — specifically, downloading all of the components other than the HTML document itself. In turn, cutting that front-end load in half would improve the total response time by more than 40 percent. At first glance, this may seem insignificant, given how few seconds or even deciseconds it takes for the typical Web page to appear using broadband. But any delays, even a fraction of a second, accumulate in reducing the satisfaction of the user. Likewise, improved site performance not only benefits the site visitor, in terms of faster page loading, but also the site owner, with reduced bandwidth costs and happier site visitors.

Creators and maintainers of Web sites of all sizes should thus take a strong interest in the advice provided by "Chief Performance Yahoo!," in the 14 rules for improving Web site performance that he has learned in the trenches. High Performance Web Sites was published on 11 September 2007, by O'Reilly Media, under the ISBNs 0596529309 and 978-0596529307. As with all of their other titles, the publisher provides a page for the book, where visitors can purchase or register a copy of the book, or read online versions of its table of contents, index, and a sample chapter, "Rule 4: Gzip Components" (Chapter 4), as a PDF file. In addition, visitors can read or contribute reviews of the book, as well as errata — of which there are none, as of this writing. O'Reilly's site also hosts a video titled "High Performance Web Sites: 14 Rules for Faster Pages," in which the author talks about his site performance best practices.

The bulk of the book's information is contained in 14 chapters, with each one corresponding to one of the performance rules. Preceding this material are two chapters on the importance of front-end performance, and an overview of HTTP. Together these form a well-chosen springboard for launching into the performance rules. In an additional and last chapter, "Deconstructing 10 Top Sites," the author analyzes the performance of 10 major Web sites, including his own, Yahoo, to provide real-world examples of how the implementation of his performance rules could make a dramatic difference in the response times of those sites. These test results and his analysis are preceded by a discussion of page weight, response times, YSlow grading, and details on how he performed the testing. Naturally, if and when a reader peruses those sites, checking their performance at the time, the owners of those sites may have fixed most if not all of the performance problems pointed out by Steve Souders. If they have not, then they have no excuse, if only because of the publication of this book.

Each chapter begins with a brief introduction to whatever particular performance problem is addressed by that chapter's rule. Subsequent sections provide more technical detail, including the extent of the problem found on the previously mentioned 10 top Web sites. The author then explains how the rule in question solves the problem, with test results to back up the claims. For some of the rules, alternative solutions are presented, as well as the pros and cons of implementing his suggestions. For instance, in his coverage of JavaScript minification, he examines the potential downsides to this practice, including increased code maintenance costs. Every chapter ends with a restatement of the rule.

The book is a quick read compared to most technical books, and not just due to its relatively small size (168 pages), but also the writing style. Admittedly, this may be partly the result of O'Reilly's in-house and perhaps outsource editors — oftentimes the unsung heroes of publishing enterprises. This book is also valuable in that it offers the candid perspective of a Web performance expert, who never loses sight of the importance of the end-user experience. (My favorite phrase in the book, on page 38, is: "...the HTML page is the progress indicator.")

The ease of implementing the rules varies greatly. Most developers would have no difficulty putting into practice the admonition to make CSS and JavaScript files external, but would likely find it far more challenging, for instance, to use a content delivery network, if their budget puts it out of reach. In fact, differences in difficulty levels will be most apparent to the reader when he or she finishes Chapter 1 (on making fewer HTTP requests, which is straightforward) and begins reading Chapter 2 (content delivery networks).

In the book's final chapter, Steve Souders critiques the top 10 sites used as examples throughout the book, evaluating them for performance and specifically how they could improve that through the implementation of his 14 rules. In critiquing the Web site of his employer, he apparently pulls no punches — though few are needed, because the site ranks high in performance versus the others, as does Google. Such objectivity is appreciated.

For Web developers who would like to test the performance of the Web sites for which they are responsible, the author mentions in his final chapter the five primary tools that he used for evaluating the top 10 Web sites for the book, and, presumably, used for the work that he and his team do at Yahoo. These include YSlow, a tool that he created himself. Also, in Chapter 5, he briefly mentions another of his tools, sleep.cgi, a freely available Perl script that tests how delayed components affect Web pages.

As with any book, this one is not perfect — nor is any work. In Chapter 1, the author could make more clear the distinction between function and file modularization, as otherwise his discussion could confuse inexperienced programmers. In Chapter 10, the author explores the gains to be made from minifying JavaScript code, but fails to do the same for HTML files, or even explain the absence of this coverage — though he does briefly discuss minifying CSS. Lastly, the redundant restatement of the rules at the end of every chapter, can be eliminated — if only in keeping with the spirit of improving performance and efficiency by reducing reader workload.

Yet these weaknesses are inconsequential and easily fixable. The author's core ideas are clearly explained; the performance improvements are demonstrated; the book's production is excellent. High Performance Web Sites is highly recommended to all Web developers seriously interested in improving their site visitors' experiences.

Michael J. Ross is a Web developer, freelance writer, and the editor of PristinePlanet.com's free newsletter.

You can purchase High Performance Web Sites from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

High Performance Web Sites

Comments Filter:
  • Or does this sound suspiciously like an advertisement for YSlow in book form? Not only is YSlow specifically mentioned as part of the book review, but every single item covered is one of those little checks YSlow completes.
    • Re: (Score:3, Insightful)

      by NickFitz (5849)

      does this sound suspiciously like an advertisement for YSlow in book form?

      What's suspicious about the fact that a book written by the creator of YSlow addresses the very issues that YSlow, a free open source Firefox extension, addresses? It would be pretty strange if it didn't.

      If you want to be so paranoid about the intentions of an author, at least find one it's reasonable to be suspicious about in the first place.

      • Seeing that this is about YSlow, just thought I'd mention that it isn't much of an extension. It'll make a few notes about your web pages (that you should probably already know if you created them) then give you links to Yahoo's website for suggestions on how to fix them.

        So, in the spirit of cutting out the middleman, here's all the information you'd get about speeding up your web site without having to install YSlow: developer.yahoo.com... [yahoo.com]

    • Incidentally, YSlow gives Slashdot a C.

  • by xxxJonBoyxxx (565205) on Wednesday October 10, 2007 @02:49PM (#20930297)
    Learned the hard way:

    Rule #34: Don't be the first Java site your users visit during the day. (Unfortunately, this pretty turned into "don't use Java applets" unless you could find a hidden way to load an throwaway applet in another frame, etc.)
    • by Anonymous Coward
      Everyone knows that Rule 34 on the internet is "If it exists, there is porn of it".
      • Re: (Score:3, Funny)

        by Anonymous Coward
        Of course. Grandparent started talking about Java and Rule 34, and I was preparing to avert my eyes.
    • by Cyberskin (1171659) on Wednesday October 10, 2007 @03:17PM (#20930739)
      Rule #34 is all about how slow loading the java plugin is for any browser. It's always been slow, it was supposed to improve w/6 and really it's still slow. The main problem is that NOTHING shows up until the plugin gets loaded. My solution was two-fold. Write the object in javascript (which conveniently allows the rest of the html in the page to load and display, but also eliminates the IE problem of "click on this" to activate the applet) and create an animated gif loading screen div which I block when the applet div finished loading (I ended up loading the applet at the bottom of the page below the watermark because otherwise I couldn't catch the finished loading event. I just made the loading screen match the page background so the only way you could tell anything was going on was that the scrollbar on the right changed sizes). Not exactly elegant, but it was better then the blank screen you get waiting for the plugin to load and provided a nice custom animated loading gif instead of the default applet loading logo.
  • Think I will buy a couple dozen copies of the ebook version and send it to many of the sites that get slashdotted every time they are posted.
  • Why?

    All my pages are static HTML. Not a web application in site, not even PHP. Yes, it's a drag when I need to do some kind of sitewide update, like adding a navigation item.

    I also have less to worry about security, as long as my hosting service keeps their patches up to date, I know I haven't introduced any holes myself.

    Also, for the most part, my pages are very light on graphics, with most of the graphics present being repeated on every page such as my site's logo, which gets cached.

    Finally, all

    • by NickFitz (5849) <slashdot AT nickfitz DOT co DOT uk> on Wednesday October 10, 2007 @03:01PM (#20930483) Homepage

      You forgot to link to your site... [amish.org]

      • by StikyPad (445176)
        My website [tallyhouniforms.com] is not only 99% pure HTML, it can save you time and money on air travel, AND you can tell everyone you got laid [tallyhouniforms.com]. Perfect for people who never get laid.
    • by Sciros (986030) on Wednesday October 10, 2007 @03:09PM (#20930591) Journal
      Why?
      It's a bicycle!!1
    • Solution (Score:5, Interesting)

      by dsginter (104154) on Wednesday October 10, 2007 @03:09PM (#20930597)
      All my pages are static HTML. Not a web application in site, not even PHP.

      This is a great point, but here is my anecdotal experience:

      Years ago, I tested static HTML vs. PHP by simply benchmarking a simple document (I used the GPL license). On the particular box, I was able to serve over 400 pages per second with static HTML but only about 12 pages per second with PHP. I was blown away. I went one step further and used PHP to fetch the data from Oracle (OCI8, IIRC) and that went down to 3 requests/sec. You can see that caching does help, but not a whole lot.

      So, rather than whine about it, what is the solution?

      AJAX, done properly, will solve the problem. Basically, instead of serving dynamic pages with PHP, JSP, ASP or whatever... just serve an AJAX client (which is served in a speedy manner with no server side processing to bog things down). This client loads in the browser and fetches a static XML document from the server and then uses the viewer's browser to generate the page - so everything thrown down by the server is static and all processing is done on the client side.

      Now, to facilitate a dynamic website (e.g. - message board, journal, or whatever), you have to generate the XML file upon insert (which are generally a small fraction of the read load) using a trigger or embedded in the code.

      Viola! Static performance with dynamic content using browser-side processing.
      • As a solution to speed alone, the right answer (as some other posts mentioned) is a CMS/publishing solution that makes static HTML pages once on a change. The most braindead way to do this is to put an aggressive squid/apache cache in front of your server, and only refresh the cache every half-hour or on demand; nobody gets to go directly to the dynamic site and you have a minimal investment in the conversion. But certainly just using an automated system to write-out HTML files works too.

        Using AJAX you ha
        • Re: (Score:3, Informative)

          by Anonymous Coward
          Uh... most implementations of Ajax are used in conjunction with a server side programming language of some sort. The only performance boost is that you don't have to reload the entire page... only the part that you need to update. The obvious drawback is that if users don't have javascript enabled you have eliminated your users... or you write a second site to handle users without javascript. It can be used to help, but be careful of the suggestions you carelessly throw out there.
      • Re:Solution (Score:5, Informative)

        by chez69 (135760) on Wednesday October 10, 2007 @04:14PM (#20931605) Homepage Journal
        sounds good, except you may or may not know that a lot of javascript implementations are sloooow. not to mention you usually have to set the no cache headers for everything in the page so your javascript works right.

        I find that sites built with the method you describe are the asshole sites that fuck with browser history, disable the back button, try to disable the context menu, and those dumb ass tricks to get around the fact they don't know how to write proper server side code.

        There's no reason you can't make a fast serverside site (with ajax too, that works without the stupid tricks I described above), if you can't I suggest you educate yourself, or don't use a wallmart PC for production use.

        I've personally written many J2EE webapps (no EJB BS, spring & struts & jsp/velocity) that where very fast, with proper coding you can let the browser cache stuff so it constantly doesn't have to refetch crap. when you do this, all you push down to the client is the HTML to render, which browsers are really good at doing quickly.
        • These sites are pretty useful when it comes to planning high performance websites:

          http://highscalability.com/ [highscalability.com]
          http://www.allthingsdistributed.com/ [allthingsdistributed.com]
        • Re: (Score:3, Insightful)

          by dsginter (104154)
          Wow - this is wonderful, constructive feedback. But allow me to make some suggestions on your wording. For example, the following statement:

          sounds good, except you may or may not know that a lot of javascript implementations are sloooow. not to mention you usually have to set the no cache headers for everything in the page so your javascript works right.

          I find that sites built with the method you describe are the asshole sites that fuck with browser history, disable the back button, try to disable the con
          • by chez69 (135760)
            Yeah, i'm an old mofo. I've worked professionally for a while. You refute what I said with I don't "get it". You really didn't refute my claim that a full ajax web client is dumb.

            I've used ajax type techniques in many projects. I know what it does well, and what it doesn't.

            I'm assuming that your talking about sites where the user never submits data. Using no serverside validation on user input is completely retarded, just ask slashdot what happened when they didn't scrub the input from their users for j
      • Years ago, I tested static HTML vs. PHP by simply benchmarking a simple document (I used the GPL license). On the particular box, I was able to serve over 400 pages per second with static HTML but only about 12 pages per second with PHP. I was blown away. I went one step further and used PHP to fetch the data from Oracle (OCI8, IIRC) and that went down to 3 requests/sec. You can see that caching does help, but not a whole lot.

        12 page/sec eh? You didn't put a busy wait in there did you? I've never seen per

      • Not to mention that that particular approach is probably a huge no-no when it comes to accessibility and search indexing. I mean, do you really expect Google to run all of your scripts when it spiders your page?
      • Now, to facilitate a dynamic website (e.g. - message board, journal, or whatever), you have to generate the XML file upon insert (which are generally a small fraction of the read load) using a trigger or embedded in the code.
        This is silly.
        If you have to "generate the XML file" every time the data changes, why not just write an x/html file and serve it?
        Even better, why not cache the x/html file instead of generating it all the time.
        • I think the answer is this:

          You generate XML for each piece of content. Multiple templates can then serve the dynamic content while the templates themselves are static. You only need to generate the content once, regardless of how many templates it appears in.

          That said, I still think generating the entire pages is the way to go. Less requests should give better performance. Not to mention the other issues (search engine friendly (or other non-JS aware), proper back/foward support, less work for the browser m
      • precompile your HTML (Score:3, Informative)

        by victorvodka (597971)
        One solution that gives you a dynamic website with the advantages of a database and server-side scripting is to precompile your site to static HTML - you update it by recompiling more HTML. It can be done fairly transparently, with all the actual precompiling happening via automatic scripts. Obviously you can't have a user-login-based site work effectively this way, but for a site of modest dynamics (such as a blog, product catalog, or even some message boards), pre-compiling to HTML can be a real benefit.
      • Better yet, use a system like Rails' page caching.

        With URL rewrite rules, have your server check for a static page matching your URL (e.g. index.html -> index.html.cache). If you get a 404, pass the request to your interpreter of choice, and write out the result to a cache. After the initial request, it's just static pages.

        If your site is more complex, use fragment caching, [rubyonrails.com] which sounds like the solution you've described.

      • The other solution might have been to use something not as dog slow as PHP. OK, that is trolling, it is a lot better these days, especially with caching compilers and the like.

        If you want really fast and scalable, AOLserver still blows PHP away today (like it did at the time you were testing), despite PHP's (and Apache's!) improvements.

        AJAX is a terrible solution in my opinion. First of all, it doesn't free you from server-side processing; all you are doing is caching what would otherwise have come from the
      • by kbahey (102895)

        Years ago, I tested static HTML vs. PHP by simply benchmarking a simple document (I used the GPL license). On the particular box, I was able to serve over 400 pages per second with static HTML but only about 12 pages per second with PHP. I was blown away. I went one step further and used PHP to fetch the data from Oracle (OCI8, IIRC) and that went down to 3 requests/sec. You can see that caching does help, but not a whole lot.

        PHP by default can be slow, because it has to be parsed, tokenized and then execut

        • by kbahey (102895)
          The 120 requests per second are not all PHP pages of course. They are at the Apache level, and include .js, .css, and graphics that form the page in aggregate.
        • Bad policy: first you just excluded all the non-javascript browsers, and those who disable javascript. Second, the client side parsing can be intensive on the CPU of the machine, and hence the used can feel slowness. Third, you do not solve the database backend issue.

          You may want to check AHAH, which is basically AJAX without the XML parsing part. Straight HTML to the browser.

          I'm not sure you intended to say it does, but that doesn't really solve any of the problems you list for AJAX. It might solve the cl

      • The idea of the book, and real life, is to optimize the user experience, not get insane server efficiency at the expense of simplicity.

        It's possible (and easy) to write PHP scripts that do 50 or more selects, a few inserts and updates, tons of string manipulation and still load in about a second over a decent connection. Besides, when every page you generate is dynamic, and the content changes every second, the only things cache-able are images, scripts and style-sheets. So you generate the HTML on the fly,
    • by Anonymous Coward on Wednesday October 10, 2007 @03:10PM (#20930615)

      In other words, it's a smalltime hobby site, and you're not a web developer. That's fine, and I agree that it's quite nice and reassuring to simplify like this where possible. However...

      Go on out into the job market advertising your incredible "static page" skills, and what lightening fast load times you'll bring to your employer. Offer to convert their entire 20GB of online content to static XHTML 1.0 Strict to obtain the peace of mind that comes with knowing you haven't introduced any holes yourself. Hell, I'm going to go right now and submit a patch to MediaWiki that generates static versions of every article and then deletes all the PHP from the entire web root! I'm sure as soon as I tell them about the performance boost, they'll be right on board!

      • by jgrahn (181062)

        In other words, it's a smalltime hobby site, and you're not a web developer. That's fine, and I agree that it's quite nice and reassuring to simplify like this where possible. However...

        I think the grandparent thinks of it as "not complicating" rather than as "simplifying" ...

        I also find it slightly amusing that you can tell from this that he's not a web developer ;-)

    • by Dekortage (697532) on Wednesday October 10, 2007 @03:11PM (#20930633) Homepage

      All my pages are static HTML. Not a web application in site, not even PHP. Yes, it's a drag when I need to do some kind of sitewide update, like adding a navigation item.

      Umm... there are plenty of content management systems (say, Cascade [hannonhill.com]) that manage content and publish it out to HTML. Even Dreamweaver's templating system will do this. Just because you use pure HTML, doesn't mean you have to lose out on sitewide management control.

      • Are there any good FOSS solutions which work on this principle? I have home rolled a CMS which stores stuff in a MySQL database but writes it all out to static HTML files which I can upload to the server via rsync. Advantages are that should I move college I just can rsync my site to the new web server without having to get and maintain a database on it.
    • Re: (Score:2, Informative)

      by perlwolf (903757)
      Your music page [geometricvisions.com] fails the XHTML W3C check though it says its XHTML compliant.
      • The page used to validate, but the W3C validation service recently added a requirement that the html element declare its namespace.

        I've been adding it to my pages as I work on them, but I haven't worked on that page for a while.

        And yeah, such requirements for sitewide updates is the best argument against my method.

        • by oni (41625)
          I've been adding it to my pages as I work on them, but I haven't worked on that page for a while.

          You should use something like PHP. Then you could have an includable header... oh wait, nevermind.
    • I bet you don't even own a television [theonion.com]
    • What is it with technology advances and people? Is there an old-fart gene operating here? Every damn time someone talks about a new technology someone has to pipe up with the "I build systems out of sand an raw electrons" argument as if that somehow is attached to great achievement and moral superiority. Use the tools you pretentious Luddites.
      • by thc69 (98798)
        What is it with technology advances and people? Is there a thoughtless idiot genre operating here? Every damn time someone pipes up with "I build systems out of sand and raw electrons", some idiot starts babbling about using the latest and greatest...while conveniently ignoring that it's results that matter, not how pretty your tools are.
        • It's actually most cost effective results that matter. The cost effectiveness of the various approaches usually determines how you get there and therefore the tools you should use. The prettyness or otherwise of your tools is unimportant, as you say. Sometimes ugly is good, sometimes pretty is good. And sometimes pretty ugly is good too!
  • by Ant P. (974313)
    Why is gzip the only content encoding option browsers support? It seems to me they'd be better off supporting something like bzip2 since it works far better on plain text.
    • Bzip2 consumes far more memory and CPU cycles than gzip. There's a lot of scenarios where this tradeoff isn't desirable for a busy webserver.
    • Re:gzip (Score:4, Insightful)

      by cperciva (102828) on Wednesday October 10, 2007 @03:28PM (#20930875) Homepage
      Unlike bzip2, gzip is a streaming compression format; so the web browser can start parsing the first part of a page while the rest is still being downloaded.
      • by LordVorp (988488)
        Oh yeah? Not contradicting you, by any means, but can you name one HTTP server program that actually DOES this? All of my research shows that the best you can hope for is a GZIP document broken into HTTP/1.1 Chunked transfer-encoding bits.
        • by cperciva (102828)
          I didn't say that HTTP servers could take advantage of the fact that gzip is a streaming compressor -- I said that HTTP clients could take advantage of that. Even if the server generates the entire compressed response before it starts to send anything back to the client, using a streaming compression format allows the client to overlap HTML parsing with the portion of download time which results from having finite bandwidth.
          • Firefox 2 and IE7 do indeed begin to display gzip-compressed pages while they are still loading over the net. The method used to verify this was to insert a local Squid cache that uses "delay pools" to limit transfer bandwidth and that records time and duration of all network transfers made. Using this method, I could see that a lengthy compressed HTML page was transferred in 14 seconds, and the content became visible in IE7 after 6 seconds and finished loading after 18 seconds.

            If you have a physically

        • by sphix42 (144155)
          Nope. Lost all my namespace to pharmies. Oxycontin: it's a line; like Oycontin?
  • by QuietLagoon (813062) on Wednesday October 10, 2007 @02:59PM (#20930453)
    ... if Yahoo's website were not dog slow all the time.
  • by redelm (54142) on Wednesday October 10, 2007 @03:02PM (#20930493) Homepage
    If you're responsible for the response time of some webpages, then you've got to do your job! First test a simple static webpage for a baseline.

    Then every added feature has to be justified -- perceived added value versus cost-to-load. Sure, the artsies won't like you. But it isn't your decision or theirs. Management must decide.

    For greater sophistication, you can measure your dl rates by file to see how much is in users caches. And decide whether these are also not a cause of slowness!

  • by morari (1080535)
    I hate that the typical webpage assumes that everyone has broadband these days. The finesse and minimalist approach of yesteryear no longer applies. Even with broadband at 100%, smaller is always better. No one wants to put the effort in that would go toward efficiency though.
    • Re: (Score:3, Insightful)

      by guaigean (867316)

      No one wants to put the effort in that would go toward efficiency though.
      That's not an accurate statement. A LARGE amount of time is spent on the very big sites to maximize efficiency. It is the largest of sites that truly see the benefits of optimization, as it can mean very large savings in fewer servers, bandwidth fees, etc. A better statement might be "People with low traffic sites don't want to put the effort in that would go toward efficiency though."
  • This is a great idea for a book--hope the execution lives up to the idea. Without having read the book, I will venture some obvious things: * Profile your app before you optimize. Don't guess where you are slow: know. * If you use Struts, don't do client-side validation. (Look at the mass of JavaScript that gets added to your page if you question this.) * Use AJAX if you can. (Also an amazing speed boost). * Use few images. * Do AJAX validation without leaving the page.
  • That web site designers don't give a flying fuck about the speed their web sites load.

     
    • Seems true. Meevee is my current peave. Lots of cool eye-candy, pop-up show descriptions and such but all I really want is to know when "Reaper" or "Nova" is on. Google would have it back to me in plain html in a fraction of a second - sponsored ads included. Loading the guide page of MeeVee involved 114 separate requests, 900kB of data and took over 20 seconds to load and render. And that's connecting via T1. Poor suckers using modem or mobile connections are SOL.
    • Re: (Score:3, Interesting)

      by PHPfanboy (841183)
      From my observations of web developers in the wild this does seem to be scientifically true. Actually, most web developers are so overloaded with projects that even if they did give a shit, they simply don't have time to benchmark, test and optimize properly.

      It's not an excuse, it's just that teams are so fluid, project work is chaotic and project management is driven by marketing considerations (read: "get it out", not "enterprise stability") that site performance is seen as a server hardware issue.

      Shame r
    • by cjjjer (530715)
      Ahh no it's the kiss ass sales people who fill the clients head full of eye candy from other sites.

      "Oh look your competitor has all these cool gradient images, border images and flash up the ass, how cool would it be to have something like that in your site".
  • by Josef Meixner (1020161) on Wednesday October 10, 2007 @03:16PM (#20930723) Homepage

    I guess I am not alone in noticing that often the ads on a page drag the load time way down. I find it interesting, that there is no rule about minimizing content dragged in from other servers you have no or little control over. Blind spot because of Yahoo's business, I guess.

    • by demonbug (309515)
      I guess I am not alone in noticing that often the ads on a page drag the load time way down.

      This is especially annoying on sites where the ads are apparently forced to load before things like the text (i.e., the content I am actually looking for) render. Anandtech used to really piss me off in this respect - the ad server would take forever, and there was nothing to read until the ads loaded (haven't noticed this behavior lately).
      I suppose I might be able to block the ads, but it is my feeling that as long
    • On the nose, Josef.

      How much time have we all spent looking at a blank browser window with "...completed 12 of 13 items." at the bottom?

      Whatever else [spacetoast.net] I might think of it, Facebook has a nice trick that appears to work as follows. The page loads with a blank graphic where the ad should be. Afterward, an onLoad script fires requesting the ad and replacing the blank graphic with it. The ads take a moment to load: the page is instantly on. Proper priorities.

      (As a corollary, I've got a Dice ad at the to

    • by NickFitz (5849)

      I find it interesting, that there is no rule about minimizing content dragged in from other servers you have no or little control over.

      This is covered by rule 9 (reduce DNS lookups) and rule 1 (make fewer HTTP requests).

      You shouldn't confuse what's in a book review with what's in the book itself...

  • by zestyping (928433) on Wednesday October 10, 2007 @03:23PM (#20930793) Homepage
    The title of the book should be "High Speed Web Sites" or just "Fast Web Sites."

    "Performance" is not a general-purpose synonym for "speed." "Performance" is a much more general term; it can refer to memory utilization, fault tolerance, uptime, accuracy, low error rate, user productivity, user satisfaction, throughput, and many other things. A lot of people like to say "performance" just because it's a longer word and it makes them sound smart. But this habit just makes them sound fake -- and more importantly, it encourages people to ignore all the other factors that make up the bigger picture. This book is all about speed, and the title should reflect that.

    So, I beg you: resist the pull of unnecessary jargon. The next time you are about to call something "performance," stop and think; if there's a simpler or more precise word for what you really mean, use it.

    Thanks for listening!
    • by improfane (855034)

      lot of people like to say "performance" just because it's a longer word and it makes them sound smart.

      I see where you are coming from but I don't like this quote. Performance was a decent term to use since this covers a lot of ground. A fast performing website 'performs' well for the user. All your examples are factors of a well performing website:

      memory utilization [browser uses less memory]
      low error rate [user doesn't make so many mistakes, doesn't misclick something due to lag, doesn't forget what they

      • by zestyping (928433)
        Sure, many of these factors are related. But what is the book really about?

        It isn't a book about making users more productive, or a book about reducing user error. This is a book about making websites fast. The other factors are only peripheral effects. Those 14 rules that Souders is pushing are all about speed, not these other factors.

        Making a website fast may improve many things about the experience. But speed is not the only thing you need to make a website perform well.

        The reviewer makes the same m
  • Odd Summary (Score:4, Insightful)

    by hellfire (86129) <deviladv@@@gmail...com> on Wednesday October 10, 2007 @03:24PM (#20930815) Homepage
    Web developers often assume that most page-loading performance problems originate on the back-end, and thus the developers have little control over performance on the front-end, i.e., directly in the visitor's browser. But Steve Souders, head of site performance at Yahoo, argues otherwise in his book, High Performance Web Sites: Essential Knowledge for Frontend Engineers."

    Let's correct this summary a little bit. First, it's NOVICE Web developers who would think this. Any web developer worth their weight knows the basic idea that java, flash, and other things like it make a PC work hard. The website sends code, but the PC has to execute the code, rather than the website pushing static or dynamic HTML and having it simply render. We bitch and moan enough here on slashdot about flash/java heavy pages, I feel this summary is misdirected as if web developers here didn't know this.

    Secondly, there's no argument, so Steve doesn't have to argue with anyone. It's a commonly accepted principle. If someone didn't learn it yet, they simply haven't learned it yet.

    Now, I welcome a book like this because #1 it's a great tool for novices to understand the principle of optimization on both the server and the PC, and #2 because it hopefully has tips that even the above average admin will learn from. But I scratch my head when the summary makes it sound like it's a new concept.

    Pardon me for nitpicking.
    • by mcmonkey (96054)
      Well, as the summary itself sez:

      As with any book, this one is not perfect -- nor is any work.
    • Sort of... (Score:3, Insightful)

      It's really irrelevant whether they actually understand the real problem or not when what they do is broken. I don't care of they really don;t know or just have a mandate from someone who doesn't know or if they're just too clueless to realize that what happens on their high end system on their high speed LAN has little to do with what Jenny and Joey Average see at home on their cheap Compaq from WalMart with about half the RAM it should have for their current version of Bloated OS. The end result is the
  • by lgordon (103004) <larry...gordon@@@gmail...com> on Wednesday October 10, 2007 @03:28PM (#20930885) Journal
    Getting rid of banner ads at the source is what causes most page loading time, and it's usually a fault of the browser renderer than anything else. A lot of times these javascript ad servers are horrible performance wise. It can also be the fault of the ad networking company when their servers get overloaded, causing undue delay before the ad is served to the client. Something to think about when choosing ad placement on a site.

    Putting an adblocker of some sort or Mozilla Adblock Plus is a great way to speed up any page (from the user's point of view, of course).
    • by shmlco (594907)
      Ads from third-party sites. Scripts and trackers from third-party sites (like Google Analytics or page counters). Scripted web page widgets from third-party sites.

      Basically anything that's not under your control can slow your site down significantly.
  • I wish it did focus on the backend more. Optimization is the second biggest problem with software these days, security/stability being number one.

    Web development is especially bad at optimization. This thread demonstrates the problem:
    http://forums.devnetwork.net/viewtopic.php?t=74613 [devnetwork.net]

    People there are actually recommending you wait until your server fails before you look to optimize.
    • by Shados (741919)
      While waiting for it to fail is pretty extreme, "early optimization is the root of all evil".

      The problem is poor requirement specifications. I worked for so many companies that had all the pretty UML architecture and usecases down, but no requirement specifications.

      Uptime requirements, response time, security requirements (so all around QOS), maintenance requirements, support, ANYTHING would help, but they don't. So when the application (web or otherwise) is slow, the developers don't know if its "good enou
  • by spikeham (324079) on Wednesday October 10, 2007 @03:39PM (#20931109)
    In the mid-90s Yahoo! pared down every variable and path in their HTML to get the minimum document size and thus fastest loading. You'd see stuff in their HTML like img src=a/b.gif and a minimum of spaces and newlines. However, back then most people had dialup Internet access and a few KB made a noticeable difference. In the past few years, mainstream Web sites pretty much assume broadband. Don't bother visiting YouTube or MySpace if you're still on a modem. Aside from graphics and videos, one of the main sources of bloat is Web 2.0. Look at the source of a Web 2.0 site, even Yahoo!, and often you see 4 times as many bytes of Javascript as HTML. All that script content not only has to be retrieved from the server, but also takes time to evaluate on the client. Google is one of the few heavily visited sites that has kept their main page to a bare minimum of plain HTML, and it is reflected in their popularity. If you visit a page 10 times a day you don't want to be slowed down by fancy shmancy embedded dynamic AJAX controls.

    - Spike
    Freeware OpenGL arcade game SOL, competitor in the 2008 Independent Games Festival: http://www.mounthamill.com/sol.html [mounthamill.com]
    • While talking about Yahoo ... I've been using yahoo mail since the late 90's. I don't use it for everything but since I've had that e-mail address for almost 10 years now I still use it for certain purposes.

      Now, I'm on a "modern" PC (1GB RAM, 1.8Ghz CPU ... starting to show it's age but still plenty capable of surfing Youtube and MySpace if I'm so inclined) and I have a business-grade cable line that I pay extra for to get more bandwidth since I'm running a home business and I have to transfer a lot of data
    • by saforrest (184929)
      In the mid-90s Yahoo! pared down every variable and path in their HTML to get the minimum document size and thus fastest loading.

      "Variable and path" in HTML? What are you talking about?

      Anyway, Yahoo's site started off relatively small in '95 or so, as most sites were then. But as I remember it, they were one of the first to unleash those bloated late-nineties "portal" sites, complete with stock ticker, 14-day forecast, and the latest celebrity gossip.

      I hardly think they were terribly concerned about fast
      • by KlomDark (6370)
        > "Variable and path" in HTML? What are you talking about?

        Swap "Filename and path" for "Variable and path" and you will glean enlightenment.
  • by mr_mischief (456295) on Wednesday October 10, 2007 @03:41PM (#20931147) Journal
    The new interface is a joke for performance compared to the old server-generated HTML one. Sure, they might be saving some hardware resources, but it's slow, and the message bodies are the bulk of the data anyway. The main transfers they cut out using JavaScript and dynamic loading seem to be updates to the message list when you delete a bunch of spam. That would be better handled by putting it in the spam folder where it belongs. OTOH, I often delete non-spam messages without reading them as I do subscribe to a few legit mailing lists from my Yahoo address but don't want to read every message.
  • by geekoid (135745) <dadinportland @ y a hoo.com> on Wednesday October 10, 2007 @03:52PM (#20931311) Homepage Journal
    "Web developers often assume that most page-loading performance problems originate on the back-end, and thus the developers have little control over performance on the front-end,"

    Those Web designers should be called "Unemployed"
  • The book is a quick read compared to most technical books, and not just due to its relatively small size (168 pages), but also the writing style. Admittedly, this may be partly the result of O'Reilly's in-house and perhaps outsource editors -- oftentimes the unsung heroes of publishing enterprises.

    So not only do they now outsource the web page designers, they are outsourcing the technical writers?
    What's next? Outsource the audience?
  • ISBN redundancy (Score:3, Informative)

    by merreborn (853723) on Wednesday October 10, 2007 @04:23PM (#20931741) Journal
    FTFA:

    High Performance Web Sites was published on 11 September 2007, by O'Reilly Media, under the ISBNs 0596529309 and 978-0596529307

    There's no need to list both the ISBN 10 and the ISBN 13. ISBN 13 is a superset of ISBN10. Notice that both numbers contain the exact same 9 data digits:
    0596529309
    9780596529307

    The only difference is the 978 "bookland" region has been prepended, and the check digit has been recalculated (using the EAN/UPC algorithm, instead of ISBN's old algo). You can just give the ISBN 10, or just the ISBN 13. You can trivially calculate one from the other. All software that deals with ISBNs should do this for you. e.g., if you search either the ISBN13 or ISBN10 on amazon, you'll end up at the exact same page.
  • Use Varnish HTTP cache http://en.wikipedia.org/wiki/Varnish_cache [wikipedia.org]
    It's designed from the ground up as an HTTP accelerator. It's extremly fast, in most cases way faster than Squid. However if you rely a lot on cookies you should look somewhere else.
  • Squeezing a few milliseconds here and there using clever optimisation is fine (and worth doing) but isn't the whole objective defeated somewhat as soon as you have to embed adverts from the major ad-delivery networks (which most sites of any size do)?

    I have lost countless hours of my life waiting for pages to render while they suck down banner ads from overloaded delivery networks (e.g. Falkag).
  • by SirJorgelOfBorgel (897488) * on Wednesday October 10, 2007 @04:40PM (#20931979)
    I have read a large number of excerpts (one for every paragraph) of this book in response to a mention of this book in the #jquery IRC channel. A few people were very much anticipating this book. A lot of discussion followed on some of the subjects. Ofcourse, this book makes some very good points, like how the front-end speed is important and only partially dependant on server response times. I will not go into the specifics (I could write a book myself :D), but some things, you might think the author is smoking crack.

    I have looked at the book again now, and there seem to have been some changes. For example, there were only 13 rules when I was reviewing those before. Now there are 14. As one example, ETags were advised to not be used at all (IIRC, my biggest WTF about the book - if used correctly, ETags are marvellous things and compliment 'expires' very nicely), instead of the current 'only use if done correctly'. Some other things are nigh impossible to do correctly crossbrowser (think ETag + GZIP combo in IE6, AJAX caching in IE7, etc). To be honest, I found pretty much all of this stuff being WebDevelopment 101. If you're not at the level that you should be able to figure most of these things out for yourself, you probably won't be able to put them into practise anyway, and you should not be in a place where you are responsible for these things.

    I might pick up this book just to read it again, see about the changes and read the full chapters, just to hear the 'other side of the story', but IMHO this book isn't worth it. In all honesty, the only thing I got out of it so far that I didn't know is the performance toll CSS expressions take (all expressions are literally re-evaluated at every mouse move), but I hardly used those anyways (only to fix IE6 bugs), and in response have written a jQuery plugin that does the required work at only the wanted times (and I've told you this now, so no need to buy the book).

    My conclusion, based solely on the fairly large number if excerpts I've read is: if you're a beginner, keep this book off for a while. If you're past the beginner stage but your pages are strangly sluggish, this book is for you. If you've been around, you already know all this stuff.
  • a broadband or narrowband connection
    Suggest "fast or slow connection".
  • Actually, eery user's impressions are created by Flash:

    Some think Flash is essential to the web brousing experience and a site without Flash is not worth the bother.

    Others think that a site with flash is sure evidence of a triumph of style over content, and guarantees its not worth waiting for it to load.

    Since Adobe choose not to support FreeBSD, its fairly clear that freeBSD users all fall in the second category. You will have to do other analyses yourself.

  • I know it's not totally relevant, but I just wanted to vent. Their website is ALWAYS horribly slow compared to all the others I frequent no matter the state of the pipe on my end. It amazes me that a company that big hasn't figured out where the bottlenecks are in their architacture and fixed them, unless their basic architecture is the problem. Anyway, I feel marginally better.
  • The best pratice is "don't put a high loaded home page due to a huge flash animation" !
  • Huh? I thought this was a news site, not advertising. And why does it load so slowly?
  • I started reading the first chapter and I was surprised when I read the following paragraph:

    Gzip is currently the most popular and effective compression method. It is a free for- mat (i.e., unencumbered by patents or other restrictions) developed by the GNU project and standardized by RFC 1952. The only other compression format you're likely to see is deflate, but it's slightly less effective and much less popular. In fact, I have seen only one site that uses deflate: msn.com. Browsers that support deflat

    • by Raphael (18701)

      Also in the sample chapter, "Table 4-2. Compression sizes using gzip and deflate" shows that gzip performs better than deflate:

      It's clear from Table 4-2 why gzip is typically the choice for compression. Gzip reduces the response by about 66% overall, while deflate reduces the response by 60%. For these files, gzip compresses ~6% more than deflate.

      Unfortunately, this table does not specify which compression settings were used for each method. Although both mod_gzip and mod_deflate should default to compr

Nothing is more admirable than the fortitude with which millionaires tolerate the disadvantages of their wealth. -- Nero Wolfe

Working...