Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×

Advanced Excel for Scientific Data Analysis 303

cgjherr writes "If the recent financial meltdown has left you wondering, 'When does exponential decay function stop?' then I have the book for you. Advanced Excel for Scientific Data Analysis is the kind of book that only comes along every twenty years. A tome so densely packed with scientific and mathematical formulas that it almost dares you to try and understand it all. A "For Dummies" book starts with a gentle introduction to the technology. This is more like a "for Mentats" book. It assumes that you know Excel very well. The first chapter alone will have you in awe as you see the author turn the lowly Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude." Read on for the rest of Jack's review.
Advanced Excel for Scientific Data Analysis
author Robert de Levie
pages 700
publisher Oxford Press
rating 9
reviewer Jack Herrington
ISBN 9780195370225
summary Use Excel for high end scientific data analysis akin to Mathemetica
When I first opened this book my mouth just dropped. It had been years since I had seen a book typeset using LaTeX. But in an instant it made sense as the book is crammed packed with the kind of equations that would have been a nightmare to build with any other tools. Chapter after chapter has everything a really smart person needs to do curve fitting, statistical measures, differential equations, time-frequency analysis. But don't expect a play by play here. You will get the equations, set within a few dense paragraphs, with maybe a spreadsheet and a chart or two to show the results.

The first chapter concentrates on the getting the most out of Excel as a tool. All the chapters that follow dig into specific data analysis techniques. Chapters two, three and four are on least squares. Chapter five and six cover the analysis in the time domain including fourier transforms. Chapter seven covers differential equations. Chapter eight returns to Excel by digging in deeper into macros. Which leads into chapter nine, where we dig deeper into basic mathematical operations. Chapter ten covers matrix operations. And chapter eleven wraps it all up by giving you some spreadsheet best practices.

In University style there are also some exercises that you can do along the way if you want to tweak your brain pan a little more. To amuse myself I tried a few and I believe the book would have assessed my attempts 'wanting' if it had a voice to tell me.

Where most books like this would have several authors this book has just one; Roberte de Levie. This means that the tone, style and quality of the book is consistent throughout. A fact that you will come to appreciate as the book wades in ever increasingly deep data analysis concepts as the chapters roll on.

Though I would have preferred the book to have code samples in C#, I understand that the language of Excel is VBA and I guess I have to live with that. Thankfully VBA has come a long way and if you so inclined it would likely be easy to translate the code into C#, Java, or whatever else you like.

The fact that one person wrote the book left me wondering, "Who is this guy?" In my minds eye I kinda of figured he would look like one of those pulsing brain guys from Star Trek. Turns out he is a professor at Bowdoin College. And his fields of study include ionic equilibria, electrochemical kinetics, electrochemical oscillators, stochastic processes, and a whole lot more stuff that almost seems made up to sound impressive.

When this book isn't serving as an amazing reference for both Excel, scientific problem solving, or just insane equations it serves other purposes as well. It's a handy portable IQ test, as the count of pages you can grind through in one sitting, plus 90, is roughly your intelligence quotient. And if you fail at that you can always put a copy of the book, along with the Orange Bible, under your pillow and try to osmose your way to becoming the Kwisatz Haderach.

In all seriousness, this is a great book. It represents the kind of in-depth work and research we used to see in books that came out twenty years ago. Robert is to be applauded for his work. This is an excellent resource for anyone looking to do scientific data analysis but who was unaware of the powerful capabilities that Excel provides that is likely waiting just one Startup menu click away.

The book is not without fault. I would have preferred that it had been in color, or at least have one color section to show some of the more impressive visualizations that I'm sure would look great in color. In addition the index is silly short for a book that clocks in at 700 pages. But those are only minor quibbles for what is all-in-all an amazing piece of work.

You can purchase Advanced Excel for Scientific Data Analysis from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

Advanced Excel for Scientific Data Analysis

Comments Filter:
  • alternately.... (Score:5, Insightful)

    by mattdm ( 1931 ) on Wednesday October 01, 2008 @01:04PM (#25221095) Homepage

    Don't do it! [burns-stat.com]

  • Re:incongruous (Score:4, Insightful)

    by fbjon ( 692006 ) on Wednesday October 01, 2008 @01:09PM (#25221167) Homepage Journal
    Why isn't it a good idea, and does this apply equally to OpenOffice?
  • eh? (Score:4, Insightful)

    by Anonymous Coward on Wednesday October 01, 2008 @01:16PM (#25221273)

    "The first chapter alone will have you in awe as you see the author turn the lowly Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude."

    Then why not just use Mathematica?

  • Re:incongruous (Score:3, Insightful)

    by johnny cashed ( 590023 ) on Wednesday October 01, 2008 @01:26PM (#25221433) Homepage
    Well, 2007 has bugs in it. I don't use Excel, I use something that can utilize math correctly. Have you checked your spreadsheet program? Or do you just assume that Microsoft does everything correctly?
  • That's nothing (Score:5, Insightful)

    by MarkusQ ( 450076 ) on Wednesday October 01, 2008 @01:30PM (#25221483) Journal

    turn the lowly Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude

    So? What's so special about that? You can turn C, Fortran, or even assembly language into something that rivals Mathematica using brains and a heaping helping of fortitude. This is arguably a better deal, since you don't need the VBA.

    --MarkusQ

  • by Vornzog ( 409419 ) on Wednesday October 01, 2008 @01:30PM (#25221497)

    ...everything looks like a snowglobe!

    Hardcore data analysis in Excel is almost always a bad idea. You can almost always find a way to do it in excel, and you can almost always find a way to do it better, faster, and cheaper somewhere else.

    R, MatLab, Mathemateica, Python/Numpy, SigmaPlot, and any number of old, well written, debugged and vetted numerical libraries written in C or Fortran. I've used all of these at various times to solve something that a co-worker couldn't figure out how to do in Excel.

    I fit quick linear regressions in Excel. For *anything* else, there is a better choice.

  • Re:eh? (Score:5, Insightful)

    by goofballs ( 585077 ) on Wednesday October 01, 2008 @01:33PM (#25221519)

    "The first chapter alone will have you in awe as you see the author turn the lowly Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude."

    Then why not just use Mathematica?

    1. you want to interact directly with excel data you receive
    2. you need to give the results to someone w/out mathematica
    3. a license of mathematica costs $2500, vs $150 for Office Home and Student
  • by slashdotlurker ( 1113853 ) on Wednesday October 01, 2008 @01:35PM (#25221571)
    for scientific data analysis.

    I know it is popular and many science and engineering faculty lazily encourage their graduate students to use it. However, something like matlab beats the crap out of excel any day. Spreadsheets tend to obfuscate relationships between data, require a lot more clicking (read human intervention) and waste time that could be spent thinking about the data, and are singularly unsuited for analysis of similar sets of data (a situation any scientist faces when he has to do a series of experiments).
    Matlab might take sometime to initially write the scripts, but it is so powerful and extensible that no one in their right mind would want to use excel. If you are a slave to spreadsheets, get yourself a copy of Microcal Origin or Labplot.

    Excel is especially unsuited to the task of preparing figures for scientific publications. The default formatting is at once wrong for the task and hard to change. Once you set your preferences in matlab (easy to do), you are set for life.

    In my experience, excel is also rarely used for anything serious outside of US. Maybe its an indictment of how lazy, slow witted and easily misled our pool of talent is becoming.
  • Re:eh? (Score:2, Insightful)

    by The Dancing Panda ( 1321121 ) on Wednesday October 01, 2008 @01:35PM (#25221573)
    Because I already have Excel, and Mathematica is another 120 dollars?
  • Re:incongruous (Score:3, Insightful)

    by TarrVetus ( 597895 ) on Wednesday October 01, 2008 @01:41PM (#25221673)

    Well, 2007 has bugs in it. I don't use Excel, I use something that can utilize math correctly. Have you checked your spreadsheet program? Or do you just assume that Microsoft does everything correctly?

    I use Excel for daily business functions and data analysis, and will continue to do so, but I don't assume Excel is perfect. I do what I should do with any program I use for calculations, though: I stay aware of all of the quirks and bugs I can of the program, and try to work around them.

    Every program is going to have a bug or two (or five thousand, seeing as Excel is part of MS Office), but part of working with software is to know what those are and learn to not let them ruin work.

  • Re:Wrong Tool (Score:2, Insightful)

    by internerdj ( 1319281 ) on Wednesday October 01, 2008 @01:41PM (#25221675)
    While I agree, sometimes being an engineer or analyst means working with one or two or six hands tied behind your back because of time, money, or IT-imposed user-permissions. If you aren't capable of identifying the sources of error in your data as well as those caused by your tools, then you are probably going to do a poor job even with the best tools. Bad tools should never be an acceptable excuse for delivering faulty analysis.
  • Re:incongruous (Score:5, Insightful)

    by johnny cashed ( 590023 ) on Wednesday October 01, 2008 @01:42PM (#25221695) Homepage
    Well, I do put faith into my fortran compiler.
  • Re:incongruous (Score:3, Insightful)

    by drfireman ( 101623 ) <dan@kiMOSCOWmberg.com minus city> on Wednesday October 01, 2008 @01:54PM (#25221939) Homepage

    Care to expand on why you think you can't do 'meaningful scientific data analysis in Excel?' Are you one of these people who 'reviews' books without actually reading them?

    Someone else has already posted a link to a page that nicely summarizes many (not all) of the problems with using excel for science. But there is virtually no statistical technique which isn't already better implemented in R (free) and many other statistical packages. Real stats packages provide implementations of a given technique that are at least as reliable, provide more control, more options, more diagnostics, and often more guidance. The built-in stuff in Excel is so oversimplified that I think if you're really forced to use it for serious statistics, you'd have to re-implement things using basic arithmetic operations. It's graphing capabilities, last time I checked, lacked the majority of even the most basic kinds of statistical/scientific graphs/plots. Sure, you can do this or that in Excel, and if you're willing to put in enough work you can often get what you really need out of it. But it's rarely if ever the best tool for the job of scientific data analysis.

    I don't review books about reading them, and this is no exception. But I do have an informed opinion about the premise of this book (and to a lesser extent about the level of insight of a reviewer who seems, to put it mildly, easily awed). The premise that Excel is good to use for scientific data analysis is pretty deeply misguided. I'd be happy to be convinced otherwise if I were really wrong, but I can only set aside so much time for listening to arguments from nutcases (just in case one of them may have a point). I'm sure if I actually read this book, I'd learn about various useful things Excel can do that would surprise and impress me. But I already have all the information I need to form a reliable opinion on this question, and I value my time too much to read books about space aliens living among us or about doing analysis in Excel.

    Truth to tell, I use openoffice calc (more or less an Excel clone) quite a bit for research-related things. But I'm careful with it, and don't rely on it for much more than moving numbers around.

  • by backwardMechanic ( 959818 ) on Wednesday October 01, 2008 @02:01PM (#25222047) Homepage
    ...in the same way that MS paint is as capable as photoshop...

    Yes, I use both. LaTeX if I have a choice, Word if I need to exchange docs with less enlightened colleagues.
  • by lowwave ( 469952 ) on Wednesday October 01, 2008 @02:21PM (#25222337)

    I thumbed through the book but not impressed.
    The author probably has used excel as best as any one can in doing the task he intends. But for most of people, the effort to acquire the skill by reading the book is not well-spent, since one can probably learn other tools which really intended for scientific analysis.

    For statistic packages, R probably is much better, though I would prefer SAS. Try a huge data set (200MB, and put it in excel, your system will crawl before excel crashes, but in SAS, it will be really fast, and provide much more statistics. How much faith do you have in Excel's statistical function anyway?
    I don't.

    As for differential Equations, I would try matlab, if I really doing NA, I will choose netlib's packge anytime over Excel.

    One can know Excel really well, and bend it to do all kinds of job, but in the end, it can only do that much. you probably can use bash to write a trading system, but why bother?

  • by excelsior_gr ( 969383 ) on Wednesday October 01, 2008 @02:26PM (#25222397)

    So according to the book, here's the recipe:
    1. Write your data analysis software in VBA
    2. Use the Excel cells, buttons, bells, and whistles for the I/O
    3. Profit!

    The math is actually irrelevant. Any computational mathematics book that respects itself uses pseudocode for the examples. If it is possible to program in one programming language it should be possible in any other language too.

    I tried it too, although I wasn't nearly as crazy to do any numerical computations in VBA. I wrote the program in pure good old Fortran 95, wrote some VBA scripts to read the Fortran ASCII output, and set everything else up in Excel that my boss liked (I'm a chemical engineer). There you go: it's fast to program, fast to run, easy to maintain.

    I would like to see anyone try to keep up with the Microsoft paraphernalia between VBA-Excel versions, if the whole thing is written in VBA. Not to mention the problems that I had with the locale when I tried to run the VBA code in a computer running a german version of Excel that had decided that the decimal point is there as a thousands separator and the comma was used as a decimal point. The setting for it in Excel was nowhere to be seen (I still haven't really figured it out. The central Windows setting seemed to have no influence on it, although I suppose it should) and 1.234 was then 1234 and 1.2E-02 was a character string. Oh, the pain... Thankfully, my *basic* Fortran part absolutely did not care, it just worked, and only the I/O needed to be reviewed.

    Try to send the program to a customer without knowing what kind of Excel version he is running. We had to go as far as Office 97 just to be sure, and there was still the problem with the locale. After a year, the I/O was useless, but who cares? It was only 1% of the code.

    I would still use Excel, but for nothing other than the most trivial tasks. There are wonderful libraries out there that work with Fortran and produce very nice graphs on the fly.

  • by Cyclopedian ( 163375 ) on Wednesday October 01, 2008 @02:49PM (#25222779) Journal

    Look at all those posts saying "Excel is not the right tool for this" or "When all you have is a hammer...". The point was not grokked by those folks.

    I'll lay it out for you, plain and simple:

    This book is like installing a linux kernel onto a wristwatch.

    We should be marvelling at the feat, not lambasting a tool that was "hacked" to do so much more than it is normally used for. If you can't appreciate that kind of work, maybe you should just stick to appreciating fine arts.

  • by Anonymous Coward on Wednesday October 01, 2008 @02:59PM (#25222933)

    I've been using spreadsheets for scientific calculations since my freshman physics class, when we were using Lotus 1-2-3. The first time that I saw Excel in operation (1993?),I became a convert.

    I still use Excel quite a lot for everything from simple data analysis (e.g. ANOVA [but not the built-in ANOVA function - that's lame], regression analysis, &c.), data aggregation, &c. It works fine for just about any application that I throw at it -- as far as number crunching goes. Worse case scenario, I have to write a bit of VBA to handle a special function. The built in data base tools (e.g. Match(), Index(), VLOOKUP(), &c.) make it easy to model multiple scenarios, as well (and link them to e.g. a list-box of options with a couple of mouse clicks). The array functions make it easy to perform parallel operations on large related sets of data with minimum work.

    Where Excel fails (and does so miserably, in my opinion) is in it's display capabilities. Have you ever tried to plot a response surface in Excel? Yuck! - it's UGLY. I have, on more than one occasion, used Excel to aggregate/crunch a large data set, then exported the results into MiniTab for report generation.

    Don't knock it till you've tried it.

    Oh - and for you open source fanatics, out there (I'm a fan of open source, and contribute a bit, myself) - Open Office Calc is still *far* behind Excel in this type of application. Getting better, but still not there.

  • by UnknowingFool ( 672806 ) on Wednesday October 01, 2008 @03:05PM (#25223029)
    While some of us admire the author for doing something akin to fitting Linux on watch, some of the objectors are pointing out the watch still needs to accurately tell time. There are a number of papers that show how Excel does not have the accuracy necessary for detailed scientific analysis. Some things like random number generation are not implemented correctly. So back to the watch analogy, as long as the author clearly divulged that the watch only tells time accurately to the minute and is not waterproof, etc, I don't have a problem with it.
  • by synthespian ( 563437 ) on Wednesday October 01, 2008 @03:31PM (#25223429)

    Now, who told you SPSS is crappy software? It's a widely used software for not only Social Science but for the Biology and Medical fields - in short, for anyone serious about statistics who's not a statistician.

    Excel, OTOH, has a long track record of errors. Microsoft does not have the expertise for numerical and statistics software.

    Which is not surprising, if you remind yourself that Microsoft did not even have security expertise for its own main product line...This software landscape is dominated by Matlab, Maple, SAS, S-Plus, R, Scilab, SPSS and Mathematica. This is what people in the field use. *Not* Excel.

    People have got to stop thinking that what Microsoft does, it does always well. That just shows lack of analysis and reading.

  • Obvious Flaw! (Score:1, Insightful)

    by Anonymous Coward on Wednesday October 01, 2008 @03:45PM (#25223663)

    The real problem with doing real scientific or research work with excel are the file sizes. Excel can handle more and more data each release, but my data files are gigs in size and way too big for excel.

You knew the job was dangerous when you took it, Fred. -- Superchicken

Working...