Forgot your password?
typodupeerror
Book Reviews Books Media

MySQL Stored Procedure Programming 206

Posted by samzenpus
from the a-little-lite-afternoon-reading dept.
Michael J. Ross writes "MySQL may be the most popular open source relational database management system (RDBMS) in the world, but during the first decade of its existence, it lacked support for stored programs, i.e., store procedures, functions, and triggers. The major commercial RDBMS vendors — including Oracle, IBM, and Microsoft — could point to this deficiency as reason enough to choose their proprietary systems over MySQL or any other open source system, such as PostgreSQL. But with the release of MySQL version 5.0, in October 2005, the "little database engine that could" dramatically improved its position against the competition. The most comprehensive discussion of these new capabilities is in the book MySQL Stored Procedure Programming." Read below for the rest of Michael's review
MySQL Stored Procedure Programming
author Guy Harrison and Steven Feuerstein
pages 636
publisher O'Reilly Media
rating 9
reviewer Michael J. Ross
ISBN 0596100892
summary A comprehensive guide to developing MySQL stored procedures, functions, and triggers.


Written by Guy Harrison and Steven Feuerstein, and published by O'Reilly Media in March 2006 under the ISBNs 0596100892 and 978-0596100896, this book is the first one to offer database programmers a full discussion of the syntax, usage, and optimization of MySQL stored procedures, stored functions, and triggers — which the authors wisely refer to collectively as "stored programs," to simplify the manuscript. Even a year after the introduction of these new capabilities in MySQL, they have received remarkably little coverage by book publishers. Admittedly, there are three such chapters in MySQL Administrator's Guide and Language Reference (2nd Edition), written by some of the developers of MySQL, and published by MySQL Press. Yet this latter book — even though published a month after O'Reilly's — devotes fewer than 50 pages to stored programs, and the material is not in the printed book itself, but in the "MySQL Language Reference" part, on the accompanying CD. That material, in conjunction with the online reference documentation, may be sufficient for the more simple stored program development needs. But for any MySQL developer who wishes to understand in-depth how to make the most of this new functionality in version 5.0, they will likely need a much more substantial treatment — and that's exactly what Harrison and Feuerstein have created.

The authors are generous in both the technical information and development advice that they offer. The book's material spans 636 pages, organized into 23 chapters, grouped into four parts, followed by an index. The first part, "Stored Programming Fundamentals," provides an introduction and then a tutorial, both taking a broad view of MySQL stored programs. The remaining four chapters cover language fundamentals; blocks, conditional statements, and iterative programming; SQL; and error handling. The book's second part, "Stored Program Construction," may be considered the heart of the book, because its five chapters present the details of creating stored programs in general, using transaction management, using MySQL's built-in functions, and creating one's own stored functions, as well as triggers. The third part, "Using MySQL Stored Programs and Applications," explains some of the advantages and disadvantages of stored programs, and then illustrates how to call those stored programs from source code written in any one of five different programming languages: PHP, Java, Perl, Python, and Microsoft.NET. In the fourth and final part, "Optimizing Stored Programs," the authors focus on the security and tuning of stored programs, tuning SQL, optimizing the code, and optimizing the development process itself.

This is a substantial book, encompassing a great deal of technical as well as advisory information. Consequently, no review such as this can hope to describe or critically comment upon every section of every chapter of every part. Yet the overall quality and utility of the manuscript can be discerned simply by choosing just one of the aforesaid Web programming languages, and writing some code in that language to call some MySQL stored procedures and functions, to get results from a test database — and developing all of this code while relying solely upon the book under review. Creating some simple stored procedures, and calling them from some PHP and Perl scripts, demonstrated to me that MySQL Stored Procedure Programming contains more than enough coverage of the topics to be an invaluable guide in developing the most common functionality that a programmer would need to implement.

The book appears to have very few aspects or specific sections in need of improvement. The discussion of variable scoping, in Chapter 4, is too cursory (no database pun intended). In terms of the book's sample code, I found countless cases of inconsistency of formatting — specifically, operators such as "||" and "=" being jammed up against their adjacent elements, without any whitespace to improve readability. These minor flaws could be easily remedied in the next edition. Some programming books make similar mistakes, but throughout their text, which is even worse. Fortunately, most of the code in this book is neatly formatted, and the variable and program names are generally descriptive enough.

Some of the book's material could have been left out without great loss — thereby reducing the book's size, weight, and presumably price. The two chapters on basic and advanced SQL tuning contain techniques and recommendations covered with equal skill in other MySQL books, and were not needed in this one. On the other hand, sloppy developers who churn out lamentable code might argue that the last chapter, which focuses on best programming practices, could also be excised; but those are the very individuals who need those recommendations the most.

Fortunately, the few weaknesses in the book are completely overwhelmed by its positive qualities, of which there are many. The coverage of the topics is quite extensive, but without the repetition often seen in many other technical books of this size. The explanations are written with clarity, and provide enough detail for any experienced database programmer to understand the general concepts, as well as the specific details. The sample code effectively illustrates the ideas presented in the narration. The font, layout, organization, and fold-flat binding of this book, all make it a joy to read — as is characteristic of many of O'Reilly's titles.

Moreover, any programming book that manages to lighten the load of the reader by offering a touch of humor here and there, cannot be all bad. Steven Feuerstein is the author of several well-regarded books on Oracle, and it was nice to see him poke some fun at the database heavyweight, in his choice of sample code to demonstrate the my_replace() function: my_replace( 'We love the Oracle server', 'Oracle', 'MySQL').

The prospective reader who would like to learn more about this book, can consult its Web page on O'Reilly's site. There they will find both short and full descriptions, confirmed and unconfirmed errata, a link for writing a reader review, an online table of contents and index, and a sample chapter (number 6, "Error Handling"), in PDF format. In addition, the visitor can download all of the sample code in the book (562 files) and the sample database, as a mysqldump file.

Overall, MySQL Stored Procedure Programming is adeptly written, neatly organized, and exhaustive in its coverage of the topics. It is and likely will remain the premier printed resource for Web and database developers who want to learn how to create and optimize stored procedures, functions, and triggers within MySQL.

Michael J. Ross is a Web programmer, freelance writer, and the editor of PristinePlanet.com's free newsletter. He can be reached at www.ross.ws, hosted by SiteGround.


You can purchase MySQL Stored Procedure Programming from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

MySQL Stored Procedure Programming

Comments Filter:
  • Stred pocedures (Score:5, Informative)

    by karavelov (863935) on Wednesday April 18, 2007 @03:38PM (#18786873) Homepage
    "The major commercial RDBMS vendors -- including Oracle, IBM, and Microsoft -- could point to this deficiency as reason enough to choose their proprietary sstems over MySQL or any other open source system, such as PostgreSQL..." They could not point this because PostgreSQL has stored procedures for decades... fully tested, stable, offering e bunch of languages for writing functions/triggers, including Perl, Python, Tcl and their own PlSQL that is very close to Oracle's variant.
  • by RedElf (249078) on Wednesday April 18, 2007 @04:00PM (#18787189) Homepage
    Mod parent up!

    Stored procedures have added benefits such as additional security, and forcing application developers to implement database functionality properly, not sloppily.
  • by Overly Critical Guy (663429) on Wednesday April 18, 2007 @04:32PM (#18787615)
    I've never had a problem with documentation. Are you saying the only reason you discounted PostgreSQL as a superior database was the documentation, despite the fact that for years, MySQL would happily drop data without telling you?
  • by XorNand (517466) on Wednesday April 18, 2007 @04:39PM (#18787739)

    I recently decided to move to Postgres from MySQL. There are some pretty cool things about Postgres; flexible authentication options and sequences being my current favorite. I've found that it's docs are actually pretty good (I don't know what they were like before). However, since the topic here are stored procedures, I have to really complain bitterly about Postgres's implementation of stored procedures. Creating a sproc that returns a dataset rather than just a outpur parameter is a convoluted dance. You have to exactly define the output using a "TYPE" and then iterate over the dataset using a cursor-like syntax.

    Consider this simple example that I'm currently working on. It's for an internal peer review application where coworkers give one positive and one negative comment about each other:

    DROP TYPE answers_type CASCADE;
    CREATE TYPE answers_type AS (
    positive text
    ,negative text);

    CREATE OR REPLACE FUNCTION answers_get(
    _review_id int
    ) RETURNS setof answers_type AS $$
    DECLARE
    rec answers_type;
    BEGIN

    FOR rec IN SELECT
    positive
    ,negative
    FROM answers
    WHERE review = _review_id
    ORDER BY id DESC
    LOOP
    RETURN NEXT rec;
    END LOOP;
    RETURN;
    END; $$ LANGUAGE plpgsql;

    This ends up being very unwieldy for larger and more complex stored procedures. I recently compared simple output -parameter only stored procedures [rightbrainnetworks.com] between Postgres, MySQL, and MS SQL on my blog. Fortunately it's a lot more straightforward when using that method. I'll stick with Postgres for now and just hope that this is eventually improved.

  • by LordLucless (582312) on Wednesday April 18, 2007 @05:00PM (#18788045)
    Actually, if all you want is a big table, MySQL is probably a good choice. It's generally very fast. However, as soon as you start getting a complex database schema, you're probably starting to look at other RDBMSs. MySQL provides a lot of advanced features with InnoDB, but then, InnoDB has just been bought by their biggest competitor. If you're needing any of those features (foreign keys, transactions, etc), then I'd be going with PostGreSQL, because, really, I don't think the InnoDB team is going to be releasing any fixes/upgrades to their MySQL engine any time soon.
  • by consumer (9588) on Wednesday April 18, 2007 @05:02PM (#18788079)
    I'm currently looking at a table on a production MySQL db with 176 million rows in it. It has no problems at all, other than being a huge amount of data. Your ideas about MySQL's limitations are not based on fact.
  • Re:Not really FUD (Score:5, Informative)

    by elp (45629) on Wednesday April 18, 2007 @05:02PM (#18788085)
    Yep it was really irritating. A lot like the Postgres users who kept claiming the next version of their server was going to be really fast.

    In the end the postgres crowd have a right to be upset. They could have been the major player for open databases but their devs were too arrogant to listen to their users and design it to be friendly.

    For all its faults mysql is trivial to install and works out of the box for most applications. Last time I looked the default install for postgresql still seemed to be tuned as though it would never have to handle more than a handful of users. Stored procedures and TCP/IP are off by default and the auto vacuum thing needs to be set up manually. Then don't forget that while the postgres query parser does more error checking that mysql its error messages are incredibly cryptic so its MUCH hard to trouble shoot.

    Your average newbie takes one looks at it, gives up and moves to mysql. When that newbie finally grows up he has too much time invested in mysql to be bothered learning the intricacies of postgres. Every extra feature that mysql adds is one less reason for anyone to try postgres.

    Postgres is a better DB once its setup and tuned properly but considering its usability issues and that mysql users probably out number postgres users a thousand to 1 or more I think mysql is always going to be the number one opensource DB.

  • by kpharmer (452893) on Wednesday April 18, 2007 @05:51PM (#18788771)
    > I'm currently looking at a table on a production MySQL db with 176 million rows in it. It has no problems at all,
    > other than being a huge amount of data. Your ideas about MySQL's limitations are not based on fact.

    Without partitioning you face the choice of selecting data only via a btree index (only typically works if you need to select less than 3% of the data) or scanning it all. With a more typical alternative (certainly db2, oracle, etc) you can partition the data. This can result in 10:1 differences in performance between mysql & db2/oracle for typical reporting queries.

    Without query parallelism you're single-threading all of your queries, and unable to take advance of those extra processors. Since oracle/db2 get near-linear performance benefits from parallelism, you're again suffering a 4:1 performance penalty on a four-way smp.

    Without a robust optimizer you will choke on complex queries against even moderate data - so a query that joins a dozen tables together will inevitately go into the ditch with poorly chosen nested-loop joins. Performance penalty? could easily be 100:1 in some situations.

    Of course, this doesn't mean that you can't keep 200 million rows in msyql - you certainly can. Just don't plan to get the same kind of performance out of a lot of typically complex queries against it - that you would get from oracle/db2/informix or even sql server. Not unless you spend 20x as much on the hardware anyway.

    > Your ideas about MySQL's limitations are not based on fact.
    Unfortunately, your ideas about mysql's capabilities are based upon insufficient experience
  • by Shados (741919) on Wednesday April 18, 2007 @06:26PM (#18789223)
    Don't worrie, I am well aware of corporate software development. I am one of the primary developer for one of the largests non-software companies in the world :) We also have an army of DBAs and database-specific developers, and right now since it IS a stored-procedure-only environment, we have (just on the part I work with) several -douzans of thousands- stored procedure, that do everything you can imagine (including non-database related tasks). I've seen, or at least heard, most everything they can do, no worries :)

    Optional parameters and parametrised order bys are the least of my worries here (though having a stored procedure with a dynamic amount of fields, multi-dimentational arrays of parameters, and dynamic order by clauses that allow to sort by a varying amount of fields end up being quite long to make, no matter the magic you apply to them. Easy to write, don't get me wrong, but...). And even with all that, I could just use, for example (on SQL Server), a CLR stored procedure, and even those problems would go away. Thats not the issue at hand.

    The issue is that with a well thought out abstraction layer and automated scripts on top of the less critical parts of the applications, you can decouple your model and your database in ways that even a full layer (or two) of views and stored procedures simply can't do.

    I mean, great: today my data is coming from 6 datacenters, using several different RDBMS, 2 different ETL technologies, and (thank god) only 1 OLAP system. Thats cute. If tomorrow I decide to change something in there....well, stored procedures aren't the most abstracted thing in the whole wide world. If I upgrade, let say, a SQL Server 2000 to 2005, well, the way to handle something as silly as filtering/paging efficiently changed. Now I have to go through the 4 thousand SPs that used the SQL Server 2k crippled way and optimise them? Hell no, that will probably be simply forgotten and never done. If it ain't broken don't fix it. If I have an SQL abstraction layer, I'll just change 2-3 functions, pass them through the unit and integration tests, and if everything comes up green the douzans of apps in my company will all take advantage of it by the next day.

    On top of that, there's the notion that some things are simply easier to think of in an object oriented way, and it just feels real, REAL dumb to have 4 stored procedure for every god damn definition table (thousands!) in there. 90% of those won't be holding money amounts nor credit card numbers, and will be hiding behind locked down web services or remoting APIs -anyway-, so its not like anyone is getting direct access to the tables either. And if thats troublesome anyway (because of the risk of failure of these layers), you still have your views. And if thats still not enough, THEN of course (like in the example you gave), you go the stored procedure way, in which case the SP ends up being just an alternate datasource, the same way my ETLs and OLAP providers are. No biggy: even .NET's silly typed dataset allows you to switch between inline SQL and stored procedures completly transparently (though using inline SQL in a .NET dataset is unmaintainable, so thats probably a bad example :) )

    What I'm getting at, is that (again, as your example shows), in corporate environments, doing without stored procedure is virtually impossible. No argument there! You -need- them. However, using ONLY stored procedures is simply a disaster waiting to happen, on top of being a hell of a waste of man hours. Some stuff is simply 10x faster to do using a well made SQL abstraction API. The important part is to always have a good way of using the stored procs that will come in.
  • by Anonymous Coward on Wednesday April 18, 2007 @06:57PM (#18789579)
    You really shouldn't link to a page about MySQL's partitioning support without linking the following:

    http://dev.mysql.com/doc/refman/5.1/en/partitionin g-limitations.html [mysql.com]
  • by swusr (689597) on Wednesday April 18, 2007 @07:07PM (#18789711)
    Since PostgreSQL 8.1:

    CREATE OR REPLACE FUNCTION
    answers_get(_review_id INT, OUT positive TEXT, OUT negative TEXT)
    RETURNS SETOF RECORD AS $$
    SELECT POSITIVE, NEGATIVE
    FROM ANSWERS
    WHERE REVIEW = $1
    ORDER BY ID DESC
    $$ LANGUAGE SQL;
  • Re:Not really FUD (Score:3, Informative)

    by cortana (588495) <sam@robo[ ]org.uk ['ts.' in gap]> on Wednesday April 18, 2007 @07:15PM (#18789781) Homepage
    I have to say that you haven't checked out PostgreSQL for far too long. None of your criticisms apply to any vaguely modern version of it.
  • Re:Not really FUD (Score:5, Informative)

    by gullevek (174152) on Wednesday April 18, 2007 @08:12PM (#18790545) Homepage Journal
    the Postgres speed was true until the release of Version 8. There it might such a huge jump, that was just unbelievable. You should give it a try.
  • Re:bullshit.. (Score:2, Informative)

    by 00lmz (733976) on Wednesday April 18, 2007 @09:01PM (#18791187)

    The 'poorly choosen nested-loop joins' are the decision of the fucktard programmer. And if he is blowing his app out from fucked up loops how does the database lay blame for that?

    Do you even know what a 'nested-loop join' is? Hint: It happens inside the RDBMS if the optimizer can't figure out a better way of joining two tables.

  • by umeboshi (196301) on Wednesday April 18, 2007 @09:45PM (#18791723)
    I suspected the same thing, as I have been using foreign keys for a long time in postgresql now, and I have never come across those problems. I read the section right above this and went pale. I was told that mysql got support for foreign keys a couple of years ago, but I now understand what they consider "support" for foreign keys.

    The real kicker for me was this sentence.

    The only nice aspect of FOREIGN KEY is that it gives ODBC and some other client programs the ability to see how a table is connected and to use this to show connection diagrams and to help in building applicatons.
    I always believed that relational integrity was the major "nice aspect" of foreign keys. I guess this really sets the record straight, and gives some of the best insight into the capabilities of the developers in making a good relational database.

  • Re:case study (Score:3, Informative)

    by rapiddescent (572442) on Thursday April 19, 2007 @04:57AM (#18795075)
    also, if you put this level of detail (see parent) into a stored procedure then it is much easier to unit test the database separately and hand over reponsibility of that unit test to the "database team". It allows the database element to have a version on it's own independant of the source code.

    I find that (in big teams, where multi-tier applications are being produced) it is far better to maintain tier separation and avoid having sql code in the app servers or worse, the client - because any minor database change has a ripple effect time and cost impact on other parts of the team.

    also; choosing this methdology makes offshore development easier where, perhaps for security reasons the database team is local and the application developers are somewhere else. I've seen this where the dba's and the sql coders where local because the client had risky non-functional requirements and preferred to keep the database people local.

    At work I use DB2 on Z and Oracle (including RAC) most of the time. Sybase seems to be getting rarer in the finance industry nowadays. I've never seen mysql or postresql deployed in a very large organisation (yet!). It would be nice to get slashdotters to say if they're using postresql or mysql in large organisations.

    rd

You know that feeling when you're leaning back on a stool and it starts to tip over? Well, that's how I feel all the time. -- Steven Wright

Working...