Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Books Education Google

Google Books As "Train Wreck" For Scholars 160

Following up on our earlier discussion, here's more detail on Geoffrey Nunberg's argument that Google Books could prove detrimental to academics and other scholars. Recently Nunberg gave a talk at a conference claiming that the metadata in Google Books is riddled with errors and is classified in a scheme unfit for scholarly use. This blog post was fleshed out somewhat a few days later in the Chronicle of Higher Education. Quoting from the latter: "Start with publication dates. To take Google's word for it, 1899 was a literary annus mirabilis, which saw the publication of Raymond Chandler's Killer in the Rain, The Portable Dorothy Parker, [and] Stephen King's Christine... A search on 'internet' in books written before 1950 and turns up 527 hits. ... [Google blames some errors on the originating libraries.] ...the libraries can't be responsible for books mislabeled as Health and Fitness and Antiques and Collectibles, for the simple reason that those categories are drawn from the Book Industry Standards and Communications codes, which are used by the publishers to tell booksellers where to put books on the shelves. ... In short, Google has taken a group of the world's great research collections and returned them in the form of a suburban-mall bookstore." The head of metadata for Google Books, Jon Orwant, has responded in detail to Numberg's complaints in a comment on the original blog post — and says his team has already fixed the errors that Nunberg so helpfully pointed out.
This discussion has been archived. No new comments can be posted.

Google Books As "Train Wreck" For Scholars

Comments Filter:
  • Error free system? (Score:3, Informative)

    by Bacon Bits ( 926911 ) on Monday September 07, 2009 @08:01PM (#29345201)

    So, the argument is that the new system is bad because it may have errors or bad data?

    Were card catalogs immune to this? It's a database. It's only as good as what you put into it. A bad database is not useful. It just means someone needs to do it better. Honestly, if anything this seems like an argument that the database shouldn't be proprietary. It should be open to everyone so that someone can always make a better version of the metadata with the same base data.

    "It's a piece of shit" shouldn't be the same argument as "nobody should even try it". The Wright brothers didn't exactly start out with a 747 or an F-35.

  • by Anonymous Coward on Monday September 07, 2009 @09:06PM (#29345657)

    WorldCat.org

    Find it on Google Books, look it up on there; Google Scholar if it is an article. I am a historian, and when I check citations (for journals or my own work), that is how I get it done.

  • Re:Card catalogs (Score:5, Informative)

    by Peter H.S. ( 38077 ) on Monday September 07, 2009 @11:03PM (#29346473) Homepage

    Well, organizing books by listing them in which city they are from (printed) is among the oldest way of cataloging printed books. The practice goes back to Gutenberg and the so called "incunabula" period where book dealers/printers/publishers (often the same persons) would make book catalogs out a certain city. So if you needed a certain edition of a title, you would have to track it by such book catalogs, since the Leipzig edition would be different from the Mainz edition.

    It is of course sad that once such common knowledge among scholars now seems forgotten, probably not a hindrance when working with modern sources, but still necessary to know when working with old stuff, just like knowing that words/names starting with J were filed under I etc.
    Many academics still puts the printing city in their sources, though many seems to have forgotten why they do so.

    You just happened to stumble into a book /journal catalog organized by a centuries old and previously very well known method. The error wasn't in the card catalog or the way it was organized, but in that no one ever told you about these ancient methods in your library course.

    --
    Regards

  • by RandomUsername99 ( 574692 ) on Tuesday September 08, 2009 @02:29AM (#29347679)

    I worked for the Harvard Law School Library and saw such a work in progress for the documents used in the Nazi war crimes tribunal at Nuremberg. The process of putting this together was extrordinarily expensive and even with the HLSL donating the Server, Traffic, labor to maintain the back end code (which it still does), etc. the project ran out of funding 13,904 scans in and is currently seeking funding.

    Although the metadata surrounding the scans of these books would not have to be nearly as detailed, it's worth noting that google is not a non-profit organization with a set of gigantic grants for book preservation. They needed to put together something that would make enough money to at least fund its own existence immediately.

    Why did they bother? Is it enough that it's useful to many people even if it's not useful to everyone?

    One could certainly put together the electronic preservation project of everyone's dreams... I wouldn't be surprised if some very smart people somewhere in academia have already designed it. Sooo if you would be so kind as to cut them a check so it doesn't have to be up to a company who's worried about it being a financially solvent program from a business perspective, I bet they'd start tomorrow.

  • by caitsith01 ( 606117 ) on Tuesday September 08, 2009 @02:52AM (#29347819) Journal

    Why did they bother?

    Why did you bother to comment on it? If you don't like it - don't use it.

    You are clearly ignorant of the key problem with the Google books settlement (as it currently stands), which is that Google and only Google will be given the right to reproduce orphaned works. I assume the morons tagging this "caveat emptor" are also ignorant of this.

    So your glib remark should more correctly read, "if you don't like it, never have access to millions of pages of orphaned copyright works again because Google has an exclusive licence to reproduce them electronically". Which doesn't quite work as well, really, does it?

A morsel of genuine history is a thing so rare as to be always valuable. -- Thomas Jefferson

Working...