Counting the World's Books 109
The Google Books blog has an explanation of how they attempt to answer a difficult but commonly asked question: how many different books are there? Various cataloging systems are fraught with duplicates and input errors, and only encompass a fraction of the total distinct titles. They also vary widely by region, and they haven't been around nearly as long as humanity has been writing books. "When evaluating record similarity, not all attributes are created equal. For example, when two records contain the same ISBN this is a very strong (but not absolute) signal that they describe the same book, but if they contain different ISBNs, then they definitely describe different books. We trust OCLC and LCCN number similarity slightly less, both because of the inconsistencies noted above and because these numbers do not have checksums, so catalogers have a tendency to mistype them." After refining the data as much as they could, they estimated there are 129,864,880 different books in the world.
8 or 9-place estimate (Score:2, Insightful)
estimate would be about 130 million, not 129,864,880
Re:8 or 9-place estimate (Score:3, Insightful)
Wow (Score:3, Insightful)
They should write a book!
Re:Seriously... (Score:4, Insightful)
Who cares? Does it matter?
Does anything?
Re:1 in 50 people wrote a book (Score:3, Insightful)
I suspect that most books have been written recently and their writers are still alive.
And I suspect that you are full of crap.
129,864,880 published books, that is. (Score:4, Insightful)
This estimate isn't bad for published works, but it does not adequately answer the question posed, ``Just how many books are out there?''
ISBN sucks for digital books (Score:4, Insightful)
ISBNs suck as identifiers for digital books, especially digital books that are free. There are two problems.
Problem number one is that they cost money. Let's say someone writes up a really nice manual documenting some open-source software. He wants the manual to be free, just like the software. But now if he wants an ISBN, he has to pay money to get the ISBN, which means expending dollars on a book that is not going to be bringing in any dollars. The fact that ISBNs cost money is out of step with the fact that we have this thing called the World Wide Web, which is basically a huge machine for letting people do publishing without the per-copy costs that are associated with print publishing.
The other problem is that ISBNs are supposed to uniquely identify an edition of the book. This makes sense for traditional print publishing, where the economics of production forced people to make discrete editions widely spaced in time. It makes no sense for print on demand or for pure digital publishing. I've written some CC-licensed textbooks. When someone emails me to let me know about a typo or a factual error, I fix it right away in the digital version, and I usually update the print-on-demand version within about 6 months. No way am I going to assign a different ISBN every 6 months.
We can say that ISBNs are for printed books, not for ephemeral web pages, but that doesn't really work. The two overlap. My textbooks exist simultaneously as web pages, pdf files, and printed books. Amazon sells a book for the kindle using one ISBN, assigning a different ISBN to the printed version. Print-on-demand books share some characteristics with printed books (e.g., they're physical objects) and some with the web (can be updated continuously).
By the way, why do you think library catalogs don't show ISBNs? It's because ISBNs are meant as commercial tools, like the barcode on a box of cereal. If google finds ISBNs useful for other purposes than selling copies of books, it's probably because google is trying to deal with a massive number of books using a minimum amount of human labor.