I guess my first comment on this subject would be, don’t
hold your breath. But this is a real concern for genealogists and others and
cannot be dismissed quite that cavalierly.
This is especially true when there are some very large companies in the
world that have as a goal the digitization of every book and in some cases,
every record in the world.
If you believe it, the Wikipedia
article on Google Books claims there are exactly 129,864,880 known printed
books in the world. I always suspect very large exact numbers, especially when
it would seem that counting every published book all around the world since
moveable type was invented is highly unlikely. Of course, this number is wrong
the second one more book is published. This number was published back on 5
August 2010, so today it is wrong anyway. See “You
can count the number of books in the world on 25,972,976 hands.” How did
Google arrive at the number? See “Books
of the world, stand up and be counted! All 129,864,880 of you.”
But let’s assume that the number really is around 130
million or so. Could Google digitize all those books? Well, the answer is if
they had them available to scan, yes they could. According to some estimates, Google has
already scanned over 30 million books since starting in 2004 and has done over
10 million in that last year. At that rate, they would be “done” in about ten
years. But the real questions is not whether Google is going to digitize every
last book in the world, but whether or not someone or anyone is going to do so.
Of course if you think about it for a minute (or more as the case may be), you
will soon realize that there are some rather apparent insurmountable obstacles
to achieving this goal. There are the physical limitations of access created by
national boundaries and attitudes. Do you really believe that every library in
the world is just going to sit there and let Google (or anyone else) waltz in
and start scanning away?
Don’t think I have ignored the issue of copyrights. Really
copyright isn’t an issue with the digitization of books, it is only an issue
with what can happen to display or make the digitized books available online. I
give you an example of one problem. This problem hits home because it is
sitting in the Mesa FamilySearch Library. Many of the books in the Mesa
FamilySearch Library are essentially unique. They are very limited editions.
What's more is that they are extremely unlikely to been included in Google's
estimate of the number of books. So, the question about whether or not all the
world's books will be digitized is not a legal issue, neither is it a
digitization issue, in the end it is a totally practical problem of making all
of the books available to be digitized. Now, I should mention that many of the
books in the Mesa FamilySearch Library have already been digitized and are
already available online on FamilySearch.org. But under present policies and procedures,
the remaining books that are under copyright and unique or in limited editions,
will likely not be digitized ever. In this context ever means until the
copyrights run out and that is a very long time assuming that additional
extensions of the copyright coverage are not passed by the United States
legislature in the future.
So the answer to the question is 42.
Just in case that answer is not satisfying, here is the full
quotiation:
"Good Morning," said Deep Thought at last.
"Er..good morning, O Deep Thought" said Loonquawl nervously, "do you have...er, that is..."
"An Answer for you?" interrupted Deep Thought majestically. "Yes, I have."
The two men shivered with expectancy. Their waiting had not been in vain.
"There really is one?" breathed Phouchg.
"There really is one," confirmed Deep Thought.
"To Everything? To the great Question of Life, the Universe and everything?"
"Yes."
Both of the men had been trained for this moment, their lives had been a preparation for it, they had been selected at birth as those who would witness the answer, but even so they found themselves gasping and squirming like excited children.
"And you're ready to give it to us?" urged Loonsuawl.
"I am."
"Now?"
"Now," said Deep Thought.
They both licked their dry lips.
"Though I don't think," added Deep Thought. "that you're going to like it."
"Doesn't matter!" said Phouchg. "We must know it! Now!"
"Now?" inquired Deep Thought.
"Yes! Now..."
"All right," said the computer, and settled into silence again. The two men fidgeted. The tension was unbearable.
"You're really not going to like it," observed Deep Thought.
"Tell us!"
"All right," said Deep Thought. "The Answer to the Great Question..."
"Yes..!"
"Of Life, the Universe and Everything..." said Deep Thought.
"Yes...!"
"Is..." said Deep Thought, and paused.
"Yes...!"
"Is..."
"Yes...!!!...?"
"Forty-two," said Deep Thought, with infinite majesty and calm.”
― Adams, Douglas. The Hitchhiker's Guide to the Galaxy. Ballantine, 1980.
"Er..good morning, O Deep Thought" said Loonquawl nervously, "do you have...er, that is..."
"An Answer for you?" interrupted Deep Thought majestically. "Yes, I have."
The two men shivered with expectancy. Their waiting had not been in vain.
"There really is one?" breathed Phouchg.
"There really is one," confirmed Deep Thought.
"To Everything? To the great Question of Life, the Universe and everything?"
"Yes."
Both of the men had been trained for this moment, their lives had been a preparation for it, they had been selected at birth as those who would witness the answer, but even so they found themselves gasping and squirming like excited children.
"And you're ready to give it to us?" urged Loonsuawl.
"I am."
"Now?"
"Now," said Deep Thought.
They both licked their dry lips.
"Though I don't think," added Deep Thought. "that you're going to like it."
"Doesn't matter!" said Phouchg. "We must know it! Now!"
"Now?" inquired Deep Thought.
"Yes! Now..."
"All right," said the computer, and settled into silence again. The two men fidgeted. The tension was unbearable.
"You're really not going to like it," observed Deep Thought.
"Tell us!"
"All right," said Deep Thought. "The Answer to the Great Question..."
"Yes..!"
"Of Life, the Universe and Everything..." said Deep Thought.
"Yes...!"
"Is..." said Deep Thought, and paused.
"Yes...!"
"Is..."
"Yes...!!!...?"
"Forty-two," said Deep Thought, with infinite majesty and calm.”
― Adams, Douglas. The Hitchhiker's Guide to the Galaxy. Ballantine, 1980.
Because the 'Inside Google Books' article refers to "books of the world", I would hope that the count isn't limited to those published in the English language. However, there's no explicit mention of the issue James.
ReplyDeleteWhile I can easily imagine OCR works OK for other Latin-based languages, I admit I that have no knowledge of its usage for other scripts, e.g. Cyrillic, Japanese, Chinese, Korean.
There must be a large number of published works in the associated languages but you wouldn't find many of them in a US/UK library.
Maybe I'm just being pessimistic.
Hmm. I think I will look into the issue of OCR in alternative scripts. That sounds interesting. Thanks for the comment and the idea.
DeleteAll I can say is… this would require a lot of time to process metadata and other details associated with the digitized books, should it eventually come to pass. Although digitized books do pave the way to easier access, it’s a matter of how well they’ll handle the digital catalog and how extensive it will be that we can measure usefulness. Because digitizing for the sake of just having digital copies isn’t exactly productive.
ReplyDeleteRuby Badcoe
Yes, good point. But I think that most of the work has already been done by the libraries as far as metadata and cataloging.
Delete