Some people eat, sleep and chew gum, I do genealogy and write...

Saturday, November 17, 2018

Where are all the genealogy ebooks?


During the past few years, there has been a considerable amount of discussion about the future of paper-based books in light of the development of digitized copies of existing books commonly called ebooks and the fact that a percentage of newly published books are being published only in digital editions. The discussions centered around the future of libraries, bookstores, and all others involved in the traditional book trade. However, the latest statistics on book sales are inconclusive about the future of paper-based books with some statistics that indicate e-book sales are declining and people are returning to printed books. See "Real books are back. E-book sales plunge nearly 20%"

I am certainly a proponent of digital books, especially for research purposes. It is absolutely easier to use a digital book that can be searched word by word than relying on an incomplete index or no index at all in a paper-based book. From the standpoint of doing genealogical research, the issue of digital vs. paper is irrelevant because of an extremely small market for either format. Most of the books that have genealogical research value are either old or of very limited print runs. For example, most of the "surname books" or books about a particular family or individual and his or her descendants or ancestors have perhaps a dozen or less than fifty copies printed. When these books find their way into a library, the library may have the only copy publically available. Even genealogy trade books have very small print runs. Almost all the books I have collected about genealogy over the years are out-of-print and will probably never be reprinted.

The fact that these genealogically valuable books are essentially "rare" makes finding them and using the information contained in the books very difficult. I recently found a reference to a book that apparently was in manuscript format and the only copy was in a library in Florida. I tried to obtain the book through Interlibrary Loan, but the library refused to send the book because they had the only known copy. Fortunately, I found a friend who knew someone who had access to the book and I was able to get the few pages I was interested in sent me as digital images. If that book had been digitized and made available online, I could have been spared the time and necessity of working through a network of researchers to find out whether or not the book was useful. In most cases, the existence of such a book would be well beyond the research resources of anyone who did not have direct access to that library in Florida.

But even supposing that a book is digitized, there is still a huge issue of how to find the book and gain access to the book. A paper book sitting on a shelf in a library somewhere in the world may be accessible but only if you can find the book in the library and only if you can physically travel to the place where the book is located. You might assume that digitizing a book makes it more accessible, but there are a number of barriers that prevent this from happening.

The first barriers to finding and accessing the information in ebooks are the arcane and overly restrictive United States Copyright laws. Unless a book is published as an ebook, there is no incentive for an author to reprint a book as an ebook unless they can maintain the same level of restrictions. Because the genealogy market is so specialized, there are few avenues for authors to publicize their books, so even if the author chooses to publish an ebook, there is little expectation of a high sales volume. In many instances, university libraries turn out the be one of the biggest customers. Libraries provide ebooks to those who have their library cards, but the most common ebook supplier, Overdrive.com, does not include more than a small handful of genealogically related ebooks to their inventory.

It is ironic that you can go to a large library, such as the one I live near, the Brigham Young University Harold B. Lee Library, and find thousands of paper-based, valuable genealogy related books on the shelves and as a non-student, none of them are available to me as ebooks.

Of course, the question that needs to be asked is whether or not there are any genealogically related ebooks at all? The answer is yes, there are hundreds of thousands available and they are concentrated in five major websites.

So where are the genealogically related ebooks?

Books.FamilySearch.org presently has an online collection of 372,477 ebooks. But not all of these are readily available. Here is an example of one of the warning messages that come up if you click on a restricted book:

Unfortunately, there are no readily available instructions about how you go about obtaining sufficient rights to view the item. If the "book cannot be viewed online due to copyright restrictions" then why have the book listed in the online catalog at all as an ebook? What is the point? It appears that the actual number of ebooks that are readily available in considerably less than the number claimed to be digitized and online. There are some readily available books, but you may become frustrated with the restrictions. I am going to write a followup post about how to find all these books.

The next online source for genealogically related ebooks is obvious: Books.Google.com. Google does not publish the numbers of publically available books and determining the number genealogically significant books is probably impossible. Notwithstanding the huge collection of books on Google, they are divided into the following categories as summarized by Wikipedia: Google Books:
The four access levels used on Google Books are:
  • Full view: Books in the public domain are available for "full view" and can be downloaded for free. In-print books acquired through the Partner Program are also available for full view if the publisher has given permission, although this is rare.
  • Preview: For in-print books where permission has been granted, the number of viewable pages is limited to a "preview" set by a variety of access restrictions and security measures, some based on user-tracking. Usually, the publisher can set the percentage of the book available for preview. Users are restricted from copying, downloading or printing book previews. A watermark reading "Copyrighted material" appears at the bottom of pages. All books acquired through the Partner Program are available for preview.
  • Snippet view: A 'snippet view' – two to three lines of text surrounding the queried search term – is displayed in cases where Google does not have permission of the copyright owner to display a preview. This could be because Google cannot identify the owner or the owner declined permission. If a search term appears many times in a book, Google displays no more than three snippets, thus preventing the user from viewing too much of the book. Also, Google does not display any snippets for certain reference books, such as dictionaries, where the display of even snippets can harm the market for the work. Google maintains that no permission is required under copyright law to display the snippet view.
  • No preview: Google also displays search results for books that have not been digitized. As these books have not been scanned, their text is not searchable and only the metadata information such as the title, author, publisher, number of pages, ISBN, subject and copyright information, and in some cases, a table of contents and book summary is available. In effect, this is similar to an online library card catalog.
Once again, you might get frustrated when you search for an ebook and you find that the ebook you want is in one of the 3 out of the 4 restricted categories.

Moving on, we find an online treasure: Archive.org. With 19,379,045 total ebooks and texts, Archive.org is the far ahead of even the closest competitor and of these 18,547,866 books and other resources are readily available and 672,961 are available to borrow online for up to two weeks. Here is the largest library you probably have access to from your home without any particular restrictions other than logging into the website. How many of these are genealogically significant? Tens of thousands. Why are all these readily available online? What happened to copyright? Nearly all of the items made available on Archive.org are in the public domain. No copyright restrictions. Why is this significant for genealogists? Because we like old stuff.

What can we do for an encore? How about MyHeritage.com? Well, according to their catalog, they have 447,870 books and publications on their website which are all completely searchable. But MyHeritage.com is a subscription website. But you could go to a Family History Center and use their institutional version of the program.

Where do we go next? The HathiTrust.org. However, even though this website is free online and currently has 16,797,545 total volumes, most of this collection is restricted to those associated with participating universities but 6,310,561 volumes are in the public domain and can be read and searched online.

Well, other than bumping up against copyright restrictions all the time, you can probably see that there are a huge number of possible genealogically significant books and other digitized texts online. Stay tuned for the next post about how to find all this stuff.

2 comments:

  1. MyHeritage Published Sources https://www.myheritage.com/research/collection-90100/compilation-of-published-sources is free. You don't need a subscription. The description says, "This collection includes a compilation of thousands of published books ranging from family, local and military histories, city and county directories, school, university and hospital reports, church and congregational minutes and much more."

    ReplyDelete
  2. I've used Archive.org, the Family Search ones and the Google Books ones but probably not to the fullest extent possible. The other sites I'm not as familiar with. I look forward to learning more about these. Thank you.

    ReplyDelete