As genealogists, we are all winners in the race to digitize all the world's records. So where do the major websites stack up? Who has the books and records we need to do our research?
Currently, there is a hot topic in astrophysics concerning the amount of dark matter in the universe. Estimates vary but some believe that up to 80 percent of the mass of the universe is made up of material that scientists cannot directly observe. This unobservable mass is known as dark matter. Well, as genealogists we have our own "dark matter." We have billions of digitized records online and for most genealogists, this vast collection of data goes unobserved. I am frequently mentioning large online collections to fellow researchers who are totally unaware of the existence of these records and books. So it is probably time to write up another summary of these vast online resources.
The website with the most "dark matter" is the Google Books website. There is no apparent way to determine exactly how many items are digitized on this website, but estimates run to the millions. I can do a search for the term "genealogy," but there are other uses of the term not related to our type of research.
This history of Google Books from Google only includes information up through 2007. Online estimates of the number of books range from 25 million to over 30 million. The main limitation for genealogists is that the books are divided in availability by copyright restrictions both real and assumed. As with all of these large websites, finding what you are looking for is a major challenge.
The two websites where I find the most useful books and other items are Archive.org and the HathiTrust.org. Here are the numbers:
- Archive.org -- 12,839,057 and there are 550,000 modern ebooks that can be borrowed with a free archive.org account.
- HathiTrust.org -- 15,750,409 with 5,898,787 volumes in the public domain. The limitation here is that only those who are part of participating universities and colleges have complete access to the collections.
FamilySearch.org has 364,088 books in its Books collection. These are all genealogically significant books. Again, there is a limitation in that some of the books are not readily available on the website unless you are in Salt Lake City at the Family History Library or in one of the Family History Centers. These books are in over 130 different languages.
The idea here is that large eclectic collections of books and other items will inevitably contain many genealogically significant items.
Of course, there is one really big collection out there, Trove.nla.gov.au, the National Library of Australia with 532,915,906 items. The Europeana.eu website with 53,335,287 items that include artworks, artifacts, books, videos, and sounds from across Europe. The key here is that you can never know what is in these huge websites unless you spend some time getting to know how to search for their contents. The United States is trying to catch up with these large collections through the Digital Public Library of America or DP.LA with a fast growing collection of 16,619,411 items.
It is time to give a list of links to some of the other larger collections;
- 250+ Killer Digital Libraries and Archives
- Congregational Library and Archives
- List of digital library projects
- Harvard Library, Library Research Guide for History
- British Library Labs, Digital Collections
- Bodleian Libraries, University of Oxford, Digital Collections
- GrandValley State University, Government Documents: Historic Documents
- Russian Digital Collections
You might remember that nearly every university and college in the United States has its own digital collections. You can always search in every state for such collections.
I could just keep on listing more and more websites. The point is that there is a lot of "dark matter" missed by genealogists doing online research.