Is the total amount of information in existence on the world finite or infinite?
The question may seem trivial but I can assure you I am serious. I am using the term "information" in the sense used by computers to digitally store that information. The question of the amount of information in the world has been extensively considered and it is definitely finite, but very large. I started with this question demonstrate that the sum of our present individual knowledge about the world's information has to be very, very small compared to all of the information available.
In 2015, the total number of active mobile connections surpassed the total world population. It is also estimated that 46% of the world population or about 3.3 billion people are active Internet users. The total world population is just over 7.3 billion people. There are around 1 billion websites. Google is estimated to have over 46 billion indexed webpages.
Some of the largest online family tree programs claim over 2 billion individual entries (many of which are duplicates) and population figures show that in 2015 the number of people who die will reach over 56 million. It is estimated that just over 100 billion people have lived on the earth. Even my own very conservative estimate is that just the four largest genealogical data websites have records of over 5 billion people.
Genealogists are fond of pointing out that "not all the world's records are online" but those same people really have no idea (nor does anyone for that matter) about how much information really is online. On the other hand, the statements about the remaining paper-based records yet to be digitized are true. For example, as far as I can ascertain, only about 30% of the microfilmed records on FamilySearch.org have been digitized and of those records, only about 10% have been indexed. I have no way of verifying these figures, but my estimate is based on several "off the record" conversations.
Google has estimated the total number of books published in the history of the world to be about 129 million with just over 2 million new books being published each year.
I am writing here about massive amounts of data. No one person can possibly even imagine effectively searching all that information. Some genealogists are also fond of speaking and teaching in terms of "evidence" and "proof." Effective and relatively exhaustive genealogical research today requires a set of complex skills that include the ability to do primary research in record repositories and also requires the same level of competence in searching online. The time when any one person could possibly know all about even the records for one limited geographic area is long past. Even though I spend a considerable part of almost every day online and in a major library, I regularly learn about new-to-me extensive historical resources. It is literally impossible for me to remember all of the information sources I have available.
In addition to the vast store of information available, genealogists in general are just now beginning to incorporate DNA testing as part of their research considerations. It only takes one direct line ancestor whose parentage was undisclosed in the available records to completely revise even the most carefully crafted genealogical proof statement. Even then, DNA testing is still in process of being fully developed much less utilized.
I have seen some very intense sessions of cross-examination in my years in court, including cross-examination of "expert" witnesses. I often wonder how well some of the genealogical experts would hold up under such pressure especially of the seasoned trial attorney had done his or her homework about the expert's area of expertise. When I have mentioned this subject before, I have often received comments about the "intense" peer review process some genealogists go through before publication in scholarly journals. I am also well aware of that process and know that the "peers" reviewing the proposed articles seldom know much about the subjects they are reviewing. Peer review frequently involves more issues of formatting and style than real content. For example, when I had my Master's Thesis review by my department's committee, they all admitted they had no idea what I was talking about and had no criteria to judge whether or not my conclusions were correct.
Genealogical researchers put themselves in the same position. An individual researcher is usually the only person who has extensively researched a particular pedigree line. Who can say if his or her conclusions are correct or not without repeating all of their research?
The reason I started out with some observations on the amount of information presently available and the size of the Internet is to demonstrate that certainty in any particular area of research is long past. Mankind has always debated an individual's ability to know anything with certainty. One interesting study in the journal Neuron, found a correlation between the certainty of our choices and the time spent making the decision. See "Certainty in Our Choices Often a Matter of Time, Researchers Find." See also the following:
“Decisions as a Window on Cognition.” Being Human. Accessed December 10, 2015. http://www.beinghuman.org/article/decisions-window-cognition.
One thing I did learn after spending years in a court setting, is that our personal degree of certitude and our ability to convince others we are right are much more persuasive than facts.
You might conclude from my writing that I do not agree with much of the writing on the subject of genealogical proof. You would be right in part. I do agree with the the commonly accepted methodology of doing genealogical research. I also recognize that any researcher has to arrive at some conclusion. But I am fundamentally iconoclastic. I see the world of information changing in dramatic ways but I do not see the genealogical community generally recognizing those changes. The technologists who are crafting and running the huge online websites are, for the most part with some notable exceptions, not experienced genealogical researchers. At the same time, those people who are widely acclaimed as genealogical experts and authorities, have little technical training or competence. We need to begin merging the two aspects of our genealogical community.
I have several friends and acquaintances who are prominent employees of large genealogy companies with the responsibility of overseeing huge genealogical databases. Many of these people have never been inside of a genealogical library or done any of their own genealogical research. In effect, they are writing and developing programs for a subject they know nothing about. At the other end of the spectrum, I know recognized genealogical authorities who barely know how to turn on a computer.
Of course those who suppose themselves to be technologically savvy and are experts in online matters look at me and consider my opinions to be uniformed and naive. Those who are the core "experts" in genealogy (who I irreverently call the "letter people") also consider me to be an outsider who really doesn't know what he is talking about. But I do not have to be an Emperor to point out that the present ruling one has no clothes. What I do bring to the table here is years of experience, day after day, doing research for and assisting hundreds and thousands of patrons and answering questions. You will have to decide if my perspective is valid or not.