Some people eat, sleep and chew gum, I do genealogy and write...

Wednesday, September 30, 2015

Where are the probate records?

The question in the title of this post, about the locating probate records, came up this week at the Brigham Young University Family History Library. First, just in case you don't know, probate is the process, usually by a court or magistrate, of transferring title to a person's heirs after death. The records of these transactions can be extremely valuable for genealogical research. During the next two weeks, or so, I will be recording a series of nine videos on the subject for the BYU Family History Library YouTube Channel.

So, where are the records? The answer could be as simple as doing a search on Ancestry.com or FamilySearch.org or as complex as locating a courthouse and manually searching boxes of records in a basement or attic. The reason why I decided to write about this subject is that searching for probate records is a microcosm of the entire process of locating records. If you know how to find probate records, you can use the same technique to find just about any type of record in existence.

The first step in finding any genealogically significant record is identifying the physical location of an event associated with the ancestor's death. Probate records are most likely associated with the place where the person died.

It may be necessary to do extensive research into other types of records in order to determine where the person died. The court or other entity handling the probate is the one that had jurisdiction at the time and place of the event. What I mean by jurisdiction is the ability of the court to hear a probate action. The rule is that probate actions were conducted in a court handing such cases according to one or more of the following:
  • The place where the death occurs, or
  • The place where the property is located, or
  • The place where the deceased resided, or 
  • The place established by a will or other testamentary document
 The import of these requirements is that probate records may be difficult to locate. They may require an extensive amount of supplementary research into the events of the deceased's life in order to establish a place to begin the search for the court records. Once the places that established for the death then it is necessary to determine the court that would've had jurisdiction at the time. For example, if the deceased died in Utah, some historical research will disclose that the county courts were established to hear probate matters. 

Once you determine which court had jurisdiction over the probate matters, you must then determine the location of the court records. As I mentioned above, they could be online or in boxes in the courthouse. The records could also be stored in the state archives, state historical society, in a university special collections library, or in almost any other record storage location.

Most researchers think that by searching for a year or so around the death date, they have determined whether or not there was a probate. I suggest, that the proper way to solve both the existence of a probate action and the location of the court is to examine land and property records. If a deceased's property passed through a probate action, it is very likely that there was a deed recorded transferring property to the heirs or to a creditor. If the data is found with the designation of administrator's deed, and executor's deed or a deed of distribution, then you are put on notice that there was a probate action filed.

Unless records have been accumulated in a central repository and then digitized, finding the records will need to follow essentially the same process I outlined here.
  • Identify the location of an event in the deceased's life
  • Identify documents associated with the event
  • Identify the location of the documents
  • Search for the information needed
Due to the fact, that the online genealogical databases are adding in millions of documents on a regular basis, it is very important to continue looking online.

In discussing this topic, the question is commonly raised as to whether a very poor individual was required to have a probate action? The answer is that whether or not appropriate was required depends on the time of the event and the jurisdiction involved. Early in the history of the Americas, it was not unusual to find a probate action for someone with very little personal property. More recently, the states have established thresholds for the filing of a probate action. Very poor people generally do not have to file probate actions.

Tuesday, September 29, 2015

Gathering Fruit From Family Trees

One of the more controversial issues in the very non-controversial area of genealogical research is the validity of the information in online family trees. There are millions of these individually accumulated trees online and the entries vary from well documented to pure invention. I am reminded of this issue every time I open MyHeritage.com and see that I have over 100,000 Smart Matches. Here is a screenshot showing that number:


You might also note that I have 14,425 Record Matches, but that is another issue. What am I supposed to do with over 100,000 suggest relatives? That is almost the entire population of Provo, Utah, where I now live. This large number comes from the fact that I uploaded my entire file into the MyHeritage.com website many years ago and I have been actively working with the file since then. The large number is really an indication of how well the program works.

The obvious answer is to focus only on those particular family members you are currently interested in researching and ignore the rest. But this begs the issue. The real issue underlying all of these potential connections is that lurking out there might be the solution to some of my end-of-line issues. What is lacking is time to work on everything at once. It is all too easy to dismiss lowly "family trees" as beneath the consideration of a true research genealogist. Don't we have much better things to do with out time than mingling with the unwashed masses?

Whenever I have written about the issue of the proliferation of online family trees, I have always had comments about how important it was to use the unsupported information in those family trees as the basis for research. That concept works (although I think it is mostly a waste of time) as long as there are only a few such research suggestions. But now, with over 100,000 such possibilities out there from just one program, there needs to be a workable strategy that does not involve just sticking my head in the sand and ignoring the issue altogether.

In theory, collaboration is a positive idea. It is a way to avoid duplication of effort and crowdsource efforts to clean up the data, but in face of this reality, there is a point when the crowd gets too large to manage. In the above MyHeritage situation, I could literally spend all my time just confirming Smart Matches and communicating with the huge mass of connecting family trees. How do I sort out those who have just copied their entries and those who have something legitimate to offer?

At the other end of this spectrum of family trees is the FamilySearch.org Family Tree and other such constructs such as WikiTree.org, WeRelate.org and Geni.com where there is an attempt to avoid unnecessary duplication. In MyHeritage.com, for example, if I sort by the number of matches for individuals in my family tree, I have a lot of ancestors with close to 200 matches each in other family trees in the system. Most of these matched individuals are 10 or 11 generations back in my own family tree and lived in the 1600s. Even if I wanted to investigate one of these ancestors, it is very likely that serious research needs to be done in that particular family line before getting back to this particular ancestor. It would be nice if there were some way to determine those connections that had information I did not have and those that were merely copies.

My present strategy is to concentrate on one ancestral generation at a time. I am now working on my 6th generation and adding all the sources and correcting the entries from the sources. Obviously, this is a never ending task, but since I am making the changes and adding the sources to the FamilySearch.org Family Tree, I am making some progress. I am also adding sources from Ancestry.com, MyHeritage.com and Findmypast.com, as well as many other programs, microfilm records and my own pile of paper records.



Call for Papers for South Davis Family History Fair

The Utah Genealogical Association's annual South Davis Family History Fair is now calling for papers for the event to be held 23 April 2016 at the Woods Cross High School, 600 West 22 South, Woods Cross, Utah. Here is the information about the proposals and the conference:
Each presentation will be 60 minutes in length which includes time for questions and answers. Presentations should reflect the latest status of research and publication on the topic. The deadline for proposals is Monday, November 16, 2015.

We welcome proposals that allow participants to gain new skills and information in the following:

 Getting Started: Those new to family history or who have never done research, or other beginner topics

 Online Research: Using computers, technology and the Internet for family history research, Genealogical websites, etc.

 Research Methodology: Beginning, intermediate and advanced research methodology in an area specific region in the world. Including pedigree analysis, evidence evaluation, tracing immigrants, LDS research, records sources, etc.

 Technology: Family History databases and programs, getting teenagers involved, Facebook, Twitter, blogging, APPs for Smartphones, IPADs, YouTube, EBooks, digital photography, audio recording, etc.

 Family History: Family organizations, family collaboration, writing a personal or family history, editing and publishing family history, etc.

Proposals must include:

 Full name of the presenter, current e-mail, telephone number

 A brief biographical sketch of the presenter for the syllabus (50 words maximum)

 Title of the presentation

 Short class descriptions (50 words maximum)

 Lecture experience

Compensation:

Speakers participating in the Conference will receive:

 Complimentary registration

 Free lunch

 Computers, projectors, and Internet access will be provided for speakers to use for their presentations.

Please e-mail presentation proposals in Microsoft Word or .PDF format to Ginny Ackerson at ugaconferences@gmail.com no later than Monday, November 16th. Completed syllabus must be submitted no later than Thursday, March 31st.

Monday, September 28, 2015

Dealing with VLPs (Very Large Pedigrees)

There is a tipping point in genealogical research that occurs when the individual records in your personal database hit a certain number. This number varies from person to person, but is usually around 5,000 or so. I first encountered the very large pedigree issue when I inherited over 30 years worth of records from my great-grandmother. My first reaction was, who are these people? Today, with an additional 20 years of accumulation and having gone into hyper-speed with online family trees, I am still asking the same question.

Roughly speaking I see the following transition points as I work with genealogists around the world.
  • Very Small Pedigree -- 0 to 100 names
  • Workable Small Pedigree -- 100 to 1000 names
  • Developing Large Pedigree -- 1000 to 5000 names 
  • Very Large Pedigree -- 5000 names or more
  • Extremely Large Pedigree -- Any pedigree over 10,000 names
I have seen files that had well over 100,000 names. At this level, the file reaches critical mass and begins an implosion. Usually caused by the natural process of pedigree collapse, the number of duplicate and unrelated entries outnumber the actual number of valid relationships. The data problems and inconsistencies begin to move like viruses through the data. 

The fact is anyone can create a file with over 100,000 names in matter of hours of consistent work online. In my case, I could do this easily by using programs I have on my computer for downloading generations of names from the FamilySearch.org Family Tree. This can be done by disregarding accuracy and any demonstrable relationship. For the life of me I cannot understand why I would want to do this.

One of the first questions I always get when I mention genealogy is "How far back have you gone with your genealogy?" The second question is "How many names to you have in your file?" These are asked as if genealogy were some sort of competition. As a matter of fact, I don't know the answer to either one of those questions and I am not going to take the time to find out. A much more appropriate question should be "How many individuals have you sourced and verified?" But even that question seems to beg the point of what I do anyway. 

Going back to my great-grandmother and her genealogical efforts, after thirty years of accumulation she had recorded most of her lines three separate times. That much paper made it impossible for her to see that she had done the research previously. Duplication of effort became her biggest problem and it is likely that she didn't even know this had happened. 

One of my persistent themes in teaching classes is that the researchers should verify every link in every family line. Online family trees make this possible, but the common practice of ignoring sources suggested by these programs makes the activity difficult. If I go to the FamilySearch.org Family Tree, an accumulation of all of my family's genealogy for the past 100 years or so, I have yet to find a line that does not end with some ridiculous and unsupported factual assertion with no sources. My Tanner line ends with Francis Tanner, b. 1708, d. 1777. After that, the information is garbled and lacking in sources. Some lines go back further, but inevitably they end. Interestingly, in the FamilySearch.org Family Tree, all of these lines continue back into the dim past. That same Tanner line continues back to a Matthew Tanner, b. abt 1510 in Wiltshire, England and d. 1565. The interesting fact in this supposed line is that the line goes from a William Tanner, b. abt 1608 in Kent, England to Wiltshire, without any supporting connection.

Oh well, that is a constant background to all I do. But the point here is that size really does matter. The question is, how do we deal with our data when we hit the Developing Large Pedigree stage or if we inherit a VLP?

This question is the same as the one that asks how we eat an elephant (not that anyone is going to try this anytime soon). The answer is one bite at a time. We need to have the intestinal fortitude to "prune the tree." Cut off those parts that really don't have any demonstrable relationship and focus on the real issue of sourcing the information we already have. New individuals will be a natural result of careful, systematic research. I would so much more appreciate some researchers who verified the Tanner line on the Family Tree with valid information and forgot about adding more names from English Parish Registers gathered willy-nilly from different parishes. 

I will likely come back to this subject as I have in the past quite a few times. 

Sunday, September 27, 2015

What Do the Reviews Have to Say About Genealogy Programs?

I have mentioned GenSoftReviews.com a number of times in the past. This website is one of the very few places where you can go to get user views of genealogy software products, both online and desktop based. After seeing a few reviews in my blog reader list, I decided to see what they had to say about some of the popular programs. Keep in mind the fact that any program with only a few reviews is subject to question. With reviews, more is always better. Here is my summary with a few comments of my own.

MyHeritage.com
There are 218 total reviews of MyHeritage.com and it is at 4.22 stars out of 5.0. The most recent reviews are most complimentary about the support from the company. Where else do I see that? Some of the reviews characterize customer support as "fantastic," "personalized help," and "simply the best." Some of the other positive comments include statements such as "ease of use," "intuitive software," "their instructions were perfect," and so forth. Not all the entries are uniformly so bubbly and positive, but by and large the almost five stars seems warranted. One comment caught my eye, "I am thrilled with “Discoveries” as it gave me an ancestor earlier that my earliest! WOW!"

Customer support is mostly very spotty in the online genealogical software business. It is nice to know about something positive. Family Tree Builder, the MyHeritage.com free desktop program is rated at 4.42 stars, putting in the high range for all programs. Again, excellent customer support is the big factor.

I found no reviews for either Ancestry.com or FamilySearch.org. There were only three reviews for Findmypast.com, all of which were quite old. There were no reviews for AmericanAncestors.org. I find that interesting, but puzzling.

As of the date of this post, GenSoftReviews.com has 3331 reviews of 931 products. Here is one that is not quite so highly ranked.

Family Tree Maker - Current Version (This is Ancestry.com's desktop program)
With a ranking of 1.69 stars out of 5.0, you have to wonder what Ancestry.com is doing with this program. There are 466 reviews so the "average" can't be skewed by a few sour notes. The Mac version of this program is rated even lower at 1.5 stars with 116 reviews. Complaints include the following statements, "can't get the program to open," "my tree shuts down constantly," "lack of compatibility," and many comments that there are no pros. I have thought about this issue for some time. I think the program is OK. But I realize that they have updated the program almost every year, often with a completely new version. Many of the complaints compare the present program with previous versions and do so in an uncomplimentary fashion.

The next review is one of the great genealogical mysteries of all time.

Personal Ancestral File (PAF)
What can I say? Personal Ancestral File hasn't been updated now for 13 years. It is unsupported and has few of the features of the new programs. It does not connect with any online program at all. I could go on, but the reviews give this program 4.68 stars. It seems that all of the other programs are still competing with the free version of PAF. This comment summarizes the users' opinions: "It does most of what you want and may not be perfect but I haven’t found one that is yet." There are only 46 reviews. The reviews for this program illustrate the reason why I do not do software reviews per se.

Legacy Family Tree 
At 2.59 stars, this program must be experiencing some problems. The reviews from before 2014 were at 3.95 but the reviews for 2015 are at 1.77. User support seems to be one big factor. Another big factor seems to be lost data on imports. This confirms several reports I have heard from users first hand the last few weeks. The lost information seems to happen when you update the program. This is an excellent reason for reading reviews.

Ancestral Quest
This may well be the highest rated desktop genealogy program at 4.76 stars but it only has 26 reviews. Apparently, the people who buy it, like it. It has been as high as 4.98 stars in the past.

RootsMagic.com
Here is another very popular program. At 4.34 stars overall and 4.79 for 2015, it is a close race with Ancestral Quest. There are a 93 reviews, which makes the rating a little more solid. The biggest pros seem to be full featured and easy to use. They also have customer support. My blogging friend, Randy Seaver, left a five star review.

There are a lot more. Take a look for yourself.





Genealogy and Probabilistic Record Linkage

Record or data linkage is one of the critical issues in genealogical database construction. Now, before you nod off to sleep about this topic, let me point out that this is exactly what genealogists are involved in doing every day they do research. As genealogical researchers, we are immersed in the issues of data cleaning, removing duplicates, merging individual level datasets and other record linkage activities.

How do you go about recognizing two records in two files that represent identical people? Additionally, how do you go about recognizing the existence of duplicate individuals in the same program? Here are some examples of why I am writing about this subject:


This screenshot shows the results of searching for a duplicate for a person named John Bryant identified by a Person Identifier of LHP9-RZP. This is from the FamilySearch.org Family Tree. How do I determine which, if any, of these suggested duplicates are actually duplicates? Could I design a computer program that would determine the correct solution to this problem? The limitation in designing such a program for genealogy lies in the details present in the records. There are several possible issues with original historical records:

  • The information in the record may be complete and accurate
  • The information my be incomplete and accurate
  • The information may be incomplete and only partially accurate
  • The information may be incomplete and inaccurate
Researchers are faced with this challenge every time they find a name in a record and have to decide whether or not the found record should be included, in whole or in part, into an existing database. 

Let me put this into a hypothetical situation. Suppose that I am doing research to find information about an ancestor. I find an entry in a parish register. I have several variables that need to be considered. Such as the following:
  • Spelling variations
  • Variations of dates and places
  • Variation the identity of associated individuals
Depending on the amount of information in the original record there could also be many other variables. Probabilistic record linkage involves assigning various degrees of possible linkage based on those factors which agree or disagree with what we consider to be accurate. At this point, I should note that this particular issue is usually discussed in the context of managing large databases. Due to the fact that there are numerous possibilities for error, this whole process is really at the core of the accuracy of genealogical research. The challenge is whether or not a computer program can be designed to accurately make these determinations.

It is apparent that some online genealogical database programs have achieved a high degree of accuracy, at least in the area of finding records that match those individuals in a particular family tree. The degree of accuracy depends heavily on the amount of information present in the original record and the accuracy of the information already in the family tree.

For this process to work properly, it is absolutely necessary to go through any database and clean up the data. Returning to my example above involving John Bryant, here is an example of the information that is presently in the Family Tree.


 If you examine this data closely you'll see some anomalies. First, the birth and christening dates are the same. Second, the death and burial dates in the same. This could happen but it is unlikely. We should also note that there are three alternate names, each of which is designated as a "Birth Name." One of the names is "Thomas Bryant." It is very likely that this is not the same person. To understand why these birth names exist it is important to review the history of the entire program. Without doing so, I can simply conclude that someone has made a mistake. If I delete all three of these "birth names" what will be the consequences? One of three options appears to be a spelling variation. This could indicate that the information contained in the "Vital Information" section is inaccurate. In this particular case there are a number of sources listed. Can these questions be answered by examining the sources?

If we focus on the dates involved in this particular example, we would realize that spelling variations in names should be expected rather than being the exception. The real question here is whether or not all of the considerations that go into resolving the apparent problems with the data in this particular entry could be programmed into a computer? Haven't we really gotten to the point where we need to have some additional information? In this particular record, the question most certainly arises upon examination of the sources when we find that one of the children, Sarah Bryant, was born after her father died. By the way, all of the source records show the spelling of the surname as Briant.

 In the context of a family tree, even before I began making any corrections to this record back in 1730, it is absolutely necessary that I correct the information for more recent individuals to have some assurance that I'm actually related to this individual. Far too many genealogical researchers rely upon information which they have inherited from others. We must also remember that in this particular case, there were three potential duplicates. Any potential matches between this individual and a record depend entirely upon the accuracy of the information already in the family tree. Presently none of the larger online databases or individual genealogical databases provide "pruning" activities that show you exactly where your family tree ceases to be accurate.

The FamilySearch Family Tree attempts to do this with icons indicating grossly inaccurate information but ultimately the corrections of the data rely upon the individual judgment of the researchers.

 If you would like to get into some reading about probabilistic record linkage here are a few references:

Australian Bureau of Statistics. Assessing the Quality of Linking School Enrolment Records to 2011 Census Data: Deterministic Linkage Methods, Dec 2013 Research Paper. Canberra: Australian Bureau of Statistics. http://www.abs.gov.au/ausstats/abs@.nsf/cat/1351.0.55.045.

Batini, Carlo, and Monica Scannapieca. Data Quality: Concepts, Methodologies and Techniques. Berlin; New York: Springer, 2006.

Dong, Xin Luna, and Divesh Srivastava. Big Data Integration, 2015. http://dx.doi.org/10.2200/S00578ED1V01Y201404DTM040.

Fair, Martha, Statistics Canada, and Canadian Perinatal Surveillance System. Validation Study for a Record Linkage of Births and Infant Deaths in Canada. Ottawa: Statistics Canada, 1999. http://www.statcan.ca/cgi-bin/downpub/listpub.cgi?catno=84F0013XIE.

Herzog, Thomas N, Fritz Scheuren, and William E Winkler. Data Quality and Record Linkage Techniques. New York: Springer, 2007.

Machado, Carla Jorge. Early Infant Morbidity and Infant Mortality in the City of São Paulo, Brazil a Probabilistic Record Linkage Approach, 2002.

Machado, Carla Jorge, and Kenneth Hill. Probabilistic Record Linkage and an Automated Procedure to Minimize the Undecided-Matched Pair Problem Relacionamento Probabilístico de Dados E Um Procedimento Automático Para Minimizar O Problema Da Incerteza No Pareamento de Registros. [Rio de Janeiro]: SciELO, 2004. http://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=178144.

Newcombe, Howard B, John D Abbatt, and Eldorado Resources Ltd. Probabilistic Record Linkage in Epidemiology: Computer Methods for Searching Death or Cancer Files Yield Risk Data for Large Cohorts. Ottawa, Ont.: Eldorado Resources Ltd., 1983.

Statistics Canada, and Statistics Canada International Symposium on Methodological Issues. Symposium 2010 social statistics: the interplay among censuses, surveys and administrative data : proceedings = Symposium 2010 : statistiques sociales : interaction entre recensements, enquêtes et données administratives : recueil. [Ottawa]: Statistics Canada = Statistique Canada, 2011.

Saturday, September 26, 2015

A Few Things I Learned Today That I Didn't Know Yesterday

Sometimes I glean some of the most interesting things from the Web but there is not much I can say or comment about. Today seems to be one of those days. Here are a few things I found out the past few days reading my genealogy blog feed. I had to check my calendar on some of these to make sure it wasn't April. Some are noteworthy, some are just silly.

YouTube is a genealogy blog. Yes, I didn't realize that this vast collection of video from around the world was actually a genealogy blog. See Best Genealogy Blogs of 2015 from FamilyTree Magazine.

On that same website, I learned that podcasts are also genealogy blogs. See Best Genealogy Blogs of 2015 from FamilyTree Magazine. I guess I need to expand my idea of a blog.

I learned that Yogi Berra died. See Yogi Berra, 1925-2015.

I learned that Facebook was working with the UN to bring Internet access to refugee camps because Internet access is a fundamental human right. See Facebook working with UN to bring Internet access to refugee camps. What can I say? What can I say?

The U.S. Copyright Office issued a report entitled "Orphan Works and Mass Digitization." Apparently the Copyright Office has recognized what many of us have been saying for years that the copyright system breaks down when we talk about digitizing a lot of books and records. I may actually have something to say about this when I get a chance to read the entire report.

Resales of year-old iPhones are starting to heat up. See CNET news.

Mary Queen of Scots Is Cleared of Murdering Her Husband Just 428 Years After She Died

I found out the first person in history whose name we know. His name is Kushim. I can't wait until someone puts him in their family tree. 

Well, I had to stop sometime. 

Go the Distance -- Living in the World of your Ancestors

One of the most common errors made by inexperienced genealogical researchers is to cross jurisdictional boundaries without realizing that the two families found in the record could not be related because of the distance between the places. I consistently look up every recorded location when someone asks me to help them find an ancestor by using Google Maps to determine how far apart the locations are on the map. I gave an example recently of some sources added to the FamilySearch.org Family Tree that involved a family that lived in Huntingdonshire, England in the early 1800s. Someone had added a family that appeared on a parish register in Manchester, Lancashire. A quick look at the distances involved shows that Lancashire is over 100 miles from Huntingdonshire. The father of the family was born in 1808.

This type of situation is sometimes dismissed by noting that the error is one of someone with the same name being the same person. But this is really a situation requiring a great deal more analysis than simply recognizing that the two families are not the same. The fact that people traveled and moved from one location to another is not at all unusual. The real question is whether they did so solely for the purpose of having children christened or for marriages or burials. In other words, are there other substantiating circumstances that support multiple locations. The same family in England with the questionable locations for christenings ends up moving to Australia.

Travel alone is not a deciding factor in determining whether or not two families are the same. But in all circumstances where the given locations appear to be different for different events such as births, marriages and deaths, it is imperative to determine whether or not the multiple locations are reasonable. Here is the rule:

The further you go back in time, the more likely it is that events occurred in physical locations closer together.

In our modern world of rapid transportation, we are used to the idea of traveling great distances in a very short period of time. Several of my friends travel extreme distances to commute to work. Some travel regularly across the United States and others travel around the world. It may seem obvious, but as you go back in time, transportation options were slower. Here is a series of maps from the following book that illustrate the time it took to travel by the fastest means possible during different time periods. The book is:

Paullin, Charles Oscar, John Kirtland Wright, Carnegie Institution of Washington, and American Geographical Society. Atlas of the Historical Geography of the United States. Baltimore: [s.n.], 1932.

The first map shows the time for travel in 1800:


The images are from the David Rumsey Map Collection. Moving forward in time to 1830, we see the introduction of the railroad the distances change.


By 1857, railroads had spread across the United States. You need to focus in on the fact that the construction of railroads and other means of transportation would have progressed at a different rate in different countries and in different areas in the same country.


Granted the distance traveled is only one factor, but it is a threshold factor. What I mean by this is that if the time frame of an event and the distances recorded are not consistent, then there is a question that needs to be resolved. In places such as Wales, Denmark, Norway, Sweden, Germany and other European countries, the distance might be a matter of a few blocks or even a mile or more. It is not unusual for people to have very similar names in the same small communities. It is mostly just a cop-out to record all the names as if they were all related. This becomes troublesome is when people start assuming relationships across parish or other jurisdictional boundaries.

Friday, September 25, 2015

Tune in to Genealogy Webinars and Virtual Expos

Until the Mesa FamilySearch Library was shut down precipitously last November, the Library was producing a free series of webinars once a month. The last of these webinars was held on November 12, 2014 and presented by Emily H. Garber. This recorded list of webinars, including several by me, is still online as Webcasts. While I was in Salt Lake recently teaching a class on automatic, online, genealogy record hints, the class was converted into a webinar by FamilyHistoryExpos.com.

The Family History Library in Salt Lake City, Utah is holding free webinars. I have found you have to be vigilant to see the schedule and/or the announcements before they have already been held. I will try to point these out in the future.

These online educational opportunities are both free and fee based. Webinars are usually limited to some number of total attendees due to the arrangement of the broadcast and what the promoter has paid for the service. In some cases, such as the Legacy Family Tree Webinars, the initial webinar is free and an online webcast is available for a limited time, but archived copies of the webinars are on sale through a fee-based website.

The FamilyHistoryExpos.com webinar I did recently is part of a continuing series of online virtual Expos. Holly Hansen, the CEO of FamilyHistoryExpos.com has initiated a three-level Expo experience for a full week: in person attendance, full-week online attendance at the classes or attendance at a single class webinar. The in-person experience involves a week in Salt Lake City, Utah. Attendees can stay at the Crystal Inn, a local hotel, and have class instruction both in the morning and late afternoon. The highlight of the experience is mentored research time at the Family History Library. If you cannot come to Salt Lake City, you can attend the classes virtually on your own home computer. If you just want to attend one class, you can tune in to the weekly webinar. You can find out about future webinars and Expos, both in person and virtual by subscribing to the Family History Expos Newsletter.

Attendees at each of the Expo experiences in Salt Lake City, receive a copy of FamilyHistoryExpos.com's subject centered book on the subject of the Library Learning Experience. There are now nine published books containing a wealth of resources. The entire collection of nine books is available on the Shopping section of the FamilyHistoryExpos.com website. Some of the books are being sold on Amazon.com and the rest will shortly be online on Amazon.com as well as the FamilyHistoryExpos.com website. You can find the books on Amazon by searching for my name, "James L. Tanner."

There is a wealth of genealogy webinars available online. Of course I am partial to the ones I participate in, but there are many more available every month throughout the year. Don't forget that RootsTech 2016 is coming and they will undoubtedly be broadcasting some of the sessions of the Conference live and have video recordings of many of the sessions afterwards.

Maps of German Empire -- Always Finding New Things Online


During our classes and FamilyHistoryExpos.com workshop in the Family History Library in Salt Lake City, Utah this week, Ruth Maness, AG, showed us some old German maps on microfilm. Holly Hansen quickly located the maps online: Posselt-Landkarten. This turned out to be exactly the same maps as we were shown on the old Family History Library microfilm. The maps are high resolution, extremely detailed and are organized in collections by years beginning in 1782. The website is in German (of course) but by using Google Translate on the entire page, I can read the website in fairly good English.

Now if you have looked at a paper map with a magnifying glass or tried to read a map on a microfilm viewer, you can begin to appreciate this tremendous map collection. Here is a screenshot of the map viewer from 1782. 


Using the viewer, I can zoom in on any portion of the map.


Here is a list of the map collections:
  • Map of the Empire of Germany from 1782
  • Deutschlandkarte von 1813
  • General-Karte von Europa 1845-1847
  • Atlas des Deutschen Reichs von 1883
  • Karte des Deutschen Reichs von 1907
  • Monumentaler Plan von Breslan 1910
  • Justus Perthes' Karte des Deutschen Reiches
  • Karte des Deutschen Reiches
These are all on this one website. In addition, here are some more detailed maps on the same website:
Here is a screenshot of the "Karte des Deutschen Reiches."


Here is a zoomed in view of one of the maps:


Here is a detail view from this same map:


Then I can zoom in to see the detail in this map:


No more blurry, half-readable images in a dark microfilm reader!

Wait, there is more. While Holly and I were clicking around looking for different copies of these maps, we found some of the same maps on the David Rumsey Map Collection:


These maps were linked to Google Earth and could be seen at magnification superimposed on a current satellite image. 

I am beginning to feel like a late night TV advertisement. Wait, there is still more.








Thursday, September 24, 2015

Why would I want to be a professional genealogist?

The Association of Professional Genealogists (APG) lists 16 professionals when I specify Utah and Utah County in a search. Of those professionals, five do not do work for clients. I would assume that Utah would have the highest concentration of professionals in the country, so this is an interesting observation. Of course, there are likely some professional genealogists that do not belong to the APG, but even then, the number cannot be large. I did another search for Salt Lake County, the location of the Family History Library. and the number increased to 34 with 8 not taking any clients. As a control, I searched for Arizona, the entire state. There were 34 in the entire state with only 3 not taking any clients.

As a comparison Arizona has over 16,000 attorneys while the state of Utah has almost 8,000. This is interesting because all a genealogist has to do to join the APG is to pay the dues and sign and agreement to abide by the rules. A lawyer has to go to school for 3 years and take a 3 day examination and submit to a background check. I would say that what I need to know as a genealogist is roughly equivalent to what I learned in law school. I might also add that attorneys need to be accepted by the Bar Association in their state to practice law, but a genealogist doesn't have to have any certifications or credentials at all. Out of all the genealogists listed on the APG searches, Arizona only three had CG or AG designations. In Utah there are 138 genealogists listed. Of those listed, only 35 have CG or AG designations. In case you don't know, a CG designation comes from the Board for Certification of Genealogists (BCG) and the AG designation comes from the International Commission for Accreditation of Professional Genealogists (ICAPGen). 

As I write part of this post, I am on a train to Salt Lake City, Utah with hundreds of Comic Con attendees. The last ComicCon Conference in Salt Lake had 140,000 people attend. This may be why I am thinking about numbers. I am sitting across from a lady who has loudly announced that she is getting a Masters Degree in fairy tales. Maybe I missed my calling in life. (Afterthought: What am I talking about. I do deal in fairy tales. I am a genealogist!).

Now, why the numbers? For one thing, genealogy is not a growth industry. The demand for professional genealogists in businesses is pretty marginal. Even large companies like FamilySearch and Ancestry.com do not employ very many "genealogists" as such. They are more interested in programmers and other persuasions. There are a relatively few genealogists that end up with full-time employment in a genealogy related business. There are some larger genealogy research companies, such as Ancestry ProGenealogists in Salt Lake City, Utah but they list only ten genealogists.

What would be my prospects if I were to go to Brigham Young University and major in family history? Would I get a job when I graduated? I would probably try to get another degree in something like Library Science or History, just to make sure I had something that had an attraction to a prospective employer. ICAPGen presently lists four possible openings for genealogists. I would not find that to be very promising.

Do we see a significant growth in the future? Well, no. Some years ago, we started our graphic design business. At the time, there were dozens of local print shops in the Mesa, Arizona area. Within a few years, because of technological changes, mostly desk-top publishing, many of the local shops had disappeared. Genealogy is less volatile than something like printing, but it is affected by technology. Back then the idea was that anyone could do publishing on their own computer. That same type of message is being sent out to the genealogy community today: anyone can do genealogy on their computer at home. Why would they need a professional?

Personally I am driven to achieve "professional level" competence but I am past the point of wanting to make a business of genealogy. Nevertheless, I write and teach and occasionally get paid for both, but nearly all my teaching is now voluntary and unpaid.

If I were to go back in time and do it all over, I would seriously consider being a professional librarian/genealogist. I think the job satisfaction would be much higher than law. I certainly would not make as much money, but I would not have had all the conflict I had in law. Realistically I would be concerned about the job prospects and the openings. 




Plumbing the Depths of Online Record Collections -- Part Two

One of my favorite books of all time is the following:

Dunning, Stephen, Edward Lueders, and Hugh Smith. Some Haystacks Don’t Even Have Any Needle: And Other Complete Modern Poems. Glenview, Ill.: Scott, Foresman & Company, 1969.

The title of the book comes from one of the poems. Here the question asked by the title applies to the vast online genealogical database companies and is a serious consideration. How many of the huge, online databases are missing exactly those records you need to find your ancestors? Do you sometimes feel like you are looking for a needle in a haystack when you face the huge database programs?

Let me ask a few more questions about these huge programs to get started in plumbing their depths.
  • How may of the huge online genealogical databases have complete copies of the United States Federal Census Records? 
  • What percentage of the records on each of the websites is duplicated on other similar websites?
  • How many of the records on the large database websites are repetitious of other similar records on the same website?
  • What percentage of the records on these websites are not source records, but user contributed copies?
These questions point out some significant limitations in relying on the numbers supplied by any one of the websites. I am not picking on any particular site; this is a general issue with all of them. In fact this issue extends to websites with claims to far fewer collections and records. I certainly see the need for a fair degree of redundancy. It is comforting to know that records such as the United States Federal Census are available from several sources online, but when the duplicates start being used to puff up the total numbers then that becomes a concern.

One fact is clear. The total number of original source records being digitized and put online continues to soar. Millions of new, previously unavailable records are being added every day. It is also clear that records are being added from areas around the world heretofore not previously available.

In genealogy, redundancy is absolutely necessary. It is very, very seldom that complete information is available about an individual from one record. For example, it is naïve to assume that a recorded birth date is accurate without some additional collaboration. Unsophisticated genealogists have a tendency to rely on the reporting of a single event in creating their genealogical view. This individualized focus engenders an atmosphere of uncertainty. This is especially true of situations involving distinguishing individuals with similar names.

In most cases, the larger websites provide a window into their inner workings. Usually, this is in the form of a catalog. For example, Ancestry.com has a "Card Catalog." This card catalog lists all of the "collections" individually on the website. In addition, Ancestry.com and some of the other websites provide a method of filtering the list of "collections" in a way to show what is available in any geographic area or in a specific topic. If you would like to know how the various websites compare in their specific holdings, I suggest a close examination of their catalog. These listings of the various collections in each of the individual large websites is usually designated either as a catalog or as a place for you can search all the collections or view all of the collections. Sometimes it takes some searching to find the list of all the resources.

Size is far from the only concern with large online genealogical databases. Whatever the size of the database, the quality of the search engine is far more important. One of the most persistent complaints about all of the websites is the apparent lack of responsiveness of the searches. The user enters a search term, such as the name of an ancestor, and the program returns results that vary considerably from the expected results. It is usually the case that the returns vary wildly as to geographic areas and time periods. For example if I search for John Jones in New York in 1850, I do not expect to see John Jones in California in 1900 or even John Jones in England in 1640.

Part of this apparent unresponsiveness of the search engines is actually designed into the program. the process or set of rules implemented by the developers and programmers, often referred to as algorithms, provide for wider responses if the main search terms are not met. So if the program cannot find "John Jones" in New York in the time period specified by the search, the program will default to providing any John Jones that appears similar. Because of the content of the databases, the results may appear random. Some of the larger databases tried to avoid this problem by separating the results of the searches into categories either by awarding the results a star, from 1 to 5 stars, or by actually separating out the results into different categories depending on their perception of the reliability. The fact that the search turns up a variety of responses reflects the reality that these results were exactly what the programmers anticipated.

If you want to get an idea of the accuracy of any given search engine, just search for something that you know is already in the database. You might be surprised at the results.

Another important factor in this accumulation of digitized documents is the quality of the indexing. Since accurate handwriting recognition is still an unobtainable goal and, given the fact that optical character recognition is also not perfect, the online documents have to be indexed by people, one letter at a time. Everyone who has worked back researching old scripts and bad handwriting knows the challenge of accurate indexing. So amassing huge collections of scanned documents may make for easier access than seeing the same documents on a roll of microfilm, but without accurate indexing the advantage stops there.





Instagram usage passes Twitter with 400 million users

Did you know that Facebook owns Instagram? Did you know that Instagram recently passed Twitter in total users worldwide? In the past few months, I have been noticing a distinct decline in online, genealogical blogging activity. The major, well-known bloggers keep pumping out content, but the background of more local, family oriented blogs has decreased dramatically. At the same time, I have observed a dramatic increase in Instagram usage. According to Satista.com, Facebook leads the social network pack with about 1.4 billion users. However, the numbers shown on Statista.com do not agree with the news article on CNET. Statista.com shows Twitter, Skype, Google+ and Instagram almost neck and neck with around 300 million users each. Other figures show that Facebook steadily increases its user base.

My point is that there is only so much time in a day to spend on social media. Even if you are glued to your smartphone or tablet, you can only spend the time you are awake. So each individual only has a certain amount of elasticity to increase their use. What can change is the amount of time spent on any one social media outlet. If Facebook usage continues to grow and arguably still captures the same amount of time from each user, then the increase in Instagram use has to come from somewhere; my observations point at a decrease in blog traffic in certain areas. Statistics are hard to come by, but one website, BuiltWith.com shows as steady decline in Blogger usage over 2015. Worldwide, according to WeAreSocial.net, the average online user spends 2.4 hours a day on social networking. Their conclusion is that social networking is now converging around mobile devices.

As I have indicated previously, among my very active, online, children, I am seeing a dramatic shift from blogging and Facebook to Instagram.

If social networking is becoming pervasive worldwide (which it is) and assuming that the conclusion that mobile devices prevail, then it is only natural to see a decline in personal blogging. You should also see a decline in access to those websites and venues that depend on a full keyboard. Hmm. When was the last time you posted a blog from your smartphone?

How does all this compare to usage of the Web for genealogy? Interestingly, according to Google Trends, searches for the term "genealogy" continued their decline. Here is the graph.


There are various ways to measure interest and usage in a website. Actual use statistics from various online websites seem to contradict each other on occasion. I tend to put more credence into search trends that usage figures because search trends are more device independent and indicate more what people are doing rather than just the fact that they are looking at the Web. Of course there is a wide latitude for controversy in this area.

Here is an update of the Google Trends graph for the three big online websites as targets for searches:


If I change the search terms to eliminate the URL designation of the searches the graph changes significantly:


To what extent does a decrease in searching for the term "genealogy" play in the decline of genealogical blog posts? Are we attracting any new readers or just maintaining our developed readership? If all of you out there stop reading my blog, does that mean I can stop writing and do something else?

Wednesday, September 23, 2015

Plumbing the Depths of Online Record Collections -- Part One

The seminal development of genealogical research the past few years is likely the online, digitized record collection. Some of these collections have grown into massive conglomerations of whatever happened to be available to the company accumulating the records. The whole idea of accumulating vast quantities of digitized records for genealogy only dates back to about 1990s with the founding of a company called Infobases, Inc., the predecessor of Ancestry.com. For a brief history of Ancestry.com, see the article reviewing its development in Wikipedia. By the way, these companies are not particularly interested in giving anyone a detailed history of their development. Corporate secrecy and the resultant carefully crafted press releases usually result in quite vague statements about how these companies have grown so large.

The combined technology that allowed the creation of these huge websites had to wait for the individual growth of a number of related developments. First, there needed to be a cheap and efficient way to convert paper documents into digital image. At the same time, the capabilities of computers and data storage had to increase so that the large digital images created could be viewed. Enough people had to acquire the new technology to create a market for the images and a way had to be developed (the Internet) to disseminate the images to a wide audience. In addition, a commercial, subscription-based company model had to be developed and people induced to pay for the privilege of simply viewing online content. For example, without the inexpensive, large, high resolution, computer monitors (aka televisions) available today, how many of us would be trying to read old documents online? The number of developments that had to come together to produce these large online collections is truly staggering.

At the same time, without the explosive growth of online businesses, there would have been no incentive to produce huge databases collecting original source documents. This puts genealogists in the position of having easy access to billions upon billions of records. Think about the numbers. If you were to look at a billion documents and spend only one second on each document and look 24 hours a day without any breaks, it would take you almost 32 years to look at that many documents but of course, you would not live more than a few weeks at most. Some of the larger companies claim to have many billions of records. So how do we really know what they have?

In past posts I have commented on the fact that none of the larger companies use the same method of measuring their huge collections. In determining if the claims of billions of records are correct, we are quite literally at the mercy of the large companies, their public relations and marketing departments, and their search engines. We can see unimaginably large numbers, but these numbers are essentially meaningless. What am I going to do with 6 billion records? The implicit assumptions made by those who promote the mega-databases is that bigger is always better. Centralization is always considered to be a positive. No one seems to stop to think about how the average person is going to process all that information and no one stops to question the advantages (if any) of centralization.

But if we go behind the facade of big data, we find a different issue. What business are these large online database companies actually in? Each of the large companies now has "strategic partnerships" with other entities and, in the most recent developments, these entities include DNA processing companies. I am very far from being an alarmist conspiracy type person. It is also, arguably, very advantageous to genealogical researchers to have access to this sort of consolidated company, but on the other hand, can these companies continue to expand into related areas and keep providing adequate service in every area? Is larger always better? There is an old saying, "The bigger they are the harder they fall." I am reminded of the recent announcement of Ancestry.com's venture into the "health industry." See health.ancestry.com.

Can we really contemplate what effect it would have on genealogy if one of these larger online database operations were to "go out of business?"

What are the practical realities of a vast centralization of the world's genealogically significant records? We assume that putting all of eggs in one basket is a positive development. Where does that assumption come from? Of course, I personally benefit from easy access to billions of records but I also have acquired the individual skills and have the computer power and equipment to utilize all that information. How many people really have the time, the money, the education, the inclination and the perseverance to use the information already accumulated?

I do not think that these questions involve value judgements. These large accumulations of records are neither "good" nor "bad." I am not addressing these issues from an ethical or value standpoint. I am addressing the issue of size, per se. I appears that this discussion is going to continue in another post or may even end up being a series. See you next time.

If you are interested in doing your own thinking on this subject, I would suggest that you start by reading the following book:

Taleb, Nassim Nicholas. The Black Swan. London: ALLEN LANE, 2011.

Tuesday, September 22, 2015

Find Your Immigrant Ancestors - Naturalization Records -- Part Three

This is the third in a series of posts about naturalization records. Here is a link to the first installment:

http://genealogysstar.blogspot.com/2015/09/find-your-immigrant-ancestors_21.html
http://genealogysstar.blogspot.com/2015/09/find-your-immigrant-ancestors.html

By the time I got into writing the second part of this series on naturalization, I began to remember the complexity of the immigration laws in the United States and I realized that this subject was going to continue for a while. Who can become a citizen of the United States and the status of immigrants has always been a political football and it is one of many subjects I can get very involved in and move into my tirade mode.

As I have noted, discovering the origin of the immigrant is one of the most challenging and at the same time, most common, issues facing genealogists. I spent a considerable time searching for the birthplace of one of my ancestors in Ireland and thought I might find the place noted in naturalization records in Pennsylvania. In this case, I looked for records from his children who were born in Ireland. I was disappointed to find that the naturalization documents listed the birthplaces as "Ireland" and that was not much help.

The question for this installment is where are the records?

As I noted in the last installment, before 1906, naturalization was handled by local courts. So the documents are scattered all over the country. After 1906, the records have been kept in the National Archives. There are some concentrations of the records however. Here are some of the major repositories of naturalization records:

There are many more websites with articles, lists and links. Do a Google search on "online naturalization records" for hundreds of links.

How do you go about finding the records?

The first step is locating the immigrant in the United States. You have to identify a specific event in the ancestor's life at a specific time and place. Without a specific location of an event, you cannot be sure you are searching the right records. Next, you need to review census records and other records to see if there is any mention of the ancestor's naturalization. You need this information to narrow the extent of the record search. Once you have a place and a time frame, you need to identify the courts having jurisdiction in the time and place where you ancestor lived and determine which courts would have had jurisdiction over naturalization petitions. Moving on, you then need to determine where the courts records are located and search the records (if they still exist). As I indicated, before 1906, you might be disappointed at the lack of information in the record. Then again, you might hit the jackpot. 


Genealogy and the Narrative Fallacy

The narrative fallacy was most recently popularized in the following book:

Taleb, Nassim Nicholas. The Black Swan. London: ALLEN LANE, 2011.

If you have read this book it would certainly help you to understand what I am writing about in this post, but it is not essential.

The narrative fallacy is our need to fit a story or pattern into a series of connected or disconnected facts. Part of Taleb's essential premise is that we impose our individual explanations on past events based on post hoc ergo propter hoc (after this, therefore because of this) reasoning.

Most recently I have been stuck in just this sort of situation. On my Tanner family line, my own research had traced the ancestral line  back to the 1600s in Rhode Island. In 1680, William Tanner first appears as a witness on two disclaimer deeds. Although his existence is well documented and his grave site located, his marriages are unclear and it is uncertain as to whether or not subsequent references are to William Tanner or to a son with the same name. The real mystery is connecting him to his family in England. One supposition is that he was "transported" as an undesirable or criminal to America. There is an extensive discussion of this entire issue on TheAncestorFiles.blogspot.com (in the list of people, look for William Tanner).

The narrative fallacy becomes relevant because there are those who purport to connect him to a William Tanner in England. The claim has been published online many times and the pedigree in FamilySearch.org reflects parents in England. Currently, the Family Tree identifies William Tanner as born in England and married in Connecticut. Here is a screeshot:


William Tanner is identified as William Francis Tanner, Sr. and he is shown as the son of John Tanner born in Bromley, Kent, England, whereas William Francis Tanner, Sr. was supposedly born in Chipstead, Surrey, England. William Francis Tanner, Sr. is currently shown with nine different wives. The name of "William Frances Tanner, Sr." comes originally from the Ancestral File. Part of the confusion here comes from the existence of another prominent Tanner family in Connecticut at about the same time. Here is a screenshot of the book which is readily available from a number of online sources including Google Books and Archive.org.


This Thomas Tanner, Sr. has a son, born in Rhode Island, named William Tanner. At about the same time, there are other individuals named William Tanner in Massachusetts and other parts of the Colonies. Here are a few facts,
  • My ancestor, William Tanner, is not recorded as having a middle name in any record attributed to him. None of the other members of the family had middle names.
  • None of the records produced by the various parties claiming to have connected the Rhode Island William Tanner who appears in 1680 connect him to a family in England. The arguments for all of the records in England are basically that the same name = the same person. 
  • The William Francis Tanner, Sr. born in England is born in a different parish than the John Tanner reported as his father. 
  • William Tanner was a member of the Seventh Day Baptist Church in Rhode Island and this could explain his appearance in America. 
Both my daughter and I have asked for documentation showing the connection between the William Tanner born in Chipstead, Surrey, England and the William Tanner in Rhode Island. I have never had a response to my request for documentation except to claim that they have found the ancestor in England. I am still waiting to see a connection. Now, I am wondering how the John Tanner in England, who is listed as his father, is born in Bromley, Kent, England and is buried in Chipstead, Kent, England. This could all be true, but no one has produced any documentation connecting these various people. 

Now back to the narrative fallacy. The common genealogical fallacy of claiming that a person with the same name is the same person is a very good example of the narrative fallacy. When we do this, we impose our own view of the history without substantiating facts. The fact that a person discovered in a relationship has the same name does not establish a relationship. What is missing in my example is some documentary evidence connecting the English Tanner with the person in Rhode Island. It may be that no such evidence exists. If he were a transported person because of his religious beliefs, he may have changed his name when he came to America and he may not have been a Tanner in England at all. A conclusive record would be one that connects the two individuals, such as finding a Seventh Day Baptist Church Record showing where he came from in England. It would even be helpful to show that the William identified in England was a Seventh Day Baptist and/or moved to America or was transported to Rhode Island. 

There is a lot more history here and the issues become very complicated. Because of the lack of connecting documentation, there is a doubt as to the conclusion. I am not a bad person just because I don't happen to see sufficient documentation to establish a connection. What happens in these situations is that the narrative fallacy created becomes the reality for those who believe the story. A real issue and problem arises when the narrative fallacy imposes its story on the facts to the exclusion of any alternate explanations.

Here is a good example of a genealogical narrative fantasy; what happened to the 1890 U.S. Census Records? If you have bought in to the narrative fantasy, you will reply that the records were destroyed in a fire. Is this actually the case? How and when did most of the records get destroyed? Do you know enough about the history of the 1890 U.S. Census to answer that question?

If you want to see the answer to the Census question, see "First in the Path of the Firemen" The Fate of the 1890 Population Census, Part 1 By Kellee Blake from the U.S. National Archives. 

Monday, September 21, 2015

Two New BYU Family History Library Genealogy Videos


The Family History Library at Brigham Young University has published two new videos. The first is an introduction to The Family History Guide website. I have the opportunity to present a series of classes at the Library and I distilled the classes into a much shorter presentation. The second video is entitled "Using FamilySearch Record Hints." You may wish to subscribe to the BYU Family History Library Channel to get notices of new videos as they are posted.


We are closing in on 100 videos. Please feel free to make suggestions for topics for future videos.

What is the largest online source for genealogical information?

Several websites claim to be the largest online source for genealogical information but depend on their own definition to establish their claims. Size is always considered a selling point. Think about the giant, economy sizes in your local warehouse store. We tend to be impressed with records of any sort. The Guinness Book of World Records is already out for 2016 and the website has 3804 records for the "largest" things. Some of the largest things recorded include the largest crochet blanket and the largest wearable turban. Size doesn't always equal value; although the largest diamond in the world might be an exception. But I might think someone was grasping at straws to claim the largest collection of stamps featuring Popes.

My rule is that size does not matter if the website does not have the records you are looking for. Notwithstanding my rule, there has to be a certain fascination about size or the websites (or libraries) wouldn't claim to be the largest.

The dividing line between including Google in the claim for being the largest "anything having to do with genealogy no matter how remote the connection" and an obvious genealogy site such as Ancestry.com or FamilySearch.org, is the characterization of the website. In short, does the website purport to be used for genealogy or is genealogical information merely one of the side benefits from having a lot of stuff? Granted, having a lot of records in one place may seem to be an advantage, but if I do a search on a large website for an ancestor and get over a million results, what am I supposed to do with that?

In reality (whatever that is) there is no practical way to compare the size of the larger online, genealogical database programs. So I have observed in the past, there are no clear definitions for the terms records, collections, names, etc. used by the various websites. For example, Ancestry.com lists 32,222 record collections, while FamilySearch.org lists 2036 collections. How do you compare the two lists of "collections" when one collection can contain one record or millions? In addition, Ancestry.com's collections list all of the records they have available. On the other hand, FamilySearch.org has millions of records still in microfilm format that are not yet included in the Historical Record Collections.

Any claims about the total number of records or names in those records must be an estimation at best. I can't imagine anyone sitting down and counting a billion records. Technically, if the individual records were entered into a computer database, the computer database could return a number of total records. That would assume that each individual record was uniquely defined in the database. However, take for example a US Census record, a single record may have 50 names or no names. Here, I use the term "record" to refer to a single U.S. Census sheet or page.

If I have a smaller database, conceivably I could search the entire database manually to determine if there were any records pertaining to my ancestors. With very large databases, we are entirely dependent upon the ability of the search engine to tell us whether or not there are pertinent records available. Although I may search a large database over and over I am never quite certain that I have effectively determined the existence or nonexistence of any specific record. For example, I am not going to search every record in an entire census year simply for the purpose of determining whether an ancestor lived in the United States.

Claims that a database has "billions" of records usually ignore the issue of duplicates. In addition, including user generated family tree entries as "records" obviously obscures the entire claim to a large number of records. My guess is that users will be more impressed with the accuracy of the searches of even a limited number of records than they will be if the searches are too general and thereby less useful.

I recently visited two different stores in one day. One was a large, warehouse store with thousands of products in large quantities. The second store was an extreme contrast. It was a small store with very specialized products. From both stores to be helpful. The interesting thing is that I would not have gone to the large store to purchase the items I ended up buying at the small store. I think genealogy databases work the same way.