RootsTech 2014

Some people eat, sleep and chew gum, I do genealogy and write...

Wednesday, August 27, 2014

The Limits of Accuracy

In quantum mechanics, the uncertainty principle is any of a variety of mathematical inequalities asserting a fundamental limit to the precision with which certain pairs of physical properties of a particle known as complementary variables, such as position x and momentum p, can be known simultaneously. This is sometimes referred to as the Heisenberg principle, but it was formally derived by Earle Hesse Kennard and Hermann Wey. See Wikipedia: Uncertainty principle.

In 1931, Kurt Gödel proved two theorems of mathematical logic. Quoting from Wikipedia: Gödel's incompleteness theorems,
The first incompleteness theorem states that no consistent system of axioms whose theorems can be listed by an "effective procedure" (e.g., a computer program, but it could be any sort of algorithm) is capable of proving all truths about the relations of the natural numbers (arithmetic). For any such system, there will always be statements about the natural numbers that are true, but that are unprovable within the system. The second incompleteness theorem, an extension of the first, shows that such a system cannot demonstrate its own consistency.
One of the most influential books I have ever read discusses the application of the Heisenberg principle and the incompleteness theorem of Kurt Gödel. The book is Hofstadter, Douglas R. Gödel, Escher, Bach: An Eternal Golden Braid. New York: Basic Books, 1979.

These mathematical principles were "discovered" and changed mathematics in a fundamental way. So, you say, here we go again, how can this possibly have anything to do with genealogy? Well, humor me. Here is what I have to say on the subject of certainty. In essence both principles boil down to the statement that some measurements can never be precise and some systems cannot ever be proved completely without resorting to proofs outside of the system. These limitations were not known until quite recently. Now, I am not saying that genealogy is very much like mathematics, but there are several concepts that suggest similarities.

It has only been quite recently that a large mass of genealogical data has been created in centralized repositories and made reasonably available. Historically, genealogists had to focus their efforts entirely on a smattering of records scattered pretty uniformly all over the world. Those large centralized data accumulations, such as the libraries of the Genealogical Society of Utah (i.e. the Family History Library in Salt Lake City, Utah), the New England Historic Genealogical Society and others were only accessible if you had the time and the resources to travel to the library itself. Now today, most of these scattered collections still exist, but technological changes have made untold millions of records accessible for free or for a moderate cost. For example, even with the advent of microfilmed records, obtaining access to the microfilm could be a slow and inefficient process. You could only guess from the microfilm's catalog description if it might contain information about your ancestors and it was very disappointing to order a film, wait until it appeared and then find that it had no useful information.

Most recently, advancing technology has resolved some of the microfilm issues by making the images available in digital format online. But the changes being wrought by technology are more basic than just increasing genealogical record availability. The vast accumulations of genealogical source records and online family tree entries are pointing out some ultimate limitations on accuracy and completeness. Ultimately, these physical limitations of the system will be the most significant barriers to genealogical research.

The effect of the accumulation of massive genealogical data concentrated online and available to huge numbers of people who, previously, would never of had such access is starting to reveal the fundamental limitations in the overall system. The most evident of these limitations is the level of accuracy of the overall system. Researchers are becoming painfully aware of the inconsistencies and contradictions contained in the accumulation of records. You can rail against the inconsistency, but as we accumulate more and more sources and records, the fact that there are such inconsistencies becomes more and more apparent. For example, it my knowledge of my ancestors was limited to one source, perhaps a surname book or a pedigree chart from a relative, I had one version of my ancestral history. Today, we can have hundreds or even thousands of versions. It is not just the compiled family tree records that are at fault, the very documents we refer to as sources are inconsistent and vague. There are few days that go by that I do not hear the complaints about the inaccuracy of the system.

What is less apparent is that there is an ultimate limit to our accuracy. No matter how careful we are and no matter how experienced, we will encounter inconsistency. We will be forced to choose between different dates, places and even different people. It is not that the records are unreliable, it is just that we now have access to multiple versions of the same types of records and multiple records that were either previously unavailable or impossible to obtain. Even if we resort to the time honored "proof statement," in many more cases we will be forced to admit that the conflicting evidence cannot be reconciled. We have found our own genealogical uncertainty principle, the limit of our ability to prove our ancestry.

But there is an even deeper issue. That is that genealogy, as such, is incapable of even defining itself. This is similar, in some ways, to the limits imposed by the theorems postulated by Kurt Gödel. By analogy, genealogy can never be completely defined or complete. In saying this, I am expressing a physical certainty. There will always be more records, but the search for absolute proof will never be complete. We cannot even come up with an adequate definition of "genealogy," much less come to a universal agreement as to what constitutes an adequate proof.

We can wring our collective hands over these limitations but they are physically imposed by the system we have collectively created. Genealogy will always be uncertain and incomplete. 

Tuesday, August 26, 2014

A Radical Change in Technology: Why I no longer need a Windows Partition on my Mac

I admit that the title to this post is somewhat techy in nature, but it points out a fundamental shift in the way that computers operate and point out a particularly a shift in genealogy programs in general. Three separate trends have coalesced to create this dramatic change in way I approach the interface between my use of the computer and my genealogical activities.

For many years, my involvement in genealogy necessitated that I maintain both a Windows-based operating system computer and an Apple-based system. Quite frankly, the Apple-based system was a matter of choice but because of my extensive involvement in typesetting and graphics that was the best choice. So for many years, we had two computer systems sitting next to each other in our computer room. A few years ago, as we upgraded our Macintosh computer we found that progress in the power and speed of the processor made it possible, for the first time, to practically run the Windows operating system in an emulation program on the Macintosh. At this point, we switched to using two Macintosh computers running Parallels Desktop.

The next step in this process, involved a decision by my wife to discontinue using Parallels Desktop for several reasons. Most of these reasons involved the difficulty of maintaining two separate operating systems on the same computer. I continued to need access to Windows-based programs, particularly those I supported and used for genealogy.

The reason for my wife's shift away from Windows-based software entirely was not based on any lack of ability to use Windows-based software or work with the newer versions of Windows operating system. In fact, we purchased a third computer to use solely for the few programs that required us to use Windows. So now we have our Windows-based computer sandwiched between two iMacs. This has gone on now for probably the last five or six years or longer.

The next step in the process occurred very subtly as online programs and those desktop programs directly connected to online programs began to dominate genealogy. I began to notice that my involvement in Windows-based programs began to decline precipitously with the introduction of several very adequate or even excellent Mac-based genealogy software programs. Continued to maintain my involvement with the PC-based programs because I felt the need to continue to support people coming into the Mesa FamilySearch Library. However, my involvement in teaching programs directly diminished rapidly because other volunteers at the Library were more than adequately capable of teaching the software classes.

The shifts in technology accelerated and could be summarized as follows:

  1. The increase power and sophistication of the online genealogy programs particularly those programs developed by the larger genealogical database programs.
  2. The availability of faster Internet access.
  3. The development of programs utilizing connections with the large online databases.
Recently, I became aware when I was notified of an upgrade to Parallels Desktop that I had not used the program for many many months, perhaps even in the last year. I still had the latest version of the program on my computer and could easily switch over to use the many programs I had on my computer that required the Windows operating system but I had not used it. Part of the reason, was that I had installed those programs that required the Windows-based operating system on the third computer. The advantage of doing this was the savings in time and frustration of switching from the OS X operating system to Windows on my Macintosh. It was much easier simply to switch to a Windows-based computer. The next shift was highly personal. The number of programs that I used that required the Windows operating system continued to shrink so my involvement with the third computer became less and less necessary.

There was another factor that contributed to a shift in the overall use of the computers and that was the increased utility of tablets and smartphones. A significant part of our usage of computers began to be transferred to iPads and iPhones. The increased use of mobile devices highlighted the fact that we were using online programs to a greater extent and that our reliance on desktop programs was diminishing. Obviously, there were some significant exceptions since I still relied heavily on graphics-based programs, but but for genealogy the shift became more and more apparent.

One day, I finally realized I no longer needed a Windows based operating system on my Macintosh. All of the programs that required the Windows operating system now been transferred to the Windows machine and frankly, it was much easier to move over and use the Windows computer than waiting around for Parallels Desktop to operate.

I also realized that my use of the computer was now dictated by the dramatic shift to online genealogy programs. I expect that this trend will continue. I further expect that this technological shift will begin to seriously change the present use of genealogy programs.

A valuable perspective on the International Association of Jewish Genealogy Societies Conference

I received a link of an excellent summary of the International Association of Jewish Genealogy Societies Conference held in Salt Lake City, Utah recently. Obviously, my perspective of the Conference was considerably different than that of a person who was deeply involved in Jewish genealogy. The post on The Jewish Daily Forward, is entitled, "A Report From the Jewish Genealogists' Summer Camp, The Author of 'The Family' Heads to Utah." I think the post very clearly expresses what I felt and experienced at the Conference as a complete "outsider." It is sometimes hard to break out of our "genealogical comfort zone" and really take the time to learn about an area of genealogy with which we are not familiar. But it is always worth the effort.

Genealogy is genealogy and the subject matter is only one variable in a sea of constants. Research methodology is the same no matter what the ethnic background, time period, geographic location or social construct. But in the case of this Conference, there was a distinct difference in the dedication and fierce interest of the attendees.

Monday, August 25, 2014

The Online Digital Book Collections

This series of posts consists of links to online websites in different categories useful to genealogists and researchers. The first post listed online digital map websites. The next posts in the series listed both national and state digital newspaper websites. This post was intended to be a list of all of the websites I can find that have collections of digital books. Well, that turned out to be impossibly lengthy. For example, it would include nearly every university and college library and/or special collections department in the world. I finally realized that the list would go on forever, so I settled for a representative sample.

Some of these websites will be commercial and have a subscription charge, but even those with a charge for some books may also have a selection of free books. In some cases, you may have to dig down into the website to find the digital books, but they will be there. I finally decided to stick with English language websites. I might try some of the other language websites some time, but the difficulty is determining if they are legitimate or pirate sites without knowing the languages that well.

In making this list, it is apparent that there are several large websites that provide online ebook services to libraries. For example, the website provides a centralized service of digital books only to libraries. Individual users of the Overdrive system gain access to the books in their own public library's online collection through logging in with their library card number and a password. Is this an online digital book collection? That is a good question. The number of books available to the subscribers is substantially different from institution to institution. The question is whether Overdrive is the online website or the local public library? Depending on how you view this question and many others, the number of online websites is determinant or indeterminate.

Another example is It has millions of digital books, mostly for sale. It also has a "free library" of books. Is it a library or a bookstore? Or both?

It became immediately evident that the availability of ebooks is changing rapidly. Many of the websites listed have some sort of restriction on access. They are either university libraries where only those associated with the school can access the books or they are private companies or repositories with other restrictions. My goal was to make a list long enough and inclusive enough that you can find a way to acquire access to the material you are searching for. Unfortunately, the list became so tangled that I soon realized that this particular goal was unobtainable.

No list of digital books would be complete without a reference to Google Books. Google has clearly amassed the largest online collection of digital books in the world. Some time ago, Google estimated that the total number of books, counting every book ever published to be, 129,864,880. Of course that was back in 2010 and millions more have likely been published since then. The real question is how many of those books has Google digitized? The last estimate was made in April of 2013 and claimed that Google had digitized more than 30 million books. We can assume that Google is still out there digitizing books, so there are probably millions more on the website at any time after 2013., the website of the National Library of Australia, has a published number of books of 18,920,532 as of the date of this post. A search in Trove's book collection for the term "genealogy" results in 137,242 hits. The same search on Google Books results in 7,720,000 results. In either case, the number of books is unimaginably large.

For genealogists, the online collection of digital books on includes the content of several large genealogical libraries. The total number of books is well over 100,000 and growing. When you go to the website, click on the Research tab at the top of the page and then click on books.

One of challenges of this project is to figure out how to list the websites. I began with the idea to start by listing all the ones I use and/or know about and then move on to all that I can find online anywhere in the world. I have explained what happened at this point. The numbers of books or items listed for some of the websites is more of an indication of their size rather than a current number of holdings. Most of these online libraries continue to acquire more books and other items and the numbers will change daily.

Here is a representative list of online book websites, not in any particular order:
What you need to learn from this list and my attempt is to search for a book by title in every case, copyright or no copyright, to see if a website has an ebook for your particular search. Even then, it would be a good idea to search in the larger websites to see if Google missed the ebook. At some point, it will become possible to assume that an ebook always exists and then the search will be to find it online in some sort of library and gain access to the book. Interlibrary loans will very likely become completely digitized. 

The above list gives no idea as to how many digital libraries exist in languages other than English. The number I found seems endless. Listing digitized map and newspaper sites was nothing compared to the number digital books sites I began to discover. Try searching. You will soon see what I mean. 

Sunday, August 24, 2014

Fitting the Tools to the Task

If any of you have experience fixing a dripping faucet over the years, you realize that years ago, you could use a wrench and screwdriver and replace the washer in the faucet and fix the problem. Today, almost all new faucets come with special cartridges and it takes a special tool, in addition to a wrench and possibly a screwdriver or two, to fix the leak. Without that special faucet tool, the job is virtually impossible.

Now, as usual, you are probably trying to figure out what this has to do with genealogy. Well, the answer is pretty simple. At one time, you could do genealogy with a pencil and piece of paper. Some people would argue that you still can, but in reality, there are a lot of specialized tools needed to do the job today. One example will suffice, the Family History Library in Salt Lake City, Utah has been digitizing its collection of books and microfilm for a number of years. As books and microfilm are digitized, they are are no longer kept on the shelves or available to order (in the case of microfilm). So, now, just to view the microfilm or digitized books, you need "specialized equipment" such as a computer or other device and a connection to the Internet.

Of course, you could reject technological changes and ignore all of the online resources, but sooner or later, just like with the leaky faucet, you either have to obtain the tools to make the repair or hire someone to do it for you. Unfortunately, there are some people who ignore the problem permanently and refuse to "make the investment" in technology, just as there are those who choose to live with leaky faucets. There are excuses. One common issue is economic. Adapting to technological change is viewed a expensive and a luxury. This is an issue of which I am painfully aware. Genealogists today (although this may change) are usually older and at or near retirement. Many live on fixed incomes. But on the other hand, those who plead poverty, are usually unaware of the alternatives available for free computer use at libraries and family history centers.

Now we could keep arguing about economics, but these arguments are essentially correct. Being involved in genealogy costs time and money. It is not a particularly expensive activity compared to many very popular activities today. From another aspect, the tools used for genealogy, such as computers, mobile devices, an Internet connection etc. are also general purpose tools. In my case, my wife and I do not have or pay for a cable TV connection. Likewise, we do not have separate land-line telephone service. We very rarely eat out and we do not attend movies or other paid-for entertainment regularly. Some people would be unwilling to "give-up" those activities and services. We don't really care about them and do not feel disadvantaged in any way.

On the other hand, I believe in having good tools. A cut-rate, second-rate tool is sometimes worse than no tool at all. We likely spend much more than the average person on computers and the associated software and external devices. We probably use those computers and other devices much more than the average person does also.

I think that it is important to fit the tool to the task. If there is a tool that will help me do a job faster, easier, better or at all, then I see the tool as a benefit, not an extra cost. Sure, I could hire a plumber to come in and fix my leaky faucet and pay a $100 or more or I could go to the store and buy the $12 tool and the $25 cartridge and save my money. From another aspect, there is no way I could be writing this blog post without the proper equipment.

I look at tools as facilitators. Desktop computers, laptops, tablets and smartphones all facilitate my genealogy work. By maximizing the use of these tools, I maximize my genealogical efforts. These devises and the software that goes with them, are the power tools of my avocation. Sometimes I buy a tool, such as a software program, and some time later, I find out that there is a better program (tool). So, I try the new tool to see for myself. Yes, there is a cost associated with this, but remember, I am allocating time and time to me is more valuable than money. Sometimes the new program moves into my arsenal of tools. Sometimes the new program is a dud and I quickly move on to another program or back to the original one.

If we understand that the technology is the tool and that the computers and other devices are the specific tools we use to do genealogy, then the idea of upgrades, technological changes and other issues begin to take on a proper perspective. finds new home for MyCanvas

After announcing its intention to discontinue the program, MyCanvas, has come to an agreement with Alexander's, an Orem, Utah based printing company to host the MyCanvas program. The Blog post announcement stated:
We’ve heard from many people who love MyCanvas and hate the idea of it going away. Well, we have some good news for you: It’s not going away after all! We were successful in finding a new home for the service at Alexander’s.

Founded 35 years ago, Alexander’s is a Utah-based printing production company that has been the long-term printer of MyCanvas products including its genealogy books, calendars, and other printed products. This makes the transition of MyCanvas to Alexander’s a natural fit. 
It’s our hope that this agreement will not change the experience for MyCanvas customers. In fact, Alexander’s plans to make some exciting improvements we think you’ll love. Additionally, MyCanvas will continue to be available from the website as we believe in the importance of sharing family history discoveries and see MyCanvas as a way to deliver this ability to our customers. 
The transition of MyCanvas will take about six months. But in the meantime, all MyCanvas projects will remain accessible on until it moves over to Alexander’s next year. We will continue to communicate details as the transition moves forward. 
We want to thank our loyal MyCanvas customers for all the projects you have built and printed with us over the years. We’re excited about this new owner of MyCanvas—and we think you will be too.
From the number of comments, it appears that this announcement is driven by user demand. I would suggest that all of us online can and do exert an influence over what the larger genealogy companies do. The market does speak.

Saturday, August 23, 2014

I Respectfully Disagree

A few days ago, The Blog@Evidence Explained, Quick Tips blog had a short post entitled, "It's Not Just about Giving Credit Where Due." I read and re-read the post because I couldn't quite agree with the analysis. The argument made seemed to focus on a detail of citation, that is, including the publisher in the citation. However, I disagree with the post's extension of the term "publisher." I think use of this term implies a degree of involvement not present in the examples given. Although, this entire issue is really more of an academic exercise than of particularly genealogical concern, there is an attitude expressed that wants to impose an additional requirement on "serious genealogists" (my term). 

The post equates "online provider" (its term) with the term "publisher." By equating the term, the author would put identifying the online provider (whatever that means) and the publisher of a book. First of all, I must say that I have been including the publisher of a book, when known, in my citations since high school and that is now a long time ago. I guess I never really thought there was any controversy over whether or not to include the publisher of a book in a citation. For example, if I were to cite the Evidence Explained book, it would look something like this:

Mills, Elizabeth Shown. Evidence Explained: Citing History Sources from Artifacts to Cyberspace. Baltimore, Md: Genealogical Pub. Co, 2007.

Here are several more examples for good measure:

Mills, E. S. (2007). Evidence explained: Citing history sources from artifacts to cyberspace. Baltimore, Md: Genealogical Pub. Co.

Mills, Elizabeth Shown. 2007. Evidence explained: citing history sources from artifacts to cyberspace. Baltimore, Md: Genealogical Pub. Co.

MILLS, E. S. (2007). Evidence explained: citing history sources from artifacts to cyberspace. Baltimore, Md, Genealogical Pub. Co.

Mills, Elizabeth S. Evidence Explained: Citing History Sources from Artifacts to Cyberspace. Baltimore, Md: Genealogical Pub. Co, 2007. Print.

I have also been taught to put the title of the book in italics, although that does not seem to be such a big issue today. If you were inclined to do so, you could spend your time arguing about which of these different formats were the most acceptable for genealogists. In order, for your information, the formats are Turabian, APA, Chicago, Harvard, and MLA. I should also point out that only a very few genealogical software programs and online citation systems in the online family tree programs, provide a format that follows any one of these particular citation formats. Usually, there are simply places to fill in some general information. For example, here is a citation from of one of the many Tanner surname books: Descendants of John Tanner : born August 15, 1778, at Hopkintown, R.I., died April 15 1850, at South Cottonwood, Salt Lake Coun [database on-line]. Provo, UT: The Generations Network, Inc., 2005.

This is an exact copy from with the spacing and typos. Here is another form of citation from the same entry from

Tanner, Maurice,. Descendants of John Tanner : born August 15, 1778, at Hopkintown, R.I., died April 15 1850, at South Cottonwood, Salt Lake County, Utah. unknown: The Tanner Family Association, 1923.

Now the question here is a little bit obscure, is more like a library or is it more like the publisher of the book. In this case, the publisher is marked as "unknown" but the book is attributed to The Tanner Family Association, located in South Cottonwood, Salt Lake County, Utah. 

Here is a screenshot of the title page of the book from the collection:

Who printed the book? I'm sorry, but that information is missing. By the way, the exact same book is also in's Books, which has three different digitized copies of the same book. Here is the citation from
Descendants of John Tanner, born August 15, 1788, at Hopkintown, R. I., died April 15, 1850, at South Cottonwood, Salt Lake County, Utah 
Author: Tanner, Maurice, 1889- 
Description: Compiled family history. Includes index. Indexed in the Early Church information card file at the main library only.

Language: English;English;English
Original Publisher: [S.l.] : Tanner Family Association

Provenance: Owning Institution:Genealogical Society of Utah d.b.a FamilySearch;

Patron Usage Instructions:; Public

Title Number: 19703
This looks less like a "citation" and more like a catalog entry in a library. 

OK, so you got this far in my commentary and you are saying to yourself, who cares? My point exactly. Here is the crucial question. Who published the John Tanner book? What is the purpose of the citation? If you were trying to publish a genealogical journal article in a major publication, you would have to "jump through their hoops" and conform to whatever citation style they required. Does that mean that genealogists in general, especially those with no aspirations to publishing serious academically oriented or professionally oriented journal articles need to conform to one format or another? As I have said many times before, the emperor has no clothes. The idea of a citation in keeping your genealogy is to identify where you got the information so someone else can find it. In the case of the John Tanner book, whether you had a physical paper copy, viewed the book on or on really doesn't matter. What does matter is that there is a reference to the book. In fact, shows that this same book is in 49 difference libraries in 9 different editions. 

What if there are different publishers of the same book? Is that what we are talking about here? Well, then I would show the publisher of the book that I used. But are and with their digitized copies of the original book, different publishers? Not at all. They have a copy of exactly the same book. Here is a copy of the title page of the John Tanner book from one of the copies on

The same exact book. In fact, as a photographer and having scanned books and documents by the hundreds of thousands, I can see from the images of the two title pages, that they are exactly the same digitized copy of the book. They have the same "artifacts," that is defects in the scan that show up as dots etc. Here is another copy of the book page with arrows pointing to what I am talking about:

So aren't and more like libraries? I think so. What does it matter which library had my book? 

The post cited above argues that putting a copy of the book online is the equivalent of "publishing" the book. The post also dismisses, without discussion, that the online repository of the book, such as's version, is the equivalent of being a library. But wait a minute. Do we even know if digitized the book? Not really. Both and could have gotten the digitized copy from the same source which certainly looks like the case. But if we switch to a purely genealogical document such as a copy of a U.S. Census record, it is entirely possible that all or some of the online copies came from the same digitizing source. But by simply acquiring a copy of the U.S. Census, the online service provider does not assume the position of a publisher. 

If I refer to a record in the U.S. Census, what do you need to know to find that same record? In reality, all you really need to know is that is was part of the U.S. Census. As a genealogist, rather than worry about formatting an "correct" citation to the source, why don't I just include a link to that source? Like I would in or or or or in my own genealogy program. Or better yet, why not attach a digitized copy of the page to my ancestor in all these programs so you can see the same document. Do you really care if the image came from a specific online database? If you have a copy of the book or article attached, then everyone can find it and any particular format of citation is superfluous. If the document were on a microfilm, I would like to know where you got the microfilm, but I would never consider the microfilm repository to be a publisher of the information on that microfilm. In fact, what I would prefer is that you use any one of the microfilm scanners and provide a copy of the record you are using. 

Now, if you do have aspirations of publishing your findings in any one of the genealogy journals, then I suggest you get a copy of their style sheet or publication requirements and study it carefully and conform. If you are writing a Masters Thesis or a Doctoral Dissertation, you have the same challenge: conform, conform, conform. But for the rest of us out here in genealogy land, just be sure to give enough information about where you got your data so that I can go back and verify the source. Thanks.