Pages

Monday, July 1, 2013

Storing your genealogy online -- access to the Cloud

A comment to my recent post about the shift to Internet or Cloud based applications raised an important issue: what about access to the Cloud? What if I have my genealogical data stored on an online database and cannot access that data for any number of possible reasons? Fundamentally, this is the same issue we have with any storage media, including paper. What happens if I have all my genealogy on paper forms and there is a fire or a flood? Ultimately, what happens to our genealogical research when we die?

There are always contingencies. In the world of online computing, there seems to be a basic assumption that access to the Internet (the term "the Cloud" is merely a coined term for the Web) is a given. Lack of access can happen at any level; you might have an equipment failure, a software or operating system failure, a physical disaster or any other type of event that interrupt service. But Internet service has moved from being a luxury to becoming a utility. When telephone or electrical service are interrupted for any reason, it becomes a local or national emergency and power and service are restored as soon as humanly possible. Internet connection service now falls in that same category. So, I assume that the real question is more like what happens if I can no longer afford telephone or electrical service?

There seems to be a societal assumption in the United States that people are somehow entitled to basic utilities. Here in Arizona, there are programs to help people keep their electrical and water service, even if they cannot pay for it, for an extended period of time. Losing electricity when it is 119 degrees outside can be fatal. But the real issue, when we are talking about genealogy, is who has ultimate control over the data files I create?

Assuming that Cloud or Internet based applications become pervasive, where will the data reside and who will ultimately have control of that data? The model of the system now looks a little like this:

User (genealogist) => Local genealogy program => Data on local storage device => Copy to Internet => Synchronizing local and remote copy of data

The new model is probably more like this:

User (genealogist) => Online or Cloud based program => Data on remote Cloud based server => Copy to local storage (optional)

The idea of local ownership of the "genealogical data" is likely to become modified. Whether or not future genealogists (or even people in general) will consider the local storage option necessary is one of the factors that could go either way. But if you look at a program such as Family Tree Maker from Ancestry.com, you will see the direction the market will likely take. There are probably many more people who have their "genealogy" online on Ancestry.com's Public Member Family Trees than maintain their own personal database on the local program. One question that came up asked, "How do you maintain your files when you have them in multiple locations?" More needs to be said about this, but basically, the model of the future is integration of the data in multiple locations with the ability to synchronize different databases.

3 comments:

  1. "model of the future ... with the ability to synchronize different databases"
    Logically, I need a database which is totally under my control (possibly with optional granting of update access to others I trust) plus the ability to sync that with a "world" tree open to everybody (call it "one tree to rule them all" or "a tree to assimilate all others" depending on your favourite mythology).

    I need a database under my control since I need to keep my view on things without having some guy merge his ancestor with mine just because they have the same name.

    But if I don't contribute to the world's view of things, why am I doing genealogy?

    Synchronising stuff is incredibly difficult. The first sync is just a long hard slog. The hard one is the second go, when you want to update your previously synchronised John Doe with your updated events and attributes, only to find someone else has merged their John Doe into yours so he doesn't look anything like yours any more.

    I remain to be convinced anyone is even thinking about this. So I'm relaxed about the physical location - it'll work eventually, even for people like me who don't trust that we'll always have Internet access (because we don't). It's the logical control that's obscure.

    ReplyDelete
    Replies
    1. Tools like SVN, and more recently Google drive, obscure the distinction between what is stored locally and what is stored in the cloud. The synchronisation issue has been solved for simple files, as opposed to complex databases.

      I can envisage a world where that paradigm is used to hold private stuff in the cloud. My big issue is the assumption by most vendors that it's possible to merge everyone's data and generate a "Borg tree" that represents true history.

      You hinted yourself, Adrian, that you don't want such a merge, and I entirely agree. I am not aware of any current model that treats contributions as individual trees until a "virtual merge" is requested by a user (analogous to a SQL VIEW), and which gives full control over the visibility of such things as sensitive information, or half-formed hypotheses.

      In my aged scepticism, I believe that the difference between a technologically correct design and what's there now is possibly commerce and profit.

      Delete
    2. I tend to agree. Thanks for the new topics to write about.

      Delete