Some people eat, sleep and chew gum, I do genealogy and write...

Thursday, May 21, 2015

More about "free" images -- be careful

One of the phenomena of our age is the ability we have to "pass along" posted information almost instantly. When we see a cute or attractive photo, we can "share" it by simply clicking a link to Facebook or some other social networking website. In addition, many online genealogy websites have provisions for uploading photos and making them available online. I recently wrote a post about where to find free genealogical images online. But after some consideration, I decided that it would be good idea to followup with a post about why some of these images are not freely usable.

Back in 1886 a multi-national agreement concerning copyright was signed in Berne, Switzerland. This agreement is known as the Berne Convention and applies to the signatory countries. The United States did not ratify the Berne Convention and become a signatory until 1989. That's right. It took the United States more than a hundred years to accept the agreement. The main provision of the Berne Convention, now enforceable in the United States, is that copyright protection becomes automatic and any requirement for formal notice is prohibited. This agreement applies to all of the Berne Convention signatories. See the list from the World Intellectual Property Organization.

This means that any "work" as defined by the U.S. Copyright Law becomes automatically covered without any notice. It is not necessary for the work to have the copyright symbol or any mention of copyright to be protected. The term of the copyright is determined by the country of origin.

So why are people able to republish photos and other content on the Internet? This is a really interesting question. You cannot assume that an image that is being "passed around" on the web is not legally subject to a claim of copyright. This is doubly the case if you find an image on someone's website. In fact, you must assume that the document is subject to a copyright claim unless there is specific information to the contrary.

In the United States, part of the issue of online content was addressed in another major copyright statute. In 1996 the United States passed the Digital Millennium Copyright Act implementing the provisions of two World Intellectual Property Organization treaties. The provisions of this statute are summarized by Wikipedia as follows:
The Digital Millennium Copyright Act (DMCA) is a United States copyright law that implements two 1996 treaties of the World Intellectual Property Organization(WIPO). It criminalizes production and dissemination of technology, devices, or services intended to circumvent measures (commonly known as digital rights management or DRM) that control access to copyrighted works. It also criminalizes the act of circumventing an access control, whether or not there is actual infringement of copyright itself. In addition, the DMCA heightens the penalties for copyright infringement on the Internet.
A complete copy of the legislation can be found on the Copyright.gov website.

So what if I have an image and I do not care about copyright protection? How can I notify the world that they can use my image without worrying that I will send them a nasty letter?

There are various ways to legally release all or part of your copyright interest in any work. One of the most commonly used methods for works on the web is the use of the CreativeCommons.org. If you post a work on the web, you can retain your copyright, but at the same time, you can specify a less restrictive use policy. You can do this for your whole website or for individual documents. You will need to go to the CreativeCommons website and spend some time studying the different levels of license that you can give to those who use your work.

If you are trying to find public domain images or other works on the web, be sure to look carefully at the content. If the image appears on a webpage without any indication of copyright or license, it is protected under copyright and it would not be a very good idea to assume otherwise. In other words, you must have a positive assertion that a document, image or any other work has been released from its copyright either due to a specific license or the passage of time.

Return to the beginning -- online genealogy resources

As the online genealogical community continues to grow and evolve, we need to sometimes remind ourselves of the our more humble online beginnings. What is more important is that some of these humble beginnings are still with us and have grown and become even more valuable. I would like to take a short trip down memory lane and point out some of the online golden oldies of genealogy that still need to be used and remembered.

Cyndi's List
First on my list is Cyndi's List. Cyndi Howells has been online with her comprehensive list of genealogical resources for over 18 years. This website has been growing steadily during those long years and now has 333,873 links to genealogy websites all over the world. As of the date of this post, Cyndi has 1146 new and updated links. This was a genealogical resource before we knew what online genealogy was all about. Here is the description of what this website entails quoted from her startup page:
What exactly is Cyndi's List?
  • A categorized & cross-referenced index to genealogical resources on the Internet.
  • A list of links that point you to genealogical research sites online.
  • A free jumping-off point for you to use in your online research.
  • A "card catalog" to the genealogical collection in the immense library that is the Internet.
  • Your genealogical research portal onto the Internet.
 To understand and gain an appreciation of what this one person has done for genealogy, you need to read the section that talks about Cyndi herself.

If you haven't used this resource or even if you haven't used Cyndi's List lately, I suggest it is time to get back to using this valuable and comprehensive list of genealogy websites.

Archive.org
I just wrote about Archive.org very recently and so I will defer to my previous post about the details. The Internet Archive or Archive.org was founded by an American computer engineer named Brewster Kahle. See Wikipedia: Brewster Kahle back in 1996. The Internet Archive is still going strong and although it is not specifically genealogical resource, the information in its massive digital files is extremely interesting and helpful to genealogical researchers. The mission of the Internet Archive is to provide "universal access to all knowledge." See Wikipedia: Internet Archive.

U.S. GenWeb
1996 was a significant year in the development of the Internet. Many of the original genealogically significant websites began about that time. the USGenWeb Project belongs in the that category. This is another of those extremely valuable websites that have continued in their development and usefulness, although I have seen few comments about the Project recently. Here is how the Project is described:
The USGenWeb Project consists of a group of volunteers working together to provide Internet websites for genealogical research in every county and every state of the United States. The Project is non-commercial and fully committed to free access for everyone. 
Organization is by county and state, and this website provides you with links to all the state websites which, in turn, provide gateways to the counties. The USGenWeb Project also sponsors important Special Projects at the national level and this website provides an entry point to all of those pages, as well.
I would suggest that if you are not familiar with this website, you certainly should be.

RootsWeb
Another of the old websites that still keeps plugging along is Rootsweb.  This website also dates back to the early days of 1996 (or even before). However, the website was purchased by Ancestry.com in 2000. It is a "free" website and has a huge amount of genealogical information. Originally, it was one of the foremost methods of posting genealogical requests for collaboration and help. It is still a very useful website if you ignore all the advertisements.

This list could go on. There have been some very persistent online blogs for example. If you have an old-time favorite, post a comment and let us know about it.

Ancestry.com up for sale at auction

If you happen to have an extra two or three billion dollars laying around, you could bid on purchasing Ancestry.com. The genealogical giant is up for sale according to an exclusive Reuters.com article entitled, "Exclusive: Genealogy website Ancestry.com explores sale: sources." The article states, in part,
Ancestry.com LLC, the world's largest family history website helping users trace their heritage, is exploring a sale that could value it at between $2.5 billion and $3 billion, including debt, according to people familiar with the matter. 
Permira Advisers LLC, the buyout firm that owns most of privately held Ancestry, has hired investment banks to run an auction for the company, the people said this week.
Permira Advisers LLC, purchased Ancestry.com back in October, 2012, just over two years ago. That purchase took Ancestry.com private and moved its ownership to Europe.

The rest of the Reuters article gives some insight into the profitability of Ancestry.com. Here is another quote from the article:
Based in Provo, Utah, Ancestry has a database of more than 15 billion historical records and more than 2.1 million paying subscribers. Subscription fees accounted for 83 percent of its total revenue of $619.6 million last year. 
In addition to offering genealogical data, Ancestry provides a DNA service that allows customers to discover their genetic ethnicity and find relatives with a common ancestral match. 
Permira outbid other private equity firms to take Ancestry private in 2012 for $1.6 billion. Ancestry's subscription revenues have grown to $553.8 million last year from $334.6 million in 2012. 
Ancestry's adjusted earnings before interest, tax, depreciation and amortization were $214.8 million in 2014, according to its most recent annual report.
I guess we can begin speculating about whether or not a sale will happen and if so, what effect it might have on the operation of the website and further, the partnership agreement with FamilySearch.org.

Tuesday, May 19, 2015

Genealogy on Facebook


I am finally seeing some significant movement away from Facebook.com.  For some time, it has appeared as if Facebook would become like a massive black hole sucking everything into its clutches. I noticed the glimmer of escape when most of my family started using Instagram.com more than Facebook. This is not to say that my Facebook traffic has slowed down any lately, in fact the opposite is true. If anything, Facebook traffic has increased and in some cases dramatically. But change is in the wind.

A short time ago, there were a number of comments and features in the news stream about Facebook adding "real time" news. You can now subscribe to just about any TV station, newspaper or other "news" outlet and receive immediate, instantaneous news update right on your Facebook feed, for example, the New York Times Facebook page shown above. This is a less that subtile transition from a "social networking" environment to a data supplier environment. Wait, how will we know about the latest cute cat video? Never fear. Facebook isn't going anyplace, but it is expanding into areas that can only peripherally be considered to be "social networking." The distinction between the results of a search from Google and a search on Facebook are becoming less distinct.

Some time ago, I wrote about a movement I detected in the genealogical community to move from formal blogs to posts on Facebook. Several prominent genealogists have decisively moved onto the Facebook stage in a dramatic way and accumulated thousands of followers (friends etc.). Some of them are combining a major presence on Facebook with other networking outlets such as webinars or similar online broadcasts. One of the most successful of these Facebook genealogy outlets is Tracing the Tribe, Jewish Genealogy, moderated by Schelly Talalay Dardashti with over 7,600 members. Shelly is also one of the bloggers that has moved her primary emphasis away from blogging to be on Facebook. Some Facebook personalities are still trying to maintain their blogs.

There is a difference between moving to Facebook and merely posting notices of blog posts on Facebook. You might have noticed that everything I publish in my blog gets posted to a variety of social networking websites, including Facebook.com, Twitter.com and Google+. But there is a difference between having a presence on Facebook and running a Facebook dominated outlet. Blog posts are generally substantial. Most Facebook posts are short, concise and usually link to another item. There is not a lot of introspection and analysis on Facebook (yet?). I am certainly not predicting the demise of blogs, they are here to stay, but I am noticing that there are some significant trends in the Internet dominated communications area.

One thing I have noticed in the past few weeks is a dramatic decline in the number and variety of blog posts from the genealogical community. I monitor over 300 genealogy related blogs every day. In the not-to-distant past, two or three months ago, it  was not unusual for me to have over 200 new posts almost every day. Today, for example, I have gone hours with only one new post. My guess is that blog readers are abandoning the content of blogs for the news bites on Facebook and other social networking venues.

At some point, if my own readership drops off, I guess I will have to evaluate what is happening and move to the new venue. I am already on Instagram with my family. I have yet not decided whether to go public with Instagram. I have been on Facebook for a very long time, but I am not yet ready to move completely to another format. If it matters, I have a lot to say and venues such as Facebook do not give much in the way of substance even with the addition of the news. Time will tell.

Is Genealogy Complicated?

The simple answer to the question in the title of this post might be considered to be, "Yes, genealogy is complicated." But it is helpful to understand what we mean when we say something is "complicated." There is a whole area of study called "complexity theory" which extends to strategy, economics, complex systems and includes areas such as chaos theory and computer related topics such as computational complexity theory. See Wikipedia: Complexity theory.

Looking at the topic of complexity from a very simplistic standpoint, we can detect many levels of complexity. However, there is no universally accepted definition of exactly what we mean by complexity. I began thinking about this recently when I was talking to a person who was explaining a very convoluted family relationship involving the historical practice of plural marriage (aka polygamy). The person doing the explaining seemed to think that the issues involved some high degree of complexity, however although I said nothing about the situation, I thought what she was explaining was quite simple and not out-of-the-ordinary type of issues commonly faced by genealogical researchers. She was using the excuse that the family relationship was "so complicated" that there was no way the genealogical relationships could be properly researched and so she was not going to do any further research. I think I smiled and said I would be glad to help her if she wanted assistance.

Genealogical research definitely deals with a "system." In addition, we are dealing with a system of limited relationships rather than a complexity of random relationships. Although there are variances in the any ancestral system, there is certainly a rather limited number of possible combinations of elements (individuals related either culturally or by blood or marriage). Since we are dealing with a system of organized complexity, we can predict that current patterns of relationship extend into the past, that is, unless our ancestral families came from an area with a distinctly different culture than the one we presently live in.

Genealogical research does, however, involves a complexity of elements. The study of this type of complexity is usually associated with network theory. One graphical representation of this type of social or ancestral network is the program called Puzzilla.org. Here is an example of a screenshot showing a graphic representation of some of my ancestral relationships.

This particular graphic representation is based on ancestral information obtained from the FamilySearch.org Family Tree and is only as accurate as the source information. In addition, this is a two dimensional representation of a multidimensional system. Many of the people represented by node (dots) in this graphic had more than one spouse. Some had many spouses. But this particular diagram only includes those direct family line individuals that have been selected by the user (me) as the preferred lines. But it is a good place to start in understanding the complexity of doing genealogical research. Each of the generations increase the number of direct line ancestors by doubling the number in the preceding generation, i.e. you have 2 parents, 4 grandparents, 8 great-grandparents etc. In addition, you add in the multiple marriages and all of the descendants and you get large numbers very quickly. It is this rapid expansion into rather large numbers of relatives that makes genealogy appear complex.

In addition, each of the individuals represented above and all of the other spouses, children and collateral relatives of each, could be the basis of extensive research. In a very real sense each person could potentially be the subject of a long and detailed biography. As genealogists, we often arbitrarily either limit or discard information that exceeds our interest or the "scope of our research." If we did not do that, we would all be involved in writing extensive biographical works about our parents or perhaps, our grandparents.

When you think genealogy is complex, you are really comparing it to other systems that may seem to be less complex. Actually, genealogy is not very complex by any absolute criteria. Genealogical research is repetitious and examines each node in the system (each ancestor) with the same or similar set of basic criteria. For example, I believe that I am advancing with my genealogical research as I fill in the blanks in an arbitrarily designed genealogy program. When I feel satisfied that I have enough information about any particular person, I move on to another person in the pedigree construct. This is essentially a repetition of the what I just considered to be finished. The appearance of complexity comes from the multiplication of these basic systematic units, i.e. families. Of course, I am not limited to biologically bases family units, I can add in adoptive units, foster units, and so forth.

What did the lady referred to above mean when she expressed the idea that her polygamous family was complex. She was, in effect, comparing that particular family to some idealized family unit that did not have as many variations in the relationships. However, as I pointed out above, there is no absolute measure of complexity. But even on a relative scale the variations in the relationships of individuals in an ancestral system cannot be considered to be very complex on any absolute scale. Complexity at this level is usually a result of a large number of random associations. A truly complex system is based on a large measure of unpredictability. This is not the case with genealogy. The system is regular. Alternative familial relationships, such as adoption, foster care, etc., merely add additional, yet similar, nodes to the general outline of the system.

None of this means that becoming involved in genealogical research is either easy or unchallenging. The initial state of the system, when we begin our research, contains few members of the system, i.e. we know very little about our ancestors. Historical research can involve a huge amount of time and effort due to the lack of availability of records and the difficulty of either finding them or accessing them. The acquisition of the knowledge and skills necessary to adequately do historical research can be overwhelming.

Contrary to the evaluation of the lady with the challenging ancestral family, those relationships do not involve a significant increase in the overall complexity of her ancestry. In fact, she already seemed to know how the various people were related. Her evaluation was most likely based on a lack of enthusiasm for pursuing the research necessary to document the system, not the complexity of the system itself.

Sunday, May 17, 2015

How to find genealogy images that are free of copyright

By Smalljim (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
It is extremely easy to copy a photo or other image on the web. But indiscriminate copying is fraught with dangerous consequences. Most of the images we see online are subject to copyright or are trademarks or tradenames. To avoid the entanglement of legal demands and possible legal action, the best practice is to avoid re-publishing protected content. I have written a lot about what is covered by copyright, so I decided to talk about the other side of the issue; what is not covered by copyright. In this post, I am focusing on images and will not get into the issue of "fair use" of written material.

Under the current copyright law in the United States, every "work" (which includes a long list of publications and includes photos and images) is automatically protected by copyright. There is no requirement that any notice of copyright claim. A very complete explanation of U.S. Copyright law is online at Copyright.gov.

First, there is a difference between "permission" to use an image and being free of copyright. A copyright holder can give permission to use an image without giving up their copyright protection and ownership. On the web, there are several ways this can be done. The common way is to grant a limited license for the use of the image. There are several organizations that provide a venue for limited sharing of images online. Probably the most used of these organizations is the CreativeCommons.org. There are more than 882 million Creative Commons licensed works on the web. Here is a quote from their website about what they do:
What is Creative Commons? 
Creative Commons is a nonprofit organization that enables the sharing and use of creativity and knowledge through free legal tools. 
Our free, easy-to-use copyright licensesprovide a simple, standardized way to give the public permission to share and use your creative work — on conditions of your choice. CC licenses let you easily change your copyright terms from the default of “all rights reserved” to “some rights reserved.” 
Creative Commons licenses are not an alternative to copyright. They work alongside copyright and enable you to modify your copyright terms to best suit your needs.
 Let me show you how I got the image at the beginning of this post.

1. I used Google Images to find the image. I searched Google Images for an image of a graveyard and I clicked on the link that says "Search tools" in the menu bar on the page:


I then clicked on the selection that says, "Labeled for reuse." That gives me a rough indication of which of all the images are either in the public domain (not subject to copyright) or are in the Creative Commons and can be used with some restrictions.

2. I chose an image from all those that appeared and clicked on it to see its status. Here is what I saw:


To avoid dealing with licenses for all the other images, I blurred out the ones that I wasn't interested in.

3. Then I went to the "View Page" link to see if this image could be used in my blog post. Ideally, I would be looking for images that are labeled "Public Domain" so that I could use them freely in my posts, but in this case, the image was available under a Creative Commons license.


The limitations in this case were "attribution" and "share alike." The limitations have their own explanation of what they entail.

4. I decided to use the image and so I clicked on the icon for use on the web on the right-hand side of the image.


That selection gives me the information I need to attribute the image. Here is a screenshot of the selections:


I then copied the Attribution from the form and put it as a caption to the image I used at the beginning of this post.

If you were to re-copy the image without the caption, you would be violating the copyright of the owner of the image, unless the image was in the public domain. As you investigate the images, you will find references to other licenses and conditions of use.

I sometimes use my own images in my blog posts. I certainly use all my own images on my WalkingArizona.blogspot.com blog. All of my images are subject to copyright and have my copyright notice embedded in them as metadata.

If the image is in the public domain, then that will be noticed in on the image page. Here is an example of an image in the public domain:


Here is the caption and notice from Google:
Johnson, Helen Kendrik (Ed.) (?) - Johnson, Helen Kendrik (Ed.): “World’s Best Music”' (1900)[1] 
Permission details 
public domain, hence royalty-free stock image for all purposes and no usage credit required
 Now what about the screenshots? That is yet an unresolved issue. But a screenshot of a copyrighted image would still be covered by copyright law. Google views screenshots as an exercise of the "fair use" doctrine and so do I. See Wikipedia: Screenshot.

Saturday, May 16, 2015

What do the automatic genealogical search programs find?

Four of the largest online genealogical documents websites, FamilySearch.org, Ancestry.com, Findmypast.com and MyHeritage.com, have all implemented automatic record hint capabilities. What do these four programs find for the people in your ancestry? Are they are finding the same records or are the results different? Are the documents they find useful for research into your family lines?

In order to use all four programs, you must be registered with each of the four. FamilySearch.org is free but the other three are paid subscription programs.

Since I have been using all four programs for quite a while now, my impression is that they are all extremely effective in accurately finding documents pertaining to particular ancestors. Overall, their accuracy is, in some cases, astounding. The general limitations are rather obvious. The programs cannot find records that are not in their own particular set of documents or collections. Their accuracy diminishes as they go back in time, just as the number of online, available documents diminishes. They have a difficult time distinguishing between people with very similar names, dates and geographic information, just as human researchers do. All in all, they work very well going back about 200 to 250 years, but then their usefulness decreases rather rapidly.

Each of these four programs requires the user to enter some basic information before the automatic search process can begin. In other words, you have to have a family tree on each of the four programs to take advantage of the automatic search capabilities. This fact creates some rather serious perceived issues, including the need to somehow keep all four of the family trees synchronized and the difficulty of moving information from one tree to another. This problem becomes even more complicated if you elect to maintain your own family tree on your own local genealogical database program. Then, in effect, you have a minimum of five family trees to contend with.

In the past, I have written about the process of determining what documents each of these websites have in their collections. Here is a brief summary of where you should look in each of the four programs. The idea here is to ascertain whether or not the programs have any documents that may help you find your ancestors in a specific location. However, you should always be aware that there may be pertinent documents in collections that you are completely unfamiliar with and would never have searched.

Another point is the utility of doing manual searches on each of the programs. In general, each of the programs has a superior ability to find some types of documents using their automated search functions than your own ability to find the same documents by manually searching. But I have found that there seem to be some documents that can only be found by manually searching in the collections individually.

Here is the summary:

FamilySearch.org
All of the records on this website are listed in the FamilySearch Catalog. Those records that have been digitized and are in the Historical Record Collections are indicated by a link to the collections. However, the only documents that are currently available for Record Hints are those that have been indexed and added to the Record Hints Search. These records that are automatically searchable constitute only a small percentage of the entire set of records on the website. The number of automatically searchable records is constantly increasing as new collections are added to the automatic Record Hints category.

Ancestry.com
All of the records on Ancestry.com are indexed and searchable. There are no obvious limitations on the number of records from the entire collection that is included in the automatic "Shaky Green Leaf" record hint technology. However, you may want to review the Ancestry.com Card Catalog and become acquainted with its contents.

Findmypast.com
The collections available from Findmypast.com are more focused on the United Kingdom and related records than other parts of the world. Of the four programs, their search capabilities are the most rapidly evolving. The records on Findmypast.com are listed in the A-Z of Record Sets.

MyHeritage.com
Each of the three other programs, start you out with a search for records for your ancestors. MyHeritage.com, more than the other three programs, relies almost completely on its automated search capabilities. You can search for records directly, if you wish to do so, and the list of collections or databases in the program is available, but doing your own searching is not nearly as effective as letting the program do the searching. The list of programs is available by geographic region under the "Research" tab on the home page of your family tree.

Choosing an ancestor to use as an example is rather difficult for me. Since I have had all of family tree data in each of the programs for an extended period of time, they have all had ample time to find suggested records. There is no way to compare the number of hints made available by all four programs for my entire record, since FamilySearch.org and Findmypast.com do not provide you with a "total" number of hints they have found. If you were to choose one particular ancestor, the choice itself would likely determine the outcome. For example, if I chose an ancestor from England, then Fimdmypast.com would have the advantage in any comparison, simply by virtue of the fact that they have the most indexed records from the U.K. In each case, the number of records found will be determined by the match between your ancestral lines and the particular types of records in each of the programs.

From another standpoint, the records the do find may be extraordinarily valuable for one ancestor and cumulative for another. Without a method of evaluating the effectiveness of the hinting process, there is also no way of comparing raw numbers. It also seems that no matter how accurate the programs' search capabilities may be, they are still liable for a number of "false positives" or record hints that do not apply to your ancestor.

Finally, record hints are not a substitute for careful evaluation of the record contents and application. Just because the programs are accurate and helpful does not mean they are always right. You have to examine each record carefully before incorporating the information into your file. You must always remember that the records themselves may not be accurate.