Thursday, December 23, 2010

More on metadata and genealogy

In my last post on this subject, I talked about the advantage of using metadata to help identify background information about digital photographs, including geolocation information. There are two ways to obtain metadata, straight from the camera or scanner or by adding the information through a photographic editing program. Both Apple and Microsoft operating systems (and other too, like Linux) allow you to name and rename photos. The default name given to images by both cameras and scanners is usually some arbitrary number and or letters, occasionally followed by a date and time. Most scanning software and some of the software for downloading images from a digital camera, also let you select a name for the files you are downloading to your computer. But unless you take the time to specify an alternative name, the images load onto your computer as "HPCX11002" or whatever.

Even if you don't go any further in identifying your pictures, you should spend the time to give each one a meaningful name and perhaps identify the people in the picture. Many programs, like iPhoto and Picassa, and a number of online sites, like Flickr and Fotopedia, let you add captions and descriptions. However, if the image is moved to another computer or another program, usually that information is lost except the file name. We have come a long way from the old 8 character DOS naming conventions. File names can now contain a lot of information. I use a format for scanned images and photos that includes the date the image was taken, scanned or created. For example, a picture of a birthday party would have a file name like, "2010-12-29 Eva Overson birthday party." I view these names to be temporary, since I still need to go back and identify all of the people in the photo.

Originally, we tried making a list of the names of the photos and identifying the people in a separate document, mostly handwritten and later, in a computer file. The main problem is that the identifying information gets lost or separated from the pictures. But until relatively recently, there were no other alternatives. It is clear that having the information saved with the image guarantees that the information will stay with the photo or image even if moved to another computer or used by a different program.

So how do you attached or embed the identifying information into the photo or image itself? Through the use of metadata. There are a number of standards for including metadata in an image; Adobe Software has one standard to store metadata in digital images called XMP for Extensible Metadata Platform.  XMP is based on XML or Extensible Markup Language. XML is a flexible format for creating structured computer documents and a variety of XML specifications exist for different applications and is an application profile or restricted form of SGML, the Standard Generalized Markup Language [ISO 8879]. XML uses tags to define (or "mark up") the identity or purpose of each piece of information in a file.

Another standard for the use of metadata is more concerned with the technical information supplied by the scanner or camera and is called EXIF or Exchangeable Image File Format.  This is the data that is automatically embedded in the photograph by the camera or scanner.

Yet another standard is the IPTC or International Press Telecommunications Council Information Interchange Model. This standard is used by news organizations to embed information such as the title, caption, author or photographer, the date of the image and the location as well as other information. The IPTC standard has been supported by Adobe for a long time. Nearly all of the current software will recognize and support XMP, EXIF and IPTC metadata.

Presently, there is no universal way to either view or attach metadata. Although some of the online "free" programs like Picassa can view some of the data, you either need to use a program like Adobe Photoshop or Photoshop Essentials to add extensive metadata about a picture. Interestingly, once metadata has been added to an image, Apple's OS X search program will find files with metadata embedded.

Now, that you know a little bit about what metadata is, be sure and read my next post (or so) about using metadata to keep track of images.

  1. XMP was a good idea as it addresses RDF and can be inserted into a number of different data formats. Unfortunately, although it is now associated with an ISO standard, the Adobe trademark still prevents it being universally adopted.

    An interesting alternative is to turn it all inside-out. Instead of hiding the metadata in some other format, wrap the other data in an XML-based "container" that has all the required metadata.

    Inserting binary data into an XML file is possible although sometimes frowned upon. They could even be maintained as separate but linked pairs of files, thus mimicking the separate Data and Resource forks of the old Macintosh filing system.