Some people eat, sleep and chew gum, I do genealogy and write...

Sunday, March 8, 2015

Digitizing Genealogy -- Beyond Resolution to Standards Part Two

What are the archival standards for digitizing both documents and photographs? In the past posts, I have mentioned that fact that advertisements for scanners include claims for extraordinarily high "dpi" resolution. It is interesting to note that the manufacturers of high quality photographic printers no longer talk about the "dpi" or dots per inch of their prints, but merely claim high quality. You can find out what the dpi is, but you have to look for it carefully. The resolution claimed for the high-end printers is now about 5760 x 1440. In essence, the printer manufacturers have started using the total pixel count in a way that is similar to how monitors and TVs are now sold. Most monitors are also sold with advertisements of the number of pixels. You see claims for monitors of up to 1080p and higher. See Understanding HDTV Resolution for more details.

What do the archivists say about resolution? What is the current standard?

In this case, I always start with the Library of Congress' Preservation Directorate. The preservation efforts of the Library of Congress address issues with Audio-Visual materials, Books, Paper including manuscripts, drawings, newspapers, prints, posters, maps etc. and Photographs. Most of the standards they support involve the preservation of the original documents or books. But that is beyond the scope of this series. The pertinent information from the Library of Congress is in the digital preservation section. Their general guidelines are simple:
  • Identify and select what to save
  • Organize the files selected to be saved
  • Save copies on at least two different storage media (e.g., USB drive and external hard drive) and keep these in separate physical locations
  • Migrate saved copies to a current storage medium about every five years
They also make the following statement about the resolution of digital images:
What resolution should I use when digitizing? 
Note that resolution is not the only consideration when digitizing. See Technical Guidelines for Digitizing Cultural Heritage Materials. In short, it depends on what is being digitized and the intended use of the digitized image. See pp. 49-67 of the guidelines above.
If you are involved in scanning or photographically digitizing your genealogical documents, you should, at least, be familiar with the standards offered in the linked Technical Guidelines above. Over the years, I have heard many genealogical presentations and read a number of articles by genealogists who provide guidelines for "standards' for scanning and photographs, but few of them refer to or are apparently aware of the national standards. They often throw out numbers for dpi or whatever without any qualification or support. I have read and heard presentations that claim that you should scan photos at the "highest resolution supported by your scanner" and other such nonsense. The real issue, as I started to illustrate at the beginning of this post, is how are the images going to be displayed? In short, if you scan a photograph at 600 dpi and then print it out at 300 dpi what was the point of the higher resolution scan? What if you scan a photo as some huge level of dpi claimed by the scanner's manufacturer, what if you view the document on a lower resolution screen?

The Library of Congress document explains all this in detail. The summary of their comments on the subject are as follows: 
Higher spatial resolution provides more pixels, and generally will render more fine detail of the original in the digital image, but not always. The actual rendition of fine detail is more dependent on the spatial frequency response SFR) of the scanner or digital camera (see Quantifying Scanner/Digital Camera Performance below), the image processing applied, and the characteristics of the item being scanned. Also, depending on the intended usage of the master files, there may be a practical limit to how much fine detail is actually needed.
I absolutely agree with this statement. Did I mention that the document is 101 pages long and has four pages of links to additional documents. If you are at all serious about the subject of digitization, you should be familiar with the current standards. Here is the Library of Congress statement in summary on the issue of resolution:
Resolution[Resolution] Requires sufficient resolution to capture all the significant detail in originals. Currently the digital library community seems to be reaching a consensus on appropriate resolution levels for preservation digitization of text based originals – generally 400 ppi for grayscale and color digitization is considered sufficient as long as a QI of 8 is maintained for all significant text. This approach is based on typical legibility achieved on 35mm microfilm (the current standard for preservation reformatting of text-based originals), and studies of human perception indicate this is a reasonable threshold in regards to the level of detail perceived by the naked eye (without magnification). Certainly all originals have extremely fine detail that is not accurately rendered at 400 ppi. Also, for some reproduction requirements this resolution level may be too low, although the need for very large reproduction is infrequent. 
Unlike text-based originals, it is very difficult to determine appropriate resolution levels for preservation digitization of many types of photographic originals. For analog photographic preservation duplication, the common approach is to use photographic films that have finer grain and higher resolution than the majority of originals being duplicated. The analogous approach in the digital environment would be to digitize all photographic camera originals at a resolution of 3,000 ppi to 4,000 ppi regardless of size. Desired resolution levels may be difficult to achieve given limitations of current scanners. 
You might say that this doesn't help much, but in fact, it is exactly appropriate. While I was involved in digitizing documents for FamilySearch directly and with my recent digitization efforts of the photographic collection that ended up at the University of Arizona, I worked closely with both organizations to follow their own guidelines. In both cases, my digitized images were acceptable for archive purposes. 

Unfortunately, images usually uploaded online to various websites do not always comply with the standards of quality set forth by the Library of Congress. In addition, if you look around on the Web, you will see a number of entirely different standards. What I can say, is that any one selling you a device and claiming a particular resolution in dpi or ppi or lpi, is probably not telling the entire story and anyone telling you the same thing about doing your own digitization project is probably not telling you the whole story either. 

Of course, I have a lot more to say about these issues in future posts in this series. This will be a very long series. 

Here are the previous posts in this series:

No comments:

Post a Comment