The number and variety of "educational opportunities" are overwhelming, starting with classes on computers at local colleges and universities, but how many genealogists spend the time obtain a degree in computer science or information science before starting out to do research on their family?
Basic computer skills involve the physical mechanics of entering data using a keyboard and mouse to understanding file structure and the operation of complex programs. But even with a good background in computer usage, it is a fact of life that the technology changes constantly. So the today's genealogist is confronted with learning about computers while trying to understand the equally complex field of genealogy. As a side note, many people involved in genealogy assume that younger people, who have grown up using computers and cell phones are a "step ahead" in entering the field of genealogy because of their background in technology. This is an illusion. Genealogical research requires additional skills of analysis and evaluation that are gained only by experience. It may be discouraging to the beginner, but learning computer skills is only the first step in doing effective genealogical research using all of the vast online sources.
I am going to have to assume that the readers of this blog post have at least a basic idea about how to use a computer or other computer-based device or they would not have gotten to this venue. This particular post is called Web basics because I find that even with good computer skills, researchers are not aware of the different ways you need to conduct searches online.
There are three basically different online search techniques that reflect three completely different ways of organizing information. Like it or not, as genealogists we are involved in the analysis, collection, classification, manipulation, storage, retrieval, movement, and dissemination of information. But to perform any of those activities, we have to first find the information. Following is a short analysis of the three different methods of approaching the finding function of online research.
You can think of research in the abstract as searching through an infinitely large pile of paper. Each piece of paper has a small piece of information. If you were to sit by the side of the pile and randomly pull out pieces of paper, what would be the chance that you would find what you were looking for? My guess is that the probability of finding what you want is close to zero. What is more, how do you know what you are looking for is even in the pile? Genealogists should be painfully aware that not all the information they need has yet been transferred to the vast online pile.
So ignoring the three different search techniques for a while, we should also have a basic idea of the types of records we are searching for and whether or not the particular types we need have migrated to the Web, that is been digitized and indexed. Hmm. That brings up another issue. Genealogical information may be on the web as images of documents. Unfortunately, the technology for searching images of documents is sadly very rudimentary. So as genealogists we rely heavily on indexing and indexes. Even with all our vast electronic wonders, we still have to rely on someone, someplace looking at each document image and manually transcribing the information. Of course, if the information we seek is text, it is much easier to find and search. But if the information is locked up in an image, we are back to visually searching the records which is no different than going to a library or searching through microfilm copies.
Now back to the infinite pile. We all seem to instinctively understand that the pile needs to be organized in some way so that we can find what we are looking for. But how do we organize the pile? Well, librarians have been organizing their piles for quite a long time. They use a variety of complex cataloging systems. As children going to a school library, we probably heard of the "Dewey Decimal System" or organization and the corresponding card catalog. Books were (and still are in some libraries) organized on shelves by subject and then numbered in a way to make it easier to find the books. For genealogists this is an awkward system because almost everything ends up in Dewey Decimal Classification number 929. Here is a list of categories:
929.1 GenealogyYou can see that this set of categories is not all that useful. In any event the whole Dewey Decimal System of classification has been supplanted by other more complex cataloging systems such as the Library of Congress Standards. Warning: getting into this area of searching can be very discouraging, as in, I had no idea how complicated this could be. Just for fun, here are the Library of Congress Standards by category:
929.2 Family Histories
929.3 Genealogical sources
929.4 Personal names
929.5 Cemetery records
929.6 Heraldry
929.7 Royal houses, peerage, orders of knighthood
929.8 Order, decorations, autographs
929.9 Forms of insignia and identification
Resource Description Formats
- BIBFRAME (Bibliographic Framework Initiative) - serves as a general model for expressing and connecting bibliographic data
- EAD (Encoded Archival Description) - XML markup designed for encoding finding aids
- Extended Date/Time Format (EDTF) - comprehensive date/time definition for the bibliogrpahic community
- MADS (Metadata Authority Description Standard) - XML markup for selected authority data from MARC21 records as well as original authority data
- MARC 21 formats - Representation and communication of descriptive metadata about information items
- MARCXML - MARC 21 data in an XML structure
- MODS (Metadata Object Description Standard) - XML markup for selected metadata from existing MARC 21 records as well as original resource description
- VRA Core -- The VRA Core is a data standard and XML schema for the description of works of visual culture as well as the images that document them
Digital Library Standards
- ALTO - Technical Metadata for Optical Character Recognition
- AudioMD and VideoMD - XML Schemas that detail technical metadata for audio- and video-based digital objects
- METS (Metadata Encoding & Transmission Standard) - Structure for encoding descriptive, administrative, and structural metadata
- MIX (NISO Metadata for Images in XML) - XML schema for encoding technical data elements required to manage digital image collections
- PREMIS (Preservation Metadata) - A data dictionary and supporting XML schemas for core preservation metadata needed to support the long-term preservation of digital materials.
- TextMD (Technical Metadata for Text) - XML schema that details technical metadata for text-based digital objects.
Information Resource Retrieval Protocols
- CQL (Contextual Query Language) - Formal, user-friendly query language for use between information retrieval systems
- SRU/SRW (Search and Retrieve URL/Web Service) - Web services for search and retrieval based on Z39.50 semantics
- Z39.50 - Supports information retrieval among different information systems
ISO Standards
- ISO 639-2: Codes for the representation of names of languages-- Part 2: Alpha-3 code.
- ISO 639-5: Codes for the representation of names of languages-- Part 5: Alpha-3 code for language families and groups.
- ISO/DIS 25577 - Information and documentation -- MarcXchange
- ISO 20775 - Schema for Holdings Information
Metadata for Digital Content: Developing institutional policies and standards at the Library of Congress
Recommended Format Specifications: Best practices for ensuring the preservation of, and long-term access to, the creative output of the national and the world in both analog and digital formats
OK, now you can begin to see the first type of search. That is a search based on a cataloging system developed and imposed on the data pile by someone who makes up the systems. Searching in a catalog is a whole complicated study in itself. I spent my first few years of work as a bibliographer in a major university library. I became very familiar with the complexity of the cataloging systems.
Is there any hope? Sorry. Not much. The second method is the brute strength, bulldozer method called a string search. You can think here of Google. You type in a series of characters and the search engine tries to match your string of characters with any other characters out there in the pile that match. I wish it were just that simple. What really happens is that Google and other such search engines, create their own catalogue or structure of the data before beginning the string search (not string as in tying knots but strings as in a series of text characters). At this point you can probably guess that I am going to write more completely about each type of search but at this point, what you need to know is that you type in a name and the program sees if it can find that name anywhere. Of course, you soon find that the searches return millions of results that simply illustrate the size of the selected pile, so there must be more to searching on Google than simply wishing that your results show up. Yes, there is, but you will have to wait until my subsequent posts.
Last, but certainly not least, computers programmers have come up with an entirely different way of organizing vast quantities of information that they call a wiki. Searching a wiki turns out to be completely different that either a traditional (or even non-traditional) cataloging system and has its unique advantages and some disadvantages.
Perhaps you can now begin to grasp the complexity of the pile of information and the fact that there are different and somewhat complex methods of organizing the piles. As genealogists, I suppose we could blissfully ignore all this and go on our merry ways seeking our ancestors. We might even acquire some or many of the skills necessary over time. But now, we are faced with the huge online world and sitting in a library in Salt Lake City or where ever is not all of the answer to our investigations.
The next posts on this subject will explore each of the three major methods of pile organization and give some ideas of how searches differ or are the same in each method.
929.1Genealogy929.2Family histories929.3Genealogical sources929.4Personal names929.5Cemetery records929.6Heraldry929.7Royal houses, peerage, orders of knighthood929.8Orders, decorations, autographs929.9Forms of insignia and identification
929.1Genealogy929.2Family histories929.3Genealogical sources929.4Personal names929.5Cemetery records929.6Heraldry929.7Royal houses, peerage, orders of knighthood929.8Orders, decorations, autographs929.9Forms of insignia and identification
No comments:
Post a Comment