Saturday, March 15, 2014

Thoughts on User-controlled search filters

I recently received the following comment:
Once we build skill at using a particular search engine, we are frustrated when a company changes it and forces us to abandon that knowledge and relearn (a perceived slow-down in accomplishing our task). I think the company isn't focused on improvements for the user as much as improved access to the collection it hosts. What are your thoughts about user-controlled search filters? It gives us perceived control of results, but feels more like the automated phone service where we continually press 1 for English, press 3 for another option,...then another and another, losing our focus which was an answer to our question. To what extent are use-accessed filters an aid or a distraction?
This is an interesting question. Here is a screenshot of the filter menu in the Card Catalog:

Here is a similar filtering option from

It is fairly common to have filtering choices in a database search engine. The commentator suggests that there may be some problem with filtering. The answer is fairly complex and involves the difference between word-by-word searches (also called a string search) vs. catalog-based searches. When you are doing a word-by-word search, such as that used by Google, you are in reality using what is called a keyword search. In a catalog search, you are searching by pre-determined subjects with subjects created by the cataloger. In both types of searches, you are, in a sense, guessing at which terms to use and relying on the accuracy of your guess.

Let me give a general hypothetical example of these two different systems work. Let's suppose you were searching for information about your ancestor, Albert Edward Doe. You need birth, marriage and death information and have only very general information to go on. You know he lived in New York in the 1840s.

If you were doing a Google search (keyword) you could enter the following items including the quotation marks:

"Albert Edward Doe" New York 1840 genealogy

You are asking Google to look for a website where all of those terms show up at the same time. As a note, the date is problematical because, if you think about it, you are asking for a specific date. Of course, you could use a variety of wildcard searches or whatever, but the real issue is that you are looking at the entire Internet and guessing at what is there. It is easy to vary the terms and come up with different results. What if I put the same terms into the search engine, minus the word genealogy? In effect, I am doing the same thing with a much more restricted data set. I am guessing that someone recorded information about my ancestor using the words (or wildcard entries) I choose.

Does using a catalog give you any advantage in this searching process? By the way, this is really what the commentator above is asking. Another way of asking the question would be, does using a pre-determined set of search parameters help you find what you are looking for? The first answer to that question is simple, it depends on the accuracy of the catalog. If you approached the search by using the catalog entry system, you would go to the Card Catalog and start filtering out all of the extraneous categories. You would filter out anything except New York in the United States and then filter the time frame to include the time your ancestor lived. This narrows the search, but what if you are wrong and your ancestor lived in Massachusetts? Didn't you just shoot yourself in the foot by looking only in New York?

In all these examples, it is presupposed that you would make a whole series of searches using both systems and varying the search terms extensively. If you realize that there is no "one way" to make a search that will produce positive results, then you are not overly concerned with the efficacy of any one method. You must also take into account that the cataloging system used in any given repository may be poorly constructed or incomplete. You must also always remember that you might be looking for the wrong name in the wrong place at the wrong time.

As genealogists we should practice searching as much as possible in both keyword based search engines and in catalog based search engines. In both cases, it is a good idea to vary the search terms in a systematic way so that you have a reasonable chance of hitting of terms that have been used either by the target websites or by the catalogers. The direct answer to the question is that user-based filters are an aid but not an end in themselves. For more information, see Keyword Searching vs. Subject Searching from the George Mason University Library.

