Some people eat, sleep and chew gum, I do genealogy and write...

Monday, December 22, 2025

An AI Genealogical Source Reliability Scale and Conflict Audit Framework

There is, as yet, no generalized standard that clarifies the levels of trust that can be exercised for validating AI generated information including small organization or individual research. Although, developing a suggested standard would involve an international committee or other effort, I think the following is a good basis for discussion. 

AI Source Reliability Scale and Conflict Audit Framework
Grade 1
Verified Primary
  • Human researcher has compared the AI transcription word-for-word against the original image. Zero discrepancies found.
  • Word-for-word human audit (comparing AI transcription against original image).
  • Zero (Human verified)
Meets the Genealogical Proof Standard (GPS) for accuracy; preferred for final evidence.

Grade 2
High-Confidence AI
  • AI confidence score >95%. Document is a clear, printed text (e.g., modern book). Facts align with known historical timelines.
  • Verification of metadata and spot-checking alignment with historical timelines.
  • Low (High confidence scores and printed format)
High evidential weight but requires citation of AI involvement.

Grade 3
Probable Draft
  • AI-transcribed cursive or archaic script. Readable but contains "low-confidence" markers or [?] symbols.
  • Full human review required; manual audit if surnames or dates are missed.
  • Moderate (Risk of misread archaic script)
Treatment as a draft; requires human collation to meet GPS.

Grade 4
Unverified Lead
  • Summary or extraction provided by AI without a direct link to a specific line in the image.
  • Finding direct links to specific lines in images; manual disentanglement of FAN (Friends, Acquaintances, Neighbors) club.
  • High (Risk of name-merging or date-shifting)
AI suggestions are clues, not evidence; requires manual verification.

Grade 5
Suspected Fiction
  • AI-generated "fact" that contradicts established records or lacks a verifiable citation (Hallucination).
  • Re-verification of physical files; do not enter data into tree.
  • Extreme (Hallucination/Fabrication)
Violates the Genealogical Proof Standard (GPS) requirements for complete and accurate source citations.

In this case, the levels of trust are set out in a descending manner. Genealogical organizations might be relied upon to provide such guidance. I am open to discussion. 

2 comments:

  1. I am a thoroughgoing AI skeptic. I read articles like this one and think: "If we have to go through all this folderol to assess the veracity, or lack thereof, of an AI product's output, why bother with it at all? I'd rather use that time to expand my research." I read about AI interpreting old handwriting, and I laugh myself silly. I've worked with the 1950 census; I've seen what their AI application did to names, and more. I transcribed a number of entries in that census, correcting the errors. I am trained as a paleographer, so I can do a better job of interpreting old handwriting than any AI. As far as I'm concerned, AI is not ready for prime time.

    ReplyDelete
    Replies
    1. For more than three years now, I have been deeply involved with the issue of whether or not AI can be reliably used for genealogical research. The question is still under consideration. The current discussion revolves around the need for a genealogical researcher to know more than the AI chatbot to be able to make sure the chatbot is correct it it's responses to prompts. This is an going issue for discussion. Google Gemini 3 is now the lead in handwriting recognition but it still needs watching and interpretation. Thanks for your comment.

      Delete