RAG (Retrieval-Augmented Generation)Retrieval-Augmented Generation, or RAG, is a technique that combines a language model with a search or retrieval system, allowing the model to pull in relevant information before generating a response. Without RAG, a model can only draw on what it absorbed during training, which has a fixed cutoff date and doesn't include anything proprietary, recent, or specialized. With RAG, a model can search a document library, a company knowledge base, or the open web and incorporate that material into its answer. This is how AI tools can accurately respond to questions about internal documents they were never trained on, and how some products can cite current sources rather than relying on potentially outdated training data. RAG also substantially reduces hallucination in knowledge-intensive tasks because the model is working from retrieved source material rather than generating from pattern memory alone.
Th RAG process is exactly modeled by the Google Gemini/NotebookLM/Gems workflow. You use Gems to create specifically designed prompts that restrict Gemini to an assigned research task and then using the research prompt, you collect vetted sources into a NotebookLM notebook and then use the restricted Gemini AI to extract accurate information from the vetted sources. As a bonus, NotebookLM offers several pre-programed ways to analyze and communicate the information derived from the workflow
Here is an example of Google Gem with the title Master Genealogist.
Revised Prompt: The Professional Genealogy Research Architect
Role: Act as a Board-certified Professional Genealogist (BCG) and Expert Research Consultant. Your primary goal is to guide a "Reasonably Exhaustive Search" while adhering strictly to the Genealogical Proof Standard (GPS).
Objective: Conduct rigorous evidence analysis to resolve complex identity and kinship problems. You will not accept "hints" as facts; you will treat every data point as a claim to be verified.
Operating Framework:
For every piece of information provided, you must apply this multi-layer analysis:
Source & Information Taxonomy:
Source: Original, Derivative, or Authored.
Information: Primary, Secondary, or Undetermined.
Evidence: Direct, Indirect, or Negative.
Correlation & Logic:
Compare independent sources to look for patterns or discrepancies.
Explicitly address Conflicting Evidence (e.g., age variances, name spelling shifts, or geographic outliers).
The "FAN Club" Filter: Analyze the person within the context of their Friends, Associates, and Neighbors to overcome brick walls.
Reliability & Weighting: Assign a Weight of Evidence score (Low, Moderate, High) to each conclusion based on the quality of the documentation.
Citations: Every record mentioned must include a full citation formatted according to the Evidence Explained (Elizabeth Shown Mills) style.
The Workflow:
Phase 1: The Research Objective. I will provide a specific, focused research question.
Phase 2: Evidence Audit & Gap Analysis. You will analyze my "Known Facts" and identify what is missing (e.g., "No evidence of land ownership despite 1850 Agricultural Census entry").
Phase 3: Strategic Research Plan. You will suggest a prioritized list of record types (Probate, Land, Military, Church, etc.) and specific repositories or databases to consult.
Action: Acknowledge your commitment to these standards. Then, ask me for my Research Objective and Known Facts to begin the investigation.
This prompt was developed from my initial descriptive statement and then refined by the Gem app. The results were then given to Gemini to refine. The resultant prompt can then be automatically used by Google Gemini to do addition research with extreme depth and almost total accuracy.
The resultant sources and analysis and sources obtained are then put into NotebookLM notebook that forces Gemini to use only the vetted sources and provide deep research anaysis and responses. With the PRO lever of Gemini (currently $20 a month) you can add as many as 300 sources to a notebook for analysis. Google Gemini also has one of the top levels of token usage of any of the AI programs. Here is a list of the current ranking for AI tokens. In the context of Artificial Intelligence (AI), a token is a fundamental unit of data that AI models process during training and inference.
Top Ranking AI Websites by Context Window (2026)
| Rank | AI Platform | Model(s) | Usable Token Limit | Key Strength |
| 1 | Google Gemini | Gemini 3.1 Pro / Deep Think | 2,000,000 | Largest mainstream window; deep integration with Google Workspace. |
| 2 | OpenAI (ChatGPT) | GPT-5.4 (xhigh) / 5.5 | 1,100,000 | High reasoning performance across the full context window. |
| 3 | Anthropic (Claude) | Claude 4.7 / Mythos Preview | 1,000,000 | Best for "needle-in-a-haystack" retrieval and stable long-form writing. |
| 4 | xAI (Grok) | Grok 4.1 | 1,000,000 | Rapid real-time information processing from X (formerly Twitter). |
| 5 | DeepSeek | DeepSeek V4 Pro (Max) | 1,000,000 | Most efficient high-context open-weight implementation. |
| 6 | Moonshot AI (Kimi) | Kimi K2.6 | 256,000 - 1M+ | Specialized in long-context memory (Kimi K2 has reached 1B tokens in labs). |
| 7 | Alibaba (Qwen) | Qwen 3.6 Plus | 262,000 | Excellent for repository-scale code understanding. |
🔍 Important Considerations
Theoretical vs. Web-Accessible: While models like Llama 4 Scout or Kimi K2 have been benchmarked at 10 million to 1 billion tokens, these limits are often restricted to specialized developer environments or high-tier API access rather than the standard "chat" interface you find on their websites.
The "Pro" Gap: Most platforms gate their highest token limits behind "Pro," "Ultra," or "Max" subscriptions. For example, Claude’s 1M token window is typically reserved for Tier 4+ organizations or specific beta previews.
Performance Degradation: Having a high number of "usable" tokens doesn't always mean the AI "remembers" everything perfectly.
Gemini 3.1 Pro and Claude 4.7 currently lead in benchmarks for accurately retrieving information buried in the middle of a 1M+ token prompt
No comments:
Post a Comment