Some people eat, sleep and chew gum, I do genealogy and write...

Friday, December 6, 2024

Major Breakthrough: 3.4 Billion Records Extracted From Historical Newspapers Were Added to MyHeritage

 

https://www.oldnews.com/en and myheritage.com

Artificial intelligence is beginning to have a huge impact on the way genealogists do research into the large online genealogy database/family tree websites. A recent email from MyHeritage.com explains how using AI they were able to index/extract 3.4 billion records from 200 million English newspaper pages. I did a preliminary search for my great-grandfather, Henry Martin Tanner, in the OldNews.com website and had these results.


All these articles and many more are about my great-grandfather. I am sure that additional searches will show a significant increase in other relatives also. Now on to MyHeritage.com's announcement. Here are some quotes from the email. 

We’re happy to announce the publication of four huge new collections of names and stories on MyHeritage, extracted from newspaper pages on OldNews.com. The collections contain 658 million records from Florida, Georgia, Alabama, and Mississippi; 998 million records from Texas, Arizona, New Mexico, Nevada, Utah, and Nebraska; 1 billion records from Delaware, Maryland, Virginia, West Virginia, and Pennsylvania; and 651 million records from North Carolina, South Carolina, and District of Columbia.

The new collections are searchable on MyHeritage, with the full images of the newspaper pages available on OldNews.com via direct links from MyHeritage. 

As part of this update, we’re also thrilled to share that OldNews.com now hosts more than 300 million newspaper pages!

Search the new collections now:

Search Names & Stories in Newspapers from Florida, Georgia, Alabama, and Mississippi

Search Names & Stories in Newspapers from Texas, Arizona, New Mexico, Nevada, Utah, and Nebraska

Search Names & Stories in Newspapers from Delaware, Maryland, Virginia, West Virginia, and Pennsylvania

Search Names & Stories in Newspapers from North Carolina, South Carolina, and District of Columbia

Now I don't normally go into such detail with these announcements, but I wanted to explain that the structured records in the new collections were extracted from nearly 200 million English newspaper pages using cutting-edge AI technology developed by the MyHeritage team.

The new collections allow MyHeritage users to uncover rich information about their ancestors that was previously out of reach. This is because they are indexed and structured, so they can be searched using imprecise names, nicknames and synonyms; whereas searching in newspapers that are not indexed is typically done using keywords and requires the user to write the name exactly as it appears in the newspaper. 

This AI is designed to extract not just names from the newspaper articles but also the relatives of every person mentioned, as well as additional fields such as occupations, residences, travel from one location to another, and more.

Last but not least, every record includes a useful summary of the article, generated automatically by AI.

We are going to see even more announcements like this one in the very near future. 

 


No comments:

Post a Comment