Machine Learning: Digitizing the past, present and future

The world’s biggest catalogues

With twelve centuries worth of documents, approximately 53 miles of shelving, and 55,000 volumes in its catalogue; the Vatican Secret Archives is one of the world’s largest and most valuable collection of historical documents. Through the power of machine learning, those hand-written parts of history are now being digitized.

 

The Codice Ratio, or “The Code System”, is a fascinating project with large involvement from NVIDIA, to which omni:us is also partnered, which aims to enable the Vatican transcribe a portion of their enormous secret archive. Said collection contains many precious documents like letters from Michelangelo, Henry VIII, Abraham Lincoln and more popes than you can count on two hands.

Digitizing all those hand-written documents manually through the Vatican’s own photographic and conservation team seemed highly impractical. So, the team created a neural network around the CuDNN-accelerated TensorFlow deep learning framework and trained it to recognize entire words, rather than letters.

As of now, the system can generate exact transcriptions for 65% of the word images in their dataset, but as the neural net learns, the system is improving in accuracy. Through that, the system is providing the Vatican with enormous support in preserving those letters and scrolls steeped in history.

Such solutions be used to not only conserve and store the centuries-old history of our cultures, but serve todays businesses just as well. Through our very own solution that’s combining handwriting-detection and machine learning, we were able to improve of one of our clients’ claims turn around time by 80% – at 75% of the original costs. Said client is operating in the insurance sector. We are emphasizing that, because as we are expecting insurance data to grow by 94% in 2018, 84% of which will be in highly variable documentation, these possible improvements for a company are all the more important.

To learn more about our corresponding solution, omni:us Claim, click here.

Check out our platform stack, which includes TensorFlow, here.