Why we use Google Vision for Optical Character Recognition in Catalytic's intelligent automation platform
Extracting data from documents is a core part of many automated processes built using Catalytic. For cases involving unstructured data and documents, one of the best solutions is to use Optical Character Recognition (OCR). Customers frequently ask what we use for document reading and extraction, so I’d like to provide some insight into how we came to the decision to use Google Vision.
When developing a new feature, there are four main paths we can choose from:
- Develop it ourselves from scratch
- Purchase a pre-built solution or use an open source project and host it ourselves
- Develop an ecosystem integration and use an API to power a Catalytic action
- Develop integrations to a number of third-party products and let customers connect to their own account
Developing an OCR integration
For OCR, after exploring the four paths, we opted for developing an ecosystem integration. We came to the conclusion that Google is the market leader in OCR and is highly likely to continue to maintain that position for these seven reasons:
- We tested and compared market-leaders in OCR and Google provided the best results. It’s hard to provide exact accuracy numbers for OCR, because results vary wildly based on the format and quality of a document. But based on analysis of a range of document types like invoices and contracts and file types like PDFs and JPEGs, we found that Google consistently provided superior character recognition, even for low-resolution files.
- Google has high levels of security. The company meets or exceeds our security standards for OCR with HIPAA and SOC2 Type 2 compliance.
- Google OCR is highly reliable and performant. It can process large documents quickly and Google outages are extremely rare, or when they do occur, very short.
- Google continually invests in research to improve its OCR tech. The company is the main sponsor of Tesseract, the leading open source OCR product.
- Google has decades of experience in OCR and computer vision. It’s a fundamental part of many of its products like Android Camera, Google Photos, Waymo, Image Search and Street View.
- Google’s APIs are well-designed, leading to shorter development time for Catalytic to adopt new features and functionality that are added to Google Vision.
- Google is continually improving the quality of results on available features, while adding new ones. For example, when Google released handwriting OCR in December 2018, Catalytic instantly added and supported that feature as well.
Google Vision comparisons
We’re not the only ones to have done this analysis and come to the same conclusion. Here are some resources on comparing Google Vision with other OCR providers:
- Google Vision the top image recognition system, study finds | CIO Dive
- Comparing the best image text recognition APIs | Data Turks
- Image Recognition Accuracy Study | Perficient Digital
- Our Search for the Best OCR Tool, and What We Found |Source: An OpenNews project
Based on all of these factors, it was clear that developing an ecosystem integration with the Google Vision API was hands down the best option to power our many uses of OCR for automation.
OCR is a rapidly evolving space, and there are amazing advances being made by commercial cloud offerings like Amazon Textract and Microsoft Computer Vision, as well as a number of new startups. This is an area we’re constantly monitoring so we can continually improve Catalytic’s ability to extract information from unstructured data using the best OCR technology available.
Using OCR with Catalytic
OCR is just one of the tools that Catalytic is constantly researching to connect to our cloud platform so we can offer the easiest way to build automations. The Catalytic platform allows you to use information extracted with Google Vision OCR throughout an entire automatic process that can include Natural Language Processing (NLP), database entry, email template filling, AI-powered sentiment analysis, or decision-making as the next steps in an end-to-end process.
Catalytic’s platform gives you the building blocks for your easiest path to digital transformation. OCR is one of many. Schedule a demo with our experts to see how to use the best OCR for intelligent automation.