Novobi - Optical Character Recognition

University of Dayton, Department of Computer Science, CPS 491 - Capstone II, Spring 2021, Instructor: Dr. Phu Phung

Team members

Marcus Chronabery Headshot
Marcus Chronabery

CIS major. Interested in web applications and DevOps.

Assoc. Research Software Eng. @ UDRI | 2014 - Present

Email LinkedIn
Caleb Hoskins Headshot
Caleb Hoskins

CS major. Interested in Artificial Intelligence.

Currently seeking professional position.

Email LinkedIn
Jacob Blair Headshot
Jacob Blair

CS major. Interested in web application development.

Accepted position at Dayton Freight.

Email LinkedIn
Muhammad Ndao Headshot
Muhammad Ndao

CS major. Interested in Information Technology.

Currently seeking professional position.

Email LinkedIn

Company Mentors

Novobi
Hung Nguyen hung.nguyen@novobi.com
8920 Business Park Dr #250, Austin, TX 78759

Project Overview

The overall goal of this project for Novobi is to:

  • Upload an image of a document
  • Parse the text from the image using an Optical Character Recognition engine
  • Present the results to the user for verification
  • Store in a database if correct.

Project Architecture

Client/Server/Database architecture with OCR Integration

Technology

CSS3, HTML5, Node.js, MongoDB, Heroku, Javascript port of Tesseract Optical Character Recognition engine

Our Take On The Problem

We determined that one possible need for this type of application would be for a company that processes a large amount of invoice statements. Someone in the accounting department needs to scan each paper invoice into the system to identify specific information in the document. In order to do this we had to implement a solution for Zonal OCR, sometimes called Template OCR.

Zonal or Template OCR allows a user to define zones on a well-formatted document so that each time a document of that type is uploaded the same areas in the document are processed. This allows only the relevant information to be pulled from the image.

The first step is to choose your template document and define where the relevant data is in the document. The second step is to upload an image of a document that uses that same layout. The final step is when you are presented with the results.

A user can choose to "accept" what was identified in the document and save it to the database. A user can also reject the results. Perhaps the image resolution was too poor and the text that was identified was nonsense. Lastly we also allow for specific zones to be modified by the uploading user (with the designated permission role) if perhaps something was not correctly identified in the image or certain information has changed. It is up to the user that initially defines the zones on the template to determine which fields can and cannot be modified.

Project Outcome

Public Deployment On Heroku
Public Docker Image On Docker Hub

We were able to successfully implement a light-weight, dynamic Zonal OCR application. Our application can:

  • Accept Template Document Images (not required to be an Invoice)
  • Define Relevant Zones In The Template Document
  • Parse and Display The Information For An Uploaded Document Using Previously Defined Zones
  • Parse and Display The Information For An Uploaded Document In It's Entirety
  • Display Previously Saved Document Parse Results

Demo Video

https://youtu.be/ej2EwCG_0G0