Optical Character Recognition for Table Extraction from PDF

Post published:31 August, 2023
Post category:StudyBullet-14
Reading time:4 mins read

Building and deploying a PDF to Excel system using PaddleOCR and Fastapi

What you will learn

How to use Paddle OCR to build a working PDF to Excel system

Basics of FastAPI

Building an OCR API

Taking a working solution from Google Colab and Deploying

Description

Optical Character Recognition (OCR) systems are used in diverse industries today. With the development of better performing deep learning models, we are getting even better OCR solutions.

In this course, we shall take you on an amazing journey in which you’ll implement and deploy a working OCR solution. To be more precise we shall build a working solution in which a user inputs a PDF file and gets all the tables contained in the PDF as excel sheets. We’ll start from understanding how this system works, then build a working prototype on Google Colaboratory (Colab). From here, we shall build a simple API with the Fastapi framework. This will permit users input a PDF file and get as output a compressed file containing folders which themselves contain excel sheets with the different tables found in the PDF.

If you are willing to move a step further in your career, this course is destined for you and we are super excited to help achieve your goals!

Get Instant Notification of New Courses on our Telegram channel.

This course is offered to you by Neuralearn. And just like every other course by Neuralearn, we lay much emphasis on feedback. Your reviews and questions in the forum will help us better this course. Feel free to ask as many questions as possible on the forum. We do our very best to reply in the shortest possible time.

Enjoy!!!

English

language

Content