• Post category:StudyBullet-14
  • Reading time:4 mins read


Building and deploying a PDF to Excel system using PaddleOCR and Fastapi

What you will learn

How to use Paddle OCR to build a working PDF to Excel system

Basics of FastAPI

Building an OCR API

Taking a working solution from Google Colab and Deploying

Description

Optical Character Recognition (OCR)ย  systems are used in diverse industries today. With the development of better performing deep learning models, we are getting even better OCR solutions.

In this course, we shall take you on an amazing journey in which you’ll implement and deploy a working OCRย solution. To be more precise we shall build a working solution in which a user inputs a PDF file and gets all the tables contained in the PDF as excel sheets. We’ll start from understanding how this system works, then build a working prototype on Google Colaboratory (Colab). From here, we shall build a simple APIย with the Fastapi framework. This will permit users input a PDF file and get as output a compressed file containing folders which themselves contain excel sheets with the different tables found in the PDF.

If you are willing to move a step further in your career, this course is destined for you and we are super excited to help achieve your goals!


Get Instant Notification of New Courses on our Telegram channel.


This course is offered to you by Neuralearn. And just like every other course by Neuralearn, we lay much emphasis on feedback. Your reviews and questions in the forum will help us better this course. Feel free to ask as many questions as possible on the forum. We do our very best to reply in the shortest possible time.

Enjoy!!!

English
language

Content

Introduction

What we shall Learn
How the overall system works
Link to Code

Building PDF to Excel Solution on Google Colab

Extracting Images from PDF
Extracting information from each image (page)

Deployment with Fastapi

Introduction to Fastapi
Building a simple API
Testing the solution
Conclusion

Bonus

Bonus