• Post category:StudyBullet-8
  • Reading time:14 mins read


Text Cleaning, Spacy, NLTK, Scikit-Learn, Deep Learning, word2vec, GloVe, LSTM for Sentiment, Emotion, Spam & CV Parsing

What you will learn

Learn complete text processing with Python

Learn how to extract text from PDF files

Use Regular Expressions for search in text

Use SpaCy and NLTK to extract complete text features from raw text

Use Latent Dirichlet Allocation for Topic Modelling

Use Scikit-Learn and Deep Learning for Text Classification

Learn Multi-Class and Multi-Label Text Classification

Use Spacy and NLTK for Sentiment Analysis

Understand and Build word2vec and GloVe based ML models

Use Gensim to obtain pretrained word vectors and compute similarities and analogies

Learn Text Summarization and Text Generation using LSTM and GRU

Description

Welcome to KGP Talkie’s Natural Language Processing (NLP) course. It is designed to give you a complete understanding of Text Processing and Mining with the use of State-of-the-Art NLP algorithms in Python.

We will learn Spacy in detail and we will also explore the uses of NLP in real life. This course covers the basics of NLP to advance topics like word2vec, GloVe, Deep Learning for NLP like CNN, ANN, and LSTM. I will also show you how you can optimize your MLΒ code by using various tools of sklean in python. At the end part of this course, you will learn how to generate poetry by using LSTM. Multi-Label and Multi-class classification is explained. At least 12 NLP Projects are covered in this course. You will learn various ways of solving edge-cutting NLP problems.

You should have an introductory knowledge of Python and Machine Learning before enrolling in this course otherwise please do not enroll in this course.

In this course, we will start from level 0 to the advanced level.

We will start with basics like what is machine learning and how it works. Thereafter I will take you to Python, Numpy, and Pandas crash course. If you have prior experience you can skip these sections. The real game of NLP will start with Spacy Introduction where I will take you through various steps of NLP preprocessing. We will be using Spacy and NLTK mostly for the text data preprocessing.


Get Instant Notification of New Courses on our Telegram channel.


In the next section, we will learn about working with Files for storing and loading the text data. This section is the foundation of another section on Complete Text Preprocessing. I will show you many ways of text preprocessing using Spacy and Regular Expressions. Finally, I will show you how you can create your own python package on preprocessing. It will help us to improve our code writing skills. We will be able to reuse our code systemwide without writing codes for preprocessing every time. This section is the most important section.

Then, we will start the Machine learning theory section and a walkthrough of the Scikit-Learn Python package where we will learn how to write clean ML code. Thereafter, we will develop our first text classifier for SPAM and HAM message classification. I will be also showing you various types of word embeddings used in NLP like Bag of Words, Term Frequency, IDF, and TF-IDF. I will show you how you can estimate these features from scratch as well as with the help of the Scikit-Learn package.

Thereafter we will learn about the machine learning model deployment. We will also learn various other important tools like word2vec, GloVe, Deep Learning, CNN, LSTM, RNN, etc.

At the end of this lesson, you will learn everything which you need to solve your own NLP problem.

English
language

Content

Introduction

Machine Learning Intuition
Install Anaconda and Python 3 on Windows 10
Resources Folder
Install Anaconda and Python 3 on Ubuntu Machine
Install Git Bash and Commander Terminal
Jupyter Notebook Shortcuts

Python Crash Course

Introduction
Data Types
Variable Assignment
String Assignment
List
Set
Tuple
Dictionary
Boolean and Comparison Operator
Logical Operator
If, Else, Elif
Loops in Python
Methods and Lambda Function

Numpy Introduction [Optional]

Introduction
Array
NaN and INF
Statistical Operations
Shape, Reshape, Ravel, Flatten
Sequence, Repetitions, and Random Numbers
Where(), ArgMax(), ArgMin()
File Read and Write
Concatenate and Sorting
Working with Dates

Pandas Introduction [Optional]

Introduction
DataFrame and Series
File Reading and Writing
Info, Shape, Duplicated, and Drop
Columns
NaN and Null Values
Imputation
Lambda Function

Spacy Introduction

Introduction to NLP
Install Spacy
Introduction to Spacy
Tokenization
Parts of Speech [POS] Tagging
Dependency Visualization
Named Entity Recognition (NER)
Sentence Segmentation
Rule Based Phrase Matching
Regular Expression Part 1
Regular Expression Part 2
Processing Pipeline in Spacy
Hashtags and Emoji Detection

Working with Text Files

String Formatting
Working with open() Files in write() Mode Part 1
Working with open() Files in write() Mode Part 2
Working with open() Files in write() Mode Part 3
Read and Evaluate the Files
Reading and Writing .CSV and .TSV Files with Pandas
Reading and Writing .XLSX Files with Pandas
Reading and Writing .JSON Files
Reading Files from URL Links
Extract Text Data From PDF
Record the Audio and Convert to Text
Convert Audio in Text Data
Text to Speech Generation

Complete Text Preprocessing

Introduction
Word Counts
Characters Counts
Average Word Length
Stop Words Count
Count #hashtag and @mentions
Numeric Digit Count
Upper case Words Count
Lower case Conversion
Contraction to Expansion
Count and Remove Emails
Count and Remove URLs
Remove RT from Tweeter Data
Special Chars Removal and Punctuation Removal
Remove Multiple Spaces
Remove HTML Tags
Remove Accented Chars
Remove Stop Words
Convert into Base or Root Form of Words
Common Words Removal
Rare Words Removal
Word Cloud Visualization
Spelling Correction
Tokenization with TextBlob
Nouns Detection
Language Translation and Detection
Sentiment Prediction with TextBlob

Python Software Packaging for Redistribution

Code Files Setup
Readme and License File Preparation
Setup.py Preparation
Utils.py Code Along Part 1
Utils.py Code Along Part 2
Utils.py Code Along Part 3
Utils.py Code Along Part 4
__init__.py Code Along
GitHub Account Setup and Package Upload
SSH Key Setup for GitHub
Install Preprocess Python Package
Removing the Errors Part 1
Removing the Errors Part 2
Testing the Package

Introduction to Machine Learning with Scikit-Learn

Logistic Regression Intuition
Support Vector Machine Intuition
Decision Tree Intuition
Random Forest Intuition
L2 Regularization
L1 Regularization
Model Evaluation Metrics: Accuracy, Precision, Recall, and Confusion Matrix
Model Evaluation Metrics: ROC and AUC
Code Along in Python Part 1
Code Along in Python Part 2
Code Along in Python Part 3
Code Along in Python Part 4

Your First Text Classifier | Spam Text Classification

Text Feature Extraction Intuition Part 1
Text Feature Extraction Intuition Part 2
Bag of Words (BoW) Code Along in Python
Term Frequency (TF) Code Along in Python
Inverse Document Frequency (IDF) Code Along in Python
TFIDF Code Along in Python
Load Spam Dataset
Balance Dataset
Exploratory Data Analysis (EDA)
Data Preparation for Training
Build and Train SVM and Random Forest Models
Test Your Model with Real Data

Real-Time Twitter Sentiment Analysis

Notebook Setup
SVM Model Training
Test Your Model
Data Cleaning and Retraining SVM Part 1
Data Cleaning and Retraining SVM Part 2
Fine Tune Your ML Model
Saving and Loading ML Model
Create Twitter Developer Account
Get the Access Tokens
Reading Twitter Timeline in Real-Time
Tracking Keywords in Real-Time on Twitter Part 1
Tracking Keywords in Real-Time on Twitter Part 2
Tracking Keywords in Real-Time on Twitter Part 3
Real-Time Sentiment Analysis with TextBlob
Real-Time Sentiment Analysis with Trained ML Model
Real-Time Twitter Sentiment Analysis of USA vs China Part 1
Real-Time Twitter Sentiment Analysis of USA vs China Part 2
Real-Time Twitter Sentiment Animation Plot Part 1
Real-Time Twitter Sentiment Animation Plot Part 2