Introduction
Introduction
Installation and getting started
Installing PostgreSQL and pgAdmin in your PC
This is a milestone!
If pgAdmin is not opening…
Course Resources
Case Study : Demo
Case Study Part 1 – Business problems
Case Study Part 2 – How SQL is Used
Fundamental SQL statements
CREATE
INSERT
Import data from File
SELECT statement
SELECT DISTINCT
WHERE
Logical Operators
UPDATE
DELETE
ALTER – Part 1
ALTER – Part 2
Restore and Back-up
Restore and Back-up
Debugging restoration issues
Creating DB using CSV files
Debugging summary and Code for CSV files
Selection commands: Filtering
IN
BETWEEN
LIKE
Selection commands: Ordering
Side Lecture: Commenting in SQL
ORDER BY
LIMIT
Aggregate Commands
COUNT
SUM
AVERAGE
MIN & MAX
Group By Commands
GROUP BY
HAVING
Conditional Statement
CASE WHEN
JOINS
Introduction to Joins
Concepts of Joining and Combining Data
Preparing the data
Inner Join
Left Join
Right Join
Full Outer Join
Cross Join
Intersect and Intersect ALL
Except
Union
Subqueries
Subquery in WHERE clause
Subquery in FROM clause
Subquery in SELECT clause
Views and Indexes
VIEWS
INDEX
String Functions
LENGTH
UPPER LOWER
REPLACE
TRIM, LTRIM, RTRIM
CONCATENATION
SUBSTRING
LIST AGGREGATION
Mathematical Functions
CEIL & FLOOR
RANDOM
SETSEED
ROUND
POWER
Date-Time Functions
CURRENT DATE & TIME
AGE
EXTRACT
PATTERN (STRING) MATCHING
PATTERN MATCHING BASICS
ADVANCE PATTERN MATCHING – Part 1
ADVANCE PATTERN MATCHING – Part 2
Window Functions
Introduction to Window functions
Introduction to Row number
Implementing Row number in SQL
RANK and DENSERANK
NTILE function
AVERAGE function
COUNT
SUM TOTAL
RUNNING TOTAL
LAG and LEAD
COALESCE function
COALESCE function
Data Type conversion functions
Converting Numbers/ Date to String
Converting String to Numbers/ Date
User Access Control Functions
User Access Control – Part 1
User Access Control – Part 2
Nail that Interview!
Tablespace
PRIMARY KEY & FOREIGN KEY
ACID compliance
Truncate
Looker Studio
Introduction
Why Data Studio?
Terminologies & Theoretical concepts for Data Studio
Data Studio Home Screen & Dataset vs Data Source
Structure of Input data
Dimensions vs Measures (new definition)
Practical part begins here
Opening Data Studio and preparing data
Adding a data source
Managing added data source
Charts to highlight numbers
Data Table
Styling tab for data table
Scorecards
Charts for comparing categories : Bar charts and stacked charts
Simple Bar and Column chart
Stacked Column chart
Charting maps of a country, continent or a region – Geomaps
GeoMap
Charts to highlight trends : Time series, Line and Area charts
Time Series
Update to Time Series chart
Line Chart and Combo Chart
Highlight contribution to total: Pie chart & Donut Chart
Pie Chart and Donut Chart
Stacked Area Charts
Updated data for area charts
Relationship between two or more variables: Scatterplots
Scatter Plots and Bubble charts
Aggregating on two dimensions: Pivot tables
Pivot tables for cross tabulation
All about a single Metric: Bullet Chart
Bullet Chart
Chart for highlighting heirarchy: TreeMap
TreeMaps
Branding a Report
Branding a Report: Brand Logo and Company Details
Brand colors for report branding
Giving the power to filter Data to viewers
Filter controls for viewers
Add Videos, Feedback form etc. to your Report
URL Embed to include external content
Sometimes data is in multiple tables
Blending data from multiple tables
Different types of Joins while blending data
Sharing and collaborating on Data Studio report
Downloading report as PDF and Page Management
Sharing report and Data Credentials
Sharing report using a link
Scheduling emails
Embeding report on Website
Charting Best Practices
Highlighting chart message
Eliminating Distractions from the Graph
Avoiding clutter
Avoiding the Spaghetti plot
Machine Learning in Python
Introduction
Setting up Python and Jupyter notebook
Installing Python and Anaconda
Opening Jupyter Notebook
Introduction to Jupyter
Arithmetic operators in Python: Python Basics
Strings in Python: Python Basics
Lists, Tuples and Directories: Python Basics
Working with Numpy Library of Python
Working with Pandas Library of Python
Working with Seaborn Library of Python
Basics of statistics
Types of Data
Types of Statistics
Describing data Graphically
Measures of Centers
Measures of Dispersion
Introduction to Machine Learning
Introduction to Machine Learning
Building a Machine Learning Model
Data Preprocessing
Gathering Business Knowledge
Data Exploration
The Dataset and the Data Dictionary
Importing Data in Python
Univariate analysis and EDD
EDD in Python
Outlier Treatment
Outlier Treatment in Python
Missing Value Imputation
Missing Value Imputation in Python
Seasonality in Data
Bi-variate analysis and Variable transformation
Variable transformation and deletion in Python
Non-usable variables
Dummy variable creation: Handling qualitative data
Dummy variable creation in Python
Correlation Analysis
Correlation Analysis in Python
Linear Regression
The Problem Statement
Basic Equations and Ordinary Least Squares (OLS) method
Assessing accuracy of predicted coefficients
Assessing Model Accuracy: RSE and R squared
Simple Linear Regression in Python
Multiple Linear Regression
The F – statistic
Interpreting results of Categorical variables
Multiple Linear Regression in Python
Test-train split
Bias Variance trade-off
Test train split in Python
Regression models other than OLS
Subset selection techniques
Shrinkage methods: Ridge and Lasso
Ridge regression and Lasso in Python
Heteroscedasticity
Introduction to the classification Models
Three classification models and Data set
Importing the data into Python
The problem statements
Why can’t we use Linear Regression?
Logistic Regression
Logistic Regression
Training a Simple Logistic Model in Python
Result of Simple Logistic Regression
Logistic with multiple predictors
Training multiple predictor Logistic model in Python
Confusion Matrix
Creating Confusion Matrix in Python
Evaluating performance of model
Evaluating model performance in Python
Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis
LDA in Python
K-Nearest Neighbors classifier
Test-Train Split
Test-Train Split in Python
K-Nearest Neighbors classifier
K-Nearest Neighbors in Python: Part 1
K-Nearest Neighbors in Python: Part 2
Comparing results from 3 models
Understanding the results of classification models
Summary of the three models
Simple Decision Trees
Introduction to Decision trees
Basics of Decision Trees
Understanding a Regression Tree
The stopping criteria for controlling tree growth
Importing the Data set into Python
Missing value treatment in Python
Dummy Variable Creation in Python
Dependent- Independent Data split in Python
Test-Train split in Python
Creating Decision tree in Python
Evaluating model performance in Python
Plotting decision tree in Python
Pruning a tree
Pruning a tree in Python
Simple Classification Tree
Classification tree
The Data set for Classification problem
Classification tree in Python : Preprocessing
Classification tree in Python : Training
Advantages and Disadvantages of Decision Trees
Ensemble technique 1 – Bagging
Ensemble technique 1 – Bagging
Ensemble technique 1 – Bagging in Python
Ensemble technique 2 – Random Forests
Ensemble technique 2 – Random Forests
Ensemble technique 2 – Random Forests in Python
Using Grid Search in Python
Ensemble technique 3 Boosting
Boosting
Ensemble technique 3a – Boosting in Python
Ensemble technique 3b – AdaBoost in Python
Ensemble technique 3c – XGBoost in Python
Alteryx
The Problem Statement
Case study and Alteryx Installation
Installing Alteryx
Alteryx Interface
DATA EXTRACTION: Extracting tabular data
Manually entering data into Alteryx
Importing Data from a CSV (Comma Separated Values) file
Importing Data from a TXT (text) file
Importing Data from an Excel file
Importing Data from a ZIP file
Importing Data from multiple files in a folder
DATA EXTRACTION: Extracting non-tabular data
Probable Issue with Extraction from XML
Extracting from XML
Extracting from an SQL table
Plan for importing sales Data
Installing PostgreSQL and pgAdmin in your PC
Creating Sales table in SQL
Extracting from an SQL table
Storing and Retrieving Data Cloud storage
Storing Data on AWS S3
Importing data from AWS S3
Merging Data Streams
Union tool – Merging Customer Data
Data Cleansing and improving data quality
Find and Replace Tool
Data Cleaning Tool
Autofield and Select Tool – For controlling Field order and data type
Merging Sales and Product data
Select and Unique Tools- For Removing duplicates from product data
Date Parse – Changing Date format
Select and union – Merging Sales data
Sampling Data
Select Records Tool
Sample Tool
Random Percent Sample Tool
Train-Validation-Test Split sampling
Data Preparation
Multifield binning and Tile Tool – To create customer age categories
Formula Tool – Conditional Formula for giving category titles
Sort tool – Sorting customer Data based on ID
Formula Tool – Sales order date & ship date
Multifield Formula tool – Converting multiple currency fields
Filtering and Sorting – Positive number of days
Text to Columns – Splitting Product ID into 3 columns
Outputting Cleaned Data
Outputting Clean Customer & Product Data
Merging tables to create a datamart
The Joining Tool – Adding customer and Product data to Sales table
Extracting more info from the Date values
Performing Analytics/ Transformation on Datamart
The Summarize tool
Running Total Tool
Crosstab tool for creating Pivot tables
Transpose Tool – the opposite of Cross Tab tool
The Count tool
Creating a report in Alteryx
Introduction to Reporting
Interactive Chart tool – Bar chart to show region-wise sales
Interactive Chart tool – Line chart to show Sales trend
Table Tool – Formatting the Pivot table
Text Tool – Adding static text to a report
Visual Layout tool – Arranging charts, text and tables in a report
Header tool – Adding header in a report
Footer tool – Adding footer to a report
Rendering tool – rendering report as a PDF, HTML or PNG
Email Tool – Sending email with Alteryx
Image tool – Adding image to a report
Layout tool – Arranging charts, text or tables in a report
Scheduling a workflow in Alteryx
Schedule and Automate Alteryx workflow
Congratulations & about your certificate
Alternative to Alteryx
The final milestone!
Bonus Lecture