Learn to build predictive models with machine learning, using different Rstudio´s packages: ROCR, caret, XGBoost, rparty
What you will learn
☑ The algorithm behind recursive partitioning decision trees
☑ Construct conditional inference decision trees with R`s ctree function
☑ Construct recursive partitioning decision trees with R`s rpart function
☑ Learn to estimate Gini´s impurity
☑ Construct ROC and estimate AUC
☑ Random Forests with R´s randomForest package
☑ Gradient Boosting with R´s XGBoost package
☑ Deal with missing data
Description
Would you like to build predictive models using machine learning? That´s precisely what you will learn in this course “Decision Trees, Random Forests and Gradient Boosting in R.” My name is Carlos Martínez, I have a Ph.D. in Management from the University of St. Gallen in Switzerland. I have presented my research at some of the most prestigious academic conferences and doctoral colloquiums at the University of Tel Aviv, Politecnico di Milano, University of Halmstad, and MIT. Furthermore, I have co-authored more than 25 teaching cases, some of them included in the case bases of Harvard and Michigan.
This is a very comprehensive course that includes presentations, tutorials, and assignments. The course has a practical approach based on the learning-by-doing method in which you will learn decision trees and ensemble methods based on decision trees using a real dataset. In addition to the videos, you will have access to all the Excel files and R codes that we will develop in the videos and to the solutions of the assignments included in the course with which you will self-evaluate and gain confidence in your new skills.
After a brief theoretical introduction, we will illustrate step by step the algorithm behind the recursive partitioning decision trees. After we know this algorithm in-depth, we will have earned the right to automate it in R, using the ctree and rpart functions to respectively construct conditional inference and recursive partitioning decision trees. Furthermore, we will learn to estimate the complexity parameter and to prune trees to increase the accuracy and reduce the overfitting of our predictive models. After building the decision trees in R, we will also learn two ensemble methods based on decision trees, such as Random Forests and Gradient Boosting. Finally, we will construct the ROC curve and calculate the area under such curve, which will serve as a metric to compare the goodness of our models.
The ideal students of this course are university students and professionals interested in machine learning and business intelligence. The course includes an introduction to the decision trees algorithm so the only requirement for the course is a basic knowledge of spreadsheets and R.
I hope you are ready to upgrade yourself and learn to optimize investment portfolios with excel and R. I´ll see you in class!
English
Language
Content
Introducción
Welcome to the Course!
Section Introduction
Introduction to Decision Trees
Building a Decision Tree. Part A.
Building a Decision Tree. Part B.
Building a Decision Tree. Part C.
Building a Decision Tree. Part D.
Data Preprocessing
Section Introduction
Teaching Case: Edutravel
Describing the Dataset
Importing CSV Data into R
Changing the Data Type
Dealing with Missing Data
Combining Rare Categories
Data Split: Training and Testing Datasets
Decisions Trees with CTREE
Section Introduction
Decisions Trees with CTREE
Interpretation of Results
Prediction with the CTREE Model
Confusion Matrix
ROC Curve
Area Under the ROC Curve (AUC)
Test 1
Decisions Tress with RPART
Section Introduction
Decisions Trees with rpart
Choosing Complexity Parameter
Classification and Confusion Matrix
ROC and AUC
Random Forests
Section Introduction
Theoretical Introduction to Random Forests
Building a Random Forest Model in R
Classification and Confusion Matrix
ROC & AUC
Gradient Boosting Trees
Section Introduction
Theoretical Introduction to Gradient Boosting
XGBoost Model
Prediction and Confusion Matrix
Conclusion