• Post category:StudyBullet-8
  • Reading time:9 mins read


Effective Data Wrangling and Exploration with R – Part II

What you will learn

You are going to learn how to perform string manipulation with base R and the stringr package

You are going to master Regular expression with base R and the stringr package

You are going to learn how to perform categorical data manipulation with base R and the forcats package

You are going to master date and datetime manipulation with base R, the chron and lubridate packages

Description

This course will teach you all you need to know to manipulate string, date, and categorical data effectively in R. We shall make use of base R, the stringr package, the forcats package, the chron package and the lubridate packages in this course. This course is the second in a series ofย  four courses dealing with data wrangling and exploration in R. The others are:


Get Instant Notification of New Courses on our Telegram channel.


  • Importing and exporting data in R: which has to do with importing csv, tab, txt, xlsx and other file types into R
  • Effective Data frame manipulation in R: which involves using base R, the packages dplyr, tidyr, data.table, and sqldf to manipulate data frames
  • Effective Data cleaning and exploration

    In this course, we are going to look at the following:

  • reading and writing raw text data
  • formatting strings
  • joining and splitting strings
  • subsetting strings
  • cleaning strings
  • performing set operations on strings
  • regex functions in both base R and stringr
  • performing regex operations
  • creating factors and ordered factors
  • factor attributes and structure
  • inspecting factors
  • manipulating factors
  • converting strings and numeric to factors and vice versa
  • the Date class
  • the POSIXt classes
  • the chron package
  • the lubridate package
  • creating dates and date-times from integers and strings
  • extracting date-time parts
  • getting the current date and date-time
  • performing date-time calculations
  • rounding datetimes
  • formatting datetimes
  • timespans: duration, periods, and interval
  • importing date-time columns

Hope you enjoy this course as we did developing it.

English
language

Content

Course Introduction

Introduction
Exercise files

String manipulation with base R

Section Objectives
Reading and writing raw text
String length and case folding
Joining strings using the cat() function
Joining strings using the paste() and paste0() functions
Joining strings using the sprintf() function
Formatting numbers with the format() function
Formatting numbers with the formatC() function
Formatting numbers using the scales package
Splitting strings using the strsplit() function
Extracting and replacing parts of a string
Removing white spaces
Performing set operations and conclusion

String manipulation with stringr

Section Objectives and Introduction
String length and formatting
Joining and splitting strings – I
Joining and splitting strings – II
Extracting and replacing parts of a string
Removing white spaces
Sorting strings
Duplicating strings and conclusion

Pattern matching using regular expression

Section Objectives
What is regular expression?
Base R regex functions
Stringr regex functions
Matching sequences
Alternates and ranges
Anchors and Word Boundaries
Quantifiers
Groups
Lookaround and conclusion

Manipulating Categorical data with Base R

Section objectives and What is a factor?
Creating a factor
Factor attributes and structure
Manipulating factors
Ordered Factors
Converting numeric and character vectors to factors and vice versa

Manipulating Categorical data with the forcats package

Section Objectives and Converting to a factor
Inspecting factors
Reordering levels
Restructuring levels and labels
Remove and add levels

Date manipulation with base R – The Date class

The class Date – Section objectives
Creating Dates from strings and integers
Getting the current date
Extracting date parts
Performing calculations with dates
Summary statistics with dates
Formatting dates and conclusion

Date manipulation with base R – The POSIXt class

The class Posixt – Section objectives
Creating Datetimes from strings and integers
Extracting datetime parts
Getting the current datetime
Performing calculations with datetimes
Summary statistics with datetimes
Rounding datetimes
Formatting datetimes
Loading columns as date or datetime and conclusion

Date Manipulation with the chron package

The chron package – Section objectives
Creating Dates from strings and integers
Extracting date and time parts
Getting the current date and time
Performing calculations with dates
Summary statistics with datetimes
Rounding datetimes
Formatting datetimes and conclusion

Date Manipulation with the lubridate package

The lubridate package – Section objectives
Creating Dates from strings and integers
Extracting datetime parts
Getting the current Date and datetime
Performing Date and datetime calculations
Summary statistics with dates and datetimes
Timespan – Duration
Timespan – Periods
Timespan – Intervals
Rounding datetimes
Formatting datetimes as strings
Reading datetime columns as datetime and conclusion