DP-203: Azure Data Engineer Associate - Beginner to Advanced

Get certified FAST!! Specifically designed to cover complete syllabus in just 12 hours.

What you will learn

Understand the exam format for DP 203 and key areas of focus to successfully achieve certification.

Prepare comprehensively for the Azure Data Engineer Associate (DP-203) exam with an emphasis on practical skills and knowledge application.

Master Data Processing with Azure Synapse for DP203 with detailed content on Dedicated, Serverless and Spark Pools

Understand robust Security and optimize Performance within Azure Synapse Pools

Comprehend Azure Data Lake Storage Solutions to secure and manage data cost effectively and ensure durability.

Orchestrate Data Workflows with Azure Data Factory

Introduction to Azure Databricks for Collaborative Data Engineering and understand different Cluster Configurations.

Learn Real-Time Data Processing with Azure Stream Analytics

Understand of time handling strategies within Stream Analytics like Out of order events, Late arriving events, Early arriving events and Watermarks.

Become proficient in leveraging Azure’s data engineering tools to their fullest potential, ready to thrive as a data engineer in the Azure cloud ecosystem.

Description

Dive into the world of Azure Data Engineering with a focused and comprehensive course designed to prepare you for the Azure Data Engineer Associate exam: DP-203

Get Instant Notification of New Courses on our Telegram channel.

This course provides a comprehensive exploration of Azure Synapse Analytics and its integrated ecosystem, encompassing Dedicated SQL Pools, Serverless SQL Pools, and Spark Pools.
You will understand how to harness the power of massive parallel processing in Dedicated SQL Pool by mastering Distributions and Indexing. The course also emphasizes performance optimization in Synapse’s Dedicated SQL Pools, highlighting techniques like Partitioning, the use of Dynamic Management Views, Materialized Views, and effective Workload Management strategies. Additionally, you’ll acquire skills in enhancing security for Dedicated SQL Pools through measures such as Conditional Access, Dynamic Data Masking, Column-level Security, Row-level Security and Encryption.
You will Learn how to utilize Serverless SQL Pools for efficient on-demand data queries and transformations and also about the authentication strategies for Serverless SQL Pools.
The curriculum thoroughly covers Spark Pools, introducing concepts like Delta Lake and Data Lakehouse.
We’ll cover the Data Lake for scalable storage solutions, focusing on key features like Access Control Lists (ACLs) for securing data, Lifecycle Policies for managing data retention, different Access Tiers available in Azure Data Lake Storage to store data cost-effectively based on access frequency and retrieval needs, and Storage Redundancy for data durability. This will give you a solid foundation in managing vast amounts of data securely and efficiently in Azure.
You’ll dive into the basics of Azure Data Factory, laying a foundation for understanding how to orchestrate data movement and transformation workflows effectively and you’ll learn the fundamentals of creating, managing, and deploying data pipelines that enable efficient data flow between different data platforms and services within the Azure ecosystem.
Azure Databricks sessions will introduce you to collaborative Apache Spark-based Data Engineering along with explanations on different cluster configurations.
Finally, the course delves into Azure Stream Analytics for real-time data processing. You will learn to ingest, process, and analyse data streams in real-time with a better understanding of time handling strategies within Stream Analytics like Out of order events, Late arriving events, Early arriving events and Watermarks.
This course not only prepares you for the Azure Data Engineer Associate exam but also equips you with the practical skills and knowledge needed to thrive as a data engineer in the Azure cloud ecosystem. Through a blend of theoretical knowledge and practical demonstrations, you’ll emerge ready to tackle real-world data challenges and leverage Azure’s powerful data engineering tools to their fullest potential.

English

language

Content

Welcome to the Course

Course Introduction

Create Azure Free Account

Azure Synapse Analytics

Introduction to Azure Synapse Analytics

Lab – Create a Synapse Analytics Workspace

Lab – Tour in Azure Synapse Analytics

Azure Synapse Dedicated SQL Pool

Introduction to Dedicated SQL Pool

Data Warehousing and ETL in Dedicated SQL Pool

Data Warehousing and ETL

MPP Architecture of Dedicated SQL Pool

Distributions in Tables of Dedicated SQL Pool

Indexing of Tables of the Dedicated SQL Pool

Lab – Create Dedicated SQL Pool

Lab – Create Azure SQL DB

Lab – Popoulate Dedicated SQL pool using Synapse Link

Lab – Explore Dedicated SQL Pool

Data Loading Process (ETL) in Dedicated SQL Pool

Create and Load Staging Tables

Slowly Changing Dimensions

Loading Dimension Tables

Loading Fact Tables and Post Load Optimization

Performance Improvement of Dedicated SQL Pool

Table Partitioning

Lab – Create a partitioned Table

Partition splitting and switching

Partition splitting and switching Example Overview

Lab-Partition splitting and switching example

Dynamic Management Views

Identify Connection Information and activity

Identify and troubleshoot query performance

Materialized Views

EXPLAIN WITH_RECOMMENDATIONS

Workload Management

Lab – Workload Management

Secure a Dedicated SQL Pool

Conditional Access

Dynamic Data Masking

Lab – Dynamic Data Masking

Lab – Column Level Security

Lab – Row Level Security

Lab – Transparent Data Encryption

Azure Synapse Serverless SQL Pool

Introduction to Serverless SQL Pool

Query Data using Serverless SQL Pool

The OPENROWSET Function

Lab – Querying different file formats from Serverless SQL Pool

Wildcard expressions to filter files

Create External Objects in Serverless SQL Pool

Lab – Create External Objects in Serverless SQL Pool

Transform Data using Serverless SQL Pool

Lab – Transform (part1) – Use CETAS to Transform data

Lab – Transform (part2) -Add a Stored procedure within a Synapse Pipeline

Lake Database in Serverless SQL Pool

Lake Database Introduction

Lab – Create Lake Database

Secure Data in Serverless SQL Pool

Authentication in Serverless SQL Pool

Access Control Lists (ACLs)

RBAC in Serverless SQL Pool

Azure Synapse Apache Spark Pool

Spark Architecture

Lab – Create an Apache Spark Pool

Spark Documentation

Data Transformation in a Spark Pool using PySpark

Lab – Read data into a Dataframe

Lab – Transformations on Customer data

Lab – Transformations on Product data

Lab – Transformations on Monthly data

Lab – Partitioning data

Lab – Managed Tables Vs External Tables

Lab – Magic commands

Lab – TempViews Vs GlobalTempViews

Data Transformation in Spark Pool using SparkSQL

Lab – Create Database Objects

Lab – Transformations for Product data

Lab – Transformations on Monthly data and Partitioning

Delta Lake in Spark Pool

Introduction to Delta Lake

Lab – Create Delta Tables

Lab – Update Delta Table

Lab – Time Travel

Lab – Create Delta Tables using SQL

Implement Data Lakehouse using Spark Pool

Introduction to Data Lakehouse

Lab – Preparing Environment for Data Lakehouse

Lab – Populate Silver Zone

Lab – Populate Gold Zone

Lab – Create Synapse ELT Pipeline

Azure Data Lake Storage Gen 2

ADLS Gen 2 Introduction

Lab – Create ADLS Gen 2 Account

Azure Storage Explorer

ADLS File Formats

Access Control Lists (ACLs) in ADLS Gen 2

Access Tiers of ADLS Gen 2

Lifecycle Management Policies

Lab – Create Lifecycle Management Policies

Storage Redundancy

Azure Data Facrory

Introduction to Azure Data Factory

Lab – Create Azure Data Factory Account

Lab (Optional)- Create Azure SQL DB

Lab – Create an ADF Data Pipeline

Lab – Create a DataFlow

Lab – Include Flowlets in the DataFlow

Lab – Create a Pipeline from Dataflow

Introduction to Triggers

Lab – Creating Triggers

Lab – Pipeline Dependency

Lab – Trigger Dependency

Integration Runtimes

Azure Databricks

Introduction to Azure Databricks

Lab – Create Azure Databricks Workspace

Cluster Configurations in Azure Databricks

Lab – Cluster Creation

Lab – Data Ingestion, SQL Warehouses and Dashboards in Azure Databricks

Lab – Mounting container onto DBFS using Key Vault

Azure Stream Analytics

Fundamentals of Azure Strean Analytics

Types of Window Functions

Tumbling Window

Hopping Window

Sliding Window

Session Window

Snapshot Window

Reference Data Inputs

Geospatial Functions

Create Azure Stream Analytics Job

Lab – Create Azure Event hub

Lab – Create Stream Analytics Job

Lab – Configure Input of Stream Analytics Job

Lab – Configure Output of Stream Analytics Job

Lab – Configure Query of Stream Analytics Job

Lab – Run Stream Analytics Job

Time handling in Azure Stream Analytics

Out of Order and Late Arriving Events

Timestamp adjustments with Watermark

Early arriving events, Watermark progression and Substreams

Watermark Delay

Performance Optimization in Azure Stream Analytics

Performance Optimization in Stream Analytics

End-of-Course Evaluation Questions for DP-203 Azure Data Engineer Associate

Practice Test

End of Course

Enroll for Free