• Post category:StudyBullet-16
  • Reading time:15 mins read


Get certified FAST!! Specifically designed to cover complete syllabus in just 12 hours.

What you will learn

Understand the exam format for DP 203 and key areas of focus to successfully achieve certification.

Prepare comprehensively for the Azure Data Engineer Associate (DP-203) exam with an emphasis on practical skills and knowledge application.

Master Data Processing with Azure Synapse for DP203 with detailed content on Dedicated, Serverless and Spark Pools

Understand robust Security and optimize Performance within Azure Synapse Pools

Comprehend Azure Data Lake Storage Solutions to secure and manage data cost effectively and ensure durability.

Orchestrate Data Workflows with Azure Data Factory

Introduction to Azure Databricks for Collaborative Data Engineering and understand different Cluster Configurations.

Learn Real-Time Data Processing with Azure Stream Analytics

Understand of time handling strategies within Stream Analytics like Out of order events, Late arriving events, Early arriving events and Watermarks.

Become proficient in leveraging Azure’s data engineering tools to their fullest potential, ready to thrive as a data engineer in the Azure cloud ecosystem.

Description

Dive into the world of Azure Data Engineering with a focused and comprehensive course designed to prepare you for the Azure Data Engineer Associate exam: DP-203


Get Instant Notification of New Courses on our Telegram channel.


  • This course provides a comprehensive exploration of Azure Synapse Analytics and its integrated ecosystem, encompassing Dedicated SQL Pools, Serverless SQL Pools, and Spark Pools.
  • You will understand how to harness the power of massive parallel processing in Dedicated SQL Pool by mastering Distributions and Indexing. The course also emphasizes performance optimization in Synapse’s Dedicated SQL Pools, highlighting techniques like Partitioning, the use of Dynamic Management Views, Materialized Views, and effective Workload Management strategies. Additionally, you’ll acquire skills in enhancing security for Dedicated SQL Pools through measures such as Conditional Access, Dynamic Data Masking, Column-level Security, Row-level Security and Encryption.
  • You will Learn how to utilize Serverless SQL Pools for efficient on-demand data queries and transformations and also about the authentication strategies for Serverless SQL Pools.
  • The curriculum thoroughly covers Spark Pools, introducing concepts like Delta Lake and Data Lakehouse.
  • We’ll cover the Data Lake for scalable storage solutions, focusing on key features like Access Control Lists (ACLs) for securing data, Lifecycle Policies for managing data retention, different Access Tiers available in Azure Data Lake Storage to store data cost-effectively based on access frequency and retrieval needs, and Storage Redundancy for data durability. This will give you a solid foundation in managing vast amounts of data securely and efficiently in Azure.
  • You’ll dive into the basics of Azure Data Factory, laying a foundation for understanding how to orchestrate data movement and transformation workflows effectively and you’ll learn the fundamentals of creating, managing, and deploying data pipelines that enable efficient data flow between different data platforms and services within the Azure ecosystem.
  • Azure Databricks sessions will introduce you to collaborative Apache Spark-based Data Engineering along with explanations on different cluster configurations.
  • Finally, the course delves into Azure Stream Analytics for real-time data processing. You will learn to ingest, process, and analyse data streams in real-time with a better understanding of time handling strategies within Stream Analytics like Out of order events, Late arriving events, Early arriving events and Watermarks.

    This course not only prepares you for the Azure Data Engineer Associate exam but also equips you with the practical skills and knowledge needed to thrive as a data engineer in the Azure cloud ecosystem. Through a blend of theoretical knowledge and practical demonstrations, you’ll emerge ready to tackle real-world data challenges and leverage Azure’s powerful data engineering tools to their fullest potential.

English
language

Content

Welcome to the Course

Course Introduction

Create Azure Free Account

Create Azure Free Account

Azure Synapse Analytics

Introduction to Azure Synapse Analytics
Lab – Create a Synapse Analytics Workspace
Lab – Tour in Azure Synapse Analytics

Azure Synapse Dedicated SQL Pool

Introduction to Dedicated SQL Pool

Data Warehousing and ETL in Dedicated SQL Pool

Data Warehousing and ETL
MPP Architecture of Dedicated SQL Pool
Distributions in Tables of Dedicated SQL Pool
Indexing of Tables of the Dedicated SQL Pool
Lab – Create Dedicated SQL Pool
Lab – Create Azure SQL DB
Lab – Popoulate Dedicated SQL pool using Synapse Link
Lab – Explore Dedicated SQL Pool
Data Loading Process (ETL) in Dedicated SQL Pool
Create and Load Staging Tables
Slowly Changing Dimensions
Loading Dimension Tables
Loading Fact Tables and Post Load Optimization

Performance Improvement of Dedicated SQL Pool

Table Partitioning
Lab – Create a partitioned Table
Partition splitting and switching
Partition splitting and switching Example Overview
Lab-Partition splitting and switching example
Dynamic Management Views
Identify Connection Information and activity
Identify and troubleshoot query performance
Materialized Views
EXPLAIN WITH_RECOMMENDATIONS
Workload Management
Lab – Workload Management

Secure a Dedicated SQL Pool

Conditional Access
Dynamic Data Masking
Lab – Dynamic Data Masking
Lab – Column Level Security
Lab – Row Level Security
Lab – Transparent Data Encryption

Azure Synapse Serverless SQL Pool

Introduction to Serverless SQL Pool

Query Data using Serverless SQL Pool

The OPENROWSET Function
Lab – Querying different file formats from Serverless SQL Pool
Wildcard expressions to filter files
Create External Objects in Serverless SQL Pool
Lab – Create External Objects in Serverless SQL Pool

Transform Data using Serverless SQL Pool

Transform Data using Serverless SQL Pool
Lab – Transform (part1) – Use CETAS to Transform data
Lab – Transform (part2) -Add a Stored procedure within a Synapse Pipeline

Lake Database in Serverless SQL Pool

Lake Database Introduction
Lab – Create Lake Database

Secure Data in Serverless SQL Pool

Authentication in Serverless SQL Pool
Access Control Lists (ACLs)
RBAC in Serverless SQL Pool

Azure Synapse Apache Spark Pool

Spark Architecture
Lab – Create an Apache Spark Pool
Spark Documentation

Data Transformation in a Spark Pool using PySpark

Lab – Read data into a Dataframe
Lab – Transformations on Customer data
Lab – Transformations on Product data
Lab – Transformations on Monthly data
Lab – Partitioning data
Lab – Managed Tables Vs External Tables
Lab – Magic commands
Lab – TempViews Vs GlobalTempViews

Data Transformation in Spark Pool using SparkSQL

Lab – Create Database Objects
Lab – Transformations for Product data
Lab – Transformations on Monthly data and Partitioning

Delta Lake in Spark Pool

Introduction to Delta Lake
Lab – Create Delta Tables
Lab – Update Delta Table
Lab – Time Travel
Lab – Create Delta Tables using SQL

Implement Data Lakehouse using Spark Pool

Introduction to Data Lakehouse
Lab – Preparing Environment for Data Lakehouse
Lab – Populate Silver Zone
Lab – Populate Gold Zone
Lab – Create Synapse ELT Pipeline

Azure Data Lake Storage Gen 2

ADLS Gen 2 Introduction
Lab – Create ADLS Gen 2 Account
Azure Storage Explorer
ADLS File Formats
Access Control Lists (ACLs) in ADLS Gen 2
Access Tiers of ADLS Gen 2
Lifecycle Management Policies
Lab – Create Lifecycle Management Policies
Storage Redundancy

Azure Data Facrory

Introduction to Azure Data Factory
Lab – Create Azure Data Factory Account
Lab (Optional)- Create Azure SQL DB
Lab – Create an ADF Data Pipeline
Lab – Create a DataFlow
Lab – Include Flowlets in the DataFlow
Lab – Create a Pipeline from Dataflow
Introduction to Triggers
Lab – Creating Triggers
Lab – Pipeline Dependency
Lab – Trigger Dependency
Integration Runtimes

Azure Databricks

Introduction to Azure Databricks
Lab – Create Azure Databricks Workspace
Cluster Configurations in Azure Databricks
Lab – Cluster Creation
Lab – Data Ingestion, SQL Warehouses and Dashboards in Azure Databricks
Lab – Mounting container onto DBFS using Key Vault

Azure Stream Analytics

Fundamentals of Azure Strean Analytics
Types of Window Functions
Tumbling Window
Hopping Window
Sliding Window
Session Window
Snapshot Window
Reference Data Inputs
Geospatial Functions

Create Azure Stream Analytics Job

Lab – Create Azure Event hub
Lab – Create Stream Analytics Job
Lab – Configure Input of Stream Analytics Job
Lab – Configure Output of Stream Analytics Job
Lab – Configure Query of Stream Analytics Job
Lab – Run Stream Analytics Job

Time handling in Azure Stream Analytics

Out of Order and Late Arriving Events
Timestamp adjustments with Watermark
Early arriving events, Watermark progression and Substreams
Watermark Delay

Performance Optimization in Azure Stream Analytics

Performance Optimization in Stream Analytics

End-of-Course Evaluation Questions for DP-203 Azure Data Engineer Associate

Practice Test

End of Course

End of Course