• Post category:StudyBullet-15
  • Reading time:7 mins read


Preparation course for Databricks Data Engineer Associate certification exam Version 2 and 3

What you will learn

Databricks Lakehouse Platform and its tools

Build ETL pipelines

Process data incrementally

Create production pipelines

Create Dashboards in Databricks

Implement best security practices

Description

Whether you’re a seasoned data professional or just starting your journey, this course provides the perfect blend of theory and hands-on examples to ensure your success. With practical exercises and step-by-step guidance, you will learn how to navigate the Data Lakehouse architecture, explore the Data Science and Engineering workspace, and master the powerful Delta Lake.

A Certified Databricks Data Engineer unlocks endless possibilities in the world of data processing and analytics. In this comprehensive course, you will gain the knowledge and skills to harness the power of the Databricks Lakehouse Platform, empowering you to tackle real-world data challenges with confidence and efficiency.


Get Instant Notification of New Courses on our Telegram channel.


Here’s a breakdown of the topics covered in this course:

  • Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:
    • Data Lakehouse (architecture, descriptions, benefits)
    • Data Science and Engineering workspace (clusters, notebooks, data storage)
    • Delta Lake (general concepts, table management and manipulation, optimizations)
  • Build ETL pipelines using Apache Spark SQL and Python, including:
    • Relational entities (databases, tables, views)
    • ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)
    • Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)
  • Incrementally process data, including:
    • Structured Streaming (general concepts, triggers, watermarks)
    • Auto Loader (streaming reads)
    • Multi-hop Architecture (bronze-silver-gold, streaming applications)
    • Delta Live Tables (benefits and features)
  • Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:
    • Jobs (scheduling, task orchestration, UI)
    • Dashboards (endpoints, scheduling, alerting, refreshing)
  • Understand and follow best security practices, including:
    • Unity Catalog (benefits and features)
    • Entity Permissions (team-based permissions, user-based permissions)

These topics provide a comprehensive coverage of the Databricks Lakehouse Platform and its tools, allowing learners to gain a solid understanding of data engineering concepts and practices using Databricks.

English
language

Content

Introduction

Introduction
(lab) Databricks community account
(lab) Free Databricks on Azure
Databricks Workspace tour
Account and workspaces
Clusters
(lab) Creating Clusters mp4
Course resources

Databricks Lakehouse Platform and its tools

What is databricks
Databricks architecture
Notebook basics
Deltalake
(Theory & lab) Reading txt and csv formats
(Theory & lab) Reading Delta format data
(Theory & lab) Creating Delta tables using CTAS
(Theory & lab) Delta table constraints
(Theory & lab) Delta table Partitions
(Theory & lab) Delta table Operations
(Theory & lab) Time-travel and optimization
(Theory & lab) Delta table optimization and VACCUM
(Theory) Delta cloning and external tables
(lab) Delta cloning and external tables

ETL pipelines using Apache Spark SQL and Python

(Theory & lab) Relational Entities
Persistent, Temporary & Global Views
Combining and reshaping tables, and higher order functions
(Lab) Combining and reshaping tables, and higher order functions
(Theory & Lab) Joins and Sets
Passing data between pyspark DF and Spark SQL
(Theory & lab) UDF and control flows

Incremental data processing

Incremental data processing
(Lab) Structured streaming
(Lab) Autoloader
Multihop architecture
Multihop architecture discussion
(Lab) Multihop architecture implementation
Delta Live Tables
(Lab) Delta live tables

Production pipelines for data engineering applications and Databricks SQL

(Lab) Notebook parameters
(Lab) Databricks SQL, alerts & dashboards
(Lab) Scheduling a job

Entity permissions and Unity catalog

Entity permissions
Unity Catalog
(Lab) Entity permissions

Exam guidance

Exam Guidance