Preparation course for Databricks Data Engineer Associate certification exam Version 2 and 3
What you will learn
Databricks Lakehouse Platform and its tools
Build ETL pipelines
Process data incrementally
Create production pipelines
Create Dashboards in Databricks
Implement best security practices
Description
Whether you’re a seasoned data professional or just starting your journey, this course provides the perfect blend of theory and hands-on examples to ensure your success. With practical exercises and step-by-step guidance, you will learn how to navigate the Data Lakehouse architecture, explore the Data Science and Engineering workspace, and master the powerful Delta Lake.
A Certified Databricks Data Engineer unlocks endless possibilities in the world of data processing and analytics. In this comprehensive course, you will gain the knowledge and skills to harness the power of the Databricks Lakehouse Platform, empowering you to tackle real-world data challenges with confidence and efficiency.
Here’s a breakdown of the topics covered in this course:
- Understand how to use and the benefits of using the Databricks Lakehouse Platform and its tools, including:
- Data Lakehouse (architecture, descriptions, benefits)
- Data Science and Engineering workspace (clusters, notebooks, data storage)
- Delta Lake (general concepts, table management and manipulation, optimizations)
- Build ETL pipelines using Apache Spark SQL and Python, including:
- Relational entities (databases, tables, views)
- ELT (creating tables, writing data to tables, cleaning data, combining and reshaping tables, SQL UDFs)
- Python (facilitating Spark SQL with string manipulation and control flow, passing data between PySpark and Spark SQL)
- Incrementally process data, including:
- Structured Streaming (general concepts, triggers, watermarks)
- Auto Loader (streaming reads)
- Multi-hop Architecture (bronze-silver-gold, streaming applications)
- Delta Live Tables (benefits and features)
- Build production pipelines for data engineering applications and Databricks SQL queries and dashboards, including:
- Jobs (scheduling, task orchestration, UI)
- Dashboards (endpoints, scheduling, alerting, refreshing)
- Understand and follow best security practices, including:
- Unity Catalog (benefits and features)
- Entity Permissions (team-based permissions, user-based permissions)
These topics provide a comprehensive coverage of the Databricks Lakehouse Platform and its tools, allowing learners to gain a solid understanding of data engineering concepts and practices using Databricks.
Content