Implement a Data Analytics Solution with Azure Databricks (DP-3011)

 

Course Overview

This course explores how to use Databricks and Apache Spark on Azure to take data projects from exploration to production. You’ll learn how to ingest, transform, and analyze large-scale datasets with Spark DataFrames, Spark SQL, and PySpark, while also building confidence in managing distributed data processing. Along the way, you’ll get hands-on with the Databricks workspace—navigating clusters and creating and optimizing Delta tables. You’ll also dive into data engineering practices, including designing ETL pipelines, handling schema evolution, and enforcing data quality. The course then moves into orchestration, showing you how to automate and manage workloads with Lakeflow Jobs and pipelines. To round things out, you’ll explore governance and security capabilities such as Unity Catalog and Purview integration, ensuring you can work with data in a secure, well-managed, and production-ready environment.

Who should attend

This course is designed for data professionals who want to strengthen their skills in building and managing data solutions on Azure Databricks. It’s a good fit if you’re a data engineer, data analyst, or developer with some prior experience in Python, SQL, and basic cloud concepts, and you’re looking to move beyond small-scale analysis into scalable, production-ready data processing. Whether your goal is to modernize analytics workflows, optimize pipelines, or better manage and govern data at scale, this learning path will equip you with the practical skills to succeed.

Prerequisites

Before starting this learning path, you should already be comfortable with the fundamentals of Python and SQL. This includes being able to write simple Python scripts and work with common data structures, as well as writing SQL queries to filter, join, and aggregate data. A basic understanding of common file formats such as CSV, JSON, or Parquet will also help when working with datasets.

In addition, familiarity with the Azure portal and core services like Azure Storage is important, along with a general awareness of data concepts such as batch versus streaming processing and structured versus unstructured data. While not mandatory, prior exposure to big data frameworks like Spark, and experience working with Jupyter notebooks, can make the transition to Databricks smoother.

Course Content

  • Explore Azure Databricks
  • Perform data analysis with Azure Databricks
  • Use Apache Spark in Azure Databricks
  • Manage data with Delta Lake
  • Build Lakeflow Declarative Pipelines
  • Deploy workloads with Lakeflow Jobs

Click on town name or "Online Training" to book Agenda

Instructor-led Online Training:   This is an Instructor-Led Online (ILO) course. These sessions are conducted via Microsoft Teams in a VoIP environment and require an Internet Connection and headset with microphone connected to your computer or laptop.
This is a FLEX course, which is delivered both virtually and in the classroom.

North America

Estados Unidos

Treinamento online Fuso horário: Pacific Standard Time (PST) Língua do curso: Inglês
Treinamento online Fuso horário: Central Standard Time (CST) Língua do curso: Inglês
Treinamento online Fuso horário: Eastern Daylight Time (EDT) Língua do curso: Inglês

Canadá

Treinamento online Fuso horário: Pacific Standard Time (PST) Língua do curso: Inglês
Treinamento online Fuso horário: Central Standard Time (CST) Língua do curso: Inglês
Treinamento online Fuso horário: Eastern Daylight Time (EDT) Língua do curso: Inglês

Europa

Alemanha

Frankfurt
Frankfurt
Frankfurt

Holanda

Utrecht

Polónia

Treinamento online Fuso horário: Central European Time (CET) Língua do curso: Polski

Reino Unido

Treinamento online Fuso horário: Greenwich Mean Time (GMT) Língua do curso: Inglês

Suíça

Zürich
Zürich

Oriente Médio

Emirados Árabes

Dubai
Dubai

África

Egito

Cairo
Cairo