Data Engineering 101
A comprehensive data engineering program for career switchers and data specialists
Data Engineering 101
Program Overview
Today, every business is data-driven. The demand for data specialists continues to grow. That’s why we’ve developed this program — a micromaster’s degree covering fundamental knowledge of data storage, processing, and retrieval.
Over three months, you’ll learn how to work with data — from SQL queries to coordination and monitoring. After completing the program, you’ll be able to apply for a Junior Data Engineer position (if you’re switching careers) or solidify your role (if you’re already in the field).
WHAT YOU WILL LEARN
PARTICIPANT REQUIREMENTS
Basic understanding of Python
Basic understanding of SQL
Basic understanding of Docker
English proficiency at B1 level or higher
EDUCATIONAL MODULES
Module 0: Prerequisites and Introduction to Data Engineering
This module is dedicated to introducing the key concepts of data engineering and structuring knowledge:
- Intro to data engineering: why it’s more than just backend
- Python: arrays, classes, functions
- SQL, relational databases (RDBMS), queries
- Docker: containers, images, version control
Module 1: Data Storage
In this module, you will get acquainted with the basic principles of data storage:
- Types of databases and key differences
- Relational databases (SQL)
- Non-relational databases (NoSQL)
- Data formats and storage in object repositories
- Data Modeling
Module 2: Data Processing
This module is dedicated to the main approaches and tools for data processing:
- Stream processing (Stream)
- Batch processing (Batch)
- Using the PySpark tool
- Spark SQL
Module 3: Data Retrieval
In this module, you will learn how to collect and organize data from various sources:
- Data organization in file systems and object repositories
- Tools for collecting batch processing data (PySpark, Airbyte)
- Basics of REST API
- Event streams
Module 4: Coordination and Monitoring
In this module, you will learn why and how to orchestrate, coordinate, and monitor data using various tools:
- Working with Airflow: DAG scripts, nodes, parameters
- Data monitoring tool Prometheus
- Visualization tool Grafana
Curator and Instructor
Dmytro Pryimak
Engineer with over 10 years of professional experience in designing and building systems for distributed data processing. Throughout his career, Dmytro has worked on numerous projects covering insurance, healthcare, medical data processing, online media, and entertainment. In recent years, he has shifted his focus from purely engineering to team leadership and mentoring/coaching.
Dmytro is also a guest lecturer at SET University, where he teaches Big Data at the Master’s degree programs.
ADVANTAGES
The program provides the necessary skills and knowledge to start a career in one of the most in-demand IT fields
Flexible learning format, allowing you to combine it with full-time work
Training by expert practitioners who provide relevant feedback and quality support during the course
WHO IT’S FOR
Developers looking to grow in the field of data engineering
Data Scientists and Data Analysts who want to transition into a Data Engineer role
Junior Data Engineers looking to organize their knowledge and use data tools effectively
FAQ
I already work as a data engineer. Is it worth taking your course?
If you have been working in this position for a year or less, then yes, this course will help you structure your knowledge and fill in gaps in your mastery of specific tools.
Learn more about the SET University program
Questions left?
Get a consultation: hello@setuniversity.tech