Stratpoint Engineering

Data Engineering Internship 2026

Sign in with your Stratpoint Google account to continue.

Data Engineering
Data Engineering Internship 2026
Chapter 10

Weekly Milestones Tracker

Self-assess at the end of each week. Bring unchecked items to your next consultation session.

Day 0 | Setup

  • WSL2 (Windows) or Terminal (macOS) is working
  • Docker Desktop is running and docker compose version works
  • PostgreSQL container starts and I can connect with psql
  • Python 3.11+ installed and pip works
  • dbt --version shows 1.8+
  • Airflow UI accessible at localhost:8080
  • PySpark SparkSession creates without errors
  • VS Code with all required extensions installed

Week 1 | Database Design

  • I can explain the difference between OLTP and OLAP
  • I can design a normalized ERD from a business scenario
  • I can design a star schema with a fact table and dimension tables
  • I understand when to use star schema vs snowflake schema
  • I completed Project 1 and presented my ERD and star schema to instructors
  • I submitted my design justification document

Week 2 | Core Tools

  • I can write a bash script that downloads a file and handles errors
  • I can load a CSV into PostgreSQL using PySpark
  • I can clean a DataFrame with Pandas (nulls, deduplication, type casting)
  • I can write window functions in SQL (RANK, LAG, LEAD)
  • I completed Project 2 and demonstrated the end-to-end pipeline

Weeks 3–5 | Data Pipelines

  • I understand the difference between ETL and ELT
  • I can create a dbt project with staging, intermediate, and mart layers
  • I use source() for raw tables and ref() for dbt models — never hardcoded table names
  • I have unique and not_null tests on all primary keys
  • I can build an Airflow DAG that runs a multi-step pipeline
  • I completed Project 3 — dbt run and dbt test both pass with zero errors
  • I completed Project 3 — Airflow DAG runs end to end with all tasks green

Weeks 6–7 | Power BI

  • I can connect Power BI Desktop to a PostgreSQL database
  • I can build a star schema data model in Power BI Model view
  • I have written at least 5 DAX measures in a dedicated Measures table
  • My dashboard answers at least 3 specific business questions
  • I have written a data story that includes a recommended action
  • I completed Project 4 and presented the dashboard to instructors

Weeks 8–12 | Capstone

  • I have read the full capstone brief on Day 1 of Week 8
  • I have an architecture diagram showing the full data flow
  • Python classes and modular functions are used for extraction and transformation
  • PySpark processes the data and writes to PostgreSQL
  • dbt project has staging/intermediate/mart layers with passing tests
  • Airflow DAG orchestrates the full pipeline on a schedule
  • Power BI dashboard is connected to the mart tables
  • I have submitted my capstone and delivered the final presentation