Stratpoint Engineering

Data Engineering Internship 2026

Sign in with your Stratpoint Google account to continue.

Data Engineering
Data Engineering Internship 2026
Chapter 12

Glossary

TermDefinition
OLTPOnline Transaction Processing. Databases optimised for fast row-level reads and writes. Your app's production database.
OLAPOnline Analytical Processing. Systems optimised for complex aggregations across large datasets. Data warehouses.
ETLExtract, Transform, Load. Data is transformed before loading into the target system.
ELTExtract, Load, Transform. Data is loaded raw first, then transformed using SQL (the dbt approach).
Star SchemaA data warehouse schema with a central fact table surrounded by denormalised dimension tables.
Snowflake SchemaA star schema where dimension tables are further normalised into sub-dimension tables.
Fact TableStores measurable, quantitative data (ratings, counts, amounts). Contains foreign keys to dimensions.
Dimension TableStores descriptive context (movie titles, genres, user details, dates).
ERDEntity-Relationship Diagram. A visual map of database tables and their relationships.
dbtData Build Tool. Runs SQL transformations and manages the dependency graph between models.
dbt modelA SQL SELECT statement saved as a .sql file. dbt compiles it into a table or view.
ref()dbt function that references another dbt model. Builds the dependency graph automatically.
source()dbt function that references a raw source table. Enables source freshness checks.
DAGDirected Acyclic Graph. In Airflow, a DAG defines the tasks in a pipeline and their dependencies.
OperatorAn Airflow class that defines what a task does. BashOperator, PythonOperator, etc.
DataFrameA 2D tabular data structure in Pandas or PySpark. Like a spreadsheet in code.
Schema (dbt)A dbt YAML file (schema.yml) that defines tests and documentation for models.
DAXData Analysis Expressions. The formula language used in Power BI for measures and calculated columns.
Measure (Power BI)A dynamic calculation evaluated at query time based on filter context. Always use for aggregations.
Star Schema (Power BI)The recommended data model layout in Power BI: fact table in centre, dimensions surrounding it.
Window functionSQL function that operates across a set of rows related to the current row. RANK, LAG, LEAD, SUM OVER.
CTECommon Table Expression. A named subquery defined using WITH. Makes SQL modular and readable.
CardinalityThe uniqueness of data values in a column, or the type of relationship between tables (1:1, 1:N, N:M).
Data LineageThe path data takes from source to final model. Visible in the dbt docs lineage graph.