Lo-Fi Python | Characterizing Database Workloads & Storage Models

Characterizing Database Workloads & Storage Models

Thank you Carnegie Mellon Database Group for putting this online! These are my notes from watching on YouTube.

Carnegie Mellon Databases Storage II, Lecture 4

Prof. Andy Pavlo [Watch on YouTube]

The Problem and Solution

How should the DBMS represent the database in storage files on disk? Solve it by choosing the right storage model for your target workload. The right strategy varies if you are reading data, writing data and with how many joins you are performing.

Workload Characterization

OLTP (Online Transaction Processing): "Simple queries with lots of writes."

OLAP (Online Analytical Processing): "Read only queries. Lots of joins. Doing a lot of reads, but they're more complex."

HTAP (Hybrid Transactional Analytical Processing): "is trying to do both of them. You still want to ingest new data, but analyze it as it comes in. It's used for companies making decisions on the fly as people are browsing websites, like internet advertising companies."