ETL Pipeline for UK Carbon Intensity Data with Python and PostgreSQL

I recently worked on an ETL pipeline built around UK regional carbon intensity data. The pipeline extracts 24-hour regional data from the Carbon Intensity API, transforms the nested JSON response into a structured tabular format, aggregates the 30-minute interval readings into daily regional summaries, and loads the output into PostgreSQL for analysis. On the transformation side, the workflow flattens both the carbon intensity values and the generation mix data across fuel sources, then uses Pandas to produce daily region-level metrics. On the database side, the final output is stored in PostgreSQL tables designed for reporting, with date-based partitioning applied to the fact tables to support cleaner storage management and better scalability as the data grows. The result is a query-ready pipeline that turns raw API data into structured daily carbon intensity and generation mix data that can be used for downstream analysis and reporting. Tech used: Python, Pandas, PostgreSQL, SQLAlchemy, YAML #DataEngineering #ETL #Python #PostgreSQL #SQL #DataPipeline #DatabaseDesign #AnalyticsEngineering

  • graphical user interface, application

To view or add a comment, sign in

Explore content categories