Decoupling the Reporting Monster 📊
Decoupling the Reporting Monster 📊
Welcome back to the Migration Diaries! If you’ve ever maintained a legacy monolith, you know that eventually, you have to face the final boss: The Reporting Engine.
In our latest sprint for the Cobone project, we tackled the beast that was our legacy reporting system. We didn't just migrate it; we completely reimagined it using a modern Semantic Layer and a dynamic API Gateway.
Here is how we turned multi-minute query timeouts into secure, sub-50-millisecond analytical dashboards.
1. The "Before" Picture: The Read-Replica Bottleneck ⏳
In our old Grails monolith, we did what most engineering teams do to protect production: we routed our heavy reporting queries to a dedicated read-replica database.
The database was fine, but the application layer was buckling. If an admin ran a massive, year-long revenue report, crunching that raw data locked up the Grails reporting application. The app would become unresponsive, leading to timeouts and a frustrating experience for anyone else trying to pull data.
To make matters worse, core business definitions (like "Net Sales Revenue") were scattered across complex, multi-join raw SQL strings. If the formula changed, we had to hunt down every query.
2. The Paradigm Shift: Introducing the "Semantic Layer" 🏗️
We realized we needed to decouple reporting entirely. Enter Cube.js.
Instead of writing raw SQL in our application code, we built a strict 3-Layer Semantic Architecture:
// 1. BASE LAYER: Raw Table Mapping (Strictly Hidden)
cube(`Purchases`, {
sql: `SELECT * FROM purchase`,
shown: false,
dimensions: { id: { sql: `id`, type: `number`, primaryKey: true } }
});
// 2. LOGIC LAYER: Business Rules & Math (Strictly Hidden)
cube(`PurchaseRevenue`, {
extends: Purchases,
shown: false,
measures: { totalNSR: { sql: `IF(status='PAID', amount, 0)`, type: `sum` } }
});
// 3. VIEW LAYER: Clean API for the Frontend (Exposed)
view(`PurchaseRevenueReport`, {
description: 'Clean API for Frontend Consumption',
includes: ['PurchaseRevenue.totalNSR']
});
The result? Our frontend engineers no longer write SQL. They simply request a clean JSON object, and the Semantic Layer translates it.
3. Speed vs. Precision: The Pre-aggregation Strategy 🎯
Financial reporting requires 100% precision. You cannot estimate revenue, and you fall into the countDistinct trap if you try to mathematically sum up "daily unique orders" to get "monthly unique orders" (since a user might buy on both days).
Instead of calculating on the fly, we utilized CubeStore to build massive, disk-based pre-aggregations (Rollups).
Recommended by LinkedIn
Solving the "Data Consistency Trap": To guarantee that a Daily chart and a Yearly KPI card always show the exact same foundational truth, we set a uniform 1-hour refresh frequency across all rollups.
But wouldn't that crush the database? No. We used incremental partitions:
preAggregations: {
monthlyRollup: {
type: `rollup`,
measures: [PurchaseRevenue.totalNSR],
timeDimension: PurchaseRevenue.paidDate,
granularity: `month`,
// 🚀 The Magic: Only rebuilds the current active year!
partitionGranularity: `year`,
// ⏱️ Guarantees all charts tie out perfectly
refreshKey: { every: `1 hour` }
}
}
Because of partitionGranularity, Cube intelligently ignores historical data and only rebuilds the current partition. The DB load remains tiny, but the numbers are always perfectly tied out.
4. The Innovation: Securing Headless BI with Dynamic PBAC 🛡️
This was our proudest architectural win. Cube is an incredible data engine, but out-of-the-box, it doesn’t know about our granular, enterprise-grade Permissions-Based Access Control (PBAC/CASL).
We placed it securely behind our NestJS API Gateway and built a custom CubeReportGuard:
@Injectable()
export class CubeReportGuard implements CanActivate {
async canActivate(context: ExecutionContext): Promise<boolean> {
const { body, user } = context.switchToHttp().getRequest();
const ability = await this.caslAbilityFactory.createForUser(user.id);
// 1. Parse the AST to find requested Views (e.g., 'PurchaseRevenueReport')
const requestedViews = this.extractViewsFromQuery(body?.query);
// 2. Dynamically check CASL database permissions
for (const viewName of requestedViews) {
const reqPerm = CUBE_PERMISSION_MAP[viewName];
if (!ability.can(reqPerm.action, reqPerm.subject)) {
throw new ForbiddenException(`Access Denied to ${viewName}`);
}
}
return true; // 🟢 Proxy to Cube.js!
}
}
We successfully bridged a headless SQL engine with strict enterprise access control. If a user lacks explicit read permission for the PurchaseRevenueReport, our Gateway rejects the request before Cube even knows it exists.
5. The Outcome: The Green Lightning Bolt ⚡
We transitioned from an unresponsive Grails app plagued by multi-minute load times to a fully decoupled architecture.
Today, our mathematically exact, fully secure, year-long financial aggregations return in under 50 milliseconds. The frontend is snappy, the read-replica is breathing easily, and our security matrix remains impenetrable.
Have you ever had to untangle a monolithic reporting system? I’d love to hear how your team approached pre-aggregations and BI security in the comments below! 👇
#SoftwareEngineering #Microservices #NestJS #CubeJS #DataEngineering #Architecture #MigrationDiaries #SystemDesign #Typescript