The Hidden Architecture Behind Cube Multi-Datasource Setups (And the S3 Credential Trap)
When working with Cube, configuring multiple databases looks straightforward.
Until it isn’t.
Recently, while integrating DuckDB with Amazon S3 inside a multi-datasource Cube setup, I ran into a subtle configuration issue that perfectly illustrates how Cube handles datasource modes internally.
What looked like an S3 permissions problem turned out to be something much deeper: Cube has two mutually exclusive datasource configuration modes.
And understanding that distinction changes everything.
🧠 Cube Has Two Configuration Modes
1️⃣ Single (Default) Datasource Mode
If you configure Cube like this:
CUBEJS_DB_TYPE=duckdb
You are in single-datasource mode.
In this setup:
Architecturally, it looks like this:
Cube
└── Database (only one)
Simple. Clean. Predictable.
But there’s a constraint:
You cannot add another database unless you switch modes.
2️⃣ Multi-Datasource Mode
The moment you define:
CUBEJS_DATASOURCES=default,Redshift,DuckDB
Cube switches into multi-datasource mode.
Now every database must be declared explicitly:
CUBEJS_DS_DUCKDB_DB_TYPE=duckdb
CUBEJS_DS_REDSHIFT_DB_TYPE=redshift
Architecturally:
Cube
├── default
├── Redshift
└── DuckDB
Important implications:
And this is where things can get tricky.
🧩 The S3 Credential Trap
In single mode, DuckDB S3 credentials work like this:
CUBEJS_DB_DUCKDB_S3_ACCESS_KEY_ID=...
CUBEJS_DB_DUCKDB_S3_SECRET_ACCESS_KEY=...
But in multi-datasource mode, they must be scoped:
CUBEJS_DS_DUCKDB_DB_S3_ACCESS_KEY_ID=...
CUBEJS_DS_DUCKDB_DB_S3_SECRET_ACCESS_KEY=...
Miss that subtle difference — or rely on documentation that assumes single mode — and you’ll see S3 failures that look like:
“Access Denied” “Missing credentials” “Unable to load httpfs extension”
But the real issue isn’t S3.
It’s configuration scoping.
🔍 The Breakthrough Insight
What finally resolved the issue?
Setting:
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
Instead of relying solely on Cube-scoped variables.
Why did this work?
Because DuckDB ultimately follows the standard AWS credential resolution chain. By setting global AWS environment variables, the driver bypassed Cube’s datasource mapping entirely.
(Ps: Remember to include an environments section in your docker compose to get these env variable
)
That confirmed the real lesson:
The problem wasn’t S3 permissions. It was how Cube injects driver configuration in multi-datasource mode.
⚠️ The Rule That Isn’t Obvious in the Docs
These two configurations are mutually exclusive:
CUBEJS_DB_TYPE=...
and
CUBEJS_DATASOURCES=...
You must choose one mode.
They do not mix.
🏗️ Architectural Takeaways
If you’re running:
Then multi-datasource mode is the correct architectural decision.
But you must:
💡 What This Really Taught Me
Most production debugging issues are not about:
They’re about configuration mode mismatches.
Understanding how your framework internally switches operational modes is often the difference between:
Final Thought
Modern analytics stacks are increasingly hybrid:
The more composable your architecture becomes, the more important it is to understand how configuration boundaries work.
Because sometimes, the bug isn’t in your query.
It’s in your mode.
If you're working with Cube in a multi-datasource environment, I’d love to compare notes — especially around embedded engines + object storage patterns.