Lakehouse

A real lakehouse workflow—built for teams, not notebooks

Query datasets with DuckDB, transform them safely, materialize reproducible outputs, and profile quality in one place. Everything is tenant-isolated, version-aware, and designed to behave like production.

Want to see it live?Start in StudioDocs
OpenAIKubernetesPostgreSQLDuckDBParquet
XALORRA LAKEHOUSE
DuckDB + Parquet
SQL, transform, materialize, and profile—tenant-safe, versioned, and production-shaped.
SQL Query
Transform
Materialize
Profile
Versioned Parquet
RLS / Tenant

Stop “data prep” whiplash.
Build clean, versioned datasets you can trust.

Query Parquet with DuckDB for fast iteration.
Transform and materialize outputs with reproducible versions.
Profile quality so downstream ML/RAG is predictable.

Designed for multi-tenant lakehouse workflows

SQL, transforms, and materialization—without breaking governance.

Tenant-safe by default
Keep data isolated across orgs with consistent access patterns and predictable contracts.
Versioned outputs
Every materialized dataset becomes a stable artifact you can build ML/RAG on top of.
Production-shaped behavior
The workflows you test in Studio match how your backend runs—no fragile notebooks.
Lakehouse Core
DuckDB execution over versioned Parquet. Transform pipelines that materialize clean datasets. Profiling that turns messy inputs into predictable downstream behavior.
DuckDB
Parquet
Transform
Materialize
Profile
Query like an analyst

Teams that query datasets

Run SQL on tenant-scoped Parquet with predictable performance.
Stay safe: guardrails and limits keep Studio responsive.
Build confidence with stable response contracts for apps.
Prefer to explore first? Open Lakehouse.
PreviewTransformMaterializeENGINE duckdbFORMAT parquet_simpleMS 48SQL EditorRunSELECT * FROM default.titanic_debug_1000_clean__v2LIMIT 100;TABLE PREVIEWsurvivedpclasssexagefare03male34.57.813female227.202male2629.011female3871.3
Transform like an engineer

Teams that transform data

Turn raw uploads into clean tables with deterministic transforms.
Keep changes reproducible—no notebook drift.
Route writes through controlled pipelines for governance.
Prefer to explore first? Open Lakehouse.
PreviewTransformMaterializeTASK LakehouseTransformTaskSTATUS okENGINE duckdbMS 178Transform SQL (alias: src)Run TransformCASE WHEN COALESCE(CAST(age AS DOUBLE), 29.699) < 12 THEN 'child' WHEN COALESCE(CAST(age AS DOUBLE), 29.699) < 18 THEN 'teen' WHEN COALESCE(CAST(age AS DOUBLE), 29.699) < 35 THEN 'adult' ELSE 'mid'END AS age_bucket,LN(1 + COALESCE(CAST(fare AS DOUBLE), 0.0)) AS fare_log,Status: successTargetdefault.titanic_debug_1000_clean @v3Rows written418Duration178 ms
Materialize what matters

Teams that materialize outputs

Create versioned datasets you can train ML or build RAG on top of.
Track what changed, when, and why across versions.
Keep the lineage clear so teams can ship with confidence.
Prefer to explore first? Open Lakehouse.
PreviewTransformMaterializeTASK LakehouseMaterializeTaskSTATUS successENGINE duckdbMS 189SELECT SQL (use aliases: s1, s2, ... )Run MaterializeSELECT pclass, sex, COUNT(*)::BIGINT AS n, ROUND(AVG(CASE WHEN survived=1 THEN 1.0 ELSE 0.0 END), 6) AS survived_rate, ROUND(AVG(age), 6) AS age_avg, ROUND(AVG(fare), 6) AS fare_avgFROM s1GROUP BY 1, 2ORDER BY pclass, sex;Status: successTargetdefault.titanic_debug_1000_clean_stats @v1Sources1 default.titanic_debug_1000_clean @v2Duration189 ms

Lakehouse you can operate

Make datasets predictable—before ML and RAG depend on them.

If you care about tenant isolation, stable contracts, and versioned outputs, you want a workflow that behaves like production from day one.