Iceberg Managed Tables in Databricks Unity Catalog

Iceberg Managed Tables in Databricks Unity Catalog

When we think about a data lakehouse, the first thing that comes to mind is where data is stored: files in the cloud, partitions, formats like Parquet or Delta. But in reality, the key question isn’t just where data is stored, but how it’s managed and governed throughout its lifecycle.

This is where Managed Tables come in, and more recently, support for Iceberg Managed Tables in Databricks.

What are Iceberg Managed Tables?

An Iceberg Managed Table is an implementation of the Apache Iceberg format where Databricks Unity Catalog assumes complete responsibility for the table’s operational maintenance. While with traditional Iceberg teams must manually execute operations like OPTIMIZE to compact small files, VACUUM to remove obsolete files, and manage catalog metadata, Managed Tables automate these processes in the background. This means developers get all the technical advantages of Iceberg (schema evolution, time travel, interoperability with multiple engines like Trino or Spark) without the operational burden of maintaining the format, creating a balance between the technological openness of the open standard and the simplicity of using a managed solution.

What are they used for?

Iceberg Managed Tables solve a common problem in modern data architectures: the tension between technological openness and operational simplicity.

Traditional problem:

  • If you use “pure” Apache Iceberg, you have maximum flexibility but must manually manage optimizations, cleanup, permissions, and metadata
  • If you use proprietary formats like Delta Lake, you get automatic management but are limited to the vendor’s ecosystem

Solution:

Iceberg Managed Tables give you the best of both worlds: an open standard format with completely automatic management.

In what contexts are they used?

Multi-Cloud Architectures

Data accessible from Databricks, Trino, Snowflake, and BI tools without duplication or conversions.

Data Warehouse Migration

Migrate from Teradata, Oracle, or SQL Server while maintaining compatibility with existing tools.

Technologically Diverse Teams

Engineering uses Spark, analytics uses Trino, ML uses DuckDB - everyone accesses the same data.

Regulated Sectors

Finance and healthcare with complete auditing, automated retention policies, and granular control.

Practical Use Cases

Case 1: Multi-Cloud E-commerce with Automatic Management

-- User events accessible from multiple engines without manual configuration
CREATE TABLE ecommerce.events.user_interactions (
  user_id BIGINT NOT NULL,
  event_type STRING,
  product_id STRING,
  timestamp TIMESTAMP,
  session_id STRING
) USING ICEBERG
PARTITIONED BY (days(timestamp));

Managed Benefit: Unity Catalog automatically executes OPTIMIZE and VACUUM, ML team accesses via DuckDB without additional permissions, BI tools from Snowflake/Trino read directly, and there are no manual maintenance tasks.

Case 2: Fintech with Automatic Regulatory Compliance

-- Transactions with automatic governance and cross-platform access
CREATE TABLE finance.transactions.payments (
  transaction_id STRING NOT NULL,
  amount DECIMAL(15,2),
  currency STRING,
  created_at TIMESTAMP,
  customer_id STRING
) USING ICEBERG
PARTITIONED BY (months(created_at));

Managed Benefit: Unity Catalog automates 7-year retention policies, auditors access from external tools (Tableau, Power BI), risk teams use Trino/Presto without data duplication, and regulatory compliance is automatic without manual intervention.

Case 3: Industrial IoT with Diverse Technology Ecosystem

-- Industrial sensors accessible from any tool without vendor lock-in
CREATE TABLE iot.sensors.temperature_readings (
  sensor_id STRING NOT NULL,
  temperature DOUBLE,
  location STRING,
  reading_time TIMESTAMP,
  device_type STRING
) USING ICEBERG
PARTITIONED BY (device_type, hours(reading_time));

Managed Benefit: Engineering uses Spark in Databricks, DevOps monitors with Grafana connected to Trino, Data Science prototypes in DuckDB, and plant teams access via Tableau. Unity Catalog eliminates data silos and automates optimization without each team managing different formats.

Case 4: Enterprise Migration from Traditional Data Warehouse

-- Migration from Teradata/Oracle to lakehouse maintaining enterprise governance
CREATE TABLE enterprise.sales.transactions (
  transaction_id BIGINT NOT NULL,
  customer_id BIGINT,
  product_sku STRING,
  sale_amount DECIMAL(18,2),
  sale_date DATE,
  region_code STRING
) USING ICEBERG
PARTITIONED BY (region_code, months(sale_date));

Managed Benefit: Eliminates Teradata/Oracle vendor lock-in, existing reports in MicroStrategy/Cognos continue working via Trino connectors, Data Science teams access with Python/R without additional ETL, and Unity Catalog automatically replicates security policies from the previous data warehouse without manual re-configuration.

Key Benefits

BenefitDescriptionPractical Example
InteroperabilityMultiple engines can read the same dataSpark processes, Trino analyzes, DuckDB prototypes
Automatic ManagementUnity Catalog handles optimizations and cleanupNo more manual OPTIMIZE and VACUUM jobs
Schema EvolutionAdd/modify columns without rewriting dataAdd new column to billions of records in seconds
Time TravelQuery historical versions of dataSELECT * FROM table VERSION AS OF '2024-01-01 10:00:00'
Unified GovernanceCentralized permissions and auditingGranular access control from Unity Catalog

tip

If your organization values technological openness, uses multiple analytics tools, or plans to migrate between platforms in the future, Iceberg Managed Tables give you flexibility without complexity.

The context: from scattered files to managed tables

For years, data teams have worked with scattered files in a data lake. That works, but generates problems: who compacts the small files? How do we ensure consistent permissions? What happens when we want to delete an entire table without leaving “garbage” in storage?

Databricks managed tables solve precisely that:

Unified Control

Storage and metadata are under Unity Catalog control

Safe Deletion

When deleting the table, data is safely removed

Automatic Optimization

Optimizations and automatic maintenance run in the background

The data team stops worrying about operational tasks and can focus on analysis and the final product.

Advantages of Apache Iceberg format

Apache Iceberg is an open table format designed for modern data lakes. Its great advantage is that it allows working with petabyte-scale data, supports schema evolution, and is interoperable with multiple engines (Trino, Spark, Flink, DuckDB, among others).

The problem was that, until now, those using Iceberg had to take care of the “uncomfortable” part: maintaining metadata, optimizing files, doing vacuum.

With Iceberg Managed Tables in Databricks, that work becomes automatic. Unity Catalog takes care of:

  • Executing analyze, optimize, and vacuum periodically
  • Managing centralized permissions
  • Allowing external access with secure temporary credentials

Thus, teams combine the best of both worlds: the openness and flexibility of Iceberg, with the simplicity of managed administration.

Practical implementation

Creating a managed table with Iceberg is no more complex than writing:

CREATE TABLE my_catalog.my_schema.my_iceberg_table (
  id BIGINT NOT NULL,
  name STRING,
  created_at TIMESTAMP
) USING ICEBERG
TBLPROPERTIES (
  'write.target-file-size-bytes'='134217728'
);

And if one day you no longer need it:

DROP TABLE IF EXISTS my_catalog.my_schema.my_iceberg_table;

info

Unity Catalog ensures data remains in quarantine for 7 days (in case you change your mind) before deleting it permanently.

Business value and operational benefits

BenefitDescription
Fewer hidden costsNo more team hours solving compaction or cleanup problems
Better performanceOptimized tables and metadata always ready for fast queries
Centralized governancePermissions, security, and auditing under one system
FlexibilityYou can use the table from Spark in Databricks or from external engines like Trino

In other words: Iceberg Managed Tables simplify the data engineer’s life and ensure the organization maintains control without losing technological openness.

Conclusion

Iceberg Managed Tables in Databricks Unity Catalog represent the natural evolution of the lakehouse: an ecosystem where technological openness and operational simplicity converge to drive business innovation.

The value proposition is clear:

For teams using Iceberg

Migrate to a managed model without losing the advantages of the open format

🚀 For teams using Delta

Explore an open standard format while maintaining all Databricks guarantees

The result: You no longer need to choose between control and flexibility. Iceberg Managed Tables offer you both, freeing your team to focus on generating value through data, not maintaining infrastructure.

Resources

  • #Apache Iceberg
  • #Data Lakes
  • #Table Format
  • #Unity Catalog
  • #Databricks
Share: