Saturday, March 16, 2024

data tools: Delta Lake, Lakehouse, Parquet (free book)

Home | Delta Lake

Delta Lake is an open-source storage framework that enables building a
Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs for Scala, Java, Rust, and Python.

Get Started

GitHub



What is Apache Parquet?
Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk





No comments: