0010 - File Format¶
Status¶
DRAFT
Context¶
Choosing component of Storage Engine
- Parquet
Let start with their slogan
Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. Parquet is available in multiple languages including Java, C++, Python, etc...
Source: [^1]
https://parquet.apache.org/docs/overview/motivation/
https://www.slideshare.net/HadoopSummit/file-format-benchmark-avro-json-orc-and-parquet