🚀 Mission
💎 Features
📰 News
🤔 Why?
🧑🤝🧑 People
🎤 Publications
🎪 Events
nsf1836650
Skyhook: Towards an Arrow-Native Storage System
With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data …
Jayjeet Chakraborty
,
Ivo Jimenez
,
Sebastiaan Alvarez Rodriguez
,
Alexandru Uta
,
Jeff LeFevre
,
Carlos Maltzahn
PDF
Cite
Zero-Cost, Arrow-Enabled Data Interface for Apache Spark
Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability …
Sebastiaan Alvarez Rodriguez
,
Jayjeet Chakraborty
,
Aaron Chu
,
Ivo Jimenez
,
Jeff LeFevre
,
Carlos Maltzahn
,
Alexandru Uta
PDF
Cite
Mapping Scientific Datasets to Programmable Storage
Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and …
Aaron Chu
,
Jeff LeFevre
,
Carlos Maltzahn
,
Aldrin Montana
,
Peter Alvaro
,
Dana Robinson
,
Quincey Koziol
PDF
Cite
Cite
×