diff --git a/README.md b/README.md index 2b0aaea65..ab7871ef7 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@ # delta-kernel-rs -Delta-kernel-rs is an experimental [Delta][delta] implementation focused on -interoperability with a wide range of query engines. It currently only supports -reads. +Delta-kernel-rs is an experimental [Delta][delta] implementation focused on interoperability with a +wide range of query engines. It currently supports reads and (experimental) writes. Only blind +appends are currently supported in the write path. -The Delta Kernel project is a Rust and C library for building Delta connectors that can read (and -soon, write) Delta tables without needing to understand the Delta [protocol -details][delta-protocol]. This is the Rust/C equivalent of [Java Delta Kernel][java-kernel]. +The Delta Kernel project is a Rust and C library for building Delta connectors that can read and +write Delta tables without needing to understand the Delta [protocol details][delta-protocol]. This +is the Rust/C equivalent of [Java Delta Kernel][java-kernel]. ## Crates @@ -33,10 +33,12 @@ the acceptance tests against it. In general, you will want to depend on `delta-kernel-rs` by adding it as a dependency to your `Cargo.toml`, (that is, for rust projects using cargo) for other projects please see the [FFI] -module. The core kernel includes facilities for reading delta tables, but requires the consumer -to implement the `Engine` trait in order to use the table-reading APIs. If there is no need to -implement the consumer's own `Engine` trait, the kernel has a feature flag to enable a default, -asynchronous `Engine` implementation built with [Arrow] and [Tokio]. +module. The core kernel includes facilities for reading and writing delta tables, and allows the +consumer to implement their own `Engine` trait in order to build engine-specific implementations of +the various `Engine` APIs that the kernel relies on (e.g. implement an engine-specific +`read_json_files()` using the native engine JSON reader). If there is no need to implement the +consumer's own `Engine` trait, the kernel has a feature flag to enable a default, asynchronous +`Engine` implementation built with [Arrow] and [Tokio]. ```toml # fewer dependencies, requires consumer to implement Engine trait. @@ -126,12 +128,13 @@ projects. There are a few key concepts that will help in understanding kernel: 1. The `Engine` trait encapsulates all the functionality and engine or connector needs to provide to - the Delta Kernel in order to read the Delta table. + the Delta Kernel in order to read/write the Delta table. 2. The `DefaultEngine` is our default implementation of the the above trait. It lives in `engine/default`, and provides a reference implementation for all `Engine` functionality. `DefaultEngine` uses [arrow](https://docs.rs/arrow/latest/arrow/) as its in-memory data format. 3. A `Scan` is the entrypoint for reading data from a table. +4. A `Transaction` is the entrypoint for writing data to a table. ### Design Principles diff --git a/kernel/src/lib.rs b/kernel/src/lib.rs index 1d43f4aba..110a822e7 100644 --- a/kernel/src/lib.rs +++ b/kernel/src/lib.rs @@ -1,45 +1,53 @@ //! # Delta Kernel //! //! Delta-kernel-rs is an experimental [Delta](https://github.com/delta-io/delta/) implementation -//! focused on interoperability with a wide range of query engines. It currently only supports -//! reads. This library defines a number of traits which must be implemented to provide a -//! working "delta reader". They are detailed below. There is a provided "default engine" that -//! implements all these traits and can be used to ease integration work. See -//! [`DefaultEngine`](engine/default/index.html) for more information. +//! focused on interoperability with a wide range of query engines. It supports reads and +//! (experimental) writes (only blind appends in the write path currently). This library defines a +//! number of traits which must be implemented to provide a working delta implementation. They are +//! detailed below. There is a provided "default engine" that implements all these traits and can +//! be used to ease integration work. See [`DefaultEngine`](engine/default/index.html) for more +//! information. //! //! A full `rust` example for reading table data using the default engine can be found in the //! [read-table-single-threaded] example (and for a more complex multi-threaded reader see the //! [read-table-multi-threaded] example). //! -//! [read-table-single-threaded]: https://github.com/delta-io/delta-kernel-rs/tree/main/kernel/examples/read-table-single-threaded -//! [read-table-multi-threaded]: https://github.com/delta-io/delta-kernel-rs/tree/main/kernel/examples/read-table-multi-threaded +//! [read-table-single-threaded]: +//! https://github.com/delta-io/delta-kernel-rs/tree/main/kernel/examples/read-table-single-threaded +//! [read-table-multi-threaded]: +//! https://github.com/delta-io/delta-kernel-rs/tree/main/kernel/examples/read-table-multi-threaded +//! +//! Simple write examples can be found in the [`write.rs`] integration tests. Standalone write +//! examples are coming soon! +//! +//! [`write.rs`]: https://github.com/delta-io/delta-kernel-rs/tree/main/kernel/tests/write.rs //! //! # Engine traits //! -//! The [`Engine`] trait allow connectors to bring their own implementation of functionality such as -//! reading parquet files, listing files in a file system, parsing a JSON string etc. This trait -//! exposes methods to get sub-engines which expose the core functionalities customizable by +//! The [`Engine`] trait allow connectors to bring their own implementation of functionality such +//! as reading parquet files, listing files in a file system, parsing a JSON string etc. This +//! trait exposes methods to get sub-engines which expose the core functionalities customizable by //! connectors. //! //! ## Expression handling //! -//! Expression handling is done via the [`ExpressionHandler`], which in turn allows the creation -//! of [`ExpressionEvaluator`]s. These evaluators are created for a specific predicate [`Expression`] +//! Expression handling is done via the [`ExpressionHandler`], which in turn allows the creation of +//! [`ExpressionEvaluator`]s. These evaluators are created for a specific predicate [`Expression`] //! and allow evaluation of that predicate for a specific batches of data. //! //! ## File system interactions //! -//! Delta Kernel needs to perform some basic operations against file systems like listing and reading files. -//! These interactions are encapsulated in the [`FileSystemClient`] trait. Implementors must take -//! care that all assumptions on the behavior if the functions - like sorted results - are respected. +//! Delta Kernel needs to perform some basic operations against file systems like listing and +//! reading files. These interactions are encapsulated in the [`FileSystemClient`] trait. +//! Implementors must take care that all assumptions on the behavior if the functions - like sorted +//! results - are respected. //! //! ## Reading log and data files //! -//! Delta Kernel requires the capability to read json and parquet files, which is exposed via the -//! [`JsonHandler`] and [`ParquetHandler`] respectively. When reading files, connectors are asked to -//! provide the context information it requires to execute the actual read. This is done by invoking -//! methods on the [`FileSystemClient`] trait. -//! +//! Delta Kernel requires the capability to read and write json files and read parquet files, which +//! is exposed via the [`JsonHandler`] and [`ParquetHandler`] respectively. When reading files, +//! connectors are asked to provide the context information it requires to execute the actual +//! operation. This is done by invoking methods on the [`FileSystemClient`] trait. #![cfg_attr(all(doc, NIGHTLY_CHANNEL), feature(doc_auto_cfg))] #![warn(