-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finish the Attribute API wrapping #6
Conversation
f6539b0
to
37d6222
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of my comments are not imminently actionable so I'm choosing to approve rather than request changes. But I think we definitely want the test improvement sooner rather than later.
|
||
pub fn get_fill_value<Conv: CAPIConverter>( | ||
&self, | ||
ctx: &Context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I have a branch which is stylizing this to be like the other things, i.e. &'ctx Context
is a member of the struct. Let's merge this one first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I'm gonna write that quick test to see what can be done with TypeId for slices to pick a replacement name for Datatype::is_valid
method and then will update and merge once that's done.
let mut c_size: u64 = 0; | ||
|
||
let res = unsafe { | ||
ffi::tiledb_attribute_get_fill_value( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Philosophical question time. Here's the implementation that I see of tiledb_attribute_get_fill_value
:
ensure_attribute_is_valid(attr);
api::ensure_output_pointer_is_valid(value);
api::ensure_output_pointer_is_valid(size);
attr->get_fill_value(value, size);
return TILEDB_OK;
Looks like get_fill_value
throws an exception if the size is zero, the value is null, or the attribute is not nullable.
The first two of these are effectively prevented by using Rust. The third one seems to me like it should really be an option type, not an error.
Anyway what I'm kind of postulating is whether getter functions like this should really return Result
since safe Rust should really be preventing the usages where this would be an error (and maybe for this specific scenario of a nullable attribute it should be Option<u64>
).
It would be a bit more user-friendly... but I guess this is probably being called alongside a bunch of other Result
code anyway. And the C implementation could always evolve. So I've answered my own question. There are some other functions we should probably fix up to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to go too far Rust-ifying our interface until more of the library is wrapped so that we have a better feel for common patterns, but I wonder if we couldn't do something along the lines of having a specific type for the value would be the way to go. Something along the lines of:
enum FillValue {
NonNullable(T),
Nullable(T, B)
}
fun set_fill_value(value: FillValue);
fun get_fill_value() -> FillValule;
@@ -0,0 +1,42 @@ | |||
pub trait CAPIConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure this trait gets it right. I think we need to play with strings to find a trait that does. Which is to say, this isn't really any different than what was here before, and I believe it will have to evolve again in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I said this in slack already but we will need to distinguish between types which are trivially convertible (as each of this examples is) and types which are not (likely string). For the trivially convertible cases, we don't want any functions at all, just a marker - and then for the non-trivial cases we will end up with implementations that are more complicated than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion stuff isn't any different for sure. The issue was I focusing on was removing the Datatype dependence of the conversion. With get_fill_value I ran smack into the issue when I realized that DomainType was wrong because u64 would only have one associated Datatype via DomainType::DATATTYPE which was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that makes sense. Yeah multiple datatypes backed by u64. Got it.
@@ -17,6 +18,7 @@ extern "C" { | |||
pub fn tiledb_datatype_size(type_: u32) -> u64; | |||
} | |||
|
|||
#[allow(non_snake_case)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bigger change but I kind of think this should change to be more similar to the smaller enums I've worked through - i.e. the sys
library just has the constants and then we put the enum in the main crate. We shouldn't need this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I'm not even sure why it started complaining when it was there before the PR. The linter disable was purely to tell clippy to quiet up so I didn't have to include tugging on this loose end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does clippy run on all directories? If not maybe that's why it didn't catch this.
tiledb-sys/src/datatype.rs
Outdated
@@ -207,6 +209,96 @@ impl Datatype { | |||
_ => None, | |||
} | |||
} | |||
|
|||
pub fn is_valid<T: 'static>(&self, _: &T) -> bool { | |||
if TypeId::of::<T>() == TypeId::of::<f32>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a tuple-type match (self, TypeId::of::<T>())
could probably neaten this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean here. Self would be the enum value in this case which is just a variant of an enum rather than the data being converted.
tiledb-sys/src/datatype.rs
Outdated
@@ -207,6 +209,96 @@ impl Datatype { | |||
_ => None, | |||
} | |||
} | |||
|
|||
pub fn is_valid<T: 'static>(&self, _: &T) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"valid" meaning "T
is a physical type which implements this logical type"?
I'd probably name it is_physical_type
or at the least give put the "physical type" terminology in a doc comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some vague ideas on how this would work with slices that we'll need for strings and multi-value cells. I'll spend a bit this morning setting up an example to see if I can't decide whether we'll be able to use this same pattern for everything or if we'll have to get more complicated.
Which is to say, I'm not 100% that this will only work for physical types. Though I could see renaming to is_compatible_type
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimental results show that we should be able to apply this same technique to non-physical types as well. According to the TypeId docs, it currently only works for types that are 'static
which I don't even know how it a type wouldn't be static yet. I'll go with is_compatible_type
for now unless you throw something at me before I finish updating and merging.
format!("{}", dt), | ||
"<UNKNOWN DATA TYPE>".to_string() | ||
); | ||
assert!(check_valid(&dt)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the assertion that only one physical type is valid for each logical type but the way this is written leaves a lot of room for failure if the wrong one is chosen. I think it's probably worth the map from logical to physical type (which we probably want somewhere anyway?) and then having the check_valid
instead return a vector representing which types were valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. The only guarantee that we've picked the right type is to have the second map but I didn't have the brain power to figure out how best to do that in Rust. I'm more than open to adding it if you've got any idea on how to write it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to check out this branch and give it a whirl, or do so after you've merged
src/array/attribute.rs
Outdated
} | ||
|
||
#[test] | ||
fn attribute_test_set_fill_value() -> TileDBResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will definitely want string tests soon but I think we can skip for now - I left a related comment in convert.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, strings and multi-value cells are a looming monster on the horizon that I've mostly been ignoring while dealing with trying to cover quite a bit of the API to get a feel for anything that might be a problem. I've just not wanted to sink the time into figuring out that particular fight while I try and make sure that the more mundane things are all going to work out reasonably well.
Although, theoretically, we'll only need to solve that issue once and then re-uses it ~everywhere that we've got the whole void*/uint64_t*
pattern in the C API.
37d6222
to
4804ae0
Compare
This implements a different approach to converting between types and Datatype dependent type checks. The main idea here that's different than previously is that the DomainType enforced a 1:1 correspondence between TileDB's Datatype enumeration and the Rust type system. Unfortunately, multiple TileDB Datatypes map to the same Rust type. The idea here is to separate the conversion of types and the validation that a given Rust type is valid for a given TileDB Datatype. The basic idea is to move type conversion between Rust and FFI types (i.e., i32 -> uint32_t) to be a separate concern from "Is an i32 valid as a Datatype::Blob". The type conversions basically map directly to how DomainType worked. The validation is moved to a generic method on Datatype in tiledb-sys that makes heavy use of `std::any::TypeId` to restrict the generic type paramter to specific Datatype enum variants.
4804ae0
to
d083dae
Compare
The major change here is to reimplement the DomainType as a new
CAPIConverter
trait along with a generic method ontiledb_sys::Datatype
.