-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primary Keys in Core #1200
Primary Keys in Core #1200
Conversation
Coverage: 30% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1062/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
Coverage: 31% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1065/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
@simonask This is a single column uniqueness, right? In general, candidate/primary keys can span multiple columns. To my knowledge, no bindings support that scenario but we get request for it (see realm/realm-java#1129). Uniqueness is not only useful for primary keys but also for a general case - see realm/realm-java#967. |
@kneth Indeed, this is single-column uniqueness. Multi-column uniqueness can be implemented in a similar way, but we would need a |
I think I like the concepts and the nice explanations, but I'm not going to claim I understand all the implications :-) |
@kspangsege, @danielpovlsen, @teotwaki, @rrrlasse I would love your input here as well. :) |
Is this still a proposal or to be considered ready for review? |
@danielpovlsen I consider a proposal inherently destined for review. ;-) Sorry if the intention was unclear. Please do review, and please do have opinions about the direction as well! :-) |
I think this is one of the major points, and that discussing this as "primary keys" is dangerous. As far as I can tell, this sort of behaviour is not possible in SQL or even NoSQL stores. @simonask, correct me if I'm wrong, but could this be rephrased as "unique on insert but not on update"? Are we convinced that what we're providing will fit with user expectations? How many people will expect merges to fail, or merges to be handled in a specific way; and can we properly document that/ensure a specific behaviour? In my opinion, there is a use-case for conflicts to occur, and ask the user for manual intervention.
|
It is indeed tricky, because SQL and SQL-like stores provide no such feature directly. Many of them are providing "UPSERT", though, precisely for distributed use cases. I would argue that this feature is actually in the spirit of "primary keys", in the sense that a primary key is meant to identify an object - logically, if two objects then have the same primary key, they have the same identity, and can therefore be collapsed. When would you use primary keys in a spreadsheet? |
Sorry, my brain farted and I'm not entirely sure which point I was trying to get to. Thinking this over, merge could indeed be the correct default behaviour considering what we're trying to achieve. I guess the main road block I see comes from the TRIGGER concept in RDBMS, which we don't have in Realm, but which would run specific actions upon insertion. In those cases, a merge strategy could have very weird consequences, and is why duplicate PK insertion is blocking in those systems. |
# Conflicts: # src/realm/impl/transact_log.hpp # src/realm/replication.cpp # src/realm/table.cpp # src/realm/table.hpp
Coverage: 31% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1413/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
Talked a bit with Simon today about this. In my opinion it doesn't matter much if we solve this problem at the ObjectStore level or in Core. For the sake of keeping core as simple as possible (+1 when trying to reason about Sync), I think that solving it at the ObjectStore level would be natural. This means that this proposal would be enough to solve our problems as far as I can see. Some extra thoughts to consider
|
SQL has a "merge" keyword, and you can skip conflicting rows by adding a "where not exists" criteria. There also exists a "when not matched then" clause that you can use if you want custom behaviour on conflicts... So SQL is a winner over Realm here, with respect to flexibility. By the way, do we really need the user to be able to specify/set the unique values? If they were only allowed to be auto generataed, then Realm could use different number generator for each thread that would somehow mathematically be guaranteed never to collide. |
@rrrlasse There are two types of primary keys: natural and synthetic. Your suggestion is to disallow natural keys, but we don't think this is what users actually want. I think it requires a deeper analysis to figure out whether or not we can achieve similar levels of flexibility under this scheme. |
Disallowing natural keys will not be a good idea IMO. Too many people rely on those (e.g. server generated keys). Regarding conflict handling, then SQLite e.g. has the following modes: https://www.sqlite.org/lang_conflict.html |
@cmelchior Interesting perspective — I hadn't looked into what SQLite actually does. Here is my evaluation of the strategies as they apply to Realm Sync:
|
Coverage: 75% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1429/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
Coverage: 78% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1430/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
Coverage: 81% Please check your coverage here: https://ci.realm.io/job/core_matrix_pr/1431/buildcommand=jenkins-pipeline-coverage,buildtype=debug,encryption=no,slave=fastlinux/Diff_Coverage/ |
PROPOSAL -- very open for discussion.
This does two things:
To implement primary keys on top of this, the binding must call only the
set_unique
operations (and never plainset
) on a column. An extra consequence of this is that primary key constraints are only checked whenset_unique
is called, and not, for instance, when new rows are inserted. This replicates the current behavior of the mechanism used in the Cocoa binding (see RLMSetValueUnique).Rationale
pk
metadata table, and Core still knows nothing about which columns are primary key columns (no attribute or similar). It is solely up to the binding / object store to ensure that primary key columns are treated as such, as it is today. This has the additional benefit of not requiring a new method of creating objects, i.e. this retains the insert-empty-row-then-set-values mode.I have also considered the word "identity" instead of "unique", but "identity" is already a heavily overloaded word... Still, it might be worth considering.
Fixes #831.
\cc @kspangsege