From 9fc3cf79b8daee1962d0f1a724feefd76a9c6375 Mon Sep 17 00:00:00 2001 From: b-fuze Date: Sat, 15 Sep 2018 13:33:18 -0400 Subject: [PATCH] Small update fixes --- docs/updates/2018_09_14.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/updates/2018_09_14.md b/docs/updates/2018_09_14.md index 4419226c..3d3f9ee0 100644 --- a/docs/updates/2018_09_14.md +++ b/docs/updates/2018_09_14.md @@ -4,9 +4,9 @@ It's been an intense couple of weeks, but we're coming out of it with more clari ## Embracing Git -Our previous vision for Memo was to store the full operational history for the repository in a global database, so that each file's full history would be available in a single structure dating back to its creation. This would essentially duplicate the role of Git as a content tracker for the repository, but with a much more fine-grained resolution. It may eventually make sense to build a global operation index to enable advanced features an analysis, but I don't think it makes sense to conceive of such an index as an independent version control system. For async collaboration, CRDTs probably won't offer enough advantages to induce people to switch away from Git. Even if we managed to build such a system, it would always need to interoperate with Git. So we may as well embrace that reality and build on top of Git. We can then focus on the area where CRDTs have their greatest strength: real-time collaboration and recording the fine-grained edits that occur *between* Git commits. +Our previous vision for Memo was to store the full operational history for the repository in a global database, so that each file's full history would be available in a single structure dating back to its creation. This would essentially duplicate the role of Git as a content tracker for the repository, but with a much more fine-grained resolution. It may eventually make sense to build a global operation index to enable advanced features and analysis, but I don't think it makes sense to conceive of such an index as an independent version control system. For async collaboration, CRDTs probably won't offer enough advantages to induce people to switch away from Git. Even if we managed to build such a system, it would always need to interoperate with Git. So we may as well embrace that reality and build on top of Git. We can then focus on the area where CRDTs have their greatest strength: real-time collaboration and recording the fine-grained edits that occur *between* Git commits. -Augmenting Git is definitely something I've considered in the past, but it's finally becoming clearer how we can achieve it. We will start by packaging the previous months' work into a library that is similar to `teletype-crdt`. With Teletype, you work with individual buffers. Each local edit returns one or more operations to apply on remote replicas, and applying remote operations returns a diff that describes the impact of those operations on the local replica. Memo will expand the scope of this abstraction from individual buffers to the working tree, but it won't represent the full state of this tree in the form of operations. Instead, we'll exploit the fact that Git commits provide a common synchronization point. The library will expect any data that's committed to the Git repository to be supplied separately from the operations.s +Augmenting Git is definitely something I've considered in the past, but it's finally becoming clearer how we can achieve it. We will start by packaging the previous months' work into a library that is similar to `teletype-crdt`. With Teletype, you work with individual buffers. Each local edit returns one or more operations to apply on remote replicas, and applying remote operations returns a diff that describes the impact of those operations on the local replica. Memo will expand the scope of this abstraction from individual buffers to the working tree, but it won't represent the full state of this tree in the form of operations. Instead, we'll exploit the fact that Git commits provide a common synchronization point. The library will expect any data that's committed to the Git repository to be supplied separately from the operations. By making the CRDT representation sparse and leaning on Git to synchronize the bulk of the working tree, we reduce the memory footprint of the CRDT to the point where it can reasonably be kept resident in memory. This also bounds the bandwidth required to replicate the full structure, which obviates the need for complex partial data fetching schemes that we were considering previously. This in turn greatly simplifies backend infrastructure. Because a sparse representation should always be small enough to fully reconstruct from raw operations on the client, server side infrastructure shouldn't need to process operations in any way other than simply storing them based on the identifier of the patch to which they apply. @@ -16,7 +16,7 @@ One big obstacle to making this patch-based representation work is undo. In `tel This felt like a show-stopper to me until I had a conversation with [@jeffrafter](https://github.com/jeffrafter) about his team's experience using `teletype-crdt` in [Tiny](https://tttiny.com/). I don't have a perfect understanding of the details of their approach, but they essentially bypass Teletype's built-in undo system and maintain their own history on each client independently of the CRDT. When a user performs an undo, they simply apply its to the current CRDT and broadcast it as a new operation. When I asked about some of the more diabolical concurrency scenarios that the counters were designed to eliminate, Jeff simply replied that it's working for them in practice. -Inspired by their experience, I have a hunch that we can implement undo similarly in our library. For each buffer, we can maintain two CRDTs. One will serve as a local non-linear history allows us to understand the effects of undoing operations that aren't the most recent change to the buffer. We'll perform undos against this local history first, then apply their affects to a CRDT that starts at the most recent commit. This will generate remote operations we know can be cleanly applied by all participants. The local history can be retained across several commits and even be stored locally. By fetching operations from previous commits, we could even construct such a history for clients that are new to the collaboration. +Inspired by their experience, I have a hunch that we can implement undo similarly in our library. For each buffer, we can maintain two CRDTs. One will serve as a local non-linear history that allows us to understand the effects of undoing operations that aren't the most recent change to the buffer. We'll perform undos against this local history first, then apply their affects to a CRDT that starts at the most recent commit. This will generate remote operations we know can be cleanly applied by all participants. The local history can be retained across several commits and even be stored locally. By fetching operations from previous commits, we could even construct such a history for clients that are new to the collaboration. ## Stable file references