Skip to content

Commit

Permalink
Deployed 72e9bef with MkDocs version: 1.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
sabledb-io committed Sep 1, 2024
1 parent e40768d commit ce2048c
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 27 deletions.
8 changes: 4 additions & 4 deletions design/auto-failover/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -587,12 +587,12 @@ <h2 id="auto-failover">Auto-Failover</h2>
<p>In order to detect whether the primary node is still alive, <code>SableDB</code> uses <a href="https://raft.github.io/">the Raft algorithm</a> while using the centralised database
as its communication layer and the <code>last_txn_id</code> as the log entry</p>
<p>Each replica node regularly checks the <code>last_updated</code> field of the primary node the interval on which a replica node checks differs from node to
node - this is to minimise the risk of attempting to start multiple failover processes (but this can still happen and is solved by the <a href="/design/auto-failover/#a-note-about-locking">lock</a> described blow)</p>
node - this is to minimise the risk of attempting to start multiple failover processes (but this can still happen and is solved by the <a href="/sabledb/design/auto-failover/#a-note-about-locking">lock</a> described blow)</p>
<p>The failover process starts if the primary's <code>last_updated</code> was not updated after the allowed time. If the value exceeds, then
the replica node does the following:</p>
<h3 id="the-replica-that-initiated-the-failover">The replica that initiated the failover</h3>
<ul>
<li>Marks in the centralised database that a failover initiated for the non responsive primary. It does not by creating a <a href="/design/auto-failover/#a-note-about-locking">unique lock record</a></li>
<li>Marks in the centralised database that a failover initiated for the non responsive primary. It does not by creating a <a href="/sabledb/design/auto-failover/#a-note-about-locking">unique lock record</a></li>
<li>The node that started the failover decides on the new primary. It does that by picking the one with the highest <code>last_txn_id</code> property</li>
<li>Dispatches a command to the new replica instructing it to switch to Primary mode (we achieve this by using <code>LPUSH / BRPOP</code> blocking command)</li>
<li>Dispatch commands to all of the remaining replicas instructing them to perform a <code>REPLICAOF &lt;NEW_PRIMARY_IP&gt; &lt;NEW_PRIMARY_PORT&gt;</code></li>
Expand All @@ -611,8 +611,8 @@ <h3 id="all-other-replicas">All other replicas</h3>
<h3 id="a-note-about-locking">A note about locking</h3>
<p><code>SableDB</code> uses the command <code>SET &lt;PRIMARY_ID&gt;_FAILOVER &lt;Unique-Value&gt; NX EX 60</code> to create a unique lock.
By doing so, it ensures that only one locking record exists. If it succeeded in creating the lock record,
it becomes <a href="/design/auto-failover/#the-replica-that-initiated-the-failover">the node that orchestrates the replacement</a></p>
<p>If it fails (i.e. the record already exist) - it switches to read commands from the queue as described <a href="/design/auto-failover/#all-other-replicas">here</a></p>
it becomes <a href="/sabledb/design/auto-failover/#the-replica-that-initiated-the-failover">the node that orchestrates the replacement</a></p>
<p>If it fails (i.e. the record already exist) - it switches to read commands from the queue as described <a href="/sabledb/design/auto-failover/#all-other-replicas">here</a></p>
<p>The only client allowed to delete the lock is the client created it, hence the <code>&lt;unique_value&gt;</code>. If that client crashed
we have the <code>EX 60</code> as a backup plan (the lock will be expire)</p>

Expand Down
32 changes: 16 additions & 16 deletions design/data-encoding/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -559,9 +559,9 @@


<h1 id="overview">Overview</h1>
<p><code>SableDb</code> uses a Key / Value database for its underlying data storage. We chose to use <code>RocksDb</code>
<p><code>SableDb</code> uses a Key / Value database for its underlying data storage. We chose to use <code>RocksDb</code>
as its mature, maintained and widely used in the industry by giant companies.</p>
<p>Because the <code>RocksDb</code> is key-value storage and Redis data structures can be more complex, an additional
<p>Because the <code>RocksDb</code> is key-value storage and Redis data structures can be more complex, an additional
data encoding is required.</p>
<p>This chapter covers how <code>SableDb</code> encodes the data for the various data types (e.g. <code>String</code>, <code>Hash</code>, <code>Set</code> etc)</p>
<div class="admonition note">
Expand Down Expand Up @@ -616,10 +616,10 @@ <h2 id="the-list-data-type">The <code>List</code> data type</h2>
is stored in a separate entry.</p>
<div class="highlight"><pre><span></span><code>List metadata:

A B C D
+-----+---- +--------+------------+
| 1u8 | DB# | Slot# | list name |
+-----+---- +--------+------------+
A B C D
+-----+---- +--------+------------+
| 1u8 | DB# | Slot# | list name |
+-----+---- +--------+------------+
E F G H I J
+-----+------------+--------- +------+------+-------+
=&gt; | 1u8 | Expirtaion | List UID | head | tail | size |
Expand Down Expand Up @@ -652,16 +652,16 @@ <h2 id="the-list-data-type">The <code>List</code> data type</h2>
<li><code>O</code> the UID of the next item in the list ( <code>0</code> means that this item is the last item)</li>
<li><code>P</code> the list value</li>
</ul>
<p>The above encoding allows <code>SableDb</code> to iterate over all list items by creating a <code>RocksDb</code> iterator and move it to
<p>The above encoding allows <code>SableDb</code> to iterate over all list items by creating a <code>RocksDb</code> iterator and move it to
the prefix <code>[ 2 | &lt;list-id&gt;]</code> (<code>2</code> indicates that only list items should be scanned, and <code>list-id</code> makes sure that only
the requested list items are visited)</p>
<h2 id="the-hash-data-type">The <code>Hash</code> data type</h2>
<p>Hash items are encoded using the following:</p>
<div class="highlight"><pre><span></span><code>Hash metadata:

A B C D E F G H
A B C D E F G H
+-----+---- +--------+-----------+ +-----+------------+---------+-------+
| 1u8 | DB# | Slot# | Hash name | =&gt; | 2u8 | Expirtaion | Set UID | size |
| 1u8 | DB# | Slot# | Hash name | =&gt; | 2u8 | Expirtaion | Set UID | size |
+-----+---- +--------+-----------+ +-----+------------+---------+-------+

Hash item:
Expand All @@ -682,14 +682,14 @@ <h2 id="the-sorted-set-data-type">The <code>Sorted Set</code> data type</h2>
<p>The sorted set ( <code>Z*</code> commands) is encoded using the following:</p>
<div class="highlight"><pre><span></span><code>Sorted set metadata:

A B C D E F G H
A B C D E F G H
+-----+---- +--------+-----------+ +-----+------------+---------+-------+
| 1u8 | DB# | Slot# | ZSet name | =&gt; | 3u8 | Expirtaion | ZSet UID| size |
| 1u8 | DB# | Slot# | ZSet name | =&gt; | 3u8 | Expirtaion | ZSet UID| size |
+-----+---- +--------+-----------+ +-----+------------+---------+-------+

ZSet item 1 (Index: &quot;Find by member&quot;):

K L M O
K L M O
+-----+--------------+---------+ +-------+
| 4u8 | ZSet ID(u64) | member | =&gt; | score |
+-----+--------------+---------+ +-------+
Expand Down Expand Up @@ -739,9 +739,9 @@ <h2 id="the-set-data-type">The <code>Set</code> data type</h2>
<p>Set items are encoded using the following:</p>
<div class="highlight"><pre><span></span><code>Set metadata:

A B C D E F G H
A B C D E F G H
+-----+---- +--------+-----------+ +-----+------------+---------+-------+
| 1u8 | DB# | Slot# | Set name | =&gt; | 4u8 | Expirtaion | Set UID | size |
| 1u8 | DB# | Slot# | Set name | =&gt; | 4u8 | Expirtaion | Set UID | size |
+-----+---- +--------+-----------+ +-----+------------+---------+-------+

Set item:
Expand All @@ -760,13 +760,13 @@ <h2 id="the-set-data-type">The <code>Set</code> data type</h2>
</ul>
<h2 id="bookkeeping-records">Bookkeeping records</h2>
<p>Every composite item (<code>Hash</code>, <code>Sorted Set</code>, <code>List</code> or <code>Set</code>) created by <code>SableDb</code>, also creates a record in the <code>bookkeeping</code> "table".
A bookkeeping records keeps track of the composite item unique ID + its type (which is needed by the <a href="/design/eviction/#composite-item-has-been-overwritten">data eviction job</a>)</p>
A bookkeeping records keeps track of the composite item unique ID + its type (which is needed by the <a href="/sabledb/design/eviction/#composite-item-has-been-overwritten">data eviction job</a>)</p>
<p>The <code>bookkeeping</code> record is encoded as follows:</p>
<div class="highlight"><pre><span></span><code>Bookkeeping:

A B C D E
+-----+----+--------+-----------+ +----------+
| 0u8 | UID| DB# | UID type | =&gt; | user key |
| 0u8 | UID| DB# | UID type | =&gt; | user key |
+-----+----+--------+-----------+ +----------+
</code></pre></div>
<ul>
Expand Down
12 changes: 6 additions & 6 deletions design/eviction/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -467,21 +467,21 @@ <h2 id="expired-items">Expired items</h2>
the item is deleted and a <code>null</code> value is returned to the caller.</p>
<h2 id="composite-item-has-been-overwritten">Composite item has been overwritten</h2>
<p>To explain the problem here, consider the following data is stored in <code>SableDb</code> (using <code>Hash</code> data type):</p>
<div class="highlight"><pre><span></span><code>&quot;OverwatchTanks&quot; =&gt;
{
{&quot;tank_1&quot; =&gt; &quot;Reinhardt&quot;},
{&quot;tank_2&quot; =&gt; &quot;Orisa&quot;},
<div class="highlight"><pre><span></span><code>&quot;OverwatchTanks&quot; =&gt;
{
{&quot;tank_1&quot; =&gt; &quot;Reinhardt&quot;},
{&quot;tank_2&quot; =&gt; &quot;Orisa&quot;},
{&quot;tank_3&quot; =&gt; &quot;Roadhog&quot;}
}
</code></pre></div>
<p>In the above example, we have a hash identified by the key <code>OverwatchTanks</code>. Now, imagine a user that executes the following command:</p>
<p><code>set OverwatchTanks bla</code> - this effectively changes the type of the key <code>OverwatchTanks</code> and set it into a <code>String</code>.
However, as explained in <a href="/design/data-encoding/#the-hash-data-type"><code>the encoding data chapter</code></a>, we know that each hash field is stored in its own <code>RocksDb</code> records.
However, as explained in <a href="/sabledb/design/data-encoding/#the-hash-data-type"><code>the encoding data chapter</code></a>, we know that each hash field is stored in its own <code>RocksDb</code> records.
So by calling the <code>set</code> command, the <code>hash</code> fields <code>tank_1</code>, <code>tank_2</code> and <code>tank_3</code> are now "orphaned" (i.e. the user can not access them)</p>
<p><code>SableDb</code> solves this problem by running an cron task that compares the type of the a composite item against its actual value.
In the above example: the type of the key <code>OverwatchTanks</code> is a <code>String</code> while it should have been <code>Hash</code>. When such a discrepancy is detected,
the cron task deletes the orphan records from the database.</p>
<p>The cron job knows the original type by checking the <a href="/design/data-encoding/#bookkeeping-records"><code>bookkeeping record</code></a></p>
<p>The cron job knows the original type by checking the <a href="/sabledb/design/data-encoding/#bookkeeping-records"><code>bookkeeping record</code></a></p>
<h2 id="user-triggered-clean-up-flushall-or-flushdb">User triggered clean-up (<code>FLUSHALL</code> or <code>FLUSHDB</code>)</h2>
<p>When one of these commands is called, <code>SableDb</code> uses <code>RocksDb</code> <a href="https://rocksdb.org/blog/2018/11/21/delete-range.html"><code>delete_range</code></a> method.</p>

Expand Down
2 changes: 1 addition & 1 deletion search/search_index.json

Large diffs are not rendered by default.

0 comments on commit ce2048c

Please sign in to comment.