Skip to content

Commit

Permalink
Optimize code under storage (milvus-io#6335)
Browse files Browse the repository at this point in the history
* rename AddOneStringToPayload/GetOneStringFromPayload to AddStringToPayload/GetStringFromPayload

Signed-off-by: yudong.cai <[email protected]>

* code optimize

Signed-off-by: yudong.cai <[email protected]>

* rename print_binglog_test to print_binlog_test

Signed-off-by: yudong.cai <[email protected]>

* update chap08_binlog.md

Signed-off-by: yudong.cai <[email protected]>

* fix unittest

Signed-off-by: yudong.cai <[email protected]>

* use SetEventTimestamp() to replace SetStartTimestamp() and SetEndTimestamp()

Signed-off-by: yudong.cai <[email protected]>

* code optimize

Signed-off-by: yudong.cai <[email protected]>

* rename AddStringToPayload/GetStringFromPayload to AddOneStringToPayload/GetOneStringFromPayload

Signed-off-by: yudong.cai <[email protected]>
  • Loading branch information
cydrain authored Jul 7, 2021
1 parent 3652b9d commit 3387b07
Show file tree
Hide file tree
Showing 16 changed files with 285 additions and 365 deletions.
151 changes: 66 additions & 85 deletions docs/developer_guides/chap08_binlog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@

InsertBinlog、DeleteBinlog、DDLBinlog

Binlog is stored in a columnar storage format, every column in schema should be stored in a individual file. Timestamp, schema, row id and primary key allocated by system are four special columns. Schema column records the DDL of the collection.

Binlog is stored in a columnar storage format, every column in schema is stored in an individual file.
Timestamp, schema, row id and primary key allocated by system are four special columns.
Schema column records the DDL of the collection.


## Event format
Expand All @@ -13,67 +14,63 @@ Binlog file consists of 4 bytes magic number and a series of events. The first e
### Event format

```
+=====================================+
| event | timestamp 0 : 8 | create timestamp
| header +----------------------------+
| | type_code 8 : 1 | event type code
| +----------------------------+
| | server_id 9 : 4 | write node id
| +----------------------------+
| | event_length 13 : 4 | length of event, including header and data
| +----------------------------+
| | next_position 17 : 4 | offset of next event from the start of file
| +----------------------------+
| | extra_headers 21 : x-21 | reserved part
+=====================================+
| event | fixed part x : y |
| data +----------------------------+
| | variable part |
+=====================================+
+=====================================+=====================================================================+
| event | Timestamp 0 : 8 | create timestamp |
| header +----------------------------+---------------------------------------------------------------------+
| | TypeCode 8 : 1 | event type code |
| +----------------------------+---------------------------------------------------------------------+
| | ServerID 9 : 4 | write node id |
| +----------------------------+---------------------------------------------------------------------+
| | EventLength 13 : 4 | length of event, including header and data |
| +----------------------------+---------------------------------------------------------------------+
| | NextPosition 17 : 4 | offset of next event from the start of file |
+=====================================+=====================================================================+
| event | fixed part 21 : x | |
| data +----------------------------+---------------------------------------------------------------------+
| | variable part | |
+=====================================+=====================================================================+
```



### Descriptor Event format

```
+=====================================+
| event | timestamp 0 : 8 | create timestamp
| header +----------------------------+
| | type_code 8 : 1 | event type code
| +----------------------------+
| | server_id 9 : 4 | write node id
| +----------------------------+
| | event_length 13 : 4 | length of event, including header and data
| +----------------------------+
| | next_position 17 : 4 | offset of next event from the start of file
+=====================================+
| event | binlog_version 21 : 2 | binlog version
| data +----------------------------+
| | server_version 23 : 8 | write node version
| +----------------------------+
| | commit_id 31 : 8 | commit id of the programe in git
| +----------------------------+
| | header_length 39 : 1 | header length of other event
| +----------------------------+
| | collection_id 40 : 8 | collection id
| +----------------------------+
| | partition_id 48 : 8 | partition id (schema column does not need)
| +----------------------------+
| | segment_id 56 : 8 | segment id (schema column does not need)
| +----------------------------+
| | start_timestamp 64 : 1 | minimum timestamp allocated by master of all events in this file
| +----------------------------+
| | end_timestamp 65 : 1 | maximum timestamp allocated by master of all events in this file
| +----------------------------+
| | post-header 66 : n | array of n bytes, one byte per event type that the server knows about
| | lengths for all |
| | event types |
+=====================================+
+=====================================+=====================================================================+
| event | Timestamp 0 : 8 | create timestamp |
| header +----------------------------+---------------------------------------------------------------------+
| | TypeCode 8 : 1 | event type code |
| +----------------------------+---------------------------------------------------------------------+
| | ServerID 9 : 4 | write node id |
| +----------------------------+---------------------------------------------------------------------+
| | EventLength 13 : 4 | length of event, including header and data |
| +----------------------------+---------------------------------------------------------------------+
| | NextPosition 17 : 4 | offset of next event from the start of file |
+=====================================+=====================================================================+
| event | BinlogVersion 21 : 2 | binlog version |
| data +----------------------------+---------------------------------------------------------------------+
| | ServerVersion 23 : 8 | write node version |
| +----------------------------+---------------------------------------------------------------------+
| | CommitID 31 : 8 | commit id of the programe in git |
| +----------------------------+---------------------------------------------------------------------+
| | HeaderLength 39 : 1 | header length of other event |
| +----------------------------+---------------------------------------------------------------------+
| | CollectionID 40 : 8 | collection id |
| +----------------------------+---------------------------------------------------------------------+
| | PartitionID 48 : 8 | partition id (schema column does not need) |
| +----------------------------+---------------------------------------------------------------------+
| | SegmentID 56 : 8 | segment id (schema column does not need) |
| +----------------------------+---------------------------------------------------------------------+
| | StartTimestamp 64 : 1 | minimum timestamp allocated by master of all events in this file |
| +----------------------------+---------------------------------------------------------------------+
| | EndTimestamp 65 : 1 | maximum timestamp allocated by master of all events in this file |
| +----------------------------+---------------------------------------------------------------------+
| | PayloadDataType 66 : 1 | data type of payload |
| +----------------------------+---------------------------------------------------------------------+
| | PostHeaderLength 67 : n | header lengths for all event types |
+=====================================+=====================================================================|
```



### Type code

```
Expand All @@ -88,12 +85,11 @@ DROP_PARTITION_EVENT

DESCRIPTOR_EVENT must appear in all column files and always be the first event.

INSERT_EVENT 可以出现在除DDL binlog文件外的其他列的binlog
INSERT_EVENT 可以出现在除 DDL binlog 文件外的其他列的 binlog

DELETE_EVENT 只能用于primary key 的binlog文件(目前只有按照primary key删除)

CREATE_COLLECTION_EVENT、DROP_COLLECTION_EVENT、CREATE_PARTITION_EVENT、DROP_PARTITION_EVENT 只出现在DDL binlog文件
DELETE_EVENT 只能用于 primary key 的 binlog 文件(目前只有按照 primary key 删除)

CREATE_COLLECTION_EVENT、DROP_COLLECTION_EVENT、CREATE_PARTITION_EVENT、DROP_PARTITION_EVENT 只出现在 DDL binlog 文件


### Event data part
Expand All @@ -102,28 +98,21 @@ CREATE_COLLECTION_EVENT、DROP_COLLECTION_EVENT、CREATE_PARTITION_EVENT、DROP_
event data part
INSERT_EVENT:
+================================================+
| event | fixed | start_timestamp x : 8 | min timestamp in this event
| data | part +------------------------------+
| | | end_timestamp x+8 : 8 | max timestamp in this event
| | +------------------------------+
| | | reserved x+16 : y-x-16 | reserved part
| +--------+------------------------------+
| |variable| parquet payloI ad | payload in parquet format
| |part | |
+================================================+
other events is similar with INSERT_EVENT
+================================================+==========================================================+
| event | fixed | StartTimestamp x : 8 | min timestamp in this event |
| data | part +------------------------------+----------------------------------------------------------+
| | | EndTimestamp x+8 : 8 | max timestamp in this event |
| | +------------------------------+----------------------------------------------------------+
| | | reserved x+16 : y | reserved part |
| +--------+------------------------------+----------------------------------------------------------+
| |variable| parquet payload | payload in parquet format |
| |part | | |
+================================================+==========================================================+
other events are similar with INSERT_EVENT
```







### Example

Schema
Expand Down Expand Up @@ -212,12 +201,4 @@ CStatus GetFloatVectorFromPayload(CPayloadReader payloadReader, float **values,

int GetPayloadLengthFromReader(CPayloadReader payloadReader);
CStatus ReleasePayloadReader(CPayloadReader payloadReader);

```
Loading

0 comments on commit 3387b07

Please sign in to comment.