forked from apache/seatunnel
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Docs]translate clickhousefile,phoenix,rabbitmq,starrocks sink doc in…
…to chinese (apache#7015)
- Loading branch information
1 parent
d296842
commit bff74ed
Showing
4 changed files
with
611 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
# ClickhouseFile | ||
|
||
> Clickhouse文件数据接收器 | ||
## 描述 | ||
|
||
该接收器使用clickhouse-local程序生成clickhouse数据文件,随后将其发送至clickhouse服务器,这个过程也称为bulkload。该接收器仅支持表引擎为 'Distributed'的表,且`internal_replication`选项需要设置为`true`。支持批和流两种模式。 | ||
|
||
## 主要特性 | ||
|
||
- [ ] [精准一次](../../concept/connector-v2-features.md) | ||
|
||
:::小提示 | ||
|
||
你也可以采用JDBC的方式将数据写入Clickhouse。 | ||
|
||
::: | ||
|
||
## 接收器选项 | ||
|
||
| 名称 | 类型 | 是否必须 | 默认值 | | ||
|------------------------|---------|------|----------------------------------------| | ||
| host | string | yes | - | | ||
| database | string | yes | - | | ||
| table | string | yes | - | | ||
| username | string | yes | - | | ||
| password | string | yes | - | | ||
| clickhouse_local_path | string | yes | - | | ||
| sharding_key | string | no | - | | ||
| copy_method | string | no | scp | | ||
| node_free_password | boolean | no | false | | ||
| node_pass | list | no | - | | ||
| node_pass.node_address | string | no | - | | ||
| node_pass.username | string | no | "root" | | ||
| node_pass.password | string | no | - | | ||
| compatible_mode | boolean | no | false | | ||
| file_fields_delimiter | string | no | "\t" | | ||
| file_temp_path | string | no | "/tmp/seatunnel/clickhouse-local/file" | | ||
| common-options | | no | - | | ||
|
||
### host [string] | ||
|
||
`ClickHouse`集群地址,格式为`host:port`,允许同时指定多个`hosts`。例如`"host1:8123,host2:8123"`。 | ||
|
||
### database [string] | ||
|
||
`ClickHouse`数据库名。 | ||
|
||
### table [string] | ||
|
||
表名称。 | ||
|
||
### username [string] | ||
|
||
连接`ClickHouse`的用户名。 | ||
|
||
### password [string] | ||
|
||
连接`ClickHouse`的用户密码。 | ||
|
||
### sharding_key [string] | ||
|
||
当ClickhouseFile需要拆分数据时,需要考虑的问题是当前数据需要发往哪个节点,默认情况下采用的是随机算法,我们也可以使用'sharding_key'参数为某字段指定对应的分片算法。 | ||
|
||
### clickhouse_local_path [string] | ||
|
||
在spark节点上的clickhouse-local程序路径。由于每个任务都会被调用,所以每个spark节点上的clickhouse-local程序路径必须相同。 | ||
|
||
### copy_method [string] | ||
|
||
为文件传输指定方法,默认为scp,可选值为scp和rsync。 | ||
|
||
### node_free_password [boolean] | ||
|
||
由于seatunnel需要使用scp或者rsync进行文件传输,因此seatunnel需要clickhouse服务端访问权限。如果每个spark节点与clickhouse服务端都配置了免密登录,则可以将此选项配置为true,否则需要在node_pass参数中配置对应节点的密码。 | ||
|
||
### node_pass [list] | ||
|
||
用来保存所有clickhouse服务器地址及其对应的访问密码。 | ||
|
||
### node_pass.node_address [string] | ||
|
||
clickhouse服务器节点地址。 | ||
|
||
### node_pass.username [string] | ||
|
||
clickhouse服务器节点用户名,默认为root。 | ||
|
||
### node_pass.password [string] | ||
|
||
clickhouse服务器节点的访问密码。 | ||
|
||
### compatible_mode [boolean] | ||
|
||
在低版本的Clickhouse中,clickhouse-local程序不支持`--path`参数,需要设置该参数来采用其他方式实现`--path`参数功能。 | ||
|
||
### file_fields_delimiter [string] | ||
|
||
ClickHouseFile使用CSV格式来临时保存数据。但如果数据中包含CSV的分隔符,可能会导致程序异常。使用此配置可以避免该情况。配置的值必须正好为一个字符的长度。 | ||
|
||
### file_temp_path [string] | ||
|
||
ClickhouseFile本地存储临时文件的目录。 | ||
|
||
### common options | ||
|
||
Sink插件常用参数,请参考[Sink常用选项](common-options.md)获取更多细节信息。 | ||
|
||
## 示例 | ||
|
||
```hocon | ||
ClickhouseFile { | ||
host = "192.168.0.1:8123" | ||
database = "default" | ||
table = "fake_all" | ||
username = "default" | ||
password = "" | ||
clickhouse_local_path = "/Users/seatunnel/Tool/clickhouse local" | ||
sharding_key = "age" | ||
node_free_password = false | ||
node_pass = [{ | ||
node_address = "192.168.0.1" | ||
password = "seatunnel" | ||
}] | ||
} | ||
``` | ||
|
||
## 变更日志 | ||
|
||
### 2.2.0-beta 2022-09-26 | ||
|
||
- 支持将数据写入ClickHouse文件并迁移到ClickHouse数据目录 | ||
|
||
### 随后版本 | ||
|
||
- [BugFix] 修复生成的数据部分名称冲突BUG并改进文件提交逻辑 [3416](https://github.com/apache/seatunnel/pull/3416) | ||
- [Feature] 支持compatible_mode来兼容低版本的Clickhouse [3416](https://github.com/apache/seatunnel/pull/3416) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# Phoenix | ||
|
||
> Phoenix 数据接收器 | ||
## 描述 | ||
|
||
该接收器是通过 [Jdbc数据连接器](Jdbc.md)来写Phoenix数据,支持批和流两种模式。测试的Phoenix版本为4.xx和5.xx。 | ||
在底层实现上,通过Phoenix的jdbc驱动,执行upsert语句向HBase写入数据。 | ||
使用Java JDBC连接Phoenix有两种方式:其一是使用JDBC连接zookeeper,其二是通过JDBC瘦客户端连接查询服务器。 | ||
|
||
> 提示1: 该接收器默认使用的是(thin)驱动jar包。如果需要使用(thick)驱动或者其他版本的Phoenix(thin)驱动,需要重新编译jdbc数据接收器模块。 | ||
> | ||
> 提示2: 该接收器还不支持精准一次语义(因为Phoenix还不支持XA事务)。 | ||
## 主要特性 | ||
|
||
- [ ] [精准一次](../../concept/connector-v2-features.md) | ||
|
||
## 接收器选项 | ||
|
||
### driver [string] | ||
|
||
phoenix(thick)驱动:`org.apache.phoenix.jdbc.PhoenixDriver` | ||
phoenix(thin)驱动:`org.apache.phoenix.queryserver.client.Driver` | ||
|
||
### url [string] | ||
|
||
phoenix(thick)驱动:`jdbc:phoenix:localhost:2182/hbase` | ||
phoenix(thin)驱动:`jdbc:phoenix:thin:url=http://localhost:8765;serialization=PROTOBUF` | ||
|
||
### common options | ||
|
||
Sink插件常用参数,请参考[Sink常用选项](common-options.md)获取更多细节信息。 | ||
|
||
## 示例 | ||
|
||
thick驱动: | ||
|
||
``` | ||
Jdbc { | ||
driver = org.apache.phoenix.jdbc.PhoenixDriver | ||
url = "jdbc:phoenix:localhost:2182/hbase" | ||
query = "upsert into test.sink(age, name) values(?, ?)" | ||
} | ||
``` | ||
|
||
thin驱动: | ||
|
||
``` | ||
Jdbc { | ||
driver = org.apache.phoenix.queryserver.client.Driver | ||
url = "jdbc:phoenix:thin:url=http://spark_e2e_phoenix_sink:8765;serialization=PROTOBUF" | ||
query = "upsert into test.sink(age, name) values(?, ?)" | ||
} | ||
``` | ||
|
||
## 变更日志 | ||
|
||
### 2.2.0-beta 2022-09-26 | ||
|
||
- 增加Phoenix数据接收器 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Rabbitmq | ||
|
||
> Rabbitmq 数据接收器 | ||
## 描述 | ||
|
||
该数据接收器是将数据写入Rabbitmq。 | ||
|
||
## 主要特性 | ||
|
||
- [ ] [精准一次](../../concept/connector-v2-features.md) | ||
|
||
## 接收器选项 | ||
|
||
| 名称 | 类型 | 是否必须 | 默认值 | | ||
|----------------------------|---------|------|-------| | ||
| host | string | yes | - | | ||
| port | int | yes | - | | ||
| virtual_host | string | yes | - | | ||
| username | string | yes | - | | ||
| password | string | yes | - | | ||
| queue_name | string | yes | - | | ||
| url | string | no | - | | ||
| network_recovery_interval | int | no | - | | ||
| topology_recovery_enabled | boolean | no | - | | ||
| automatic_recovery_enabled | boolean | no | - | | ||
| use_correlation_id | boolean | no | false | | ||
| connection_timeout | int | no | - | | ||
| rabbitmq.config | map | no | - | | ||
| common-options | | no | - | | ||
|
||
### host [string] | ||
|
||
Rabbitmq服务器地址 | ||
|
||
### port [int] | ||
|
||
Rabbitmq服务器端口 | ||
|
||
### virtual_host [string] | ||
|
||
virtual host – 连接broker使用的vhost | ||
|
||
### username [string] | ||
|
||
连接broker时使用的用户名 | ||
|
||
### password [string] | ||
|
||
连接broker时使用的密码 | ||
|
||
### url [string] | ||
|
||
设置host、port、username、password和virtual host的简便方式。 | ||
|
||
### queue_name [string] | ||
|
||
数据写入的队列名。 | ||
|
||
### schema [Config] | ||
|
||
#### fields [Config] | ||
|
||
上游数据的模式字段。 | ||
|
||
### network_recovery_interval [int] | ||
|
||
自动恢复需等待多长时间才尝试重连,单位为毫秒。 | ||
|
||
### topology_recovery_enabled [boolean] | ||
|
||
设置为true,表示启用拓扑恢复。 | ||
|
||
### automatic_recovery_enabled [boolean] | ||
|
||
设置为true,表示启用连接恢复。 | ||
|
||
### use_correlation_id [boolean] | ||
|
||
接收到的消息是否都提供唯一ID,来删除重复的消息达到幂等(在失败的情况下) | ||
|
||
### connection_timeout [int] | ||
|
||
TCP连接建立的超时时间,单位为毫秒;0代表不限制。 | ||
|
||
### rabbitmq.config [map] | ||
|
||
In addition to the above parameters that must be specified by the RabbitMQ client, the user can also specify multiple non-mandatory parameters for the client, covering [all the parameters specified in the official RabbitMQ document](https://www.rabbitmq.com/configure.html). | ||
除了上面提及必须设置的RabbitMQ客户端参数,你也还可以为客户端指定多个非强制参数,参见 [RabbitMQ官方文档参数设置](https://www.rabbitmq.com/configure.html)。 | ||
|
||
### common options | ||
|
||
Sink插件常用参数,请参考[Sink常用选项](common-options.md)获取更多细节信息。 | ||
|
||
## 示例 | ||
|
||
simple: | ||
|
||
```hocon | ||
sink { | ||
RabbitMQ { | ||
host = "rabbitmq-e2e" | ||
port = 5672 | ||
virtual_host = "/" | ||
username = "guest" | ||
password = "guest" | ||
queue_name = "test1" | ||
rabbitmq.config = { | ||
requested-heartbeat = 10 | ||
connection-timeout = 10 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## 变更日志 | ||
|
||
### 随后版本 | ||
|
||
- 增加Rabbitmq数据接收器 | ||
- [Improve] 将连接器自定义配置前缀的数据类型更改为Map [3719](https://github.com/apache/seatunnel/pull/3719) | ||
|
Oops, something went wrong.