Skip to content

Commit

Permalink
[Fix][Connector-V2] User selects csv string pattern (#8572)
Browse files Browse the repository at this point in the history
  • Loading branch information
corgy-w authored Feb 6, 2025
1 parent 0bf0693 commit 227a11f
Show file tree
Hide file tree
Showing 16 changed files with 281 additions and 78 deletions.
11 changes: 10 additions & 1 deletion docs/en/connector-v2/sink/CosFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
| csv_string_quote_mode | enum | no | MINIMAL | Only used when file_format is csv. |
| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
Expand Down Expand Up @@ -107,7 +108,7 @@ Only used when `custom_filename` is `true`

When the format in the `file_name_expression` parameter is `xxxx-${now}` , `filename_time_format` can specify the time format of the path, and the default value is `yyyy.MM.dd` . The commonly used time formats are listed as follows:

| Symbol | Description |
| Symbol | Description |
|--------|--------------------|
| y | Year |
| M | Month |
Expand Down Expand Up @@ -199,6 +200,14 @@ When File Format is Excel,The maximum number of data items that can be cached in

Writer the sheet of the workbook

### csv_string_quote_mode [string]

When File Format is CSV,The string quote mode of CSV.

- ALL: All String fields will be quoted.
- MINIMAL: Quotes fields which contain special characters such as a the field delimiter, quote character or any of the characters in the line separator string.
- NONE: Never quotes fields. When the delimiter occurs in data, the printer prefixes it with the escape character. If the escape character is not set, format validation throws an exception.

### xml_root_tag [string]

Specifies the tag name of the root element within the XML file.
Expand Down
14 changes: 14 additions & 0 deletions docs/en/connector-v2/sink/FtpFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
| csv_string_quote_mode | enum | no | MINIMAL | Only used when file_format is csv. |
| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
Expand Down Expand Up @@ -207,6 +208,14 @@ When File Format is Excel,The maximum number of data items that can be cached in

Writer the sheet of the workbook

### csv_string_quote_mode [string]

When File Format is CSV,The string quote mode of CSV.

- ALL: All String fields will be quoted.
- MINIMAL: Quotes fields which contain special characters such as a the field delimiter, quote character or any of the characters in the line separator string.
- NONE: Never quotes fields. When the delimiter occurs in data, the printer prefixes it with the escape character. If the escape character is not set, format validation throws an exception.

### xml_root_tag [string]

Specifies the tag name of the root element within the XML file.
Expand Down Expand Up @@ -237,17 +246,22 @@ Only used when file_format_type is json,text,csv,xml.
The encoding of the file to write. This param will be parsed by `Charset.forName(encoding)`.

### schema_save_mode [string]

Existing dir processing method.

- RECREATE_SCHEMA: will create when the dir does not exist, delete and recreate when the dir is exist
- CREATE_SCHEMA_WHEN_NOT_EXIST: will create when the dir does not exist, skipped when the dir is exist
- ERROR_WHEN_SCHEMA_NOT_EXIST: error will be reported when the dir does not exist
- IGNORE :Ignore the treatment of the table

### data_save_mode [string]

Existing data processing method.

- DROP_DATA: preserve dir and delete data files
- APPEND_DATA: preserve dir, preserve data files
- ERROR_WHEN_DATA_EXISTS: when there is data files, an error is reported

## Example

For text file format simple config
Expand Down
9 changes: 9 additions & 0 deletions docs/en/connector-v2/sink/HdfsFile.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Output data to hdfs file
| common-options | object | no | - | Sink plugin common parameters, please refer to [Sink Common Options](../sink-common-options.md) for details |
| max_rows_in_memory | int | no | - | Only used when file_format is excel.When File Format is Excel,The maximum number of data items that can be cached in the memory. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel.Writer the sheet of the workbook |
| csv_string_quote_mode | enum | no | MINIMAL | Only used when file_format is csv. |
| xml_root_tag | string | no | RECORDS | Only used when file_format is xml, specifies the tag name of the root element within the XML file. |
| xml_row_tag | string | no | RECORD | Only used when file_format is xml, specifies the tag name of the data rows within the XML file |
| xml_use_attr_format | boolean | no | - | Only used when file_format is xml, specifies Whether to process data using the tag attribute format. |
Expand Down Expand Up @@ -203,6 +204,14 @@ HdfsFile {

Only used when file_format_type is text,csv.false:don't write header,true:write header.

### csv_string_quote_mode [string]

When File Format is CSV,The string quote mode of CSV.

- ALL: All String fields will be quoted.
- MINIMAL: Quotes fields which contain special characters such as a the field delimiter, quote character or any of the characters in the line separator string.
- NONE: Never quotes fields. When the delimiter occurs in data, the printer prefixes it with the escape character. If the escape character is not set, format validation throws an exception.

### For compress simple config

```
Expand Down
Loading

0 comments on commit 227a11f

Please sign in to comment.