diff --git a/snooty.toml b/snooty.toml new file mode 100644 index 00000000..f7bac1ef --- /dev/null +++ b/snooty.toml @@ -0,0 +1,5 @@ +name = "kafka-connector" +title = "MongoDB Kafka Connector" +intersphinx = ["https://docs.mongodb.com/manual/objects.inv"] + +toc_landing_pages = ["/kafka-sink"] diff --git a/source/includes/externalize-secrets.rst b/source/includes/externalize-secrets.rst index 5d358409..3d8054a5 100644 --- a/source/includes/externalize-secrets.rst +++ b/source/includes/externalize-secrets.rst @@ -1,5 +1,4 @@ -.. admonition:: Avoid Exposing Your Authentication Credentials - :class: important +.. important:: Avoid Exposing Your Authentication Credentials To avoid exposing your authentication credentials in your ``connection.uri`` setting, use a diff --git a/source/kafka-docker-example.txt b/source/kafka-docker-example.txt index b150605a..97589c6e 100644 --- a/source/kafka-docker-example.txt +++ b/source/kafka-docker-example.txt @@ -185,7 +185,7 @@ http://localhost:9021/ and navigate to the cluster topics. Next, explore the collection data in the MongoDB replica set: * In your local shell, navigate to the ``docker`` directory from which you - ran the ``docker-compose`` commands and connect to the `mongo1` MongoDB + ran the ``docker-compose`` commands and connect to the ``mongo1`` MongoDB instance using the following command: .. code-block:: none diff --git a/source/kafka-sink-data-formats.txt b/source/kafka-sink-data-formats.txt index df0e16bb..3f50e668 100644 --- a/source/kafka-sink-data-formats.txt +++ b/source/kafka-sink-data-formats.txt @@ -103,8 +103,7 @@ without schema** data format. The Kafka topic data must be in JSON format. value.converter=org.apache.kafka.connect.json.JsonConverter value.converter.schemas.enable=false -.. admonition:: Choose the appropriate data format - :class: note +.. note:: Choose the appropriate data format When you specify **JSON without Schema**, any JSON schema objects such as ``schema`` or ``payload`` are read explicitly rather than as a diff --git a/source/kafka-sink-postprocessors.txt b/source/kafka-sink-postprocessors.txt index 1db55322..7dad1aab 100644 --- a/source/kafka-sink-postprocessors.txt +++ b/source/kafka-sink-postprocessors.txt @@ -40,32 +40,37 @@ class or use one of the following pre-built ones: - | Full Path: ``com.mongodb.kafka.connect.sink.processor.DocumentIdAdder`` | Uses a configured *strategy* to insert an ``_id`` field. - .. seealso:: :ref:`Strategy options and configuration `. + .. seealso:: + :ref:`Strategy options and configuration `. * - BlockListKeyProjector - | Full Path: ``com.mongodb.kafka.connect.sink.processor.BlockListKeyProjector`` | Removes matching key fields from the sink record. - .. seealso:: :ref:`Configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Configuration ` and :ref:`Example `. * - BlockListValueProjector - | Full Path: ``com.mongodb.kafka.connect.sink.processor.BlockListValueProjector`` | Removes matching value fields from the sink record. - .. seealso:: :ref:`Configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Configuration ` and :ref:`Example `. * - AllowListKeyProjector - | Full Path: ``com.mongodb.kafka.connect.sink.processor.AllowListKeyProjector`` | Includes only matching key fields from the sink record. - .. seealso:: :ref:`Configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Configuration ` and :ref:`Example `. * - AllowListValueProjector - | Full Path: ``com.mongodb.kafka.connect.sink.processor.AllowListValueProjector`` | matching value fields from the sink record. - .. seealso:: :ref:`Configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Configuration ` and :ref:`Example `. * - KafkaMetaAdder - | Full Path: ``com.mongodb.kafka.connect.sink.processor.KafkaMetaAdder`` | Adds a field composed of the concatenation of Kafka topic, partition, and offset to the document. @@ -74,14 +79,16 @@ class or use one of the following pre-built ones: - | Full Path: ``com.mongodb.kafka.connect.sink.processor.field.renaming.RenameByMapping`` | Renames fields that are an exact match to a specified field name in the key or value document. - .. seealso:: :ref:`Renaming configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Renaming configuration ` and :ref:`Example `. * - RenameByRegex - | Full Path: ``com.mongodb.kafka.connect.sink.processor.field.renaming.RenameByRegex`` | Renames fields that match a regular expression. - .. seealso:: :ref:`Renaming configuration ` and :ref:`Example `. + .. seealso:: + :ref:`Renaming configuration ` and :ref:`Example `. You can configure the post processor chain by specifying an ordered, comma separated list of fully-qualified ``PostProcessor`` class names: @@ -178,8 +185,7 @@ To define a custom strategy, create a class that implements the interface and provide the fully-qualified path to the ``document.id.strategy`` setting. -.. admonition:: Selected strategy may have implications on delivery semantics - :class: note +.. note:: Selected strategy may have implications on delivery semantics BSON ObjectId or UUID strategies can only guarantee at-least-once delivery since new ids would be generated on retries or re-processing. @@ -324,10 +330,10 @@ The previous example projection configurations demonstrated exact string matching on field names. The projection ``list`` setting also supports the following wildcard patterns matching on field names: -* "``*``" (`star`): matches a string of any length for the level in the +* "``*``" (``star``): matches a string of any length for the level in the document in which it is specified. -* "``**``" (`double star`): matches the current and all nested levels from +* "``**``" (``double star``): matches the current and all nested levels from which it is specified. The examples below demonstrate how to use each wildcard pattern and the @@ -637,7 +643,7 @@ The post processor applied the following changes: subdocuments of ``crepes`` are matched. In the matched fields, all instances of "purchased" are replaced with "quantity". -.. admonition:: Ensure renaming does not result in duplicate keys in the same document +.. tip:: Ensure renaming does not result in duplicate keys in the same document The renaming post processors update the key fields of a JSON document which can result in duplicate keys within a document. They skip the @@ -650,9 +656,9 @@ Custom Write Models A **write model** defines the behavior of bulk write operations made on a MongoDB collection. The default write model for the connector is -:java-docs-latest:`ReplaceOneModel +:java-docs:`ReplaceOneModel ` with -:java-docs-latest:`ReplaceOptions ` +:java-docs:`ReplaceOptions ` set to upsert mode. You can override the default write model by specifying a custom one in the @@ -674,8 +680,9 @@ strategies are provided with the connector: - | Replaces at most one document that matches filters provided by the ``document.id.strategy`` setting. | Set the following configuration: ``writemodel.strategy=com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy`` - .. seealso:: Example of usage in :ref:`writemodel-strategy-business-key`. + .. seealso:: + Example of usage in :ref:`writemodel-strategy-business-key`. * - DeleteOneDefaultStrategy - | Deletes at most one document that matches the id specified by the ``document.id.strategy`` setting, only when the document contains a null value record. | Implicitly specified when the configuration setting ``mongodb.delete.on.null.values=true`` is set. @@ -685,8 +692,9 @@ strategies are provided with the connector: - | Add ``_insertedTS`` (inserted timestamp) and ``_modifiedTS`` (modified timestamp) fields into documents. | Set the following configuration: ``writemodel.strategy=com.mongodb.kafka.connect.sink.writemodel.strategy.UpdateOneTimestampsStrategy`` - .. seealso:: Example of usage in :ref:`writemodel-strategy-timestamps`. + .. seealso:: + Example of usage in :ref:`writemodel-strategy-timestamps`. * - UpdateOneBusinessKeyTimestampStrategy - | Add ``_insertedTS`` (inserted timestamp) and ``_modifiedTS`` (modified timestamp) fields into documents that match the filters provided by the ``document.id.strategy`` setting. | Set the following configuration: ``writemodel.strategy=com.mongodb.kafka.connect.sink.writemodel.strategy.UpdateOneBusinessKeyTimestampStrategy`` diff --git a/source/kafka-sink-properties.txt b/source/kafka-sink-properties.txt index 40c0ceef..9861ee9c 100644 --- a/source/kafka-sink-properties.txt +++ b/source/kafka-sink-properties.txt @@ -43,15 +43,13 @@ data to sink to MongoDB. For an example configuration file, see - string - | A regular expression that matches the Kafka topics that the sink connector should watch. - .. example:: + The following regex matches topics such as + "activity.landing.clicks" and "activity.support.clicks", + but not "activity.landing.views" or "activity.clicks": - | The following regex matches topics such as - | "activity.landing.clicks" and "activity.support.clicks", - | but not "activity.landing.views" or "activity.clicks": + .. code-block:: none - .. code-block:: none - - topics.regex=activity\\.\\w+\\.clicks$ + topics.regex=activity\\.\\w+\\.clicks$ | *Required* | **Note:** You can only define either ``topics`` or ``topics.regex``. @@ -61,11 +59,9 @@ data to sink to MongoDB. For an example configuration file, see - string - | A :manual:`MongoDB connection URI string `. - .. example:: - - .. code-block:: none + .. code-block:: none - mongodb://username:password@localhost/ + mongodb://username:password@localhost/ .. include:: /includes/externalize-secrets.rst @@ -146,11 +142,9 @@ data to sink to MongoDB. For an example configuration file, see - string - | An inline JSON array with objects describing field name mappings. - .. example:: + .. code-block:: none - .. code-block:: none - - [ { "oldName":"key.fieldA", "newName":"field1" }, { "oldName":"value.xyz", "newName":"abc" } ] + [ { "oldName":"key.fieldA", "newName":"field1" }, { "oldName":"value.xyz", "newName":"abc" } ] | **Default**: ``[]`` | **Accepted Values**: A valid JSON array @@ -159,11 +153,9 @@ data to sink to MongoDB. For an example configuration file, see - string - | An inline JSON array containing regular expression statement objects. - .. example:: - - .. code-block:: none + .. code-block:: none - [ {"regexp":"^key\\\\..*my.*$", "pattern":"my", "replace":""}, {"regexp":"^value\\\\..*$", "pattern":"\\\\.", "replace":"_"} ] + [ {"regexp":"^key\\\\..*my.*$", "pattern":"my", "replace":""}, {"regexp":"^value\\\\..*$", "pattern":"\\\\.", "replace":"_"} ] | **Default**: ``[]`` | **Accepted Values**: A valid JSON array @@ -242,8 +234,7 @@ data to sink to MongoDB. For an example configuration file, see - int - | The maximum number of tasks that should be created for this connector. The connector may create fewer tasks if it cannot handle the specified level of parallelism. - .. admonition:: Messages May Be Processed Out of Order For Values Greater Than 1 - :class: important + .. important:: Messages May Be Processed Out of Order For Values Greater Than 1 If you specify a value greater than ``1``, the connector enables parallel processing of the tasks. If your topic has diff --git a/source/kafka-source.txt b/source/kafka-source.txt index 98f48fa2..6d7fe5f6 100644 --- a/source/kafka-source.txt +++ b/source/kafka-source.txt @@ -154,9 +154,10 @@ an example source connector configuration file, see [{"$match": {"operationType": "insert"}}, {"$addFields": {"Kafka": "Rules!"}}] - .. seealso:: :ref:`Custom pipeline example `. + .. seealso:: - .. seealso:: :ref:`Multiple source example `. + - :ref:`Custom pipeline example `. + - :ref:`Multiple source example `. | **Default**: [] | **Accepted Values**: Valid aggregation pipeline stages @@ -195,60 +196,63 @@ an example source connector configuration file, see * - output.schema.key - string - - | The `Avro schema `__ definition for the key document of the SourceRecord. - | **Default**: + - The `Avro schema `__ + definition for the key document of the SourceRecord. + + **Default**: - .. code-block:: json + .. code-block:: json - { - "type": "record", - "name": "keySchema", - "fields" : [ { "name": "_id", "type": "string" } ]" - } + { + "type": "record", + "name": "keySchema", + "fields" : [ { "name": "_id", "type": "string" } ]" + } - | **Accepted Values**: A valid JSON object + **Accepted Values**: A valid JSON object * - output.schema.value - string - - | The `Avro schema `__ definition for the value document of the SourceRecord. - | - | **Default**: - - .. code-block:: json - - { - "name": "ChangeStream", - "type": "record", - "fields": [ - { "name": "_id", "type": "string" }, - { "name": "operationType", "type": ["string", "null"] }, - { "name": "fullDocument", "type": ["string", "null"] }, - { "name": "ns", - "type": [{"name": "ns", "type": "record", "fields": [ - {"name": "db", "type": "string"}, - {"name": "coll", "type": ["string", "null"] } ] - }, "null" ] }, - { "name": "to", - "type": [{"name": "to", "type": "record", "fields": [ - {"name": "db", "type": "string"}, - {"name": "coll", "type": ["string", "null"] } ] - }, "null" ] }, - { "name": "documentKey", "type": ["string", "null"] }, - { "name": "updateDescription", - "type": [{"name": "updateDescription", "type": "record", "fields": [ - {"name": "updatedFields", "type": ["string", "null"]}, - {"name": "removedFields", - "type": [{"type": "array", "items": "string"}, "null"] - }] }, "null"] }, - { "name": "clusterTime", "type": ["string", "null"] }, - { "name": "txnNumber", "type": ["long", "null"]}, - { "name": "lsid", "type": [{"name": "lsid", "type": "record", - "fields": [ {"name": "id", "type": "string"}, - {"name": "uid", "type": "string"}] }, "null"] } - ] - } - - | **Accepted Values**: A valid JSON object + - The `Avro schema `__ + definition for the value document of the SourceRecord. + + **Default**: + + .. code-block:: json + + { + "name": "ChangeStream", + "type": "record", + "fields": [ + { "name": "_id", "type": "string" }, + { "name": "operationType", "type": ["string", "null"] }, + { "name": "fullDocument", "type": ["string", "null"] }, + { "name": "ns", + "type": [{"name": "ns", "type": "record", "fields": [ + {"name": "db", "type": "string"}, + {"name": "coll", "type": ["string", "null"] } ] + }, "null" ] }, + { "name": "to", + "type": [{"name": "to", "type": "record", "fields": [ + {"name": "db", "type": "string"}, + {"name": "coll", "type": ["string", "null"] } ] + }, "null" ] }, + { "name": "documentKey", "type": ["string", "null"] }, + { "name": "updateDescription", + "type": [{"name": "updateDescription", "type": "record", "fields": [ + {"name": "updatedFields", "type": ["string", "null"]}, + {"name": "removedFields", + "type": [{"type": "array", "items": "string"}, "null"] + }] }, "null"] }, + { "name": "clusterTime", "type": ["string", "null"] }, + { "name": "txnNumber", "type": ["long", "null"]}, + { "name": "lsid", "type": [{"name": "lsid", "type": "record", + "fields": [ {"name": "id", "type": "string"}, + {"name": "uid", "type": "string"}] }, "null"] } + ] + } + + **Accepted Values**: A valid JSON object * - output.schema.infer.value - boolean @@ -296,7 +300,9 @@ an example source connector configuration file, see - string - | Prefix to prepend to database & collection names to generate the name of the Kafka topic to publish data to. - .. seealso:: :ref:`Topic naming example `. + .. seealso:: + + :ref:`Topic naming example `. | **Default**: "" | **Accepted Values**: A string @@ -314,14 +320,14 @@ an example source connector configuration file, see data. A namespace describes the database name and collection separated by a period, e.g. ``databaseName.collectionName``. - .. example:: + .. example:: - In the following example, the setting matches all collections - that start with "page" in the "stats" database. + In the following example, the setting matches all collections + that start with "page" in the "stats" database. - .. code-block:: none + .. code-block:: none - copy.existing.namespace.regex=stats\.page.* + copy.existing.namespace.regex=stats\.page.* | **Default**: "" | **Accepted Values**: A valid regular expression @@ -346,18 +352,17 @@ an example source connector configuration file, see This can improve the use of indexes by the copying manager and make copying more efficient. - .. example:: + .. example:: - In the following example, the :manual:`$match - ` aggregation operator - ensures that only documents in which the ``closed`` field is - set to ``false`` are copied. + In the following example, the :manual:`$match + ` aggregation operator + ensures that only documents in which the ``closed`` field is + set to ``false`` are copied. - .. code-block:: none + .. code-block:: none - copy.existing.pipeline=[ { "$match": { "closed": "false" } } ] + copy.existing.pipeline=[ { "$match": { "closed": "false" } } ] - | | **Default**: [] | **Accepted Values**: Valid aggregation pipeline stages @@ -398,21 +403,24 @@ an example source connector configuration file, see start a new change stream when an existing offset contains an invalid resume token. If blank, the default partition name based on the connection details is used. - | + | **Default:** "" | **Accepted Values**: A valid partition name * - heartbeat.interval.ms - int - - | The length of time in milliseconds between sending heartbeat messages to record a post batch resume token when no source records have been published. This can improve the resumability of the connector for low volume namespaces. Use ``0`` to disable. - | + - The length of time in milliseconds between sending heartbeat messages to + record a post batch resume token when no source records have been published. + This can improve the resumability of the connector for low volume namespaces. + Use ``0`` to disable. + | **Default**: ``0`` | **Accepted Values**: An integer * - heartbeat.topic.name - string - - | The name of the topic to write heartbeat messages to. - | + - The name of the topic to write heartbeat messages to. + | **Default**: ``__mongodb_heartbeats`` | **Accepted Values**: A valid Kafka topic name @@ -533,8 +541,7 @@ from collections in a database to their associated topic as insert events prior to broadcasting change stream events. The connector **does not** support renaming a collection during the copy process. -.. admonition:: Data Copy Can Produce Duplicate Events - :class: note: +.. note:: Data Copy Can Produce Duplicate Events If clients make changes to the data in the database while the source connector is converting existing data, the subsequent change stream events diff --git a/worker.sh b/worker.sh index d2db2059..145442fc 100644 --- a/worker.sh +++ b/worker.sh @@ -1,2 +1 @@ -#!/bin/sh -make html +"build-and-stage-next-gen" \ No newline at end of file