Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
terakilobyte authored and schmalliso committed Apr 26, 2022
1 parent 9995893 commit e55d421
Show file tree
Hide file tree
Showing 15 changed files with 2,194 additions and 8 deletions.
Binary file added .static/logo-mongodb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 10 additions & 6 deletions conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
#
# This file is execfile()d with the current directory set to its containing dir.

from giza.config.helper import fetch_config, get_versions, get_manual_path
from giza.config.runtime import RuntimeStateConfig
import base64
import sys
import os.path
Expand All @@ -13,15 +15,14 @@
project_root = os.path.join(os.path.abspath(os.path.dirname(__file__)))
sys.path.append(project_root)

from giza.config.runtime import RuntimeStateConfig
from giza.config.helper import fetch_config, get_versions, get_manual_path

conf = fetch_config(RuntimeStateConfig())
intersphinx_libs = conf.system.files.data.intersphinx
pdfs = conf.system.files.data.pdfs
sconf = conf.system.files.data.sphinx_local

sys.path.append(os.path.join(conf.paths.projectroot, conf.paths.buildsystem, 'sphinxext'))
sys.path.append(os.path.join(conf.paths.projectroot,
conf.paths.buildsystem, 'sphinxext'))

# -- General configuration ----------------------------------------------------

Expand Down Expand Up @@ -59,8 +60,11 @@
])

extlinks = {
'issue': ('https://jira.mongodb.org/browse/%s', '' ),
'issue': ('https://jira.mongodb.org/browse/%s', ''),
'manual': ('http://docs.mongodb.com/manual%s', ''),
'community-support': ('https://www.mongodb.com/community-support-resources%s', ''),
'kafka-21-javadoc': ('https://kafka.apache.org/21/javadoc/org/apache/kafka%s', ''),
'java-docs-latest': ('http://mongodb.github.io/mongo-java-driver/3.12/%s', ''),
}

intersphinx_mapping = {}
Expand Down Expand Up @@ -92,7 +96,7 @@
# -- Options for HTML output ---------------------------------------------------

html_theme = sconf.theme.name
html_theme_path = [ os.path.join(conf.paths.buildsystem, 'themes') ]
html_theme_path = [os.path.join(conf.paths.buildsystem, 'themes')]
html_title = conf.project.title
htmlhelp_basename = 'MongoDBdoc'

Expand Down Expand Up @@ -145,4 +149,4 @@
epub_identifier = 'http://docs.mongodb.org/kafka-connector/'
epub_exclude_files = []
epub_pre_files = []
epub_post_files = []
epub_post_files = []
2 changes: 1 addition & 1 deletion config/build_conf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ system:
- 'sphinx_local.yaml'
- htaccess: ['htaccess.yaml']
version:
release: '0.1'
release: '0.2'
branch: 'master'
assets:
- branch: master
Expand Down
Empty file added source/.mongodb
Empty file.
Empty file added source/.static/.mongodb
Empty file.
39 changes: 38 additions & 1 deletion source/index.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,44 @@
.. _kafka:

.. _kafka-connector-landing:

=======================
MongoDB Kafka Connector
=======================

.. default-domain:: mongodb

Herein lies documentation.
Introduction
------------

`Apache Kafka <https://kafka.apache.org>`_ is a *distributed streaming
platform* that implements a publish-subscribe pattern to offer streams of
data with a durable and scalable framework.

The `Apache Kafka Connect API <https://www.confluent.io/connectors/>`_ is
an interface that simplifies integration of a data system, such as
a database or distributed cache, with a new **data source** or a
**data sink**.

The `MongoDB Kafka connector <https://www.mongodb.com/kafka-connector>`_ is
a Confluent-verified connector that persists data from Kafka topics as a
data sink into MongoDB as well as publishes changes from MongoDB into Kafka
topics as a data source. This guide provides information on available
configuration options and examples to help you complete your
implementation.

This guide is divided into the following topics:

.. toctree::
:titlesonly:
:maxdepth: 1

Install MongoDB Kafka Connector </kafka-installation>
Sink Connector Guide </kafka-sink>
Source Connector Guide </kafka-source>
Kafka Docker Example </kafka-docker-example>
Migrate from Kafka Connect </kafka-connect-migration>

For questions or issues , visit our :community-support:`Community Support
Resources </>`.

68 changes: 68 additions & 0 deletions source/kafka-connect-migration.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
.. _kafka-connect-migration:

==========================
Migrate from Kafka Connect
==========================

.. default-domain:: mongodb

Follow the steps in this guide to migrate your Kafka deployments from
`Kafka Connect <https://github.com/hpgrahsl/kafka-connect-mongodb>`_ to the
`official MongoDB Kafka connector <https://github.com/mongodb/mongo-kafka>`_.

Update Configuration Settings
-----------------------------

- Replace any property values that refer to ``at.grahsl.kafka.connect.mongodb``
with ``com.mongodb.kafka.connect``.

- Replace ``MongoDbSinkConnector`` with ``MongoSinkConnector`` as the
value of the ``connector.class`` key.

- Remove the "``mongodb.``" prefix from all configuration property key
names.

- Remove the ``document.id.strategies`` key if it exists. If the value of
this field contained references to any custom strategies, move them to the
``document.id.strategy`` field and read the :ref:`custom-class-changes`
section for additional required changes to your classes.

- Replace any keys that were used to specify per-topic and collection
overrides that contain the ``mongodb.collection`` prefix with the
equivalent key in `Topic-Specific Configuration Settings
<https://github.com/mongodb/mongo-kafka/blob/master/docs/sink.md#topic-specific-configuration-settings>`_.

.. _custom-class-changes:

Update Custom Classes
---------------------

If you added any classes or custom logic to your Kafka Connect connector,
migrate them to the new MongoDB Kafka connector jar file and make the
following changes to them:

- Update imports that refer to ``at.grahsl.kafka.connect.mongodb`` to
``com.mongodb.kafka.connect``.

- Replace references to the ``MongoDbSinkConnector`` class with
``MongoSinkConnector``.

- Update custom sink strategy classes to implement the
``com.mongodb.kafka.connect.sink.processor.id.strategy.IdStrategy``
interface.

- Update references to the ``MongoDbSinkConnectorConfig`` class which
has been split into the `sink.MongoSinkConfig
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/MongoSinkConfig.java>`_
and `sink.MongoSinkTopicConfig
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/MongoSinkTopicConfig.java>`_
classes.

Update PostProcessor Subclasses
-------------------------------

- Update any concrete methods that override methods in the Kafka Connect
``PostProcessor`` class to match the new method signatures of the
MongoDB Kafka Connector `PostProcessor
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/processor/PostProcessor.java>`_
class.
213 changes: 213 additions & 0 deletions source/kafka-docker-example.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
.. _kafka-docker-example:

======================================
MongoDB Kafka Connector Docker Example
======================================

.. default-domain:: mongodb

.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: singlecols

This guide provides an end-to-end setup of MongoDB and Kafka Connect to
demonstrate the functionality of the MongoDB Kafka Source and Sink
Connectors.

In this example, we create the following Kafka Connectors:

.. list-table::
:header-rows: 1

* - Connector
- Data Source
- Destination

* - Confluent Connector:
`Datagen <https://github.com/confluentinc/kafka-connect-datagen>`_
- `Avro random generator
<https://github.com/confluentinc/avro-random-generator>`_
- Kafka topic: pageviews

* - Sink Connector: **mongo-sink**
- Kafka topic: ``pageviews``
- MongoDB collection: ``test.pageviews``

* - Source Connector: **mongo-source**
- MongoDB collection: ``test.pageviews``
- Kafka topic: ``mongo.test.pageviews``

* The **Datagen Connector** creates random data using the
**Avro random generator** and publishes it to the Kafka topic "pageviews".

* The **mongo-sink** connector reads data from the "pageviews" topic and
writes it to MongoDB in the "test.pageviews" collection.

* The **mongo-source** connector produces change events for the
"test.pageviews" collection and publishes them to the
"mongo.test.pageviews" collection.

Requirements
------------

Linux/Unix-based OS
~~~~~~~~~~~~~~~~~~~
* `Docker <https://docs.docker.com/install/#supported-platforms>`_ 18.09 or later
* `Docker Compose <https://docs.docker.com/compose/install/>`_ 1.24 or later

MacOS
~~~~~

* `Docker Desktop Community Edition (Mac)
<https://docs.docker.com/docker-for-mac/install/>`_ 2.1.0.1 or later

Windows
~~~~~~~

* `Docker Desktop Community Edition (Windows)
<https://docs.docker.com/docker-for-windows/install/>`_ 2.1.0.1 or later

How to Run the Example
----------------------

Clone the `mongo-kafka <https://github.com/mongodb/mongo-kafka>`_ repository
from GitHub:

.. code-block:: shell

git clone https://github.com/mongodb/mongo-kafka.git

Change directory to the ``docker`` directory

.. code-block:: shell

cd mongo-kafka/docker/

Start the shell script, **run.sh**:

.. code-block:: shell

./run.sh

The shell script executes the following sequence of commands:

#. Run the ``docker-compose up`` command

The ``docker-compose`` command installs and starts the following
applications in a new docker container:

* Zookeeper
* Kafka
* Confluent Schema Registry
* Confluent Kafka Connect
* Confluent Control Center
* Confluent KSQL Server
* Kafka Rest Proxy
* Kafka Topics UI
* MongoDB replica set (three nodes: **mongo1**, **mongo2**, and
**mongo3**)

#. Wait for MongoDB, Kafka, Kafka Connect to become ready
#. Register the Confluent Datagen Connector
#. Register the MongoDB Kafka Sink Connector
#. Register the MongoDB Kafka Source Connector

.. note::

You may need to increase the RAM resource limits for Docker if the script
fails. Use the `docker-compose stop <docker-compose-stop>` command to
stop any running instances of docker if the script did not complete
successfully.

Once the services have been started by the shell script, the Datagen Connector
publishes new events to Kafka at short intervals which triggers the
following cycle:

#. The Datagen Connector publishes new events to Kafka
#. The Sink Connector writes the events into MongoDB
#. The Source Connector writes the change stream messages back into Kafka

To view the Kafka topics, open the Kafka Control Center at
http://localhost:9021/ and navigate to the cluster topics.

* The ``pageviews`` topic should contain documents added by the Datagen
Connector that resemble the following:

.. code-block:: json

{
"viewtime": {
"$numberLong": "81"
},
"pageid": "Page_1",
"userid": "User_8"
}

* The ``mongo.test.pageviews`` topic should contain change events that
resemble the following:

.. code-block:: json

{
"_id": {
"_data": "<resumeToken>"
},
"operationType": "insert",
"clusterTime": {
"$timestamp": {
"t": 1563461814,
"i": 4
}
},
"fullDocument": {
"_id": {
"$oid": "5d3088b6bafa7829964150f3"
},
"viewtime": {
"$numberLong": "81"
},
"pageid": "Page_1",
"userid": "User_8"
},
"ns": {
"db": "test",
"coll": "pageviews"
},
"documentKey": {
"_id": {
"$oid": "5d3088b6bafa7829964150f3"
}
}
}

Next, explore the collection data in the MongoDB replica set:

* In your local shell, navigate to the ``docker`` directory from which you
ran the ``docker-compose`` commands and connect to the `mongo1` MongoDB
instance using the following command:

.. code-block:: shell

docker-compose exec mongo1 /usr/bin/mongo

* If you insert or update a document in the ``test.pageviews``, the Source
Connector publishes a change event document to the
``mongo.test.pageviews`` Kafka topic.

.. _docker-compose-stop:

To stop the docker containers and all the processes running on them, use
Ctrl-C in the shell running the script, or the following command:

.. code-block:: shell

docker-compose stop

To remove the docker containers and images completely, use the following
command:

.. code-block:: shell

docker-compose down
Loading

0 comments on commit e55d421

Please sign in to comment.