Examples of CDC with Debezium Engine (2)

Timothy Zhang
10 min readJul 8, 2024

Following the first article in this series, here I will introduce the second of two representative examples. This example demonstrates how to synchronize CRUD operations from MongoDB to a PostgreSQL database in near real-time. Here are the two representative example articles again:

  1. Change Data Capture (CDC) with Embedded Debezium and SpringBoot, with code shared on GitHub.
  2. Change Data Capture Made Easy: Debezium Integration with Spring Boot, MongoDB, and Postgres, with code also shared on GitHub.

The improved project code for the second example that I submitted is available on GitHub: cdc-mongodb-postgresql.

1. Source and Target Data Storage Systems

In this example, the source and target data storage systems are still created directly using a docker-compose.yml file. The source data storage system is a 3-node MongoDB cluster, while the target data storage system is a single-node PostgreSQL database.

networks:
mongo-net:
name: mongo-net
services:
mongo-replica-1:
hostname: mongo-replica-1
container_name: mongo-replica-1
image: mongo:latest
command: mongod --replSet rs --bind_ip localhost,mongo-replica-1
ports:
- "27018:27017"
restart: always
mongo-replica-2:
hostname: mongo-replica-2
container_name: mongo-replica-2
image: mongo:latest
command: mongod --replSet rs --bind_ip localhost,mongo-replica-2
ports:
- "27019:27017"
restart: always
mongo-primary:
hostname: mongo-primary
container_name: mongo-primary
depends_on:
- mongo-replica-1
- mongo-replica-2
image: mongo:latest
command: mongod --replSet rs --bind_ip localhost,mongo-primary
ports:
- "27017:27017"
restart: always
healthcheck:
test: echo 'rs.initiate({_id:"rs",members:[{_id:0,host:"mongo-primary:27017",priority:2},{_id:1,host:"mongo-replica-1:27017",priority:0},{_id:2,host:"mongo-replica-2:27017",priority:0}]}).ok || rs.status().ok' | mongosh --port 27017 --quiet
interval: 10s
start_period: 30s
volumes:
- ./src/main/resources/mongo-init.js:/docker-entrypoint-initdb.d/mongo-init.js
postgres-db:
image: postgres:latest
container_name: postgres-db
ports:
- "5432:5432"
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: root

In this example, both the mongo and postgres images use the latest tag, and you can use the docker inspect command to check their exact versions.

docker inspect mongo | jq '.[].Config.Env'
[
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"GOSU_VERSION=1.17",
"JSYAML_VERSION=3.13.1",
"MONGO_PACKAGE=mongodb-org",
"MONGO_REPO=repo.mongodb.org",
"MONGO_MAJOR=7.0",
"MONGO_VERSION=7.0.12",
"HOME=/data/db"
]
docker inspect postgres | jq '.[].Config.Env'
[
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/16/bin",
"GOSU_VERSION=1.17",
"LANG=en_US.utf8",
"PG_MAJOR=16",
"PG_VERSION=16.3-1.pgdg120+1",
"PGDATA=/var/lib/postgresql/data"
]

It should be noted that the healthcheck for the mongo-primary container initializes the replica set (rs) during the check, with the priority parameter determining the roles within the cluster. Additionally, be sure to map the hostnames of each container to 127.0.0.1 in /etc/hosts to successfully start them locally.

127.0.0.1  mongo-primary
127.0.0.1 mongo-replica-1
127.0.0.1 mongo-replica-2

Once ready, execute this configuration file:

docker-compose up -d --wait
[+] Running 5/5
✔ Network cdc-mongodb-postgresql_default Created 0.0s
✔ Container mongo-replica-1 Healthy 1.0s
✔ Container postgres-db Healthy 1.0s
✔ Container mongo-replica-2 Healthy 1.0s
✔ Container mongo-primary Healthy

After successful execution, you can check the status of the containers:

docker ps --format "table {{.Image}}\t{{.Ports}}\t{{.Names}}"
IMAGE             PORTS                      NAMES
mongo:latest 0.0.0.0:27017->27017/tcp mongo-primary
mongo:latest 0.0.0.0:27018->27017/tcp mongo-replica-1
mongo:latest 0.0.0.0:27019->27017/tcp mongo-replica-2
postgres:latest 0.0.0.0:5432->5432/tcp postgres-db

2. Prepare the environment and run the application

2.1 Terminal window layout

Similarly, it is recommended to use a Π-shaped terminal window layout for our CDC: the upper half runs our application project, the lower left part operates on the source database, and the lower right part verifies the results in the target storage. In this example, the lower left part runs operations on MongoDB, where we can execute a container and enter the MongoDB CLI interface:

docker exec -it mongo-primary mongosh

And the lower right part enters the PostgreSQL CLI interface to execute SQL commands:

docker exec -it postgres-db psql -U postgres

2.2 Prepare the MongoDB Collection and Documents

The project’s resources in the code repository provide a JavaScript script for importing initial data into MongoDB. You can execute it directly in the MongoDB CLI window in the lower left part. Due to the script’s length, I opted to execute the script file using docker commands in the project’s running window:

docker exec mongo-primary mongosh /docker-entrypoint-initdb.d/mongo-init.js

After execution, there was no output displayed, but we can verify the import status of the initial data:

docker exec mongo-primary mongosh --eval 'show collections'
books
computers
phones

It’s evident that these three collections have been successfully imported into MongoDB. Thus, the preparation of the source and target data storage systems is now complete.

2.3 Run the Debezium Engine application

Next, we run this project. Considering Java 21 is specified in the pom.xml, ensure that the environment path for running the project contains this Java version.

jenv shell 21
mvn clean package
java -jar target/cdc-0.0.1-SNAPSHOT.jar

It’s worth mentioning that in the project’s application.yml, we have a parameter called logging.level:

logging:
level:
org.mongodb.driver: WARN

This setting instructs Spring to only print logs at the WARN level for the org.mongodb.driver module, rather than INFO level logs for other modules. If this parameter is not configured in the application.yml file, you can achieve the same effect by adding the following system property when running the project:

java -Dlogging.level.org.mongodb.driver=WARN -jar target/cdc-0.0.1-SNAPSHOT.jar

## OR

mvn spring-boot:run -Dspring-boot.run.arguments=--logging.level.org.mongodb.driver=WARN

Otherwise, the Java Sync Driver used in the project for MongoDB will frequently output monitoring logs for org.mongodb.driver.cluster, continuously checking the cluster deployment status. For more information, please refer to the Java Sync Driver Logging documentation. You can also learn about logging in the Logging in Spring Boot guide.

3. Verifying CDC on Source Database Operations

The Debezium Engine application is now up and running. We begin to verify how our application implements CDC.

3.1 READ

Since we imported initial data earlier, the project will automatically handle READ operations when it runs. Initially, the startup logs will display the reading in of 9 initial data entries:

c.t.c.l.DebeziumSourceEventListener      : Collection ¡: books , DocumentId : 66770447e735590f088db601 , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: books , DocumentId : 66770447e735590f088db602 , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: books , DocumentId : 66770447e735590f088db603 , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: phones , DocumentId : 66770447e735590f088db5fb , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: phones , DocumentId : 66770447e735590f088db5fc , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: phones , DocumentId : 66770447e735590f088db5fd , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: computers , DocumentId : 66770447e735590f088db5fe , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: computers , DocumentId : 66770447e735590f088db5ff , Operation : READ
c.t.c.l.DebeziumSourceEventListener : Collection ¡: computers , DocumentId : 66770447e735590f088db600 , Operation : READ

Then check the records imported into PostgreSQL:

select * from public.product;
price | id |                             description                              |         mongo_id         |         name          | source_collection
-------+----+----------------------------------------------------------------------+--------------------------+-----------------------+-------------------
12.99 | 1 | Classic novel by Harper Lee exploring themes of racial injustice. | 66770447e735590f088db601 | To Kill a Mockingbird | books
9.99 | 2 | Dystopian novel by George Orwell depicting a totalitarian society. | 66770447e735590f088db602 | 1984 | books
14.99 | 3 | F. Scott Fitzgerald's masterpiece capturing the Jazz Age in America. | 66770447e735590f088db603 | The Great Gatsby | books
999 | 4 | Premium Apple smartphone with powerful features. | 66770447e735590f088db5fb | IPhone 15 | phones
899 | 5 | Premium Android smartphone with powerful features. | 66770447e735590f088db5fc | Samsung Galaxy S23 | phones
799 | 6 | Flagship phone with top-notch camera and performance. | 66770447e735590f088db5fd | Google Pixel 6 | phones
1499 | 7 | High-performance laptop for professionals. | 66770447e735590f088db5fe | MacBook Pro | computers
1299 | 8 | Powerful laptop with stunning display and long battery life. | 66770447e735590f088db5ff | Dell XPS 15 | computers
1099 | 9 | Versatile 2-in-1 laptop with impressive design and performance. | 66770447e735590f088db600 | HP Spectre x360 | computers
(9 rows)

Thus, it can be seen that the initial data from MongoDB has been transferred to the PostgreSQL database. The current Terminal window looks like this:

2.2 INSERT/CREATE

Next, we insert a piece of data into MongoDB:

db.phones.insertOne({
"name": "OnePlus 10 Pro",
"price": 800,
"description": "Flagship OnePlus smartphone with advanced features and Hasselblad camera technology."
})
{
acknowledged: true,
insertedId: ObjectId('667704d5b44e4f1c1e8db5fb')
}

Check the Collection phones in MongoDB:

db.phones.find()
[
{
_id: ObjectId('668b0ee228597751f3f3f54e'),
name: 'IPhone 15',
price: 999,
description: 'Premium Apple smartphone with powerful features.'
},
{
_id: ObjectId('668b0ee228597751f3f3f54f'),
name: 'Samsung Galaxy S23',
price: 899,
description: 'Premium Android smartphone with powerful features.'
},
{
_id: ObjectId('668b0ee228597751f3f3f550'),
name: 'Google Pixel 6',
price: 799,
description: 'Flagship phone with top-notch camera and performance.'
},
{
_id: ObjectId('668b10728daabb9ad8f3f54e'),
name: 'OnePlus 10 Pro',
price: 800,
description: 'Flagship OnePlus smartphone with advanced features and Hasselblad camera technology.'
}
]

The last entry here is the Document we added. Correspondingly, the application logs will output CREATE operation information:

c.t.c.l.DebeziumSourceEventListener      : Collection ¡: phones , DocumentId : 668b10728daabb9ad8f3f54e , Operation : CREATE

Finally, check the data in the PostgreSQL database:

price | id |                                     description                                      |         mongo_id         |         name          | source_collection
-------+----+--------------------------------------------------------------------------------------+--------------------------+-----------------------+-------------------
1499 | 1 | High-performance laptop for professionals. | 668b0ee228597751f3f3f551 | MacBook Pro | computers
1299 | 2 | Powerful laptop with stunning display and long battery life. | 668b0ee228597751f3f3f552 | Dell XPS 15 | computers
1099 | 3 | Versatile 2-in-1 laptop with impressive design and performance. | 668b0ee228597751f3f3f553 | HP Spectre x360 | computers
999 | 4 | Premium Apple smartphone with powerful features. | 668b0ee228597751f3f3f54e | IPhone 15 | phones
899 | 5 | Premium Android smartphone with powerful features. | 668b0ee228597751f3f3f54f | Samsung Galaxy S23 | phones
799 | 6 | Flagship phone with top-notch camera and performance. | 668b0ee228597751f3f3f550 | Google Pixel 6 | phones
12.99 | 7 | Classic novel by Harper Lee exploring themes of racial injustice. | 668b0ee228597751f3f3f554 | To Kill a Mockingbird | books
9.99 | 8 | Dystopian novel by George Orwell depicting a totalitarian society. | 668b0ee228597751f3f3f555 | 1984 | books
14.99 | 9 | F. Scott Fitzgerald's masterpiece capturing the Jazz Age in America. | 668b0ee228597751f3f3f556 | The Great Gatsby | books
800 | 10 | Flagship OnePlus smartphone with advanced features and Hasselblad camera technology. | 668b10728daabb9ad8f3f54e | OnePlus 10 Pro | phones
(10 rows)

The last entry among the 10 records is the one we added.

2.3 UPDATE

In MongoDB, each Document has a built-in unique field called `_id`, which has a specific type of ObjectId. This serves as the equivalent of a primary key in relational databases. Therefore, for the UPDATE operation here, we need to use this field as the parameter to identify the Document that needs to be updated. Here, we update the Document added in the previous section:

db.phones.updateOne( 
{ "_id" : ObjectId("668b10728daabb9ad8f3f54e") },
{ $set: { "price" : 1700 } }
)
{
acknowledged: true,
insertedId: null,
matchedCount: 1,
modifiedCount: 1,
upsertedCount: 0
}

Here it shows that the update was successful. Let’s check the changes in the MongoDB Collection phones:

[
{
_id: ObjectId('668b0ee228597751f3f3f54e'),
name: 'IPhone 15',
price: 999,
description: 'Premium Apple smartphone with powerful features.'
},
{
_id: ObjectId('668b0ee228597751f3f3f54f'),
name: 'Samsung Galaxy S23',
price: 899,
description: 'Premium Android smartphone with powerful features.'
},
{
_id: ObjectId('668b0ee228597751f3f3f550'),
name: 'Google Pixel 6',
price: 799,
description: 'Flagship phone with top-notch camera and performance.'
},
{
_id: ObjectId('668b10728daabb9ad8f3f54e'),
name: 'OnePlus 10 Pro',
price: 1700,
description: 'Flagship OnePlus smartphone with advanced features and Hasselblad camera technology.'
}
]

The price of the last Document has been updated to 1700. Next, let’s examine the application log for the added entry:

c.t.c.l.DebeziumSourceEventListener      : Collection ¡: phones , DocumentId : 668b10728daabb9ad8f3f54e , Operation : UPDATE

It also shows the UPDATE operation. Finally, let’s check the records in the PostgreSQL database:

price | id |                                     description                                      |         mongo_id         |         name          | source_collection
-------+----+--------------------------------------------------------------------------------------+--------------------------+-----------------------+-------------------
1499 | 1 | High-performance laptop for professionals. | 668b0ee228597751f3f3f551 | MacBook Pro | computers
1299 | 2 | Powerful laptop with stunning display and long battery life. | 668b0ee228597751f3f3f552 | Dell XPS 15 | computers
1099 | 3 | Versatile 2-in-1 laptop with impressive design and performance. | 668b0ee228597751f3f3f553 | HP Spectre x360 | computers
999 | 4 | Premium Apple smartphone with powerful features. | 668b0ee228597751f3f3f54e | IPhone 15 | phones
899 | 5 | Premium Android smartphone with powerful features. | 668b0ee228597751f3f3f54f | Samsung Galaxy S23 | phones
799 | 6 | Flagship phone with top-notch camera and performance. | 668b0ee228597751f3f3f550 | Google Pixel 6 | phones
12.99 | 7 | Classic novel by Harper Lee exploring themes of racial injustice. | 668b0ee228597751f3f3f554 | To Kill a Mockingbird | books
9.99 | 8 | Dystopian novel by George Orwell depicting a totalitarian society. | 668b0ee228597751f3f3f555 | 1984 | books
14.99 | 9 | F. Scott Fitzgerald's masterpiece capturing the Jazz Age in America. | 668b0ee228597751f3f3f556 | The Great Gatsby | books
1700 | 10 | Flagship OnePlus smartphone with advanced features and Hasselblad camera technology. | 668b10728daabb9ad8f3f54e | OnePlus 10 Pro | phones
(10 rows)

2.4 DELETE

Finally, let’s delete the newly added Document:

db.phones.deleteOne({ "_id": ObjectId("668b10728daabb9ad8f3f54e") })
{ acknowledged: true, deletedCount: 1 }

Deletion successful. Check the Collection phones in MongoDB. The last added Document has been removed. The application log also reflects this operation DELETE:

c.t.c.l.DebeziumSourceEventListener      : Collection ¡: phones , DocumentId : 668b10728daabb9ad8f3f54e , Operation : DELETE

Finally, verify that this operation has been propagated to PostgreSQL, returning to the initial 9 records.

With this, the second example I have modified is complete. MongoDB, a NoSQL database, has successfully passed CRUD operations on its Documents through Debezium Engine to the RDBMS PostgreSQL.
In the next example, I will introduce a Debezium Engine application I created myself, which propagates CRUD operations from Oracle DB to Redis data storage.

--

--

Responses (1)