-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blog post for moving data between JBOD disks using Cruise Control #469
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: ShubhamRwt <[email protected]>
@scholzj @ppatierno Hi, is it just me or the formatting in the blog post is messed up for you also? |
If the data is not removed from the disk, and it is removed then potential data loss can happen. | ||
Currently, moving data between the JBOD disks is done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we are introducing the ability to move data between the JBOD disks using Cruise Control | ||
|
||
## Cruise Control to move data between JBOD disks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move all your headers one level down, the webpage would look nicer ... i.e. ##
-> ###
, ###
-> ####
etc.
Not sure what exactly you mean by it and where exactly. |
volumeIds: [1, 2] | ||
``` | ||
|
||
Now let’s wait for the `KafkaRebalance` resource to move to `ProposalReady` state. You can check the rebalance summary by running the following command once the proposal is ready: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scholzj for eg. here. I see the text in green when it should be white as I have above for normal text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the GitHub review window? Or on the blog preview?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the GitHub review window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, from my experience, that is quite easily derailed by some code examples etc. So I would not make too muhc out of it as long as the preview looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for clearing the doubt. I though I have some wring formatting or something which is causing this
|
||
### Additional notes | ||
|
||
1. This feature only works if JBOD storage is enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also I see the text in blue while these are just normal text and should be in white If if Iam not wrong?
author: shubham_rawat | ||
--- | ||
|
||
Apache Kafka is a platform which provides durability and fault tolerance by storing messages on disks and JBOD storage is one of the storage configuration types supported by Kafka. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kafka supports multiple disks. JBOD storage configuration type is more of a Strimzi thing. That is how we call it in the API. I would either reword it or change it from Kafka to Strimzi?
The JBOD data storage configuration allows Kafka brokers to make use of multiple disks. | ||
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks. | ||
If the data is not removed from the disk, and it is removed then potential data loss can happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe ...
If the data is not removed from the disk, and it is removed then potential data loss can happen. | |
If the data is not removed from the disk, and the disk is removed then potential data loss can happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could adjust the following:
"In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks.
If the data is not removed from the disk, and it is removed then potential data loss can happen."
To:
"If you plan to remove a disk that contains partition replicas, the data must be safely moved to other disks first.
Failing to do so could result in data loss."
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks. | ||
If the data is not removed from the disk, and it is removed then potential data loss can happen. | ||
Currently, moving data between the JBOD disks is done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we are introducing the ability to move data between the JBOD disks using Cruise Control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, moving data between the JBOD disks is done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we are introducing the ability to move data between the JBOD disks using Cruise Control | |
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we are introducing the ability to move data between the JBOD disks using Cruise Control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're in 0.45 release, maybe change "we are introducing" to "we introduced"?
Also missing period at end of sentence.
|
||
## Cruise Control to move data between JBOD disks | ||
|
||
This feature will allow you to move the data between the JBOD disks using the `KafkaRebalance` custom resource that we have in Strimzi. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Between the JBOd disks
sounds not right here. That is what the intrabroker rebalancing does which we have for some time already. You are moving all data from one disk to another disks.
## Cruise Control to move data between JBOD disks | ||
|
||
This feature will allow you to move the data between the JBOD disks using the `KafkaRebalance` custom resource that we have in Strimzi. | ||
This feature makes use of the `remove-disks` endpoint of Cruise Control that triggers a rebalancing operation which moves replicas, starting with the largest and proceeding to the smallest, to the remaining disks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature makes use of the `remove-disks` endpoint of Cruise Control that triggers a rebalancing operation which moves replicas, starting with the largest and proceeding to the smallest, to the remaining disks. | |
This feature makes use of the `remove-disks` endpoint of Cruise Control that triggers a rebalancing operation which moves all replicas, starting with the largest and proceeding to the smallest, to the remaining disks. |
spec: | ||
replicas: 3 | ||
roles: | ||
- broker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that based on the rest of the YAMLs, @ShubhamRwt is making the example with a ZooKeeper based cluster but I totally agree on using KRaft only from now on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will update this with a KRaft cluster
zookeeper: | ||
replicas: 3 | ||
storage: | ||
type: persistent-claim | ||
size: 100Gi | ||
deleteClaim: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use KRaft please.
- brokerId: 0 | ||
volumeIds: [1, 2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know what will be broker 0? You should probably also clean it from all brokers from the node pool and take the example till the end to remove the disks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scholzj Sorry, I didn't understand what you meant by How do you know what will be broker 0
. I was just showing how we can move all the data from the two volumes to some other volume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know the ID 0 will not be a broker form the second node pool with only one volume? The YAML you use here does not give you deterministic order in which the node IDs will be assigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scholzj Hi, before I push my changes if I understood the above suggestion correctly -> Iam now using a Kraft cluster with 3brokers and 3 controllers having 3 disks and then removing the 3rd disk from all the brokers(not the controllers) and taking the ecample till the end where we remove the disks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this comment was mainly that without the node ID annotations, ou do not know if nodes 0, 1 and 2 will be brokers or controllers.
|
||
### Additional notes | ||
|
||
1. This feature only works if JBOD storage is enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only works when multiple disks are used, not when JBOD is enabled but has 1 disk, or?
2. Make sure you have more than one volume per broker else you will be prompted of not having enough volumes to move the data to. | ||
3. This endpoint does not provide `before` load since upstream Cruise Control project does not support `verbose` with this endpoint so the `loadmap` generated should only have `afterLoad` information. | ||
|
||
## What's next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably list the risks / incomplete parts as well ... in paritcular, new partition replicas might be scheduled to the disks between cleaning them up with cruise control and removing them which might lead to data loss again.
## Setting up the environment | ||
|
||
Let's set up a cluster to work through an example demonstrating this feature. | ||
To get the Kafka cluster up and running, we will first have to install the Strimzi Cluster Operator and then deploy the `Kafka` resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not going to be just Kafka
but also KafkaNodePool
.
metadata: | ||
name: my-cluster | ||
annotations: | ||
strimzi.io/node-pools: enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing annotation to specify it's a KRaft based cluster and so remove the ZooKeeper section as well.
spec: | ||
replicas: 3 | ||
roles: | ||
- broker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that based on the rest of the YAMLs, @ShubhamRwt is making the example with a ZooKeeper based cluster but I totally agree on using KRaft only from now on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, Shubham. I left a few suggestions. I thought the example might benefit from some kind of introduction to what we're trying to achieve and how.
@@ -0,0 +1,345 @@ | |||
--- | |||
layout: post | |||
title: "Moving data between the JBOD disks using Cruise Control" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
title: "Moving data between the JBOD disks using Cruise Control" | |
title: "Moving data between JBOD disks using Cruise Control" |
The JBOD data storage configuration allows Kafka brokers to make use of multiple disks. | ||
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks. | ||
If the data is not removed from the disk, and it is removed then potential data loss can happen. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could adjust the following:
"In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks.
If the data is not removed from the disk, and it is removed then potential data loss can happen."
To:
"If you plan to remove a disk that contains partition replicas, the data must be safely moved to other disks first.
Failing to do so could result in data loss."
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
In case you plan to remove a disk, and it contains some partition replicas, then you need to make sure that data is safely moved to some other disks. | ||
If the data is not removed from the disk, and it is removed then potential data loss can happen. | ||
Currently, moving data between the JBOD disks is done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we are introducing the ability to move data between the JBOD disks using Cruise Control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're in 0.45 release, maybe change "we are introducing" to "we introduced"?
Also missing period at end of sentence.
segment.bytes: 1073741824 | ||
``` | ||
|
||
Once you create the topic, now you can check whether the volumes have some partition replicas assigned to them or not using the `kafka-log-dir.sh` tool. Let's see the partition replicas assigned to the volumes on broker with id 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you create the topic, now you can check whether the volumes have some partition replicas assigned to them or not using the `kafka-log-dir.sh` tool. Let's see the partition replicas assigned to the volumes on broker with id 0. | |
Now you can check whether the volumes have some partition replicas assigned to the topics using the `kafka-log-dir.sh` tool. Let's see the partition replicas assigned to the volumes on broker with id 0. |
} | ||
``` | ||
|
||
Now lets try to move the data of volume 1 and volume 2 to volume 0, present on broker with ID 0. For doing that let's create a `KafkaRebalance` resource with `remove-disks` mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now lets try to move the data of volume 1 and volume 2 to volume 0, present on broker with ID 0. For doing that let's create a `KafkaRebalance` resource with `remove-disks` mode. | |
Next, let's move the data from volumes 1 and 2 to volume 0 on the broker with ID 0. | |
To achieve this, we create a `KafkaRebalance` resource in `remove-disks` mode. |
### Additional notes | ||
|
||
1. This feature only works if JBOD storage is enabled | ||
2. Make sure you have more than one volume per broker else you will be prompted of not having enough volumes to move the data to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Make sure you have more than one volume per broker else you will be prompted of not having enough volumes to move the data to. | |
2. Make sure you have more than one volume per broker else you will be prompted for not having enough volumes to move the data to. |
kubectl get kafkarebalance my-rebalance -n myproject -o yaml | ||
``` | ||
|
||
and you should be able to get an output like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and you should be able to get an output like this: | |
And you should be able to get an output like this: |
|
||
### Additional notes | ||
|
||
1. This feature only works if JBOD storage is enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. This feature only works if JBOD storage is enabled | |
1. This feature only works if JBOD storage is enabled and multiple disks are used. |
|
||
1. This feature only works if JBOD storage is enabled | ||
2. Make sure you have more than one volume per broker else you will be prompted of not having enough volumes to move the data to. | ||
3. This endpoint does not provide `before` load since upstream Cruise Control project does not support `verbose` with this endpoint so the `loadmap` generated should only have `afterLoad` information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. This endpoint does not provide `before` load since upstream Cruise Control project does not support `verbose` with this endpoint so the `loadmap` generated should only have `afterLoad` information. | |
3. The optimization proposal does not show the load before optimization, it only shows the load after optimization. | |
This is because in upstream Cruise Control the verbose tag is not enabled with the `remove_disks` endpoint. |
## What's next | ||
|
||
We hope this blog post has provided you with a clear understanding of how you can use the `KafkaRebalance` custom resource in `remove-disks` to easily move the data between the JBOD disks. | ||
If you get stuck on any step or have any doubts, you can have read about this in or documentation on [Using Cruise Control to reassign parititon on JBOD disk](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you get stuck on any step or have any doubts, you can have read about this in or documentation on [Using Cruise Control to reassign parititon on JBOD disk](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str) | |
If you encounter any issues or want to know more, refer to our documentation on [Using Cruise Control to reassign partitions on JBOD disks](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str) |
Signed-off-by: ShubhamRwt <[email protected]>
### Setting up the environment | ||
|
||
Let's set up a cluster to work through an example demonstrating this feature. | ||
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will be making use of the following resources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will be making use of the following resources: | |
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will use Kafka and KafkaNodePool resources to create a KRaft cluster. |
|
||
Let's set up a cluster to work through an example demonstrating this feature. | ||
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will be making use of the following resources: | ||
We use a Kafka resource and KafkaNodePool resources to create a KRaft cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use a Kafka resource and KafkaNodePool resources to create a KRaft cluster. |
|
||
Let's see the partition replicas assigned to the volumes on the brokers using the `kafka-log-dir.sh` tool. | ||
```shell | ||
kubectl exec -n myproject -ti my-cluster-pool-a-0 /bin/bash -- bin/kafka-log-dirs.sh --describe --bootstrap-server my-cluster-kafka-bootstrap:9092 --broker-list 3,4,5 --topic-list my-topic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my-cluster-pool-a-0
is not consistent with the YAML ... it should be my-cluster-controller-0
or my-cluster-broker-3
or any other right pod. There is no node pool named pool-a
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, looks like I copied the command from the other example I generated. I will fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be some helper Pod and not one of the Kafka nodes 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then I will add an extra step on deploying the helper pod and then this step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to do just something like this:
kubectl -n myproject run kafka-consumer -ti --image=quay.io/strimzi/kafka:0.45.0-kafka-3.9.0 --rm=true --restart=Never -- bin/kafka-log-dirs.sh --describe --bootstrap-server my-cluster-kafka-bootstrap:9092 --broker-list 3,4,5 --topic-list my-topic
|
||
After the rebalance is complete, use the `kafka-log-dirs.sh` tool again to verify that the data has been moved. | ||
```shell | ||
kubectl exec -n myproject -ti my-cluster-pool-a-0 /bin/bash -- bin/kafka-log-dirs.sh --describe --bootstrap-server my-cluster-kafka-bootstrap:9092 --broker-list 3,4,5 --topic-list my-topic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto as before about my-cluster-pool-a-0
|
||
1. This feature only works if JBOD storage is enabled and multiple disks are used else you will be prompted for not having enough volumes to move the data to. | ||
2. The optimization proposal does not show the load before optimization, it only shows the load after optimization. | ||
3. New partition replicas might be scheduled to the disks between cleaning them up with cruise control and removing them which might lead to data loss again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. New partition replicas might be scheduled to the disks between cleaning them up with cruise control and removing them which might lead to data loss again. | |
3. New partition replicas might be scheduled to the disks between cleaning them up with Cruise Control and removing them which might lead to data loss again. |
1. This feature only works if JBOD storage is enabled and multiple disks are used else you will be prompted for not having enough volumes to move the data to. | ||
2. The optimization proposal does not show the load before optimization, it only shows the load after optimization. | ||
3. New partition replicas might be scheduled to the disks between cleaning them up with cruise control and removing them which might lead to data loss again. | ||
4. After all replicas are moved from the specified disk, the disk may still be used by CC during rebalances and Kafka can still use it when creating topics so make sure to delete the disk manually if not required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4. After all replicas are moved from the specified disk, the disk may still be used by CC during rebalances and Kafka can still use it when creating topics so make sure to delete the disk manually if not required. | |
4. After all replicas are moved from the specified disk, the disk may still be used by Cruise Control during rebalances and Kafka can still use it when creating topics so make sure to delete the disk manually if not required. |
Signed-off-by: ShubhamRwt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates from the first review. I've left a few more suggestions, but looks good to me.
Regarding removing PVCs, we don't show that in the docs, but maybe we should?
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
If you plan to remove a disk that contains partition replicas, the data must be safely moved to other disks first. | ||
Failing to do so could result in data loss. | ||
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. | |
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool, which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. |
|
||
### Cruise Control to move data between JBOD disks | ||
|
||
This feature will allow you to move the data from one JBOD disk to another JBOD disk using the `KafkaRebalance` custom resource that we have in Strimzi. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature will allow you to move the data from one JBOD disk to another JBOD disk using the `KafkaRebalance` custom resource that we have in Strimzi. | |
This feature allows you to move the data from one JBOD disk to another JBOD disk using Strimzi's `KafkaRebalance` custom resource. |
### Setting up the environment | ||
|
||
Let's set up a cluster to work through an example demonstrating this feature. | ||
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will use Kafka and KafkaNodePool resources to create a KRaft cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will use Kafka and KafkaNodePool resources to create a KRaft cluster. | |
IN the example we will see how to safely remove the JBOD disks by moving the data from one disk to another, and we will use `Kafka` and `KafkaNodePool` resources to create a KRaft cluster. |
|
||
Let's set up a cluster to work through an example demonstrating this feature. | ||
During the example we will see how we can safely remove the JBOD disks by moving the data from one disk to another, and we will use Kafka and KafkaNodePool resources to create a KRaft cluster. | ||
Then, we create a KafkaRebalance resource in remove-disks mode, specifying the brokers and volume IDs for partition reassignment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, we create a KafkaRebalance resource in remove-disks mode, specifying the brokers and volume IDs for partition reassignment. | |
Then, we create a `KafkaRebalance` resource in remove-disks mode, specifying the brokers and volume IDs for partition reassignment. |
You can install the Cluster Operator with any installation method you prefer. | ||
You can also refer to the [Strimzi documentation](https://strimzi.io/docs/operators/in-development/deploying#con-strimzi-installation-methods_str). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can install the Cluster Operator with any installation method you prefer. | |
You can also refer to the [Strimzi documentation](https://strimzi.io/docs/operators/in-development/deploying#con-strimzi-installation-methods_str). | |
You can install the Cluster Operator with the installation method you prefer, which are described in the [Strimzi documentation](https://strimzi.io/docs/operators/in-development/deploying#con-strimzi-installation-methods_str). |
# ... | ||
``` | ||
|
||
If you now check the PVCs then you will see that they are not deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you now check the PVCs then you will see that they are not deleted. | |
Checking the PVCs, we see that they are not deleted. |
data-2-my-cluster-broker-5 Bound pvc-0c126dc5-863f-4e2a-96ae-1a4fef9d8839 100Gi RWO standard 63m | ||
``` | ||
|
||
It is because they are not deleted by default, and you need to remove them yourself. You can delete the PVC's using the following command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because they are not deleted by default, and you need to remove them yourself. You can delete the PVC's using the following command. | |
It is because they are not deleted by default, and you need to remove them yourself. You can delete the PVCs using the following command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have this step in the docs @ShubhamRwt -- should we add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are showing an example where we are trying to remove a JBOD disk, then yes but if we are just showing how to use the endpoint, then no
kubectl delete pvc data-1-my-cluster-broker-3 -n myproject | ||
``` | ||
|
||
You can remove the other PVC's in the same way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the other PVC's in the same way. | |
You can remove the other PVCs in the same way. |
|
||
#### Additional notes | ||
|
||
1. This feature only works if JBOD storage is enabled and multiple disks are used else you will be prompted for not having enough volumes to move the data to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. This feature only works if JBOD storage is enabled and multiple disks are used else you will be prompted for not having enough volumes to move the data to. | |
1. This feature only works if JBOD storage is enabled and multiple disks are used, otherwise you will be prompted for not having enough volumes to move the data to. |
### What's next | ||
|
||
We hope this blog post has provided you with a clear understanding of how you can use the `KafkaRebalance` custom resource in `remove-disks` to easily move the data between the JBOD disks. | ||
If you encounter any issues or want to know more, refer to our documentation on [Using Cruise Control to reassign partitions on JBOD disks](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you encounter any issues or want to know more, refer to our documentation on [Using Cruise Control to reassign partitions on JBOD disks](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str) | |
If you encounter any issues or want to know more, refer to our documentation on [Using Cruise Control to reassign partitions on JBOD disks](https://strimzi.io/docs/operators/latest/deploying#proc-cruise-control-moving-data-str). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments. Also, some general comments:
- Kubernetes uses the term volumes - I wonder if it would be more clear if you use volumes as well instead of disks (but feel free to treat this as optional and stick with disks if you want to).
- There is no fencing of the disks. So while you cleaned the 2 volumes with the rebalance, any newly created topics might still be created on these volumes. You seem to cover that in the additional notes 3 and 4. But I think it deserves its own section to make it more clear.
Apache Kafka is a platform which provides durability and fault tolerance by storing messages on disks and JBOD storage is one of the storage configuration types supported by Strimzi. | ||
The JBOD data storage configuration allows Kafka brokers to make use of multiple disks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a bit of a disconnect / big jump in this sentence. Maybe you can lead it out with somehting like this?
Apache Kafka is a platform which provides durability and fault tolerance by storing messages on disks and JBOD storage is one of the storage configuration types supported by Strimzi. | |
The JBOD data storage configuration allows Kafka brokers to make use of multiple disks. | |
Apache Kafka is a platform that provides durability and fault tolerance by storing messages on persistent volumes. | |
In most cases, each Kafka broker will use one persistent volume. | |
However, it is also possible to use multiple volumes for each broker. | |
This configuration is called JBOD storage. |
Apache Kafka is a platform which provides durability and fault tolerance by storing messages on disks and JBOD storage is one of the storage configuration types supported by Strimzi. | ||
The JBOD data storage configuration allows Kafka brokers to make use of multiple disks. | ||
Using JBOD storage, you can increase the data storage capacity for Kafka nodes, which can further lead to performance improvements. | ||
If you plan to remove a disk that contains partition replicas, the data must be safely moved to other disks first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You talk about Moving data between JBOD disks. So it would be good to include also adding volumes -even if briefly in one or two sentences and link to the docs.
If you plan to remove a disk that contains partition replicas, the data must be safely moved to other disks first. | |
It might happen that you need to add or remove volumes to increase or shrink the overall capacity and performance of the Kafka cluster. | |
When adding new volumes, you first need to add the volume and then move some of the data to to. | |
That can be done using the _intrabroker_ rebalance: TOTO Link to docs. | |
When removing volumes, you have to first safely move the data to other volumes first. |
Failing to do so could result in data loss. | ||
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. | ||
|
||
### Cruise Control to move data between JBOD disks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following on the comment above - I don't think we want to go into details for adding volumes and focus on removing them for the rest of the blog post. So I would adjust the title accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed this tile to -> New remove-disks
mode in KafkaRebalance now
- brokerId: 3 | ||
volumeIds: [1, 2] | ||
- brokerId: 4 | ||
volumeIds: [1, 2] | ||
- brokerId: 5 | ||
volumeIds: [1, 2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAybe you can use the same YAML formatting as you have below in the output from Kubernetes? It might help to avoid confusion between the meaning of
volumeIds: [1, 2]
and
volumeIds:
- 1
- 2
among less experienced users.
@scholzj For the second pointer, I have added a separate section as |
But it is not incomplete or missing, or? That would suggest you add it for example in the next release. But AFAIK this is not supported in Kafka and there are no plans to change this, or? |
In the upstream CC PR, they have mentioned this as |
I don't think it can be done in Cruise Control. It would need to be done in Kafka so that you can fence the disks there. I think it is important to give the users the right expectations. For me:
|
Based on what I understood from the last comment, I will change the header to just -> What's missing and then update it with more context |
Signed-off-by: ShubhamRwt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left few more nits. But it looks good to me overall. Nice work, thanks.
Failing to do so could result in data loss. | ||
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool, which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. | ||
|
||
### New `remove-disks` mode in KafkaRebalance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you check the preview, the formatting looks bad due to some bad styles I guess. Maybe you can just leave out the formatting?
### New `remove-disks` mode in KafkaRebalance | |
### New remove-disks mode in KafkaRebalance |
That can be done using the [_intrabroker_](https://strimzi.io/docs/operators/in-development/deploying#con-rebalance-str) rebalance. | ||
When removing volumes, you have to first safely move the data to other volumes first. | ||
Failing to do so could result in data loss. | ||
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool, which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe check with @PaulRMellor ... but I wonder if someting like this would sound better.
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool, which is not very user-friendly, therefore in Strimzi 0.45.0 we introduced the ability to move data between the JBOD disks using Cruise Control. | |
Moving data between the JBOD disks can be done using the `kafka-reassign-partitions.sh` tool, which is not very user-friendly. | |
Therefore - in Strimzi 0.45.0 - we introduced the ability to move data between the JBOD disks using Cruise Control. |
sessionId: 028a7dc8-8f6d-485e-8580-93225528b587 | ||
``` | ||
|
||
Now you can use the `approve` annotation to apply the generated proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we link to the docs for more details about the approval? People wh try it for the first time might find that useful.
Type of change
Select the type of your PR