|
| 1 | +# Kafka replicator |
| 2 | + |
| 3 | +*Kafka Replicator* is an easy to use tool for copying data between two Apache Kafka clusters with configurable re-partitionning strategy. |
| 4 | + |
| 5 | +Data will be read from topics in the origin cluster and written to a topic/topics in the destination cluster according config rules. |
| 6 | + |
| 7 | + |
| 8 | +# Features |
| 9 | + |
| 10 | +Lets start with an overview of features that exist in kafka-replicator: |
| 11 | + |
| 12 | + * [x] *Data replication:* Real-time event streaming between Kafka clusters and data centers; |
| 13 | + * [ ] *Schema replication:* Copy schema from source cluster to destination; |
| 14 | + * [x] *Flexible topic selection:* Select topics with configurable config; |
| 15 | + * [ ] *Auto-create topics:* Destination topics are automatically created for strict_p2p strategy; |
| 16 | + * [x] *Stats:* The tool shows replication status; |
| 17 | + * [ ] *Monitoring:* Kafka replicator exports stats via prometheus. |
| 18 | + * [-] *Cycle detection* |
| 19 | + |
| 20 | + |
| 21 | + |
| 22 | +# Use cases |
| 23 | + |
| 24 | + * Replicate data between Kafka clusters; |
| 25 | + * Aggregate record from several topics and put them into one; |
| 26 | + * Extend bandwidth for exist topic via repartitioning strategy. |
| 27 | + |
| 28 | + |
| 29 | +# Installation |
| 30 | + |
| 31 | + |
| 32 | +### System dependencies |
| 33 | + |
| 34 | +``` shell |
| 35 | +libsasl2-dev |
| 36 | +libssl-dev |
| 37 | +``` |
| 38 | + |
| 39 | + |
| 40 | +## Install from crates.io |
| 41 | + |
| 42 | + |
| 43 | +If you have the Rust toolchain already installed on your local system. |
| 44 | + |
| 45 | +``` shell |
| 46 | +rustup update stable |
| 47 | +cargo install kafka-replicator |
| 48 | +``` |
| 49 | + |
| 50 | + |
| 51 | +## Compile and run it from sources |
| 52 | + |
| 53 | +Clone the repository and change it to your working directory. |
| 54 | + |
| 55 | +```shell |
| 56 | +git clone https://github.com/lispython/kafka-replicator.git |
| 57 | +cd kafka-replicator |
| 58 | + |
| 59 | +rustup override set stable |
| 60 | +rustup update stable |
| 61 | +cargo install |
| 62 | +``` |
| 63 | + |
| 64 | + |
| 65 | +# Usage |
| 66 | + |
| 67 | +``` shell |
| 68 | +RUST_LOG=info kafka-replicator /path/to/config.yml |
| 69 | +``` |
| 70 | + |
| 71 | +## Run it using Docker |
| 72 | + |
| 73 | + |
| 74 | +``` shell |
| 75 | +sudo docker run -it -v /replication/:/replication/ -e RUST_LOG=info lispython/kafka_replicator:latest kafka-replicator /replication/config.yml |
| 76 | +``` |
| 77 | + |
| 78 | +### Example config |
| 79 | + |
| 80 | + |
| 81 | +``` yaml |
| 82 | +clusters: |
| 83 | + - name: cluster_1 |
| 84 | + hosts: |
| 85 | + - replicator-kafka-1:9092 |
| 86 | + - replicator-kafka-1:9092 |
| 87 | + - name: cluster_2 |
| 88 | + hosts: |
| 89 | + - replicator-kafka-2:9092 |
| 90 | + |
| 91 | +clients: |
| 92 | + - client: cl_1_client_1 |
| 93 | + cluster: cluster_1 |
| 94 | + config: # optional |
| 95 | + message.timeout.ms: 5000 |
| 96 | + auto.offset.reset: earliest |
| 97 | + - client: cl_2_client_1 |
| 98 | + cluster: cluster_2 |
| 99 | + |
| 100 | +routes: |
| 101 | + - upstream_client: cl_1_client_1 |
| 102 | + downstream_client: cl_1_client_1 |
| 103 | + upstream_topics: |
| 104 | + - 'topic1' |
| 105 | + downstream_topic: 'topic2' |
| 106 | + repartitioning_strategy: random # strict_p2p | random |
| 107 | + upstream_group_id: group_22 |
| 108 | + progress_every_secs: 10 |
| 109 | + limits: |
| 110 | + messages_per_sec: 10000 |
| 111 | + number_of_messages: |
| 112 | + |
| 113 | + - upstream_client: cl_1_client_1 |
| 114 | + downstream_client: cl_2_client_1 |
| 115 | + upstream_topics: |
| 116 | + - 'topic2' |
| 117 | + downstream_topic: 'topic2' |
| 118 | + repartitioning_strategy: strict_p2p |
| 119 | + upstream_group_id: group_22 |
| 120 | + progress_every_secs: 10 |
| 121 | + |
| 122 | + - upstream_client: cl_2_client_1 |
| 123 | + downstream_client: cl_1_client_1 |
| 124 | + upstream_topics: |
| 125 | + - 'topic2' |
| 126 | + downstream_topic: 'topic3' |
| 127 | + repartitioning_strategy: strict_p2p # strict_p2p | random |
| 128 | + default_begin_offset: earliest # optional |
| 129 | + upstream_group_id: group_2 |
| 130 | + progress_every_secs: 10 |
| 131 | + |
| 132 | + |
| 133 | +observers: |
| 134 | + - client: cl_1_client_1 |
| 135 | + name: "my name" |
| 136 | + topics: |
| 137 | + - 'topic1' |
| 138 | + - 'topic2' |
| 139 | + fetch_timeout_secs: 5 |
| 140 | + show_progress_every_secs: 10 |
| 141 | + |
| 142 | + - client: cl_2_client_1 |
| 143 | + topic: 'topic3' |
| 144 | + topics: |
| 145 | + - 'topic2' |
| 146 | + show_progress_every_secs: 5 |
| 147 | + |
| 148 | + |
| 149 | + - client: cl_1_client_1 |
| 150 | + topic: 'topic1' |
| 151 | + topics: [] |
| 152 | +``` |
| 153 | +
|
| 154 | +
|
| 155 | +### Options describing |
| 156 | +
|
| 157 | +Root config options: |
| 158 | + - _clusters_ - are a list of Kafka Clusters |
| 159 | + - _clients_ - are a list of configurations for consumers |
| 160 | + - _routes_ - are a list of replication rules |
| 161 | + - _observers_ - are a list of observers |
| 162 | +
|
| 163 | +
|
| 164 | +### Contributing |
| 165 | +Any suggestion, feedback or contributing is highly appreciated. Thank you for your support! |
0 commit comments