Kafka ecosystem

From Luis Gallego Hurtado - Not Another IT guy
Jump to: navigation, search


https://www.infoq.com/articles/apache-kafka-best-practices-to-optimize-your-deployment/

High Availability

Integrating a messaging system with a Persistent storage

In event-driven architectures, events stored as messages are handled by consumers, which may eventually, store data into a persistent storage, such as a RDBMS, or any other database.

In such architectures, all business logic should be driven from events so no business logic should be driven from persisted data, as persistent systems are considered the end of the pipeline.

This is completely different to traditional architectures, where data is read by micro-services, which run business logic according to such data.

In order to build an event-driven pipeline, where events are produced and consumed, companies with a traditional system, may be interested on retrieving messages from persisted data.

In Kafka, this can be performed with solutions based on Kafka Connectors:

  • Attunity Replicate: I do not recommend this option, as an integration platform. Pricing is quite expensive, and Attunity did not offer any trial version for performing any Proof of Concept.
  • Striim: I could run a Proof of Concept of Striim, with a Trial version. Striim provided 2 sort of connectors for this scenario, a JDBC connector which pushed messages for all existing data, and a SQL Server Change Data Capture (CDC) Connector which pushed new CRUD events into the pipeline and finally into Kafka. Unfortunately, there was not a straight way to perform a fault-tolerance way which loaded messages for existing data and then, loaded messages for new CRUD events.
  • Debezium: I run a Proof of Concept which pushed SQL Server Change Data Capture (CDC) events into Kafka, through the SQL connector, with different configurations. This tool probed to be the most flexible, allowing to either receive only messages for new CRUD events of the monitored table or also messages for existing data.

Tools

Monitoring

UIs

Kafka Tool

An easy to use and quick to install Kafka UI.

CMAK, former Kafka Manager

CMAK is a complete UI, with support for JMX-based Kafka Broker metrics.

Installation notes

Note that Kafka UI needs to connect to a Kafka cluster where some data will be stored. Setup Kafka Manager: set in Application.conf the zookeeper server to which the UI will connect. By default, port is 9000.

Run Kafka Manager

./bin/kafka-manager

In windows, depending on the installation path, there may be some errors on running Kafka Manager, due to length of characters in command line. Open the browser at http://localhost:9000 Add the Kafka cluster, by setting up the Zookeeper URL. For basic monitoring, its not necessary to enable JMX in Kafka brokers.

Schema Registries