Confluent Kafka
Overview
Here's an overview of the data transfer process from Qlik to Actian Vector using Kafka:
Data Origin: Data originates in Qlik and is sent to Kafka topics.
Intermediary Role: Kafka acts as an intermediary to facilitate the transfer of data between Qlik and Actian Vector.
Kafka Server: The Kafka server serves as the backbone of this setup, managing topics.
-
Kafka Connect: A tool within the Kafka ecosystem, used to integrate Kafka with external systems.
JDBC Sink Connector: Connects Kafka topics with Actian Vector, translating topic data into database records.
Actian Vector: The target database that stores data from Kafka topics.
Data Insertion: The JDBC Sink Connector ensures proper data insertion into Vector tables.
Schema Registry: If in use, it manages data schemas across Kafka topics to ensure compatibility between data producers (Qlik) and consumers (Actian Vector) via the JDBC connector.
This structured approach ensures efficient and accurate data transfer and storage.
Architecture
The main components for this approach are Zookeeper, Kafka Brokers, Schema Registry, Kafka Connect, and Connector Plugins.
Qlik: Serves as the source for the Kafka ecosystem components, where tasks are configured to push data into the intermediary Kafka component.
Zookeeper: Manages multiple Kafka Broker nodes within the Kafka Cluster.
Kafka Broker: Stores messages into topics and provides the means to store data in different partitions.
Schema Registry: Provides REST API endpoints to post/get schema details and uses Kafka Broker to store schema details into specific schema topics.
Kafka Connect: Connects Kafka to different target systems and provides REST API endpoints to get/post connectors.
Connectors: Plugins added to Kafka Connect Worker that create tasks and handle the insertion of data into target systems.
Starting Kafka Cluster
To start the Kafka Cluster, follow these steps:
1. Zookeeper
Zookeeper maintains different Kafka Broker nodes and helps ensure high availability of the data.
Change the configurations as required in the
/etc/kafka/zookeeper.propertiesfile.-
Start Zookeeper by running:
sudo systemctl start confluent-zookeeper
2. Kafka Broker
Kafka Broker is the Kafka server that stores all the topics and handles messages.
Change the configurations as required in the
/etc/kafka/server.propertiesfile.-
Start Kafka Broker by running:
sudo systemctl start confluent-kafka
3. Schema Registry
Schema Registry is a component in the Kafka ecosystem that maintains message schemas. It provides REST API endpoints, allowing Kafka producers (Qlik) and Kafka Connect (consumers) to access the message schemas.
Configure the schema-registry properties in the
/etc/schema-registry/schema-registry.propertiesfile.-
Start the schema-registry by running:
sudo systemctl start confluent-schema-registry;
4. Kafka Connect
Kafka Connect is a tool used to connect Kafka to target systems. It requires connectors as plugins to handle the actual message processing and handling.
Kafka Connect can be run in distributed and standalone modes, with separate properties files at
/etc/kafka/connect-distributed.properties.-
Start Kafka Connect by running:
sudo systemctl start confluent-kafka-connect
After starting all the components, initiate the full load from the Qlik Replicate side.
Note: Schema Registry supports only Avro format, so select the appropriate options in the Qlik Replicate Task.