Confluent Kafka

Overview

Here's an overview of the data transfer process from Qlik to Actian Vector using Kafka:

Data Origin: Data originates in Qlik and is sent to Kafka topics.
Intermediary Role: Kafka acts as an intermediary to facilitate the transfer of data between Qlik and Actian Vector.
Kafka Server: The Kafka server serves as the backbone of this setup, managing topics.
Kafka Connect: A tool within the Kafka ecosystem, used to integrate Kafka with external systems.
- JDBC Sink Connector: Connects Kafka topics with Actian Vector, translating topic data into database records.
Actian Vector: The target database that stores data from Kafka topics.
Data Insertion: The JDBC Sink Connector ensures proper data insertion into Vector tables.
Schema Registry: If in use, it manages data schemas across Kafka topics to ensure compatibility between data producers (Qlik) and consumers (Actian Vector) via the JDBC connector.

This structured approach ensures efficient and accurate data transfer and storage.

Architecture

The main components for this approach are Zookeeper, Kafka Brokers, Schema Registry, Kafka Connect, and Connector Plugins.

Qlik: Serves as the source for the Kafka ecosystem components, where tasks are configured to push data into the intermediary Kafka component.
Zookeeper: Manages multiple Kafka Broker nodes within the Kafka Cluster.
Kafka Broker: Stores messages into topics and provides the means to store data in different partitions.
Schema Registry: Provides REST API endpoints to post/get schema details and uses Kafka Broker to store schema details into specific schema topics.
Kafka Connect: Connects Kafka to different target systems and provides REST API endpoints to get/post connectors.
Connectors: Plugins added to Kafka Connect Worker that create tasks and handle the insertion of data into target systems.

Starting Kafka Cluster

To start the Kafka Cluster, follow these steps:

1. Zookeeper

Zookeeper maintains different Kafka Broker nodes and helps ensure high availability of the data.

Change the configurations as required in the /etc/kafka/zookeeper.properties file.

Start Zookeeper by running:

sudo systemctl start confluent-zookeeper

2. Kafka Broker

Kafka Broker is the Kafka server that stores all the topics and handles messages.

Change the configurations as required in the /etc/kafka/server.properties file.
Start Kafka Broker by running:
```
sudo systemctl start confluent-kafka
```

3. Schema Registry

Schema Registry is a component in the Kafka ecosystem that maintains message schemas. It provides REST API endpoints, allowing Kafka producers (Qlik) and Kafka Connect (consumers) to access the message schemas.

Configure the schema-registry properties in the /etc/schema-registry/schema-registry.properties file.

Start the schema-registry by running:

sudo systemctl start confluent-schema-registry;

4. Kafka Connect

Kafka Connect is a tool used to connect Kafka to target systems. It requires connectors as plugins to handle the actual message processing and handling.

Kafka Connect can be run in distributed and standalone modes, with separate properties files at /etc/kafka/connect-distributed.properties.

Start Kafka Connect by running:

sudo systemctl start confluent-kafka-connect

After starting all the components, initiate the full load from the Qlik Replicate side.

Note: Schema Registry supports only Avro format, so select the appropriate options in the Qlik Replicate Task.