Migrating Existing Qlik Tasks to Kafka

Back up your existing Qlik tasks

If you are using the accelerator for the first time, create source and target endpoints (Kafka). Then, create a new task with all the required tables.
If you have previously used earlier versions of the accelerator, you need to export all your existing tasks before upgrading the accelerator as part of the standard process.

Replace Existing Actian Vector Target Endpoint Connection with Kafka

After completing the installation and ensuring all components are running, create a Kafka endpoint as follows. In previous versions, different target endpoints were created based on the schemas of Actian Vector. Now, only one Kafka endpoint is needed.

Kafka Broker List:

Example: localhost:9092, localhost:9093, localhost:9094
Topic Naming Convention:

Example: source.schema.table
Message Properties:
1. Set Avro format.
2. Set Compression to Snappy.
3. Select Use Logical data types for specific data types in Message Properties.
4. Select Encode the message key in Avro format in Message Properties.
Data Message Publishing:
1. Select Separate topic for each table.
2. Select By message key as the Partition strategy.
3. Select Primary key columns as the Message key.
Metadata Message Publishing:
1. Select Publish data schemas to Confluent Schema Registry for Publish label.
2. Input the Schema Registry server(s) (for example, localhost:8081).
Schema Registry Subject Properties:
1. Select Topic name for Subject name strategy.
2. Select Use Schema Registry defaults for Subject compatibility mode.
Test and Save:
1. Once all the required information is entered, click on Test Connection.
2. If the test connection is successful, click Save. Otherwise, fix the errors and save the connection.

Import Existing Qlik Tasks

Import Tasks:
- Once the Kafka endpoint is ready, import all previously exported tasks.
- Before importing, change the task name if you want to keep both existing tasks and tasks with Kafka as the target endpoint.
Change Target Endpoint to Kafka Endpoint:

After successfully importing all tasks, navigate to each task, remove the existing endpoint, and replace it with the Kafka endpoint. Save the task.
Apply Global Transformations:

Add global transformations to include table names and column names with placeholders to support special characters via Global Rules in Qlik Replication Tool. Save the task. Refer to Support Special Characters in Table and Column Names via Global Rules in Qlik Replication.
Remove Actian Vector Metadata:

After importing the task, open Task Settings and clear the target table schema details.

Create Sink Connectors

Create Sink Connector:
- Use the Qlik task JSON file exported before the migration to create a sink connector, which replicates data from Kafka to the target Actian Vector database.
- Access the sink connector API at http://{IP Address}/Swagger/index.html.
Create Sink Connector with Table Creation: If creating a new replication (target tables are not available), use the following endpoint to create target tables along with sink connectors:
- SourceDatabaseType: Select the source database type (SQL, ORACLE, DB2).
- SourceDatabaseConnectionString: Connection string of the source database.
- TargetDatabaseUserName: Target database username.
- TargetDatabasePassword: Target database password.
- NumberOfKafkaConnector: Number of Kafka connectors to be created (default 5).
- Request Body: Post the Qlik task JSON file to create the target tables and sink connectors.
Create Sink Connector without Table Creation: If resuming an existing process (full load already completed), use the following endpoint to create Kafka connectors without dropping and recreating tables:
- TargetDatabaseUserName: Target database username.
- TargetDatabasePassword: Target database password.
- NumberOfKafkaConnector: Number of Kafka connectors to be created (default 5).
- Request Body: Post the Qlik task JSON file to distribute tables into different Kafka connectors for replication.
Run Qlik Task: Once the target table schema is ready, run the Qlik task to reload the target or resume processing.

Start the Qlik Task

Ensure target tables are created, sink connector configurations are posted, and the task is exported and pasted into the Sink API Manager.
Start the Qlik task to begin data replication.

Note: Exporting the task correctly is crucial for replication to work smoothly.

Monitor Replication in Kafka UI

Monitor Progress:
1. Use Kafka UI to monitor replication progress.
2. Check topic-wise message flow and consumer lag.
  - Topic-wise message flow: Each topic corresponds to a table. You will see the number of messages per topic.
  - Consumer Lag:
    - If consumer lag = 0, all records in that topic have been moved to the target database.
    - If lag > 0, there are still records pending processing.

Failure Handling: If replication failures occur, check DLQ topics for failed records and exceptions.

Connector Failures
- Check the Sink Connector status in the Kafka Connect UI or via the API.
- If the connector has failed or is in a degraded state, replication to the target database will stop.
- Review the connector logs for:
  - Schema mismatches
  - Network timeouts
  - Target database constraints or errors
- Restart or redeploy the connector after resolving the issue.

Monitor Broker Health and Disk Usage

To ensure stability, monitor broker health status and disk space consumption.

Note: Overloaded brokers or disks may lead to increased lag or replication failures.

Recommendations

Run the following source and target-specific SQL queries to monitor ongoing data replication.

Oracle:

SELECT SUM(row_count) AS total_rows

FROM (

SELECT COUNT(*) AS row_count FROM R12QA2_REP.AP_AE_HEADERS_ALL_1

UNION ALL

SELECT COUNT(*) FROM R12QA2_REP.SAMPLE_ALL_DATATYPES

);

MSSQL:

SELECT SUM(p.rows) AS total_rows

FROM sys.tables t

JOIN sys.partitions p ON t.object_id = p.object_id

WHERE t.name IN ('table1', 'table2') AND p.index_id IN (0,1);

Actian Vector:

select sum(num_rows)

from iitables

where table_name in () and  table_owner like 'kafkalatest%'