How to use KafkaImporter transform option

Using omniSci KafkaImporter to load Kafka Topic data [String with comma separated] to omniSci Table.
How to use transform option in KafkaImporter

  1. To insert -1 for parameters which are missing in Topic data record.
  2. Combine 2 values of topic data record[Date and Time ] and insert a TIMESTAMP column of table.

Hi @zachariam,

The transformation option of KafkaImporter works like the one of StreamImporter so that you can apply a regular expression to one or more fields, and it can’t add or remove columns.

The transformation format is like this
Transformation format: <column name>:s/<regex pattern>/<fmt string>/
and you can specify transformations for more than one column using commas as delimiters, for example, using -t f1:"s/2002/02/", f2:"s/17/12/"
will transform the fields called f1 and f2, and the output of the transformation would be

transform f1: s/2002/02/
transform f2: s/17/12/

You can use the transformations you want using a syntax compatible with regex_replace of boost.

If you want to combine or remove fields using only the HeavyAI’s Kafkaimporter, you must transform the data after the load.

I can create a Jira to track this new feature request.

Regrads,
Candido

can you give regular expression to replace NULL or empty data to replace with -1. i tried s//-1 , but not working

Hi Candido,

Thank you for sharing transform format details of Kafka Importer. Can u please suggest how to achieve if value itself is not present in topic data to use reg expression and replace.

We have a set of Events data
EVENT 1 (Topic1) : Param1, Param2,Param3,Params4
EVENT 2 (Topic2) : Param1,Param5

Table EVENT_TAB to load each event data in real time .
EVENT_TAB(PARAM1 TEXT, PARAM2 TEXT,PARAM3 TEXT,PARAM4 TEXT,PARAM5 TEXT)

so in case of
EVENT1 : Param5 should be inserted as -1 in EVENT_TAB and rest values same as that of data published.
ex: insert into EVENT_TAB(Param1,param2,param3,param4,-1)

EVENT2: Details which are not present should be filled with -1.
insert into EVENT_TAB(Param1,-1,-1,-1,Param5) .

Looking forward to your valuable suggestions.

regular expression of multiple field not working, i given 2 replace but only first one is working.
–transform NAME:“s/zacharia/mathew/”, MARK:“s/20/12” , i given like this in KafkaImporter and send kafka data as “zacharia,20” , but in table only NAME column value replaced, MARK column value not changed.

select * from EMP;

NAME|MARK
mathew|20

Hi @zachariam ,

The teansformation for each field would end with a slash character to work.
Thats the reason why mark hasn’t been changed.

I tried to replace a non existent value ith something elae rhia afternoon bu it hasn’t worked.
I had to use a space to replace an arbitray value with the expression s/ /-1/
Using s//-1 hasn’t worked. I guess that I should go more in deep the syntax of regex_replace syntax

i tried --transform NAME:“s/zacharia/mathew/”,MARK:“s/20/12/” , but not working,

select * from EMP;

NAME|MARK
mathew|20

can u please test and verify it, please confirm for your test case is it working or not, i don’t know why it is not working for me.

Hi,

Ah, I’m sorry but it’s needed also a space between the two transformations so
NAME:“s/zacharia/mathew/”, MARK:“s/20/12/” it’s also working using just 1 space without the comma
NAME:“s/zacharia/mathew/” MARK:“s/20/12/”

I have to double-check the source of the commands and extend our docs for StramInserver and KafkaImporter, because they obliviously lack some pieces of information.

Thanks for pointing me out to this.

Regards,
Candido

1 Like