KafkaImporter crashes on RHEL 7.9

We are observing kafkaimporter crashes intermittently reporting below error

** Error in `/opt/omnisci/bin/KafkaImporter’: corrupted size vs. prev_size: 0x000000000124fbb0 **
======= Backtrace: =========
/lib64/libc.so.6(+0x7f3e4)[0x7f6a7972e3e4]
/lib64/libc.so.6(+0x82bf4)[0x7f6a79731bf4]
/lib64/libc.so.6(__libc_malloc+0x4c)[0x7f6a797346fc]
/opt/omnisci/bin/KafkaImporter(_Znwm+0x15)[0xac0b75]
/opt/omnisci/bin/KafkaImporter(ZNSt6vectorI12TStringValueSaIS0_EE17_M_realloc_insertIJRKS0_EEEvN9__gnu_cxx17__normal_iteratorIPS0_S2_EEDpOT+0x88)[0x62c218]
/opt/omnisci/bin/KafkaImporter(_Z11msg_consumePN7RdKafka7MessageER17RowToColumnLoaderN13import_export10CopyParamsERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSF_12regex_traitsIcNSF_16cpp_regex_traitsIcEEEEEESt14default_deleteISL_EESE_ISC_SM_ISC_EEESt4lessISC_ESaISD_IKSC_SR_EEEb+0xa58)[0x623608]
/opt/omnisci/bin/KafkaImporter(Z12kafka_insertR17RowToColumnLoaderRKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSA_12regex_traitsIcNSA_16cpp_regex_traitsIcEEEEEESt14default_deleteISG_EES9_IS7_SH_IS7_EEESt4lessIS7_ESaIS8_IKS7_SM_EEERKN13import_export10CopyParamsEbS7_S7_S7+0x79b)[0x6242bb]

Environment details as below :

Operating System
RHEL 7.9 . Kernel version
Linux unvpcrf02.ims.unwls.com 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux

omnisql -v
OmniSQL Version: 5.10.2-20220218-4112053580

Kafka version kafka_2.13-2.8.0

Request advise on possible solutions for this issue

Hi @bsharsha ,

is this happening just RHEL 7.9 or also with older version of the OS?

Regards,
Candido

Hi candido

No …we have been using Kafka importer in older rhel version of 7.0 and its stable …this issue we have seen only on rhel 7.9 …we infact tried to upgrade glibc to latest version but the result remains same and Kafka improter fails to start with above(as reproted by harsha ) error .

Hi @Raj_Kiran,

So I guess we use an outdated version of Kafka lib, and probably we should upgrade.

I’ll check with the engineering team and get back to you.

Regards,
Candido

Thanks @candido.dessanti . Shall await your further inputs .We use this utility extensively and hence it has become blocker for us .

@candido.dessanti - Request advise on the feedback received from engineering team.

Regards , Harsha

HI,

I haven’t any news. We aren’t able to reproduce, so I’ìm going to compile a more recent libkafkard with the tool and give it to you, but I need some time.

Regards,
Candido

Thank you for the feedback @candido.dessanti . Appreciate :+1:

We shall await for your suggestions further. Please feel free to keep us posted if you may require any debug info from our labs which may assist for your analysis.

Hi @bsharsha and @Raj_Kiran ,

I compiled the utility with a recent version of the library (1.8.2), and shared it with our google drive

Let me know if the problem is resolved.

Regards,
Candido

Thank you @candido.dessanti . On verification , we still observe module crashes with below memory corruption error output. Kernel version details shared for reference . Request your further advise on same , please keep posted if you may need any further details

Observation

[volte@unvpcrf02 scripts]$ tr*** Error in `/opt/volte/KafkaImporter’: malloc(): memory corruption: 0x0000000001d5cb30 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x82aa6)[0x7f38ddef0aa6]
/lib64/libc.so.6(__libc_malloc+0x4c)[0x7f38ddef36fc]
/opt/volte/KafkaImporter(_Znwm+0x15)[0xb82655]
/opt/volte/KafkaImporter(_ZN17RowToColumnLoader18get_row_descriptorEv+0x5e)[0x64e20e]
/opt/volte/KafkaImporter(_Z11msg_consumePN7RdKafka7MessageER17RowToColumnLoaderN13import_export10CopyParamsERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSF_12regex_traitsIcNSF_16cpp_regex_traitsIcEEEEEESt14default_deleteISL_EESE_ISC_SM_ISC_EEESt4lessISC_ESaISD_IKSC_SR_EEEb+0x16c)[0x6351bc]
/opt/volte/KafkaImporter(Z12kafka_insertR17RowToColumnLoaderRKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4pairISt10unique_ptrIN5boost11basic_regexIcNSA_12regex_traitsIcNSA_16cpp_regex_traitsIcEEEEEESt14default_deleteISG_EES9_IS7_SH_IS7_EEESt4lessIS7_ESaIS8_IKS7_SM_EEERKN13import_export10CopyParamsEbS7_S7_S7+0x79b)[0x63675b]
/opt/volte/KafkaImporter(main+0x196d)[0x63195d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f38dde90555]
/opt/volte/KafkaImporter[0x634d2e]
======= Memory map: ========
00400000-00e11000 r-xp 00000000 fd:00 301657947 /opt/volte/KafkaImporter
01010000-01067000 r-xp 00a10000 fd:00 301657947 /opt/volte/KafkaImporter
01067000-0107a000 rwxp 00a67000 fd:00 301657947 /opt/volte/KafkaImporter
0107a000-01086000 rwxp 00000000 00:00 0
01d37000-01dc1000 rwxp 00000000 00:00 0 [heap]
7f38c4000000-7f38c4021000 rwxp 00000000 00:00 0
7f38c4021000-7f38c8000000 —p 00000000 00:00 0
7f38c8000000-7f38c8021000 rwxp 00000000 00:00 0
7f38c8021000-7f38cc000000 —p 00000000 00:00 0
7f38cc000000-7f38cc021000 rwxp 00000000 00:00 0
7f38cc021000-7f38d0000000 —p 00000000 00:00 0
7f38d0000000-7f38d0021000 rwxp 00000000 00:00 0
7f38d0021000-7f38d4000000 —p 00000000 00:00 0
7f38d4000000-7f38d4021000 rwxp 00000000 00:00 0
7f38d4021000-7f38d8000000 —p 00000000 00:00 0
7f38db240000-7f38db255000 r-xp 00000000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7f38db255000-7f38db454000 —p 00015000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7f38db454000-7f38db455000 r-xp 00014000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7f38db455000-7f38db456000 rwxp 00015000 fd:00 86 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7f38db456000-7f38db462000 r-xp 00000000 fd:00 202400 /usr/lib64/libnss_files-2.17.so
7f38db462000-7f38db661000 —p 0000c000 fd:00 202400 /usr/lib64/libnss_files-2.17.so
7f38db661000-7f38db662000 r-xp 0000b000 fd:00 202400 /usr/lib64/libnss_files-2.17.so
7f38db662000-7f38db663000 rwxp 0000c000 fd:00 202400 /usr/lib64/libnss_files-2.17.so
7f38db663000-7f38db669000 rwxp 00000000 00:00 0
7f38db669000-7f38db66a000 —p 00000000 00:00 0
7f38db66a000-7f38dbe6a000 rwxp 00000000 00:00 0
7f38dbe6a000-7f38dbe6b000 —p 00000000 00:00 0
7f38dbe6b000-7f38dc66b000 rwxp 00000000 00:00 0
7f38dc66b000-7f38dc66c000 —p 00000000 00:00 0
7f38dc66c000-7f38dce6c000 rwxp 00000000 00:00 0
7f38dce6c000-7f38dce6d000 —p 00000000 00:00 0
7f38dce6d000-7f38dd66d000 rwxp 00000000 00:00 0
7f38dd66d000-7f38dd66e000 —p 00000000 00:00 0
7f38dd66e000-7f38dde6e000 rwxp 00000000 00:00 0
7f38dde6e000-7f38de032000 r-xp 00000000 fd:00 202382 /usr/lib64/libc-2.17.so
7f38de032000-7f38de231000 —p 001c4000 fd:00 202382 /usr/lib64/libc-2.17.so
7f38de231000-7f38de235000 r-xp 001c3000 fd:00 202382 /usr/lib64/libc-2.17.so
7f38de235000-7f38de237000 rwxp 001c7000 fd:00 202382 /usr/lib64/libc-2.17.so
7f38de237000-7f38de23c000 rwxp 00000000 00:00 0
7f38de23c000-7f38de253000 r-xp 00000000 fd:00 202408 /usr/lib64/libpthread-2.17.so
7f38de253000-7f38de452000 —p 00017000 fd:00 202408 /usr/lib64/libpthread-2.17.so
7f38de452000-7f38de453000 r-xp 00016000 fd:00 202408 /usr/lib64/libpthread-2.17.so
7f38de453000-7f38de454000 rwxp 00017000 fd:00 202408 /usr/lib64/libpthread-2.17.so
7f38de454000-7f38de458000 rwxp 00000000 00:00 0
7f38de458000-7f38de559000 r-xp 00000000 fd:00 202390 /usr/lib64/libm-2.17.so
7f38de559000-7f38de758000 —p 00101000 fd:00 202390 /usr/lib64/libm-2.17.so
7f38de758000-7f38de759000 r-xp 00100000 fd:00 202390 /usr/lib64/libm-2.17.so
7f38de759000-7f38de75a000 rwxp 00101000 fd:00 202390 /usr/lib64/libm-2.17.so
7f38de75a000-7f38de75c000 r-xp 00000000 fd:00 202388 /usr/lib64/libdl-2.17.so
7f38de75c000-7f38de95c000 —p 00002000 fd:00 202388 /usr/lib64/libdl-2.17.so
7f38de95c000-7f38de95d000 r-xp 00002000 fd:00 202388 /usr/lib64/libdl-2.17.so
7f38de95d000-7f38de95e000 rwxp 00003000 fd:00 202388 /usr/lib64/libdl-2.17.so
7f38de95e000-7f38de980000 r-xp 00000000 fd:00 202375 /usr/lib64/ld-2.17.so
7f38deb65000-7f38deb6d000 rwxp 00000000 00:00 0
7f38deb7c000-7f38deb7f000 rwxp 00000000 00:00 0
7f38deb7f000-7f38deb80000 r-xp 00021000 fd:00 202375 /usr/lib64/ld-2.17.so
7f38deb80000-7f38deb81000 rwxp 00022000 fd:00 202375 /usr/lib64/ld-2.17.so
7f38deb81000-7f38deb82000 rwxp 00000000 00:00 0
7ffc5c0bb000-7ffc5c0dd000 rwxp 00000000 00:00 0 [stack]
7ffc5c178000-7ffc5c17a000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Kernel details
uname -a
Linux unvpcrf02.ims.unwls.com 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
Server version
Red Hat Enterprise Linux Server release 7.9 (Maipo)

KafkaImporter version
md5sum /opt/volte/KafkaImporter
66858f5fa0d0a47287bf835d515de27f /opt/volte/KafkaImporter

Hi,

could you share with us the DDL of the table you are loading, the line command you are using, and the logs of the KafkaImport itself (you can find them in a directory called log created typically under the path you are running the exec

The log directory look like this

candido@zion-legion:/mapd_storage/github/alter_session/heavydb-internal/build/bin/log$ ls -ltr
total 100
-rw-rw-r-- 1 candido candido 12520 lug  1 16:21 KafkaImporter.INFO.20220701-162113.log
-rw-rw-r-- 1 candido candido 12435 lug  4 10:33 KafkaImporter.INFO.20220704-103342.log
-rw-rw-r-- 1 candido candido 12435 lug  4 10:34 KafkaImporter.INFO.20220704-103359.log
-rw-rw-r-- 1 candido candido 12435 lug  4 10:36 KafkaImporter.INFO.20220704-103519.log
-rw-rw-r-- 1 candido candido 12435 lug  4 10:37 KafkaImporter.INFO.20220704-103625.log
lrwxrwxrwx 1 candido candido    38 lug  4 10:39 KafkaImporter.INFO -> KafkaImporter.INFO.20220704-103907.log
-rw-rw-r-- 1 candido candido 12435 lug  4 10:39 KafkaImporter.INFO.20220704-103907.log

You can also try to bump up the log level to DEBUG1 or DEBUG2 with the --log-severity switch of utility.

Regards,
Candido

Hello @candido.dessanti, Please find the requested DDL and log files. KafkaImporter file md5sum and crash time as well mentioned in the file. Let me know if you need more data.

Hello @candido.dessanti.
Also find the command line which we will be using to start the module

    nohup /opt/omnisci/bin/KafkaImporter WDBS_GX_REQUEST voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222  --topic GxReqTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.GXREQUEST.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 --print_error > /dev/null &
	
    nohup /opt/omnisci/bin/KafkaImporter WDBS_GX_RESPONSE voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic GxResTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.GXRESPONSE.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &
	
    nohup /opt/omnisci/bin/KafkaImporter WDBS_RX_REQUEST voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic RxReqTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.RXREQUEST.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &
	
    nohup /opt/omnisci/bin/KafkaImporter WDBS_RX_RESPONSE voltetracker -u volte -p volte123 --port 6274 --batch 1 --delim ',' --host 10.0.1.222 --brokers 10.0.1.222 --topic RxResTopic --group-id TRACKER --log-directory '/opt/volte/var/log/' --log-file-name %d-%m-%Y.kafkaImp.RXRESPONSE.%H%M.log --log-severity DEBUG2 --log-rotate-daily 1 --log-rotation-size 524288000 > /dev/null &

Hi @VishnuCleago,

I have been able to get a crash on the tool with this message.

malloc(): invalid next size (unsorted)
Aborted (core dumped)

I had to change the message, changing the size of integer between a message an another; I’ll go deeper with a debug build tomorrow

1 Like

Hi @candido.dessanti

Thank you

Hi guys,

I have some news, and they are, as usual, bad and good at the same time.

The good news is that we’re able to reproduce your issue with your DDLs and DATAs on Centos 7.9.

I haven’t tried with the older version, so I’m taking for granted that with the same version of the database you aren’t experiencing the issue on previous versions of CentOS.
The same executable with the same data and Heavydb version works on Ubuntu, so it’s likely that’s an issue afflicting CentOS only.

The bad news is that we aren’t able to fix it right now because isn’t clear which library is making the system crash (I had the utility crash on a different segment of code randomly). So I’m going to open an internal ticket about that, and I hope it’ll be fixed.

In the meantime, you can use an older version of KafkaImporter, that looks to work in CentOS, with your DDLs and data.

For your convenience, you can download it here (it’s the 5.5.3 version)

Let me know if the workaround works.

Regards,
Candido

Hi,

As an update, the engineering team came out with a solution to the problem

I’m sharing the fixed version. Let us know if this fix your issue

best regards,
Candido

Hi @candido.dessanti, thank you for your great assistance along with us.

Hi,

I think I could be faster at solving that, but without the data, you gave me we couldn’t be able reproducing the issue; I’d ask for it before.

Anyway have you tried the new executable? is it working for you?

Hi @candido.dessanti, We have tested with the executable you have shared and we could not see any failure/crashes as of now. We can observe its working with our data. We shall inform you if any crashes are getting observed while testing. Thank you