[Opendnssec-user] segfault after system upgrade.

Fred.Zwarts F.Zwarts at KVI.nl
Mon Jan 9 15:18:09 UTC 2017


On our test system we have been running ods 2.0.3 with softhsm 2.2.0 for a 
few weeks without problems.
Last week we upgraded the system from
SUSE Linux Enterprise Server 12 (x86_64) SP1
to SP2.
After this upgrade the enforcer exits with a segfault a short time after 
startup.
In the system log we see:

2017-01-09T15:19:37.958829+01:00 kvivs20 ods-enforcerd: [engine] running as 
pid 17890
2017-01-09T15:19:37.959069+01:00 kvivs20 ods-enforcerd: [engine] enforcer 
started
2017-01-09T15:19:37.970328+01:00 kvivs20 ods-enforcerd: [enforcer] update 
zone: 15.125.129.in-addr.arpa
2017-01-09T15:19:37.978189+01:00 kvivs20 ods-enforcerd: [enforcer] update 
zone: 27.125.129.in-addr.arpa
2017-01-09T15:19:37.983407+01:00 kvivs20 ods-enforcerd: [enforcer] update 
zone: 37.125.129.in-addr.arpa
2017-01-09T15:19:37.988586+01:00 kvivs20 ods-enforcerd: [enforcer] update 
zone: 40.125.129.in-addr.arpa
2017-01-09T15:19:38.173046+01:00 kvivs20 kernel: [432557.821200] 
ods-enforcerd[17892]: segfault at 7efc1b23aff8 ip 00007efc1cd1d6bc sp 
00007efc1b23b000 error 6 in libc-2.22.so[7efc1cca4000+19a000]
2017-01-09T15:19:47.908556+01:00 kvivs20 systemd-coredump[17896]: Process 
17890 (ods-enforcerd) of user 0 dumped core.

It looks as if there is a problem with zone 40.125.129.in-addr.arpa or 
56.125.129.in-addr.arpa, because somewhere in the processing of these zones 
the error occurs each time. 40 is the last one mentioned, 56 is the first 
one not mentioned.

If have tried to get a trace-back with valgrind, but that fails with an 
internal error in valgrind:
# valgrind ods-enforcerd -d

==16788== Memcheck, a memory error detector
==16788== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==16788== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==16788== Command: /usr/local/sbin/ods-enforcerd -d
==16788==

vex: the `impossible' happened:
   isZeroU
vex storage: T total 535943568 bytes allocated
vex storage: P total 640 bytes allocated

valgrind: the 'impossible' happened:
   LibVEX called failure_exit().

host stacktrace:
==16788==    at 0x3803D1C8: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3803D2F4: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3803D531: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3803D55A: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x38057F02: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x380FF028: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3810BF2D: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3810F9E1: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x38110A5E: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x38112345: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x381133F3: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x380FC885: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3805A3D3: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3808AD1A: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3808C9DF: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)
==16788==    by 0x3809BA7A: ??? (in 
/usr/lib64/valgrind/memcheck-amd64-linux)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 16788)
==16788==    at 0x6101260: ??? (in /lib64/libcrypto.so.1.0.0)
==16788==    by 0x60E4011: EC_POINT_mul (in /lib64/libcrypto.so.1.0.0)
==16788==    by 0x60EBC97: EC_KEY_check_key (in /lib64/libcrypto.so.1.0.0)
==16788==    by 0x60EC06D: EC_KEY_set_public_key_affine_coordinates (in
/lib64/libcrypto.so.1.0.0)
==16788==    by 0x61A0542: FIPS_selftest_ecdsa (in 
/lib64/libcrypto.so.1.0.0)
==16788==    by 0x619BEE9: FIPS_selftest (in /lib64/libcrypto.so.1.0.0)
==16788==    by 0x619ABF4: FIPS_module_mode_set (in 
/lib64/libcrypto.so.1.0.0)
==16788==    by 0x607616B: FIPS_mode_set (in /lib64/libcrypto.so.1.0.0)
==16788==    by 0x6072B5F: OPENSSL_init_library (in 
/lib64/libcrypto.so.1.0.0)
==16788==    by 0x400EC09: call_init.part.0 (in /lib64/ld-2.22.so)
==16788==    by 0x400ECF2: _dl_init (in /lib64/ld-2.22.so)
==16788==    by 0x4001189: ??? (in /lib64/ld-2.22.so)
==16788==    by 0x1: ???
==16788==    by 0xFFF00078E: ???
==16788==    by 0xFFF0007AC: ???


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

Then I found the other thread in this mailing list about segfaults. I am not 
familiar with the debugger. It tried the following:
I rebuilt ods with make clean; make CFLAGS="-g -O0"
Then I entered the /enforcer/src directory and tried:

# gdb -e ./ods-enforcerd
GNU gdb (GDB; SUSE Linux Enterprise 12) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) run -d
Starting program: /downloads/opendnssec-2.0.3/enforcer/src/ods-enforcerd -d
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
OpenDNSSEC key and signing policy enforcer version 2.0.3
[New Thread 0x7ffff5584700 (LWP 18228)]
[New Thread 0x7ffff4d83700 (LWP 18229)]
[New Thread 0x7fffeffff700 (LWP 18230)]
[New Thread 0x7fffef7fe700 (LWP 18231)]
[New Thread 0x7fffeeffd700 (LWP 18232)]

Thread 3 "ods-enforcerd" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff4d83700 (LWP 18229)]
0x00007ffff60666bc in _int_malloc (av=av at entry=0x7ffff0000020, 
bytes=bytes at entry=16) at malloc.c:3320
3320    malloc.c: No such file or directory.
(gdb) bt
#0  0x00007ffff60666bc in _int_malloc (av=av at entry=0x7ffff0000020, 
bytes=bytes at entry=16) at malloc.c:3320
#1  0x00007ffff6069024 in __libc_calloc (n=<optimized out>, 
elem_size=<optimized out>) at malloc.c:3237
#2  0x0000000000430824 in ?? ()
#3  0x0000000000001f00 in ?? ()
#4  0x0000000000000020 in ?? ()
#5  0x00007ffff45840a0 in ?? ()
#6  0x000000000043084d in ?? ()
#7  0x0000000000001f00 in ?? ()
#8  0x00007ffff0027cb0 in ?? ()
#9  0x00007ffff0034100 in ?? ()
#10 0x000000000047b249 in ?? ()
#11 0x00007ffff45840d0 in ?? ()
#12 0x0000000000430bff in ?? ()
#13 0x00007ffff001f0f0 in ?? ()
#14 0x00007ffff00315f0 in ?? ()
#15 0x00007ffff4584100 in ?? ()
#16 0x00007ffff0027cb0 in ?? ()
#17 0x00007ffff4584100 in ?? ()
#18 0x0000000000430a61 in ?? ()
#19 0x00007ffff4d83700 in ?? ()
#20 0x00007ffff001f0f0 in ?? ()
#21 0x00007ffff0034100 in ?? ()
#22 0x00007ffff00315f0 in ?? ()
#23 0x00007ffff4584140 in ?? ()
#24 0x0000000000444db9 in ?? ()
#25 0x00007ffff0028080 in ?? ()
#26 0x00007ffff0031410 in ?? ()
#27 0x00007ffff002a3b0 in ?? ()
#28 0x00007ffff00008c0 in ?? ()
#29 0x00007ffff4584150 in ?? ()
#30 0x0000000000000003 in ?? ()
#31 0x00007ffff4584170 in ?? ()
#32 0x0000000000444b93 in ?? ()
#33 0x00000001f4584170 in ?? ()
#34 0x00007ffff0028080 in ?? ()
#35 0x00007ffff002bd50 in ?? ()
#36 0x00007ffff0031410 in ?? ()
#37 0x00007ffff4584200 in ?? ()
#38 0x000000000042437f in ?? ()
#39 0x00007ffff002b180 in ?? ()
#40 0x00007ffff002b0c0 in ?? ()
#41 0x00000001f45841b0 in ?? ()
#42 0x00007ffff4d82b60 in ?? ()
#43 0x00007ffff002b0c0 in ?? ()
#44 0x00007ffff002d780 in ?? ()
#45 0x0000000000000005 in ?? ()
#46 0x00007ffff0022160 in ?? ()
#47 0x00007ffff4584200 in ?? ()
#48 0x00007ffff4d82b60 in ?? ()
#49 0x00000001f4d83700 in ?? ()
#50 0x00007ffff002b0c0 in ?? ()
#51 0x00007ffff002b0c0 in ?? ()
#52 0xfffffffff002d780 in ?? ()
#53 0x00007ffff002b0c8 in ?? ()
#54 0x0000000000000003 in ?? ()
#55 0x00007ffff4584290 in ?? ()
#56 0x0000000000424902 in ?? ()
#57 0x00007ffff0028080 in ?? ()
#58 0x00007ffff0022f80 in ?? ()
#59 0x00000001f4584240 in ?? ()
#60 0x00007ffff4d82b60 in ?? ()
#61 0x00007ffff0022f80 in ?? ()
#62 0x00007ffff002d780 in ?? ()
#63 0x0000000000000005 in ?? ()
#64 0x00007ffff0022160 in ?? ()
---Type <return> to continue, or q <return> to quit---
#65 0x00007ffff4584290 in ?? ()
#66 0xfffffffff4d82b60 in ?? ()
#67 0x00000001f4d83700 in ?? ()
#68 0x00007ffff0031410 in ?? ()
#69 0x0000000000000000 in ?? ()
(gdb)

I don't know how to get more symbolic information.
Any suggestion? 





More information about the Opendnssec-user mailing list