[Opendnssec-user] ods-signer segfaults after C_DestroyObject

Matthijs Mekking matthijs at nlnetlabs.nl
Mon Mar 18 12:29:57 UTC 2013


Hi Casper,

The reason why the signer cannot find the keys is that it cannot reopen
the libhsm connection:

> Mar 18 00:10:55 ramanujan ods-signerd: [hsm] hsm_get_slot_id():
> could not find token with the name LocalHSM

Did you perhaps change anything with respect to the HSM in the conf.xml?

On 03/18/2013 01:10 PM, Casper Gielen wrote:
> Hello, I've recently experienced three segfaults from ods-signer. In
> all three cases the log contains errors about keys that can not be
> found. I initially assumed they were erroneously deleted from the HSM
> but 'ods-hsmutil' is able to find them. After restarting the signer
> it seems to work fine.
> 
> I'm using OpenDNSSEC 1.3.9-5 and SoftHSM 1.3.3-2 as provided by
> Debian/wheezy.
> 
> The logs below have been abbreviated and redacted. I'll provide full
> logs upon request. I've increased logging and enabled coredumps, so
> if it happens again I may have more information.

I would be interested in those core dumps.

Best regards,
  Matthijs

> 
> Crash on host Ramanujan Mar 18 00:10:53 ramanujan ods-enforcerd:
> SoftHSM: C_DestroyObject: An object has been destroyed Mar 18
> 00:10:53 ramanujan ods-enforcerd: SoftHSM: C_DestroyObject: An object
> has been destroyed Mar 18 00:10:53 ramanujan ods-enforcerd: Key
> remove successful. Mar 18 00:10:53 ramanujan ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 00:10:53
> ramanujan ods-enforcerd: SoftHSM: C_DestroyObject: An object has been
> destroyed Mar 18 00:10:53 ramanujan ods-enforcerd: Key remove
> successful. Mar 18 00:10:54 ramanujan ods-signerd: [hsm] idle libhsm
> connection, trying to reopen Mar 18 00:10:55 ramanujan ods-signerd:
> [hsm] hsm_get_slot_id(): could not find token with the name LocalHSM 
> Mar 18 00:10:55 ramanujan ods-signerd: [hsm] unable to get key: key
> d1f03a7b14eac19b355e23ce1b47d0d9 not found Mar 18 00:10:55 ramanujan
> ods-signerd: [hsm] unable to sign: get key failed Mar 18 00:10:55
> ramanujan ods-signerd: [rrset] unable to sign RRset[6]: error
> creating RRSIG RR Mar 18 00:10:55 ramanujan ods-signerd: [worker[4]]
> sign zone example3.nl failed: 1 of 11 signatures failed Mar 18
> 00:10:55 ramanujan ods-signerd: [worker[4]] backoff task [sign] for
> zone example3.nl with 60 seconds ... same for many other zones ... 
> Mar 18 00:14:02 ramanujan ods-signerd: [hsm] unable to get key: key
> d56c511a78d2a9406b0d135edc80a758 not found Mar 18 00:14:02 ramanujan
> ods-signerd: [hsm] unable to get key: key
> d56c511a78d2a9406b0d135edc80a758 not found Mar 18 00:14:02 ramanujan
> ods-signerd: [hsm] unable to sign: get key failed Mar 18 00:14:02
> ramanujan ods-signerd: [rrset] unable to sign RRset[6]: error
> creating RRSIG RR Mar 18 00:14:02 ramanujan ods-signerd: [worker[7]]
> sign zone example3.org failed: 1 of 10 signatures failed Mar 18
> 00:14:02 ramanujan ods-signerd: [worker[7]] backoff task [sign] for
> zone example3.org with 120 seconds Mar 18 00:14:03 ramanujan
> ods-signerd: [hsm] unable to get key: key
> cf06b88e9a3867461af1bd628fda4d51 not found Mar 18 00:14:03 ramanujan
> ods-signerd: [hsm] unable to get key: key
> cf06b88e9a3867461af1bd628fda4d51 not found Mar 18 00:14:03 ramanujan
> ods-signerd: [hsm] unable to sign: get key failed Mar 18 00:14:03
> ramanujan ods-signerd: [rrset] unable to sign RRset[6]: error
> creating RRSIG RR Mar 18 00:14:03 ramanujan ods-signerd: [worker[7]]
> sign zone example4.org failed: 1 of 13 signatures failed Mar 18
> 00:14:03 ramanujan ods-signerd: [rrset] unable to sign RRset[6]:
> error creating RRSIG RR Mar 18 00:14:03 ramanujan ods-signerd:
> [worker[7]] sign zone example4.org failed: 1 of 13 signatures failed 
> Mar 18 00:14:03 ramanujan ods-signerd: [worker[7]] backoff task
> [sign] for zone example4.org with 120 seconds Mar 18 00:14:20
> ramanujan ods-signerd: [hsm] unable to get key: key
> 462167fef14dff802a768a2234003d60 not found Mar 18 00:14:20 ramanujan
> ods-signerd: [hsm] unable to get key: key
> 462167fef14dff802a768a2234003d60 not found Mar 18 00:14:20 ramanujan
> ods-signerd: [hsm] unable to sign: get key failed Mar 18 00:14:20
> ramanujan ods-signerd: [rrset] unable to sign RRset[6]: error
> creating RRSIG RR Mar 18 00:14:20 ramanujan ods-signerd: [hsm] unable
> to get key: key 462167fef14dff802a768a2234003d60 not found Mar 18
> 00:14:20 ramanujan ods-signerd: [hsm] unable to get key: key
> 462167fef14dff802a768a2234003d60 not found Mar 18 00:14:20 ramanujan
> ods-signerd: [hsm] unable to sign: get key failed Mar 18 00:14:20
> ramanujan ods-signerd: [rrset] unable to sign RRset[12]: error
> creating RRSIG RR Mar 18 00:14:20 ramanujan ods-signerd: [hsm] unable
> to get key: key 462167fef14dff802a768a2234003d60 not found Mar 18
> 00:14:20 ramanujan ods-signerd: [hsm] unable to get key: key
> 462167fef14dff802a768a2234003d60 not found Mar 18 00:14:20 ramanujan
> ods-signerd: [hsm] unable to get key: key
> 462167fef14dff802a768a2234003d60 not found Mar 18 00:14:20 ramanujan
> kernel: [1059710.358951] ods-signerd[24797]: segfault at 10000000010
> ip 00007fbb353b206a sp 00007fbb2ed4d490 error 4 in
> libc-2.13.so[7fbb3533b000+180000]
> 
> 
> 
> First crash on host Metagross Mar 14 05:10:36 metagross ods-signerd:
> [STATS] example.eu RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)]
> RRSIG[new=1 reused=9 time=3(sec) avg=0(sig/sec)] AUDIT[time=0(sec)]
> TOTAL[time=3(sec)] Mar 14 05:10:36 metagross ods-enforcerd: Purging
> keys... Mar 14 05:10:36 metagross ods-signerd: [STATS] example2.eu
> RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=1
> reused=8 time=2(sec) avg=0(sig/sec)] AUDIT[time=0(sec)]
> TOTAL[time=2(sec)] Mar 14 05:10:37 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 14 05:10:37
> metagross ods-signerd: [hsm] idle libhsm connection, trying to
> reopen Mar 14 05:10:37 metagross ods-signerd: [hsm] idle libhsm
> connection, trying to reopen Mar 14 05:10:37 metagross ods-signerd:
> ../../../signer/src/daemon/engine.c at 367 could not
> pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument 
> Mar 14 05:10:37 metagross ods-enforcerd: SoftHSM: C_DestroyObject: An
> object has been destroyed Mar 14 05:10:37 metagross ods-enforcerd:
> Key remove successful. Mar 14 05:10:38 metagross ods-enforcerd:
> Policy standbyyourkey found. Mar 14 05:10:38 metagross ods-enforcerd:
> Key sharing is Off. Mar 14 05:10:38 metagross ods-enforcerd: No zones
> on policy standbyyourkey, skipping... Mar 14 05:10:38 metagross
> ods-enforcerd: Purging keys... Mar 14 05:10:38 metagross ods-signerd:
> ../../../signer/src/daemon/engine.c at 367 could not
> pthread_join(engine->drudgers[i]->thread_id, NULL): Invalid argument 
> Mar 14 05:10:39 metagross kernel: [52691.560115] ods-signerd[3896]:
> segfault at 7f30a8a3a9d0 ip 00007f30b83bad8c sp 00007f30b4a41dc0
> error 4 in libpthread-2.13.so[7f30b83b3000+17000]
> 
> 
> 
> 
> 
> Second crash on host Metagross Mar 18 05:10:28 metagross
> ods-enforcerd: SoftHSM: C_DestroyObject: An object has been
> destroyed Mar 18 05:10:28 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 05:10:28
> metagross ods-enforcerd: Key remove successful. ... many more ... Mar
> 18 05:10:41 metagross ods-signerd: [hsm] idle libhsm connection,
> trying to reopen Mar 18 05:10:41 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 05:10:41
> metagross ods-enforcerd: Key remove successful. Mar 18 05:10:41
> metagross ods-enforcerd: SoftHSM: C_DestroyObject: An object has been
> destroyed Mar 18 05:10:41 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 05:10:41
> metagross ods-enforcerd: Key remove successful. Mar 18 05:10:41
> metagross ods-enforcerd: SoftHSM: C_DestroyObject: An object has been
> destroyed Mar 18 05:10:42 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 05:10:42
> metagross ods-enforcerd: Key remove successful. Mar 18 05:10:42
> metagross ods-signerd: [hsm] idle libhsm connection, trying to
> reopen Mar 18 05:10:42 metagross kernel: [398294.951400]
> ods-signerd[28480]: segfault at 7f352f47b9d0 ip 00007f3535df9d8c sp
> 00007f3531c7fdc0 error 4 in libpthread-2.13.so[7f3535df2000+17000] 
> Mar 18 05:10:42 metagross ods-enforcerd: SoftHSM: C_DestroyObject: An
> object has been destroyed Mar 18 05:10:42 metagross ods-enforcerd:
> SoftHSM: C_DestroyObject: An object has been destroyed Mar 18
> 05:10:42 metagross ods-enforcerd: Key remove successful. Mar 18
> 05:10:42 metagross ods-enforcerd: SoftHSM: C_DestroyObject: An object
> has been destroyed Mar 18 05:10:42 metagross ods-enforcerd: SoftHSM:
> C_DestroyObject: An object has been destroyed Mar 18 05:10:42
> metagross ods-enforcerd: Key remove successful. ... enforcer
> continues destroying ...
> 
> 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 553 bytes
Desc: OpenPGP digital signature
URL: <http://lists.opendnssec.org/pipermail/opendnssec-user/attachments/20130318/25a25324/attachment.bin>


More information about the Opendnssec-user mailing list