[Opendnssec-user] Bad signerd crash.

Matthijs Mekking matthijs at nlnetlabs.nl
Thu Sep 19 05:37:30 UTC 2013


Hi,

On 09/18/2013 09:09 AM, Mathieu Arnold wrote:
> Hi,
> 
> (OpenDNSSEC 1.4.2 on FreeBSD.)
> 
> Yesterday evening, it seems something went wrong somehow. The signer
> complained that some keys were not there, (see attached log,) but they are,
> they're the current ZSKs :
> 
> # ods-ksmutil key list -z 1-wire.fr -v
> Zone:                           Keytype:      State:    Date of next
> transition (to):  Size:   Algorithm:  CKA_ID:
> Repository:                       Keytag:
> 1-wire.fr                       KSK           active    2014-07-20 09:06:02
> (retire)   2048    8           b511db6d9bec32ee33c90f0ba1b9f6b2
> SoftHSM-KSK                       50864
> 1-wire.fr                       ZSK           active    2013-09-18 09:53:29
> (retire)   1024    8           651d67ff14adc7e56071e83f07824ce4
> SoftHSM-ZSK                       51118
> 1-wire.fr                       ZSK           publish   2013-09-18 09:31:50
> (ready)    1024    8           e460a1aa5d1b4ebbde1abc4d4db48b3c
> SoftHSM-ZSK                       59416

Looking at the code (shared/hsm.c), it looks like hsm_find_key_by_id()
returns NULL, but libhsm does not provide an error. After a couple of
tries, the signer reports "key not found".

It looks like the connection to the HSM went bad:

Sep 17 20:27:01 ns1 ods-signerd: [hsm] idle libhsm connection, trying to
reopen

But then it also looks like the drudger threads didn't start up:

Sep 17 20:27:02 ns1 ods-signerd: daemon/engine.c at 444 could not
pthread_join(engine->drudgers[i]->thread_id, NULL): No such process

Do you have logs from between start and Sep 17 20:24:09?

> Then the signer crashed (btw, can't find a core file, should be in the tmp
> directory, right ?, how do I get one ?) leaving its control socket around,
> and, preventing itself from launching again...

Core file, depends on the system. When testing it is my experience that
sometimes it is stored in the directory that the binary is called from.

We could mitigate against the preventing itself from launching again by
setting the SO_REUSEADDR option in the socket.

Best regards,
  Matthijs


> 
> 
> 
> _______________________________________________
> Opendnssec-user mailing list
> Opendnssec-user at lists.opendnssec.org
> https://lists.opendnssec.org/mailman/listinfo/opendnssec-user
> 




More information about the Opendnssec-user mailing list