[Opendnssec-user] Help with random signer crash
Matthijs Mekking
matthijs at NLnetLabs.nl
Mon Feb 14 12:18:00 UTC 2011
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Sebastian,
This sounds like a race condition to me. Although, I have not been able
to simulate this, it looks to me that the command handler is handling
the sign commands faster than that workers are created.
In the OpenDNSSEC-1.2 branch, I made a fix that tasklist, zonelist and
workers are created before starting the command handler (see below). I
would guess that this prevents the race condition from happening (as
well as the one you posted later).
Best regards,
Matthijs
Modified: branches/OpenDNSSEC-1.2/signer/src/daemon/engine.c
===================================================================
- --- branches/OpenDNSSEC-1.2/signer/src/daemon/engine.c 2011-02-11
13:11:34 UTC (rev 4436)
+++ branches/OpenDNSSEC-1.2/signer/src/daemon/engine.c 2011-02-14
10:34:57 UTC (rev 4437)
@@ -552,6 +552,11 @@
se_log_assert(engine->config);
se_log_debug("perform setup");
+ /* set up the work floor */
+ engine->tasklist = tasklist_create(); /* tasks */
+ engine->zonelist = zonelist_create(); /* zones */
+ engine_create_workers(engine); /* workers */
+
/* create command handler (before chowning socket file) */
engine->cmdhandler =
cmdhandler_create(engine->config->clisock_filename);
if (!engine->cmdhandler) {
@@ -662,11 +667,6 @@
return 1;
}
- - /* set up the work floor */
- - engine->tasklist = tasklist_create(); /* tasks */
- - engine->zonelist = zonelist_create(); /* zones */
- - engine_create_workers(engine); /* workers */
- -
return 0;
}
On 02/11/2011 05:38 AM, Sebastian Castro wrote:
> My apologies in advance for this message, which is more venting that bug
> report.
>
> In our testing environment we are hitting unexpected crashes from the
> signer, and refuses to give us light where it comes from.
>
> The setup in a server running CentOS 5.5, equipped with a SCA6000,
> running openCryptoki and OpenDNSSEC 1.2.0
>
> During start, the ods-signerd spits out the message below and reaches a
> state where all the threads can't progress because there is no
> "listener" behind the control sock (engine.sock)
>
> sign2 openCryptokiModule[28609]: daemon/cmdhandler.c:209:
> cmdhandler_handle_cmd_sign: assertion cmdc->engine->tasklist failed
>
> Note: for some reason we haven't investigated, the messages are logged
> under openCryptokiModule and not ods-signerd.
>
> We've tried repeatedly to run ods-signerd using valgrind and gdb, and in
> both cases the signer DOESN'T CRASH!
>
> So, in my search for wisdom I must ask: anyone has seen this error (I
> doubt it). If not, anyone would recommend an strategy to collect more
> useful information towards a diagnostic?
>
> Cheers,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJNWR13AAoJEA8yVCPsQCW5ISEH/31ItYK5ivOvc0t/SgZV1MoQ
1uSYYEztMduMyP/BZAk1C03O/fLnMb5l57i8ShnbcnUdgknOpito9vUctPusnjc7
bzcQ8WeagswzitW7llsqRNLeIKiPH37lJYnfxK25538xXuaYlcjrZrjJZENVzHab
+9ffEOqxffMbfSCCw9uwPZ2CQ7an01sqaR0fMAjwaWudMFw5w+Uo51I/1J1CRGVi
z3JrmIPTDcWLBkL0bebL/VvIPJtCKYwnuGxQy6V/0QpvQkxaKg9ylXjMEXwrbgNN
KreC/D1AUHEtgoLc8W2cZCwASKcpENFstWQ1fMuGju4aEobaiQSiGfTn5SZOzQg=
=s8f9
-----END PGP SIGNATURE-----
More information about the Opendnssec-user
mailing list