[Opendnssec-user] ods-enforcerd in error loop required manual ods-ksmutil hacking to get unstuck :(

Paul Wouters paul at nohats.ca
Mon Sep 22 23:05:46 UTC 2014


On Mon, 22 Sep 2014, Paul Wouters wrote:

> Looking around I saw:
>
> [root at ns0 log]# ods-ksmutil key list --verbose |grep  "NOT IN"
> SQLite database set to: /var/opendnssec/kasp.db
> libreswan.org                   ZSK           active    2014-09-25 05:44:55 
> (retire)   2048    8           e9fa25e8214d2920c9489069bdfe61e6 SoftHSM NOT 
> IN repository
> libreswan.net                   ZSK           active    2014-09-25 05:44:55 
> (retire)   2048    8           635cc83526dfafc2c76605ae7db96ea3 SoftHSM NOT 
> IN repository
> libreswan.com                   ZSK           active    2014-09-25 05:44:55 
> (retire)   2048    8           cd09e7a69e890ee7ad41a30e9a323b73 SoftHSM NOT 
> IN repository
> httpca.org                      ZSK           active    2014-09-23 03:44:15 
> (retire)   2048    8           f11737ffedb1723031545fe488c89bad SoftHSM NOT 
> IN repository
>
> It is possible that testing with softhsm-2 and then reverting to
> softhsm-1 caused these to happen, if these keys were generated during
> the 2 days of running softhsm-v2.
>
> Running ods-ksmutil key delete --cka_id <id> --force seemed to have
> killed it and forced it to generate new ones

And for unknown reasons it is now only creating a single RRSIG record
for the DNSKEY set (by the KSK) and none of the RRSIG records by the
ZSK, turning these 4 zones into bogus :(

Deleting all files in /var/opendnssec/tmp/ and /var/opendnssec/signed/
and even /var/opendnssec/signconf/ and running ods-ksmutil update all
did not resolve this issue:

Sep 22 14:51:51 ns0 ods-signerd: [STATS] libreswan.org 2014092200 RR[count=60 time=0(sec)] NSEC3[count=38 time=0(sec)] RRSIG[new=1 reused=0 time=4(sec) avg=0(sig/sec)] TOTAL[time=4(sec)] 
Sep 22 15:11:31 ns0 ods-signerd: [STATS] libreswan.org 2014092201 RR[count=1 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=0 reused=1 time=2(sec) avg=0(sig/sec)] TOTAL[time=2(sec)] 
Sep 22 15:14:34 ns0 ods-signerd: [worker[2]] CRITICAL: failed to sign zone libreswan.org: General error
Sep 22 15:14:34 ns0 ods-signerd: [worker[2]] backoff task [configure] for zone libreswan.org with 60 seconds
Sep 22 15:14:58 ns0 ods-enforcerd: Zone libreswan.org found.
Sep 22 15:14:58 ns0 ods-enforcerd: Policy for libreswan.org set to default.
Sep 22 15:14:58 ns0 ods-enforcerd: Config will be output to /var/opendnssec/signconf/libreswan.org.xml.
Sep 22 15:14:58 ns0 ods-enforcerd: WARNING: ZSK rollover for zone 'libreswan.org' not completed as there are no keys in the 'ready' state; ods-enforcerd will try again when it runs next
Sep 22 15:14:58 ns0 ods-enforcerd: Called signer engine: /usr/sbin/ods-signer update libreswan.org
Sep 22 15:15:54 ns0 ods-signerd: [signconf] zone libreswan.org signconf: RESIGN[PT7200S] REFRESH[PT604800S] VALIDITY[PT1209600S] DENIAL[PT1209600S] JITTER[PT43200S] OFFSET[PT3600S] NSEC[50] DNSKEYTTL[PT3600S] SOATTL[PT3600S] MINIMUM[PT3600S] SERIAL[datecounter]
Sep 22 15:16:03 ns0 ods-signerd: [STATS] libreswan.org 2014092200 RR[count=60 time=0(sec)] NSEC3[count=38 time=0(sec)] RRSIG[new=1 reused=0 time=3(sec) avg=0(sig/sec)] TOTAL[time=3(sec)]

[two hours later]

I ended up using dnssec-keygen to generate a new key, and import it
manually into softhsm and then importing it as ZSK using ods-ksmutil.

Not an easy recovery. What should have happened was that opendnssec
should have signed with the only existing ZSK there was, instead
of skipping all ZSK RRSIGs and producing a signed zone with only
a single RRSIG over the DNSKEY from the KSK :(

Paul



More information about the Opendnssec-user mailing list