[Opendnssec-user] Same SOA#, different content?!?

Havard Eidnes he at uninett.no
Thu Mar 22 14:13:00 UTC 2018


Hi,

recently our ods-signerd crashed due to me removing a zone from
our setup -- the crash happened 5-10 minutes later, and has been
reported as a bug.  (We're on OpenDNSSEC 1.4.13, with zone
transfers in and out.)

Following this crash I had to remove our /var/opendnssec/tmp
directory to "start from a clean slate" (actually, I saved the
old dir after a false start, where I saw in the log e.g.

Mar 14 14:58:02 tilfeldigvis ods-signerd: [axfr] zone 60.39.128.in-addr.arpa expired at 1518409250, and it is now 1521035882: not serving soa (serial_xfr_acquired=1517804450, expire=604800)

which is the result of another bug, and which would have caused
even more indigestion), and restarted OpenDNSSEC.

About a few days ago I got an alarm for remaining signature
validity for a zone dropping below 2 days less than our re-sign
interval, so I bumped the SOA in the "source zone" and that
pushed a new copy through, resolving the issue for this zone.

Now the same problem has been flagged for one of our other zones,
and this got me worried -- how many other of our 400-or-so zones
have this latent problem?

The problem appears to be that the new signed zone on our signer
is re-using a SOA version number which was already in use when
the previous signer ran, but that the contents of the zone file
to be pushed to our downstream NS is *different*.

The downstream NS is reporting:

4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.             3600 IN RRSIG     DNSKEY 8 14 3600 20180328030759 20180307052312 15839 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa. rhcR+e9wNH4Qu+gBRnzOjDHiLm/0boUW6pxxH11Mqwq/ETOA0uJn21uJ Vj0NgzmaimkXd5AftaM4MRpaBwIPXiOX1T/RelVs4LSb8YtDLz0C4hfU 8oEILzS6o9W+0vvsw4Ofl965uTlmkfPd9RuKOvoanDpmj42pqVoY0e+F eBJYM/xbpVjVL/QZd8Rihk4LtphfTfxSs7JraeIRDCpraHDaBlc7Fwlq stE5ixzhNpf9glNZFACTxPftkAzXpkt3RG5YRzMon8uWlZlNGsVPJk5W QsPgDGhqLT53eoDImzYtYbAcg7Vzi+8WAjvxI+cjZFhOOg1LzYA2Aq/G YkSPig==

when I look at the slaved-down zone file, while the
4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.axfr file on the signer machine
has

4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.       3600    IN      RRSIG   DNSKEY 8 14 3600 20180405011650 20180314133440 15839 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa. sqsUG0MZcWLNVubG8Akxnrj7ZBcfFaobTf587+VoR1u0ZoI4XZQkm4erTTxvl8R3MxZI3krtUNlylnUiiStp4abtaiPytmYWON6Q2Jo02KcoWRUwa+lkX3JZBi3i2iTPNbPotEDujb/OgVo0zmFgyz4/03HVQU5XiQ9UdcsH5hdvAFTIjQ18TpV+LRiMtayRKrimatWH0dNdWQ+5pScHvPZcS0Vn8VwdUd6ChU3Fw1usO6LN/sdeAGdqBnqb4jM7CwXkVWmYZcu67RX3VllZyTfDdu76F15H+imu3iY7RMlB7mMsqYMh99Sri2d3/F4ljQp/1TQ4ESFeMI6G2NYCZQ==

which clearly has a different validity window(!)

The two SOA records do however have the *same* SOA version number:

4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.             3600 IN SOA       ns.uninett.no. hostmaster.uninett.no. 2018031400 28800 3600 1814400 900

on the downstream and

4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.       3600    IN      SOA     ns.uninett.no. hostmaster.uninett.no. 2018031400 28800 3600 1814400 900

on the signer.

One log entry for a "scheduled re-signing" says:

Mar 21 17:34:42 tilfeldigvis ods-signerd: [worker[2]] report for duty
Mar 21 17:34:42 tilfeldigvis ods-signerd: [worker[2]] nothing to do
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] report for duty
Mar 21 17:34:43 tilfeldigvis ods-signerd: [scheduler] pop task for zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [scheduler] unschedule task [sign] for zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] start working on zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] perform task [sign] for zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa at 1521650083
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] sign zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [namedb] zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa update serial: format=datecounter in=2018020504 internal=2018031400 out=2018031400 now=1521650083
Mar 21 17:34:43 tilfeldigvis ods-signerd: [namedb] zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa update serial: 2018031400 + 700 = 2018032100
Mar 21 17:34:43 tilfeldigvis ods-signerd: [zone] zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa set soa serial to 2018032100
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] wake up
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[2]] report for duty
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] somebody poked me, check completed jobs 106 appointed, 106 completed, 0 failed
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[2]] nothing to do
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] sign zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa ok: 106 of 106 RRsets succeeded
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] write zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [tools] skip write zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa serial 2018032100 (zone not changed)
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] next task [sign] for zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [file] openfile 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa.backup2.tmp count 1
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] finished working on zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [scheduler] schedule task [sign] for zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [task] On Wed Mar 21 19:34:43 2018 I will [sign] zone 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] report for duty
Mar 21 17:34:43 tilfeldigvis ods-signerd: [worker[1]] nothing to do

i.e. in this instance nothing was actually done, but you can also
see how it computes the downstream SOA serial number.

I'm guessing (have not dug down to the details) that the SOA
"offset" isn't saved on-disk between runs, and this has therefore
caused a serial# to be re-used, triggering this problem(?)

Either that, or the memory of the previous outbound SOA# (and
therefore the offset which needed to be bumped on re-signing) was
stored in the tmp/ backup2 file header:

;OpenDNSSEC-backup-v3
;;Time: 1521729283
;;Zone: name 4.0.0.0.0.0.7.0.1.0.0.2.ip6.arpa class 1 inbound 2018020504 internal 2018031400 outbound 2018031400
;;Signconf: lastmod 1520466103 maxzonettl 0 resign PT7200S refresh PT777600S valid PT1814400S denial PT1814400S jitter PT43200S offset PT3600S nsec 50 dnskeyttl PT3600S soattl PT3600S soamin PT900S serial datecounter 
;;Nsec3parameters: salt 8a6b88f85bb4751d algorithm 1 optout 0 iterations 5
;;Key: locator 1d8c9a33b21f9f8e779f951db9b16168 algorithm 8 flags 257 publish 1 ksk 1 zsk 0 rfc5011 0
;;Key: locator 39b74a5b177b9febaabebb9c5835ad79 algorithm 8 flags 256 publish 1 ksk 0 zsk 0 rfc5011 0
;;Key: locator c39269697c2ea402c69c3dcdcc5c0bfc algorithm 8 flags 256 publish 1 ksk 0 zsk 1 rfc5011 0
;;

Comments?

Regards,

- Håvard



More information about the Opendnssec-user mailing list