[Opendnssec-user] 1.4.9: Zones not properly updating -- fallout from upgrade?

Havard Eidnes he at uninett.no
Tue Mar 8 11:03:56 UTC 2016


Hi,

further on this problem, I took a closer look at the .ixfr file
for this particular zone, comparing to the other bits and pieces:

1) .backup2 file contains

;;Zone: name korunikhum.no class 1 inbound 2015062912 internal 2016022600 outbound 2016022600
; ...
korunikhum.no.  3600    IN      SOA     ns.uninett.no. elisabeth.uninett.no. 2016030800 28800 3600 604800 900
; ...
korunikhum.no.  3600    IN      RRSIG   SOA 8 2 3600 20160329185004 20160308080507 31363 korunikhum.no. TYORJ/Hdwp3Mgr9e2Ffbrh4fkfsZnwaw7OdICOkBqO6/Jg0TPVSnT75V9ZZd+I3+/w12e1YrUAaMmeV2CJCl6V5tesqCARepntLnl035EFB0dP/nj/setcW+hzUsrmzGj6axOyxPgYkR1HTAj7fUL7etHXkAsG1faoNMmR2VRHE=; {locator 559d9d0ca1306b4c56895e4bc31dfd00 flags 256}

   So... There is a new SOA record (dated today) and a new
   signature for the same (also created today) in the .backup2
   file.

2) The last trace in the log of a signing of the zone is

Feb 26 14:05:11 hugin ods-signerd: [STATS] korunikhum.no 2016022600 RR[count=4 time=0(sec)] NSEC3[count=2 time=0(sec)] RRSIG[new=7 reused=0 time=4(sec) avg=1(sig/sec)] TOTAL[time=4(sec)]

   This matches up with the 2016022600 "internal" and "outbound"
   SOA version numbers, but not with the SOA record above which
   is at 2016030800(!)

   It also follows shortly after the upgrade of OpenDNSSEC from
   version 1.4.7 to 1.4.9, where the latter discarded all the
   .backup2 files due to the introduction of the rfc5011 flag and
   the lack of backward compatibility.

   This is also the last occurrance of '[STATS] korunikhum.no' in
   our logs, so is probably the last time the zone was signed(?)
   I write a question mark, because there's a newer SOA and
   signature in the .backup2 file.

3) Massaging the .ixfr file for the zone with some awk gives:

hugin: {47} awk '/IN    SOA/ { if (num != 0 ) { printf("RRs %d: ", num); num=0; for (r in rrs) { printf("%s %d ", r, rrs[r]); } delete rrs; printf("\n"); } printf("SOA %s\n", $7); next; } // { rrs[$4]++; num += 1 }' korunikhum.no.ixfr
SOA 2016022500
SOA 2016022200
RRs 8: RRSIG 8 
SOA 2016022200
RRs 8: RRSIG 8 
SOA 2016022400
RRs 27: RRSIG 27 
SOA 2016022400
RRs 28: DNSKEY 1 RRSIG 27 
SOA 2016022500
RRs 5: RRSIG 5 
SOA 2016022500
RRs 5: RRSIG 5 
SOA 2016022500
hugin: {48}

   You'll notice that none of the SOA version numbers in the .ixfr
   file correspond to the zone signing which was done on the 26th of
   February.  So of course when there comes in inbound IXFR request,
   the IXFR code comes to read this file, and the new zone data is
   *not* transferred.

The date stamps for "last written" for the various files are:

hugin: {51} ls -l korunikhum.no*
-rw-r--r--  1 ods  ods   3713 Feb 26 14:05 korunikhum.no.axfr
-rw-r--r--  1 ods  ods   4759 Mar  8 10:05 korunikhum.no.backup2
-rw-r--r--  1 ods  ods  22723 Feb 25 05:06 korunikhum.no.ixfr
-rw-r--r--  1 ods  ods  33866 Feb 21 20:10 korunikhum.no.ixfr-bad
-rw-r--r--  1 ods  ods    325 Feb 26 13:49 korunikhum.no.xfrd-state
hugin: {52}

Hmm...  I think I see what's going on, and it all comes back to
rejecting the .backup2 file:

1) The only two places which sets the per-zone "is_initialized"
   variable to 1 is a) the piece which reads the .backup2 file (which
   was aborted, so the variable wasn't set), and b) the tools_output()
   function after adapter_write() is called.

2) adapter_write() has a check whether the zone "is_initialized", and
   skips writing the .ixfr file if it's not set.

3) on later invocations, tools_output() has this segment of code which
   skips writing a new zone if the contents hasn't changed, ref. this
   block of code:

    /* prepare */
    if (zone->stats) {
        lock_basic_lock(&zone->stats->stats_lock);
        if (zone->stats->sort_done == 0 &&
            (zone->stats->sig_count <= zone->stats->sig_soa_count)) {
            ods_log_verbose("[%s] skip write zone %s serial %u (zone not "
                "changed)", tools_str, zone->name?zone->name:"(null)",
                zone->db->intserial);
            stats_clear(zone->stats);
            lock_basic_unlock(&zone->stats->stats_lock);
            zone->db->intserial =
                zone->db->outserial;
            return ODS_STATUS_OK;
        }
        lock_basic_unlock(&zone->stats->stats_lock);
    }

   and of course on subsequent invocations, the zone doesn't really
   change comparet to the previous version.

So to me this looks like

a) it skips recording the actual change of the initial signing after
   the upgrade to 1.4.9 in the .ixfr file
b) all subsequent internal changes go unnoticed because the check
   above shortcuts the writing of the output data

so it ends up stuck in this state while the remaining signature
validity for the actual published records keeps on counting downwards
towards zero, and this will not mend itself without manual
intervention.

My conclusion: "Not robust!"

Regards,

- Håvard



More information about the Opendnssec-user mailing list