[Opendnssec-user] Re: Zone stuck, not updating

Fred.Zwarts F.Zwarts at KVI.nl
Tue Nov 4 07:58:11 UTC 2014

"Havard Eidnes"  schreef in bericht 
news:20141031.172405.489878262.he at uninett.no...
>> It seems that the problem is that the SOA version number used in
>> the IXFR request is totally "off the wall"; I'm seeing
>> 3180924024, which is way bigger than what's in the .xfrd-state
>> file (2014091709), but still "bigger" according to the serial
>> number arithmetic used for SOA version numbers.
>Ding, found the bug which caused toe odd-looking SOA serial
>number.  Someone has been running on autopilot...
>xfrd_recover() in signer/src/wire/xfrd.c which restores the data
>for a zone from the .xfrd-state file contains this piece of code:
>                /* all ok */
>                xfrd->soa.ttl = htonl(soa_ttl);
>                xfrd->soa.serial = htonl(soa_serial);
>                xfrd->soa.refresh = htonl(soa_refresh);
>                xfrd->soa.retry = htonl(soa_retry);
>                xfrd->soa.expire = htonl(soa_expire);
>                xfrd->soa.minimum = htonl(soa_minimum);
>Um, why the htonl() invocations?!?  This is data written to and
>read from a local file, so on a little-endian machine, a byte-
>swap should not be needed.
>Byte-swapping the "strange" serial number:
>% dc
>3180924024 16op
>% dc
>gives us back what it should have been in the first place.
>And of course when I fix this bug, the byte-swapped version of
>the SOA serial has already been written to the *.xfrd-state
>files, forcing me to stop OpenDNSSEC, remove all the *.xfrd-state
>files, and restart it.
>Hopefully *that* should fix the problem permanently.
>It seems that some other files in my /var/opendnssec/tmp/ also
>needed removing (ixfr state files messed up?), so I stopped
>OpenDNSSEC, renamed the dir and re-created it and then restarted
>OpenDNSSEC.  Things seem to be back on track again.
>Now I just need to write that monitoring script so that I can be
>alerted if this should ever happen again -- hopefully it won't.

This might explain why we see this problem more often then others. We have a 
cron job that once a day stops OpenDNSSEC, makes a backup of the softhsm and 
opendnssec directories and databases and then restarts OpenDNSSEC. This 
means that each day the serial gets corrupted in our case. Sites that do not 
shutdown OpenDNSSEC daily, won´t see the problem that often. 

More information about the Opendnssec-user mailing list