[Opendnssec-user] Problems adding largish # of zones

Havard Eidnes he at uninett.no
Thu Dec 17 19:07:15 UTC 2015


Hi,

I'll try to digest the rest of your message more properly, but
this is just my initial feedback:

> So to cut to the chase. Based on my testing, on short term your
> troubles should go away by increasing the number of this define in
> tcpset.h
>
> #define TCPSET_MAX 50
>
> Make this something in the order of the number of zones you are
> adding at once. I'd stay a bit away from 1024 as to allow for
> the signerd to have some room for other file descriptors. So
> I'd advice maybe 500 to 900.

My initial reaction is that this will make OpenDNSSEC into the
local network villain, and will create a "thundering herd" of TCP
connections on startup, possibly overwhelming the upstream auth
name server (which in our case also does other things than feed
OpenDNSSEC its zones(!)).

Instead I'd rather like to see OpenDNSSEC properly pace itself in
its interaction with its surroundings: there really is no need to
start 300-500 parallel TCP sessions on startup if it only is
configured to have two worker threads!  Even on a relatively
beefy machine it takes its own sweet time "configuring" and
"reading" all the zone files on startup, so why the rush?  And
I'm quite certain that in total, doing this paced instead of with
300-500 parallel sessions isn't actually going to be slower,
possibly quite the opposite, and additionally create a lot less
stress on its surroundings.

If the signer wants to read a zone file, and the zone file isn't
there, do a zone transfer if configured to do so, and wait for it
to complete before proceeding, instead of "retrying".  If there's
no connection slot available, take a place in the queue, and
don't simply declare "input adapter failed" and *not* initiate a
zone transfer, and spin around ever more slowly trying to read a
zone file which won't be there until a zone transfer is actually
attempted.  (That's what the behaviour looks like reading the log
files, which seems an awfully clumsy way to go about these
things...)

Instead of the signer trying 300-500 parallel zone transfers, I
think a more reasonable behaviour would be to not do more than 4
parallel zone transfers(!)

But ... this partly depends on what you actually mean when you
say "the number of zones you are adding at once".  I can think of
a couple of interpretations:

1) The number of zones configured where there's no cached file
   on-disk, i.e. the number of configured zones.  When I get the
   dreaded "soamin not set" assertion (which unhelpfully doesn't
   point to *which* zone or which file which triggers this error
   condition, possibly I'll take a look at fixing that), I as an
   operator have no other recourse than to remove all the cached
   files.  So am I then "adding the number of configured zones"
   (in my case, at present, 368).
   
2) The number of zones added in one batch, with "zone add" and
   counted when you do the corresponding "update zonelist"?  This
   is in our situation usually quite modest, except for the round
   where I 2-3 weeks ago added around 300 in one go.

Or perhaps "both 1 and 2"?


I'll need to look in more detail at the suggested patches and the
other comments.

Regards,

- Håvard



More information about the Opendnssec-user mailing list