[Opendnssec-user] OpenDNSSEC in ISP environment (lots of small zones)?

Simon Mittelberger simon.mittelberger at united-domains.de
Fri Jan 28 11:12:10 UTC 2011


hi,

we are currently trying to achieve the same thing (many small zones). at
the moment we have opendnssec up and running on 1500 zones, with a ksk
lifetime of 2 days and zsk odf 1 day.

what we did to achieve this amount of zones is the following:
1. use mysql as database backend instead of sqlite in order to get rid
of the kasp.db locks.
2. we created more policies, each with an own softhsm and distributed
the zones equally over them. so we have now 50 policies with 50
softhsms. we figured that the performance of softhsm was a lot better
with fewer keys in each repository.
3. we also disabled the auditor.
4. we added the zones slowly (every 30 seconds a zone) so that the
enforcer would not be stuck in the key creation queue forever and the
signer could take up his work.
5. we build a small script which checks if keys are published in the
parent and issues the ds-seen commands for the ksks automatically.

but we are noticing some other locking problems as well:
1. when the enforcer is creating keys and the signer is singing, the
signer sometimes runs into errors and crashes. if we wait and reastart
the signer after key creation is done it works again.
2. we noticed that the command 'ods-ksmutil key ds-seen --zone xxx
--keytag 123' runs into a lock too, when the enforcer creates the keys
and cannot issue the ds-seen command.

however, we believe these issues are known by the opendnssec developers
and we think they are trying to resolve them. the project plan on:
http://trac.opendnssec.org/wiki/ProjectPlan/EnforcerNG sounds very
promising.


all the best,
simon

Am Freitag, den 28.01.2011, 11:19 +0100 schrieb Jan-Piet Mens: 
> Hello,
> 
> We are envisioning the deployment of OpenDNSSEC in an ISP environment in 
> order to provide DNSSEC services to clients. As an ISP, the typical use 
> is thus many thousand of small zones.
> 
> We've been looking at OpenDNSSEC for some time now, and have read and 
> (hopefully) understood most of the documentation and the very useful 
> hints and messages on the mailing-list. Initial tests with prior 
> versions on a couple of zones were looked very promising, so we upgraded 
> to OpenDNSSEC 1.2.0 and attempted to put a bit of load on the service. 
> Wanting to avoid spending money on a HSM during testing, we are using 
> SoftHSM, also version 1.2.0.
> 
> I realize what follows may sound like a rant, which it isn't supposed to 
> be; it is more of a cry of help, coupled with the question on whether 
> OpenDNSSEC is the right tool for our job. :)
> 
> We are basically using a default configuration as provided by the 
> project. As mentioned, the first couple of zones work like a charm, and 
> I was delighted to see that the round-trip-time of a dynamic update to 
> BIND, the zone transfer to OpenDNSSEC, it signing the zone and providing 
> it to an NSD server could be completed in just a couple of seconds! 
> Lovely. ;-)
> 
> For testing, we added 10,000 synthetic zones, each with 610 RR all 
> configured to use a single default policy. From that point onwards, it 
> all becomes a bit blurry; the following observations are based on "look 
> and feel".
> 
> For example, after about 2,000 key pairs were created, we notice 
> concurrency seems to be a problem. While the enforcer is running , the 
> KASP database has a lock on it so that I can't look at a key even, an 
> operation which is surely read-only?
> 
>     ods-ksmutil key list -z c1767.aa
>     SQLite database set to: 
> /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db
>     /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db.our_lock 
> already locked, sleep
>     ...
> 
> Our test system has 6GB of RAM on it. While enforcer and signer were 
> running it locked up (swap), so we had to pull the plug on it. After 
> restart, we notice that starting OpenDNSSEC with `ods-control start' 
> doesn't start the enforcer (only the signerd is started). It appears 
> that files left over in /var/run make
> the enforcer think it is still running.
> 
> Just before the reboot, about 2,000 key pairs had been created. An 
> `ods/ksmutil key list' then took an inordinate amount of time to complete:
> 
> 	time ods-ksmutil  key list  > x.01
> 	SQLite database set to: 
> /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db
> 
> 	real	5m28.749s
> 	user	4m44.685s
> 	sys	0m43.702s
> 
> The first 10,000 key pairs took over 4 hours to generate. During that 
> time the signer was blocked (kasp.db.our_lock exists). After the four 
> hours, there was no activity: no signing, no nothing. Two signer 
> processes apparently hung. I killed off one of them, and the enforcer 
> continued working.
> 
>     Jan 26 19:53:15 sign1 ods-signerd: zone fetcher transferred zone 
> c1111.aa serial 1 successfully
>     Jan 26 19:53:15 sign1 ods-signerd: daemon/cmdhandler.c:209: 
> cmdhandler_handle_cmd_sign: assertion cmdc->engine->tasklist failed
>     Jan 26 19:53:15 sign1 ods-signerd: zone fetcher transferred zone 
> c1112.aa serial 1 successfully
> 
> Killing off the processes and restarting didn't help. An `ods-ksmutil 
> update all' seems to have "fixed" the issue (I was able to launch 
> ods-control), but the question remains as to what happened.
> 
> What we then did was to completely disable the auditor in the 
> configuration and on the zone policy (all zones have the same policy), 
> hoping to strongly decrease the load of the system. After an `update 
> all` and a restart of the OpenDNSSEC daemons we experienced once again 
> that the enforcer starts and the signers appear to wait on something 
> (a01.aa is the first zone in zonelist.xml):
> 
> 	1955 ?        Rs   193:45 
> /usr/local/stow/opendnssec-1.2.0/sbin/ods-enforcerd
> 	1959 ?        Ss     0:00 
> /usr/local/stow/opendnssec-1.2.0/sbin/ods-signerd -vvv
> 	1967 ?        S      0:00 sh -c 
> /usr/local/stow/opendnssec-1.2.0/sbin/ods-signer sign a01.aa > /dev/null 
> 2>&1
> 	1968 ?        S      0:00 
> /usr/local/stow/opendnssec-1.2.0/sbin/ods-signer sign a01.aa
> 
> (This has been so since a while now, again: note the times:
>   -rw-r--r-- 1 opendnssec opendnssec 5223424 Jan 28 11:12 kasp.db
>   -rw-r--r-- 1 opendnssec opendnssec       0 Jan 28 07:52 kasp.db.our_lock
> )
> 
> I understand OpenDNSSEC is used mainly TLD environments, which have few 
> but large zones. Is OpenDNSSEC theoretically suited to be used in 
> production in a lots-of-small-zones environment?
> 
> Is what we are attempting to do, realistically feasable with OpenDNSSEC?
> 
> Thank you & regards,
> 
> 	-JP
> 
> _______________________________________________
> Opendnssec-user mailing list
> Opendnssec-user at lists.opendnssec.org
> https://lists.opendnssec.org/mailman/listinfo/opendnssec-user





More information about the Opendnssec-user mailing list