[Opendnssec-develop] Re: [OpenDNSSEC] #262: Possible race condition causing CPU-bound loop in signerd?
OpenDNSSEC
owner-dnssec-trac at kirei.se
Tue Oct 25 13:27:31 UTC 2011
#262: Possible race condition causing CPU-bound loop in signerd?
-------------------------------+--------------------------------------------
Reporter: goeran@… | Owner: matthijs
Type: defect | Status: accepted
Priority: major | Component: Signer
Version: 1.3.0 | Resolution:
Keywords: CPU-bound loop |
-------------------------------+--------------------------------------------
Comment (by matthijs):
Hi,
I had the time to take a look at the strace. If I am correct, you have
configured 8 worker threads and 2 signer threads.
I see that all 8 threads are putting RRsets in the sign queue, so there
are 8 zones in parallel being signed at the moment. The strace shows that
7 of them are waiting on a lock on q_lock, one of them is releasing the
lock on q_lock. So probably, that is going alright.
All 8 workers have a lock on zone_lock (zone_lock is different in all of
these cases, because each zone has it's own zone_lock). The command
handler has received an update command for the zone "chalmers.eu", so it
requires the zone_lock for that zone. Probably, one of the workers
currently has that zone_lock.
The two drudger threads are sleeping. That is kind of strange. They should
have get a broadcast signal as soon as the threshold of 1 queuing RRset
has been reached:
if (count == 0 && q->count == 1) {
lock_basic_broadcast(&q->q_threshold);
ods_log_deeebug("[%s] threshold %u reached, notify drudgers",
fifoq_str, q->count);
}
Given this reasoning, my analysis is that the RRset queue is full, so the
workers cannot queue the whole zone for signing. Because of that, they
cannot finish their job and release their zone_lock. You notice, because a
ods-signer update command is requiring a zone_lock that is being resigned
at the moment.
That is my best guess. If someone has any other insights, please provide
them. In the meantime, I will investigate if and how this scenario is
possible.
--
Ticket URL: <http://trac.opendnssec.org/ticket/262#comment:5>
OpenDNSSEC <http://www.opendnssec.org/>
OpenDNSSEC
More information about the Opendnssec-develop
mailing list