[Opendnssec-develop] Locking mechanism problems

Jerry Lundström jerry at opendnssec.org
Thu Sep 29 08:23:13 UTC 2011


On 2011-09-29 10.00, Matthijs Mekking <matthijs at NLnetLabs.nl> wrote:

>>A quick fix to this is to make engine_update_zones() aware of the lock
>> on zonelist so it doesn't try and lock it and release it after,
>> suggested patch below.
>> 
>> Altho I am seeing this almost everywhere and I would like to discuss our
>> approach. I would really like to see read and write locks implemented to
>> better support thread interoperability. Using
>> pthread_mutex_trylock()/pthread_mutex_timedlock() where it suites to
>> counter hangs. Using PTHREAD_MUTEX_RECURSIVE so different segments can
>> lock the same locked object without letting each other know providing
>> better OO. Of course as complexity in the locking mechanism increases so
>> is the chance of dead locks.
>
>The last sentence is exactly my reason to keep the locking mechanism
>simple, so only lock/unlock.

But the cost might come as a crash or worse that the data gets corrupt.

Just using PTHREAD_MUTEX_RECURSIVE would be a good start so that each
separate segment in the code does not have to know about the locks. I
don't know how supported PTHREAD_MUTEX_RECURSIVE is across the OS targets
we have.

>>Thoughts?
>
>I am not sure if this patch fixes the problem, When I look at the
>assertion error, it looks to me that there is a call that invalidly sets
>the zone->task to NULL.
>
>http://trac.opendnssec.org/browser/branches/OpenDNSSEC-1.3/signer/src/daem
>on/engine.c#L897
>
>Here task may be set to NULL, if the task is being worked on (not
>scheduled).
>
>http://trac.opendnssec.org/browser/branches/OpenDNSSEC-1.3/signer/src/daem
>on/engine.c#L923
>
>Here the zone->task is set to task, which might be NULL if the task was
>being worked on. This should not happen.
>
>The same in cmdhandler.c when a update call is received.

I know that this fix does not fix everything, there are more places this
can happen and there can be more data structures that might be affected by
not locking correctly.

Do we want to do an overhaul of the locks or is it in the plans to redo
the signer?

I could start looking at it in a week or so.

/Jerry






More information about the Opendnssec-develop mailing list