[Opendnssec-develop] Database optimisation

Fri Sep 27 08:00:56 UTC 2013

Youri, others,

When we discussed database optimisation yesterday, I got the feeling that some of the reasons for choosing a database are not well understood in the group.  So, hoping not to state too many obvious things, here's how I see the general approach of OpenDNSSEC following the same design flaw that underpins many websites as well.

SQL is a Set Query Language, and its operations are designed to work on sets, preferrably in bulk.  This means that overhead on queries are often an investment in efficient bulk handling.  The common use in OpenDNSSEC and web applications to use SQL for per-object queries risks the overhead without getting the appropriate "return on investment" that would have applied on bulk operations.

Overhead takes a number of shapes, but at least:
 - updating indexes when inserting, deleting or modifying indexes fields
 - locking to achieve ACID (Atomicity, Consistency, Isolation, Durability) properties (especially locks for individual rows can be inefficient -- and incorrect [1])
 - maintaining a split-brain data set to be able to rollback, possibly in two phases in support distributed transactions
 - deadlock avoidance algorithms (detecting cyclic dependencies on locks)
 - using a file system as a storage vehicle for a series of blocks
 - guarding consistency: foreign key properties

When MySQL started, they evaded many of these forms of overhead.  What they did, was introduce an intuitive notion that N queries of size 1 could run as efficiently as 1 query of size N.  In other words, that it did not matter that you did individual record queries.  The choices made were clear too -- locking was a matter of manually locking an entire table.  As soon as MySQL introduced transactions, I knew their performance would plummet, especially in those not-so-blik cases.  I think that is what we are experiencing now.

There are many alternatives to the SQL data model; usually they attack either the notion of bulk operation or the strong guarantees (well… [1]) provided by ACID.  Removing these assumptions underpinning your data model is often not at all harmful to your application, although it requires some explicit thought, but it will improve efficiency.  Given this, there are a few ways out I gues.

One would be to use MySQL without transactions; I don't know if you can mix it with MySQL and if the semantics are right, but inserting one record in one table is atomic and if it does wait for locks by transactions running, it would be possible to do this without the overhead of a transaction -- but not for one record in one table and another record in another table, you would risk problems with referetial integrity.  You could design your system to even work well in that setting, but this is not how the database layer of the Enforcer was… designed.

Another option is to collate information and offer it all in one set to OpenDNSSEC, *and* process it with a single INSERT statement or otherwise within one transaction.  The mention of multiple -z arguments and feeding a -Z filelist are the options that came up in that direction yesterday.  What I've seen of the KASP code suggests that it too doesn't make bulk queries, so it might be possible to improve things there too.

You could investigate the indexes that are setup (I looked but it looks like there aren't many -- although the foreign key references might create some).

Finally, and most importantly, you could choose a database model whose semantics better fit the purposes of OpenDNSSEC.  This would be a drama to introduce, so in practice it is probably a very remote option.  But it does take away the basic fault of using SQL for mostly singular lookups.  The other models are sometimes referred to as NoSQL databases; they do not include just frivolour-dataformat-of-the-day-over-HTTP but also stuff like text files, DBM, LDAP.  I already mentioned BerkeleyDB in the past, as a key->value database with redundancy (with self-recovery based on majority voting, much more advanced and useful than what MySQL does), thread support and so on.  LDAP is the option that runs on top of that if you need stronger query options including delivering lists that fulfil a property.   The limitation of LDAP is that it cannot join objects to form a larger object, and so you can filter on (attr1=3) or (attr2=*Rein) but not (att3=attr4).  I fully understand that LDAP is too far off for the project, and DBM probably lacks query luxoury, but it should be clear that we are then relying on a database paradigm that was not designed to do what we do with it.  There is one alternative that may be more suitable; Oracle provides a drop-in replacement for SQLite3 that is founded on BerkeleyDB.  Allas, I mentioned all of this before but not with the underpinning I'm giving here.

I wrote a series of articles entitled "Silly SQL" for the Dutch DB/M database magazine; since it's Dutch, I will send it directly to you.

I hope this is helpful,

Cheers,
 -Rick

[1] Row locking can only lock on data that has been seen, not data that has shown to not be present.  So-called "ghost entries" can be inserted by other transactions and leak into your transaction, possibly invalidating your assumptions that rows are not present in a table.  AFAIK, database engineers have never compared queries but only been focussing on row locking as the most refined form of concurrency control.