a little madness

A man needs a little madness, or else he never dares cut the rope and be free -Nikos Kazantzakis

Zutubi

Archive for the ‘Persistence’ Category

Phrases You Should Never See in an FAQ

Today’s phrase of choice is “you don’t need”, with the word of the day being “never”. Consider this entry in the Hibernate FAQ:

The hibernate.hbm2ddl.auto=update setting doesn’t create indexes

SchemaUpdate is activated by this configuration setting. SchemaUpdate is not really very powerful and comes without any warranties. For example, it does not create any indexes automatically. Furthermore, SchemaUpdate is only useful in development, per definition (a production schema is never updated automatically). You don’t need indexes in development.

What’s the problem with this? The FAQ answer is saying (in a round about way) that you should not ask this question, as you don’t need an answer. Telling your users what they need is a good way to alienate them. If this question is common enough to warrant an entry in the FAQ in the first place, aren’t your users telling you that they do need this functionality? This doesn’t mean you have to jump to implement it – just don’t patronise your users by telling them you understand their needs better than they do.

Personally, I was looking at this entry because we use Hibernate for persistence in Pulse. In Pulse, there is a built-in upgrade framework that updates the schema automatically for you when you install a new version. So much for the assertion above that “a production schema is never updated automatically”. While recently adding an upgrade that required new indices, I also certainly did “need indexes in development” because I want my development environment to match production as closely as possible (not to mention the fact that they saved hours in testing time against large data sets).

The most interesting underlying thing is that the existence of this (and similar) FAQs seems to be indicating to the Hibernate team that the simple SchemaUpdate code could actually be the beginnings of an extremely useful tool. Too few applications have decent upgrade capabilities: our users are often pleasantly surprised by what we have been able to build with considerable help from SchemaUpdate to simplify their upgrades. Maybe the Hibernate team are underestimating the potential of their own tool?

SQL schema upgrades a thing of the past?

I would like to draw your attention to the recent release of the new Java Persistence API for the Berkeley DB. In short, the Berkeley DB is designed to be a very efficient embedded database. With no SQL query support, you instead talk directly to its internal BTree via a very simple API. A very good write up is available at the server side for those interested in the full details.

This style of persistence mechanism is certainly not for everyone. If you want adhoc query support for writing reports and extracting data, look elsewhere.

If you don’t, then Berkeley should be considered when defining the persistence architecture of your next project, just don’t tell your DBAs. Not only is it reportedly very fast, due largely to the lack of SQL style interface, but it also supports transactions, hot backups and hot failover, all of the things that help you sleep at night. However, what has me intrigued is idea of not having to deal with SQL schema migration.

I consider Schema migration to be one of the more tedious and yet non-trivial tasks that is required by any application that employs relational persistence. Yes, Hibernate makes this task somewhat easier to deal with. However, even with Hibernate, you will still need to roll up your sleeves and write some SQL to handle the migration of the data.

Managing schema migration within the Berkeley DB is different. Where as previously you extracted the data via SQL, converted it and then updated the DB via SQL, with Berkeley, you just convert the data in Plain Old Java. They have some examples in there javadoc that gives a reasonable idea of what is involved. Below is one of these examples, a case where the Person object’s address field is split out into a new Address object with 4 fields. The core of the work is done by the convert method:


public Object convert(Object fromValue) {

// Parse the old address and populate the new address fields

String oldAddress = (String) fromValue;
Map addressValues =
new HashMap();
addressValues.put("street", parseStreet(oldAddress));
addressValues.put("city", parseCity(oldAddress));
addressValues.put("state", parseState(oldAddress));
addressValues.put("zipCode", parseZipCode(oldAddress));

// Return new raw Address object

return new RawObject(addressType, addressValues, null);
}

Personally, I think this is a great improvement. Now, if only I had been aware of this at the start of this project, things might be a little different, and faster, and some other good stuff as well.

So are schema upgrades a thing of the past? Maybe not, but they don’t have to be a part of every project.