Archive for the ‘Opinion’ Category

Would You Like Some Tech News With Those Job Ads?

Wednesday, September 19th, 2007

Yes, I’m whining, but I can’t be the only person to notice that job ads are taking over the internet. More specifically, every tech news site seems to have its own job ads, including:

I guess the reason is that good people are hard to find, and our industry is no exception. I do wonder at the viability of these boards; it will be interesting to see if this spreads even further. Are the advantages of a centralised board offset by the differentiating factors offered by each site (in particular the ability to target a niche audience)? Perhaps a combination of the two ala the Google Adwords Content Network is our final destination?

The Key Thing In Python’s Favour

Wednesday, September 5th, 2007

I recently ran across this post advocating Python. This made me think about why I prefer Python over the rest of the scripting crowd. And I realised that there is just one key reason:

Python is easier to read

That’s it. For all the joys of programming in dynamically-typed languages, I think there is one major problem: in many ways these languages favour writers over readers. I covered one example of this problem back in my post on Duck Typing vs Readability. This is why it is so important that a dynamic language is designed with readability in mind.

In my opinion, Python really shines readability-wise. And this is no accident — if you follow design discussions about Python you will see that Guido is always concerned with code clarity. A feature is not worth adding just because it makes code more compact; it must make the code more concise. Features have even been removed from Python because they were seen to encourage compact but difficult to comprehend code (reduce being a classic example). Even controversial design choices like significant whitespace and the explicit use of “self” are actually good for readability.

The philosophy behind Python also extends beyond the language design. Clever tricks and one-liners are not encouraged by the community. Rather, the community aims for Pythonic code:

To be Pythonic is to use the Python constructs and datastructures with clean, readable idioms.

This is surely refreshing compared to the misguided boasting about how much can be crammed in to a few lines of [insert other scripting language]. Certainly, if I ever have to pick up and maintain another person’s dynamically-typed code, I hope it is Python. And since I know I will have to maintain my own code, when I go dynamic I always pick Python.

It’s Not About the Typing, Or Even the Typing

Tuesday, July 10th, 2007

There has been plenty of buzz lately about so-called “dynamic” languages. I guess we have the hype around Rails and similar frameworks to thank (blame?) for this most recent manifestation of an ages-old debate. Unfortunately, the excitement around the productivity increase (real or perceived) afforded by these languages has led to some common fallacies.

Fallacy 1: Misclassification of Type Systems

I’ve rolled a few related fallacies into one here. The most common mistakes I see are classifying static as explicit typing, and dynamic as weak typing. I won’t go into too much detail, as this one is well covered by the fallacies in What To Know Before Debating Type Systems. In any case, you can usually tell what people are getting at, and concentrate on more interesting arguments.

Fallacy 2: Less “Finger” Typing == Drastically More Productivity

This one really gets to me. When was the last time you told your customer “Sorry, I would have made that deadline: I just couldn’t type all the code fast enough!”? The number of characters I need to type has very little to do with how fast I code. This is for a few reasons:

  1. For anything but short bursts, I type a lot faster than I think. Most programming time goes into thinking how to solve the non-trivial problems that come up every day. This reason alone is enough to dispel the myth.
  2. Every second thing I type is spelled the same: Control-Space. Honestly, good IDEs do most of the typing for you, especially when they have plenty of type information to work off.
  3. Sometimes more verbosity is better anyway, otherwise we would be naming our variables a, b, c…

The only ways in which I believe less finger typing give a significant increase in productivity are when:

  • The chance to make a trivial error is removed; or
  • The reduced noise in the code makes it more readable.

Personally, I do not find static typing introduces noise, even when full explicit type declarations are required. On the contrary, I find code with type information is usually easier to read without the need to refer to external documentation (see my earlier post on Duck Typing vs Readability)1.

Fallacy 3: Dynamic Typing Is The Big Factor In Increasing Productivity

This is the end conclusion many people draw to the whole issue. I think this derives from the earlier fallacies, and misses the important points. It is true that people are finding many significant productivity improvements in dynamic languages. However, I think very few of these have anything to do with the static vs dynamic typing debate. So, where are the productivity increases? Here are a few:

  1. Frameworks: Rails is creating the latest hype around dynamic languages, because a framework of its style is a big productivity boost for a certain class of applications. However, Rails-like frameworks are possible in other languages, as demonstrated by the many clones. Not everything will be cloneable in some languages, but the important things can be and those that can’t are unlikely going to be because of static typing.
  2. Syntactic sugar: there is a correlation between languages that are dynamically typed and languages that have convenient syntax for common data structures and other operations (e.g. regular expressions). Note that I do not believe the value of syntactic sugar is in the reduced finger typing (see fallacy 2). Rather, this syntax can reduce the chance of silly errors (e.g. foreach constructs vs a classic for-loop) and can make for significantly more readable code (compare the verbosity of simple list operations in Java vs Python). There is nothing that prevents something like a literal syntax for lists from being supported in a statically-typed language, it is just not common in popular static languages.
  3. Higher-level programming constructs: this is the big one. The biggest productivity gains given by a programming language come from the abstractions that it offers. Similar to the previous point, there is a correlation between languages that use dynamic typing and those with higher-level constructs like closures, meta-programming, (proper) macros etc. However, statically-typed languages can and in some cases do support these constructs.

Let’s also not forget that dynamic typing can harm productivity in certain ways: such as less powerful tooling.

Conclusion

My point is simple: get your argument straight before you write off static typing. Even though dynamic typing is the most obviously “different” thing when you first try a language like Python or Ruby, that does not mean it is the reason things seem much more productive. In the end, allowing higher levels of abstraction and maintaining readability are far more important factors, and these are largely independent of the static vs dynamic religious war.


1 Actually, the ideal readability-wise would be to have explicit type information where the type is not immediately obvious. For example, API parameters need type information (when it is not in the code it ends up in the docs). On the other hand, most local variables do not, for example loop counters.

Linus Makes a git of Himself

Tuesday, June 5th, 2007

Recently, Linus gave a Google tech talk on git, the SCM that he wrote to manage Linux kernel development after the falling out with BitKeeper. In the talk, Linus lambasts both CVS and Subversion, for a variety of reasons, and argues why he believes that git is far superior. In true Linus fashion, he frequently uses hyperbole and false(?) egotism as he lays out his arguments. This in itself could all be in good fun, but Linus takes it a too far. The even worse part, in my opinion, is that Linus confidently pedals arguments that are either specious or poorly focused. In the end, he just loses a lot of respect.

Now both CVS and Subversion are far from perfect. In fact, I personally dislike CVS and am disappointed with the progress of Subversion. Git clearly includes some advanced features that empower the user when compared with either of these tools. However, the way Linus presents the superiority of git is very misleading. Firstly, he throws out several insults towards CVS/Subversion that are plain wrong, such as:

  1. Tarballs/patches (the way kernel development was managed way, way back) is superior to using CVS. This almost has to be a joke, but sadly I do not think it was meant that way. CVS may be lacking many powerful features, but this statement ignores the many extremely useful features that CVS does have.
  2. There is no way to do CVS “right”. Only if you buy the argument that decentralised is the only way, which Linus does not convince me of one bit.
  3. It is not possible to do repeated merges in CVS/Subversion. It may suck that the tools do not support this directly, but everyone knows how to do repeated merges with a small amount of extra effort on the user’s part. I dislike the extra effort, but this is very different from it being impossible.
  4. To merge in CVS you end up preparing for a week and setting aside a whole day. Only a stupid process would ever lead to such a ridiculous situation; one that is not adapted to the tool being used. A more accurate criticism would be to say that CVS limits your ability to parallelise development due to weak merging capabilities.

Ignoring the insults, the very core of Linus’ arguments does not make sense. He claims the absolute superiority of the decentralised model, but the key advantages that he uses to back this up are not unique to decentralised systems. Almost all of the key advantages are much more about the ability to handle complex branching and merging than they are about decentralisation. The live audience was not fooled, and one member even questioned Linus on pretty much this exact point a bit over half way through the talk. In response, Linus starts off on a complete tangent about geographically distributed teams and network latency, leaving the question totally unanswered. The closest he gets is making some very weak points about namespace issues with branches in centralised systems (like somehow that would be difficult to solve).

The inability to answer (or perhaps the ability to avoid) important questions is in fact a recurring theme throughout the talk. Another prime example is when an audience member asks an insightful question about whose responsibility it is to merge changes. In a centralised system, the author is naturally tasked with merging their own new work into the shared code base. The advantage here is they know their own changes and are thus well equipped to merge them. In a decentralised system, people pull changes from others and thus end up merging the new work of others into their own repository. Linus is so happy with this question he quips that he had payed to be asked it. Such a shame, then, that he fails to answer it in any meaningful way. Instead, he waffles about the powerful merging capabilities of git (again, not unique to distributed systems). He then hypothesises a case where he pulls a conflict and instead of merging the change himself he requests the author of the change to do the merging for him. He concludes proudly that this is so “natural” in the distributed model. But hang on a second, the original point was that this would have actually been more natural in the centralised model: the author would have merged their own change and Linus need never have been bothered. Honestly, this is as close as you can get to stand up comedy when giving a talk about an SCM!

All in all, a very disappointing talk. Distributed SCMs are a great thing, and bring many advantages. The powerful merging that the distributed model has brought (through necessity) is something that other SCM vendors should be taking a good look at. This is no way to promote the distributed model, however. A more thoughtful view on the pros and cons of distributed versus centralised models, including what each can take from the other, would have been much more informative. My feel is that centralisation has many benefits for certain projects, and like most other things, the best solution would adapt the good parts of each model to the project at hand.

5 Reasons You Should Delete That Code Now!

Friday, February 2nd, 2007

As programmers we tend to get attached to code we have written. After all, we spent all that effort dreaming it up, typing it in, and (sometimes) making it work. Many of us also like to look over how much we have churned out, to see that lines of code metric crack the 500,000 (or whatever) mark. These sorts of things make us loathe to delete code we have written. But we forget these important points:

  1. Maintenance: someone has to maintain all of that code, and I’m sorry, but it’s you! The less code you have to maintain, the better. So next time you try and crack the million LOC, think about how fun it is going to be to maintain it all :).
  2. Dead code rots: leaving unused code around “just in case” is a particularly bad idea. If it isn’t exercised, it will become broken very soon. Its very presence distracts the reader from their real task.
  3. Source control: think you might need that dead code again? Think that a bad refactor may bring you unstuck? That’s what source control is for: it has got your back.
  4. Efficiency: every little bit of code slows everything down just that little bit. More disk space. Longer compile times. Slower IDE performance. Larger reports. Over time, all these little things add up.
  5. Elegance: verbose code is often a sign of a non-optimal solution. When refactoring, you should be able to use your superior knowledge of the problem to simplify. Cutting back code in this way removes the cruft that builds up over time.

So, stop being so happy that you’ve produced so much code. You should be much happier to achieve the same result with a smaller codebase, even if it means throwing away yesterday’s hard work!

The Perfect Programming Job

Thursday, February 1st, 2007

Fresh out of a Computer Science degree at uni, I hit the job market looking, naturally, for the best job I could find. I immediately ruled out moving into big business (think banking, insurance) and the consulting roles that tend to take you to such places. I wanted to work with technology I could enjoy, on new projects, churning out interesting “stuff”. That is what led me to the wonderful world of the VC-funded startup.

There are many great things about startups:

  1. New projects: by their very nature, startups have brand new projects to work on. We all know that starting from scratch is much more stimulating and fun than maintaining a 30-year-old COBOL codebase.
  2. Great people: because they have interesting projects and questionable job security, startups tend to attract people with a real interest in technology. You don’t get people who moved into “IT” as a career strategy, you get people who love what they do.
  3. Great atmosphere: partly because of 1. and 2., but also because of their small size and youth, startups often have a great feel. They haven’t had time to develop the painful politics and beuraucracy of older, larger companies. Hopefully they never will!
  4. No customers: when you’re just starting out, there are none of those annoying end-users to pester you with support questions and Real World problems. You can just code away in your own little world.

So I jumped aboard the VC-funded startup roller-coaster, and I had a great time. I lost my job a couple of times, but there were always new ventures to start on. I had a ton of fun and I learned heaps.

But something did start to bug me. I poured my efforts into these ventures and their projects, and even though the technology was fun, there was somewhat of a hollow feeling. The problem was that all of this effort was going to waste: without customers everything I created was consigned to obscurity. Of course the startups planned to get customers, and lots of them, but the business model was the classic boom-or-bust strategy. Either we would “disrupt” the world, or we would fade into oblivion. Of course, the odds lean heavily towards the latter.

So this supposed advantage of not having to worry about customers turned out to be the one great drawback of working for startup ventures. Jeff Atwood touches on it in his post Shipping Isn’t Enough:

How many users actually use your application? Now that’s the ultimate metric of success.

I achieved plenty in terms of development as a programmer, and I even managed to do some pretty good work (I hope!). But at the end of the day, I didn’t have the sense of achievement that only comes from creating something that other people use.

However, I have not given the startup game away :). I realised that you can have all the advantages, just with a different strategy. So at Zutubi we’re not planning to “disrupt” anyone. We just make software that people love to use, and grow organically off the back of that. And for once, I have Real Customers that use my software, and I interact with them closely. Who would have thought it: I love dealing with customers! Not because they pay (though I thank them for that ;)), but because every time I learn of someone new using our software, I get that sense of achievement I had been looking for.

Even while I wrote this entry, an email landed from a customer commenting on how they loved the user experience of Pulse. Honestly, it does not get any better than that!

SQL schema upgrades a thing of the past?

Tuesday, October 17th, 2006

I would like to draw your attention to the recent release of the new Java Persistence API for the Berkeley DB. In short, the Berkeley DB is designed to be a very efficient embedded database. With no SQL query support, you instead talk directly to its internal BTree via a very simple API. A very good write up is available at the server side for those interested in the full details.

This style of persistence mechanism is certainly not for everyone. If you want adhoc query support for writing reports and extracting data, look elsewhere.

If you don’t, then Berkeley should be considered when defining the persistence architecture of your next project, just don’t tell your DBAs. Not only is it reportedly very fast, due largely to the lack of SQL style interface, but it also supports transactions, hot backups and hot failover, all of the things that help you sleep at night. However, what has me intrigued is idea of not having to deal with SQL schema migration.

I consider Schema migration to be one of the more tedious and yet non-trivial tasks that is required by any application that employs relational persistence. Yes, Hibernate makes this task somewhat easier to deal with. However, even with Hibernate, you will still need to roll up your sleeves and write some SQL to handle the migration of the data.

Managing schema migration within the Berkeley DB is different. Where as previously you extracted the data via SQL, converted it and then updated the DB via SQL, with Berkeley, you just convert the data in Plain Old Java. They have some examples in there javadoc that gives a reasonable idea of what is involved. Below is one of these examples, a case where the Person object’s address field is split out into a new Address object with 4 fields. The core of the work is done by the convert method:

public Object convert(Object fromValue) {

    // Parse the old address and populate the new address fields
     
    String oldAddress = (String) fromValue;
    Map<String,Object> addressValues =
                                  new HashMap<String,Object>();
    addressValues.put(“street”, parseStreet(oldAddress));
    addressValues.put(“city”, parseCity(oldAddress));
    addressValues.put(“state”, parseState(oldAddress));
    addressValues.put(“zipCode”, parseZipCode(oldAddress));

    // Return new raw Address object
     
    return new RawObject(addressType, addressValues, null);
}

Personally, I think this is a great improvement. Now, if only I had been aware of this at the start of this project, things might be a little different, and faster, and some other good stuff as well.

So are schema upgrades a thing of the past? Maybe not, but they don’t have to be a part of every project.

Do the Simplest Thing That Will Knock Their Socks Off

Wednesday, August 2nd, 2006

It appears that I am not the only one wondering if the pragmatic keep it real approach to software development will result in programs that are missing that something extra, those features that you can not do without the moment you find them.

Kathy writes:

Our users will tell us where the pain is. Our users will drive incremental improvements. But the user community can’t do the revolutionary innovation for us. That’s up to us.

Dave takes a more direct approach when he writes:

It does seem, though, that agile teams will be less likely to either prioritize or implement some of the more subtle touches of the kind that the Wired article discusses. When forced to choose between the features “Send email”, and “Implement graceful date column resizing”, guess which one is likely to get short shrift.

An excellent example of this type of feature, a feature that you didn’t know you needed until you saw it, is iTunes smart shuffle. It “allows you to control how likely you are to hear multiple songs in a row by the same artist or from the same album”.

itunes-smartshuffle.png

I don’t know how I managed without it.

I can not help but wonder whether this feature would have been added by the agile keep it real approach? It is not an essential feature, and I would not have noticed if it was not there. But the fact that it is makes me a very happy user.

The internet, best viewed in 1280 x 1024

Wednesday, May 24th, 2006

Every time that I have upgraded my monitor, I have marvelled at how much more screen realestate is available. Each time I have gone through a period of adjustment where I adjusted the layouts of my applications to take maximum advantage of the extra space. My latest move from a 19inch to one of the nice Dell 24inch screens has allowed me to move to a three column layout in IDEA with the project view on the left, the structure on the right, and still plenty of room for 120 char lines in the editor. Awesome. Moving on to Thunderbird has yielded a similarly pleasing outcome with its nice three column vertical layout. Nice. Even JEdit supports a configurable layout that allows me to make good use of the screen.

Unfortunately my amusement came to an end when I opened up my browser. Either one, it matters not. Where ever I go, whether it’s my favourite news or community website, any one of the blogs I try to keep up with, I end up with one of two things. Either a fixed width page with half my screen empty on either side (but all of the content nicely squeezed into the middle), or a 100% width page where the lines are so wide that they become difficult to read and large patches of whitespace spread around the page.

Now I understand that large screens are only just becoming affordable, and so I do not expect websites to suddenly rearrange their content to suit me and my dislike of unused pixels. It does however bring to attention one aspect of building web applications that can be very frustrating. The page layout. Now certainly some people are doing nice things with the tools that are available, namely css, but that can only go so far. What I would really like to see now is a good old fashioned trustworthy layout manager.

Only just last week I was trying to achieve a relatively simple task. Opening a popup window that is an appropriate size for the content it will display. Well, I had hoped that it would be a relatively simple task. After loading the content into an iframe, doing my best to work out what its visible width and height would be and then dropping it into a popup, I am still no closer to a solution that works in both Firefox and IE. With a slightly better API for handling page layouts, this type of task would become dead simple. Speaking of layout managers, those guys over at qooxdoo are doing some pretty interesting things. Now, if only it didn’t look so much like windows.

10 Ways To Improve Wikis; 2

Wednesday, May 3rd, 2006

Sorry readers, I know you have all been waiting with bated breath for the final installment in the wiki series :). Well, here goes:

  1. Unreadable syntax for tables

    The problem is actually more generic: the simple wiki syntax works well for inline elements, but poorly for structured elements. Why? Mainly because whitespace is significant. This is convenient most of the time, but gives no way to format a structured entity such that it can be easily read. A possible solution is to allow insignificant whitespace so that structured elements can be formatted with indentation. Naturally, for regular paragraphs, significant whitespace would have to stay. I think the semantics for whitespace can still be clear if the insignificant whitespace is only allowed “outside” of the actual content, in the space between whatever syntax elements are used to delimit the table (or whatever it happens to be). Or you could just use a WYSIWYG editor, but you won’t catch me doing it :).

  2. Editing in a browser text area sucks

    A couple of readers actually pointed out some neat solutions for this one, such as the mozex extension for Firefox, which lets you use an external program for editing instead of the browser text area. Nice idea, but still a little awkward, as you can’t avoid interacting with the web UI for other actions (this is a wiki after all!). I think the real answer is the slow, painful progression towards browsers as a richer application platform. Many have tried to create the ubiquitous thin client technology for the web, but none can compete with the inertia of the trusty old web browser. Unfortunately for us users, this means we’ll be waiting a long time for browser technology to evolve to a state where web UIs are actually enjoyable to use. Just look at how excited we all get when AJAX lets us emulate two-decade-old desktop technology in our web apps.

  3. Poor support for versioning

    Let’s start simple: a wiki that doesn’t have at least RCS-level versioning support (simple revisions, diffs, history) should not even be allowed to exist. A wiki that doesn’t have some form of merging support (with conflict resolution) isn’t worth using. Personally, my requirements are even more strict. I maintain documentation for software products, and these products have multiple release streams. I manage these streams in my code base using branches, and ideally would do the same with the documentation. I don’t know of a wiki that supports this. This seems a shame, because the version control technology has been around for a long time. Rather than reinventing a weaker versioning scheme, I don’t understand why more wikis don’t use a full-blown version control system (like Subversion) in the back end. There is a wealth of existing tools and technologies that just aren’t being leveraged here.

  4. Lost edits due to browser crashes

    First off, for those wikis with autosave, this is much less of a problem. I would like to mention one insightful comment from reddit, however. There, poster rahul turned this problem back at the browser itself. A good point: why don’t the web browsers implement their own autosave for form data? They are already part of the way there by remembering form input, it is not much of a leap to get from there to autosaving for text areas. The obvious benefit of this approach is that once it is implemented in the browser, there is no longer a need for every web app to reimplement the same functionality.

  5. Wiki discussions

    People, please just stop using flat wiki pages for discussions! It hurts us, precious! Some wikis have a more enlightened approach, and allow threaded comments on each page. This is a huge improvement, no doubt. Still, will they ever match dedicated forum solutions? Maybe wiki implementors should be looking again to adopt existing technologies. I wonder how feasible it would be to integrate an existing forum solution into the wiki. Perhaps a looser coupling would even be benefitial, allowing more diverse tools to be linked to a wiki page. The counterbalancing force is of course the benefit of tight integration.

So there you have it: some good can come of a rant after all. At the very least, I have picked up a few new ideas myself!

——–
Into continuous integration? Want to be? Try pulse.