a little madness

If Java Could Have Just One C++ Feature…

I have been immersed in Java for a while now, but having worked in C++ for years before, there is one big thing I miss: destructors. Especially in a language with exceptions, destructors are a massive time and error saver for resource management.

Having garbage collection is nice and all, but the fact is that we deal a multitude of resources and need to collect them all. How do we do this in Java? The Hard Way: we need to know that streams, database connections etc need to be closed, and we need to explicitly close them:

FileInputStream f = new FileInputStream(“somefile”);
// Do some stuff.
f.close();

Of course, with exceptions it gets worse. We need to guarantee that the stream is closed even if an exception is thrown, leading to the oft-seen pattern:

FileInputStream f = null;
try
{
// Do some stuff
}
finally
{
if(f != null)
{
try
{
f.close();
}
catch(IOException e)
{
// Frankly, my dear…
}
}
}

The noise is just incredible. A common way to reduce the noise is to use a utility function to do the null check and close, but noise still remains. Repeating the same try/finally pattern everywhere is also mind-numbing, and it can be easily forgotten leading to incorrect code.

In C++, this problem is solved elegantly using the Resource Acquisition Is Initialisation (RAII) pattern. This pattern dictates that resources should be acquired in a constructor and disposed of in the corresponding destructor. Combined with the deterministic destruction semantics for objects placed on the stack, this pattern removes the need for manual cleanup and with it the possbility of mistakes:

{
std::ifstream f(“somefile”);
// Do some stuff
}

Where has all the cleanup gone? It is where it should be: in the destructor for std::ifstream. The destructor is called automatically when the object goes out of scope (even if the block is exited due to an uncaught exception). The ability to create value types and place them on the stack is a more general advantage of C++, but Java can close the gap with smarter compilers¹.

Interestingly, C# comes in half way between Java and C++ on this matter. In C#, you can employ a using statement to ensure cleanup occurs:

using (TextReader r = File.OpenText(“log.txt”)) {
// Do some stuff
}

In this case the resource type must implement System.IDisposable, and IDispose is guaranteed to be called on the object at the end of the using statement. The using statement in C# is pure syntactic sugar for the try/finally pattern we bash out in Java every day.

What’s the answer for Java?² Well, something similar to using would be a good start, but I do feel like we should be able to do better. If we’re going to add sugar why not let us define our own with a full-blown macro system? Difficult yes, but perhaps easier than always playing catch up? An alternative is to try and retrofit destructors into the language³. It is possible to mix both garbage collection and destructors, as shown in C++/CLI⁴. However, I don’t see an elegant way to do so that improves upon what using brings. If you do, then let us all know!

—
¹ it appears that Mustang already has some of the smarts such as escape analysis.
² if you’re the one down the back who shouted “finalizers”: you can leave anytime you want as long as it’s now!
³ I said NOW!
⁴See also Herb Suttor’s excellent post on the topic Destructors vs. GC? Destructors + GC!.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.

This entry was posted on Thursday, August 3rd, 2006 at 1:36 am and is filed under C++, Programming Languages, Java, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

20 Responses to “If Java Could Have Just One C++ Feature…”

Martin Says:
August 3rd, 2006 at 4:13 am
I am living a quite happy life without destructors. What I cannot live without are closures! With closures your example could look something like this:

File.open(”filename.txt”) { |f|
// do something with f.
// f is the FileinputStream that is automatically closed when this codeblock has finished the execution.
}

Have a look at Ruby, this construct is ubiquitous there.
Nestr Says:
August 3rd, 2006 at 4:59 am
It is interesting how the programmers detects the problem of allocating resources of various types independently and realize that there has to be a better way than repeating boilerplate over and over again.

In Java garbage collection takes care of memory allocation. The problem is that custom destructors and garbage collection is not a very good match because most garbage collectors are not deterministic; you never know when the destructor is going to run; worst case: When the program finishes. That is not always acceptable.

The C++ alternative of adding new and delete all over the place and be careful when there are exceptions is very flexible but not that appealing either and akin to unstructured gotos, but there is a nice solution:
http://www.gnu.org/software/libc/manual/html_node/Advantages-of-Alloca.html
Allocate on the stack! Resources for that block get automagically removed as soon as they go out of scope.

This is the exact approach many are taking: Use scope!
In C# we have using. Ruby uses blocks:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/20983
And the version of Python coming out next month is adding a ‘with’ statement for exactly this purpose:
http://docs.python.org/dev/whatsnew/pep-343.html
Shai Almog Says:
August 3rd, 2006 at 5:07 am
As a former C++ developer it was very hard for me to make the switch to Java. Destructors? Its the one thing I don’t miss…
You picked about the only place where a destructor might come in handy but you missed a couple of basic things:
1. You don’t have to close streams in Java. All finalizers (yes I know they aren’t destructors and I did find your notes amusing) will close the stream when they get a chance so in all the years of my programming in Java only once did I run into a problem caused by an unclosed stream (on Mac OS 9.x where file access and threads were “problematic”).

2. Most Java programs have generic error handling code where you don’t have to wrap things with nester try{}catch() within a catch statement and it doesn’t look as ugly (although it is functionally the same thing).

3. Destructors will remove the need for closing the file but to get them working better than finalizers (without performance penalty) you need stack objects which are a world of PAIN.

4. Even if destructors exist and close the stream for you you still need to catch and handle the exception…

5. I think the REAL problem is that close() throws an IOException
Tom Says:
August 3rd, 2006 at 7:28 am
Shai, you _do_ have to close streams in Java. Based on real life use. Even abstractly speaking, finalizers are defined as being unreliable. And Java’s inconsistent conventions for “destruction” also make knowing what to do a pain. C#’s IDisposable is much clearer. At least the new Closeable in Java 5 is an improvement but not all the way there.

Also, I think RAII is great. Python deterministic garbage collection (if you don’t have reference cycles) also effectively becomes like C++’s RAII only better because it’s based on real liveness, not the block scope. But even block scope ain’t so bad. The D language has RAII like C++, but it’s a bit more explicit if I remember.

All that said, I’m with Martin. I find Ruby to be my favorite on this topic. Seems cleaner and more flexible, although perhaps less consistent, than C#. And while both Ruby and C# (and Java’s inconsistent try/finally) require a block for each resource, I find that it tends to make code clearer than a half dozen strewn resources in local vars in a single block.
Tom Says:
August 3rd, 2006 at 7:40 am
Shai, but I do agree that close() throwing IOException is a pain.
Tom Says:
August 3rd, 2006 at 7:43 am
One more, if I don’t overstay my welcome too badly, is in reference to Python’s “with” as referenced by Nestr. I take the “with” thing as meaning that they felt Ruby envy, didn’t trust the reference counting (because of alternative implementations of GC in Jython, IronPython, and perhaps in CPython in the future), and thought it was more clear for defining things based on scope. Sort of interesting, at least, and I find it too complicated compared to Ruby/Groovy, personally.
Jason Says:
August 3rd, 2006 at 10:47 am
Martin,

Thanks for pointing out how it is solved in Ruby, though I would say that this issue is separate to closures. The fact that Ruby uses blocks for both is interesting, though perhaps mixing two concepts? Also it appears that the Ruby solution suffers from the same issues as “using” in C#: you can forget the block, and it is more awkward to handle multiple resources. Correct me if I’m wrong: I don’t program in Ruby.
Jason Says:
August 3rd, 2006 at 10:53 am
Nestr,

Custom destruction and garbage collection are only a bad match if you tie them together using something like *shudder* finalizers. C++/CLI shows how to marry the two: perform custom destruction at a deterministic time, separate to garbage collection. Leave the GC to clean up the memory only.

Also, in C++ you should not be using new/delete everywhere, nor even alloca. Memory (like all other reasources) should be acquired in a constructor and freed in a destructor. By sticking to RAII you make it impossible to forget to deallocate the resources.
Jason Says:
August 3rd, 2006 at 11:09 am
Shai,

Didn’t I ask you to leave ? Seriously, finalizers are no solution: they are a horribly broken concept. To address your points one by one:

1) As Tom says you absolutely *do* have to close streams. Finalization provides no guarantees, plus it is a terrible waste of resources (which are limited) to wait for collection.

2) Yes, I did mention that utility classes are used to reduce the code in the catch+close block. An improvement, but we could do much better.

3) I am not sure what performance penalty you are talking about. Finalizers themselves complicate the garbage collector, and I suspect this complication hurts performance. Anyhow, I disagree completely that stack objects are painful. In fact, using the stack is dead simple and performs extremely well. So many of the objects we use are only needed within a stack frame, but so few languages allow us to take advantage of it.

4) Often you do not need to handle the exception, rather it will just be propagated. This is a major advantage of the exception model in the first place. Even if you do handle it, the finally block is extra work that shouldn’t be necessary.

5) Yes, close() throwing an IOException is a pain. The checked/unchecked debate aside, this is just a fact of life: closing can fail. It’s awkward to deal with in any language. Fortunately, a utility function can hide this detail when you don’t care about the failure.
Jason Says:
August 3rd, 2006 at 11:22 am
Tom,

Reference counting in Python is interesting. It does seem that the introduction of “with” indicates that Python is moving away from guarantees about GC. I actually think this is a good thing: GC can perform very well, but the implementation needs to be freed from unnecessary requirements. We can leave memory to GC, but shouldn’t expect it to help with resources it knows nothing about.
Tom Palmer Says:
August 3rd, 2006 at 2:44 pm
Jason, I see Ruby’s use of closures/blocks as a simple and convenient language construct that is very versatile.

As for “forgetting the block” in Ruby, I only have some experience myself, but I look at things as a path of least resistence issue. In any language, you sometimes need a resource to outlive the current block, so some API has to be available for that. The questions I see are, “How obvious is what I’m doing here?” and “How easy is it to accidentally forget or get lazy and cheat?”. And C# is perhaps risky. The code looks the same. You just have to remember to use “using”. Compared with Ruby, the constructs look more different (so code structure makes meaning clearer than in C++), but it’s also very convenient and pretty (at least within the confines of the language). Being convenient and pretty makes it a nice path of least resistance.

So, anyway, all that together is why I like Ruby’s style best, but I haven’t carried out any statistical research or anything.

Meanwhile, thanks for the great post. Java is very much lacking in this important area, and I think raising awareness is important. And either C++, C#, or Ruby style would be an improvement over the current easily buggy style that Java has.

(By the way, I remembered to put in my last name finally. Figured it was worth doing after the confusion the last time I commented on your blog.)
Tom Palmer Says:
August 3rd, 2006 at 4:34 pm
Um, or some linguistic construct for outliving the block. My “API” comment was a bit off. Anyway, done chattering now.
Shai Almog Says:
August 4th, 2006 at 12:26 am
Jason,
I specifically said that I don’t compare finalizers to destructors

To respond to your responses:
1) Resources are not limited, thats a mistake common for C++ programmers. Once they become limited the GC will run and finalizers will occur pretty much guaranteed. Finalizers are bad because they require another GC cycle to occur but for the special case of streams its really as close to an elegant solution as GC can get. Unless you are arguing against GC which is the only proven none manual memory handling solution (reference counting breaks in very hard to detect ways).

3) Finalizers have a performance penalty (we both mentioned it) its unavoidable. So do destructors (function call cost and the cost of the content of that function), the advantage of the finalizer approach is that this cost can occur whenever the GC algorithm chooses that cost to occur (low priority thread unless resources are low). In the case of destructors they have to occur NOW, which might not be a “good time”.
Stack is a pain depending on how you look at it. As a C++ programmer I was a maintainer of a project with over 1 million lines of code. However, for more than 4 years we had to release the project with a specific module compiled in debug mode (this was passed on by our predecessors) anyway one day after upgrading to a new version of VisualStudio I got a crash in a section of that particular module, turns out someone returned a pointer to a stack object… It took us 4 years to discover this (and yes purify… right it wouldn’t even look at the amount of code we had). That problem is a C++ problem, in Java this would probably be simpler but then you have to explain to most users the difference between a “stack object” and “heap object” (hell try to explain it to most people who call themselves C++ programmers, they say they understand but they don’t really). The problems can be illustrated quite clearly by C# that has the terrible support for this nonsense:
a) An object isn’t REALLY a stack object anymore if you pass it onwards (to prevent memory corruption), so you might expect the performance of a stack object but you are really using a heap object.
b) There is no reason to complicate peoples lives with such complex concepts when the JIT can scope these things for us… Hell with virtual method inlining the application flow changes completely and stack objects can be considerably wider.
c) There are very few cases where stack objects can be used in a proper real world Java application. Streams are one such exception but these are rare, having built MANY J2ME/Java SE and Java EE applications over the past 10 years I’d say that is is hardly necessary. You can argue that members initialized in the class constructor/fields/static section could be stack objects but I disagree since this is a JIT optimization and really doesn’t require a language modification.

4) We agree that you don’t need to handle the exception in place. However, what I was trying to say is that EVENTUALLY you will need to catch an IOException (this is actually a great example of Java’s declared exceptions), since any real world application MUST handle an IOException at some place. You can catch it in a generic way tack on the stream object into the thread local and throw it onwards letting the generic handling code close the stream for you.

5) I agree, close can fail but I doubt whether we care about this. Close is probably one of those rare cases that should return a boolean for the special cases who do care… Checked exceptions are great IMO but overused in MANY cases in the API!
Jason Says:
August 4th, 2006 at 12:38 am
Tom,

Agreed, there is an element of style to it. Your point about there still needing a way to specify a lifetime outside of the scope is quite right. In C++, the modern way to achieve this is to use an object on the heap wrapped in a smart pointer that gives the desired lifetime (reference counting, ownership transfer, …). This is elegant, but does leave more decisions up to the programmer.
Jason Says:
August 4th, 2006 at 12:54 am
Shai,

Let the debate continue :).

1) Your are mixing the general concept of resource management with the specific task of memory management. Resources *are* limited, and the GC does not know a thing about them (except memory), so it doesn’t know when it needs to kick in. This is the first problem with finalizers: they burden the GC with management of resources it knows nothing about. Nothing against GC, just don’t expect it to solve everything.

3) The cost of the destructor function is clearly unavoidable when we need to clean up. In an ideal world we would have a resource collector for every kind of resource that would know a timely and efficient way to trigger and perform cleanup. In reality, the number of possible resources is infinite and there needs to be a generic method for dealing with them. This generic method cannot have all the advantages of a GC which knows the kind of resource it is dealing with.

As for the stack problem, I concede these bugs can be a pain to find (but 4 years?!?). This is a massive disadvantage of C++, but can be largely avoided by use of RAII everywhere except where you absolutely *must* start dealing with raw pointers (very rare in practice). I don’t think this is a reason to write off stack usage (after all, there are plenty of other ways to corrupt memory in C++). Indeed, the vast majority of objects created in a typical application have a very short lifespan bounded by some stack frame. I can’t wait till compilers catch up and take full advantage of this!

4) You’ve lost me now. Attach the stream object to a thread local and bother some code that shouldn’t know a thing about it?

5) Close returning a boolean probably would have been a decent compromise. The main argument against it would be API consistency.
Shai Almog Says:
August 4th, 2006 at 2:30 am
Jason,
I think we agree on most things but see different sets of problems.

1) You are very right with your diffrentiation between limited resources other than memory and GC. The question is: what are these resources you speak of and do they matter?
In GUI we have GDI memory which is quite limited yet AWT does not require any freeing of memory and works (sort of works, but none of the problems I ran into related to GDI), when programming MFC GDI leaks would just kill me… SWT does take the approach of manually freeing stuff but this is really unrelated to destructors and is more of a “taste” issue.
Databases are resource pooled by the application server and if an exception occurs the application server should recycle everything for me (still Hibernate freaks out when I don’t close the session but it has a configuration option to turn this off).
In networking we have a limited set of connections we can open, but then most networking and connections are pooled by the application server for us and despite building very high volume networked applications you reach the thread limit well before you reach the stream limit (with selectors both problems go away).
On the file system I am aware of some theoretical limitations but I have to say that the only time where I ran into a problem related to an unclosed file was in Mac OS 9 (which is really a special case). Most of the programmers I know have some code like: myMethod(new FileInputStream(file)); somewhere, even if you don’t have any code like that its probable that one of your libraries has code like that and surprisingly it scales.

Don’t get me wrong, you should close files and not waste resources even if the OS has plenty but my point is that this is not something to get excited about even for resources other than memory. Current performance issues are no longer related to resource conservation, if you look in the Java message boards even with the relatively advanced developers you will rarely see complaints of resource hogging related to resources other than memory. Is it because these resources are harder to detect? Or maybe because people are less familiar with them?

3) I think the real advantage of laying everything on a GC is in my ability to tune the application later on for better performance. We had several applications that got a HUGE boost (20% – 30% difference) just by changing GC parameters.

The 4 years was a time span in which the problem existed, with 1 million lines of code and weird crashes that only occured when building release versions (no debugging) there was no way we could detect this… We tried memory tools, the few that worked (crawled) didn’t produce anything so we just left it… There was no one who understood that rather complex piece of code developed by an aeronautics engineer for modeling flight and path envelopes (don’t ask). So finding this particular code was a needle in one huge haystack… I’d love to claim genious on finding that bug but I was lucky, the upgrade to a new version of visual studio made the application crash in debug mode which allowed me to detect the problem… RAII would assume that we actually wrote the code, this is obviously not the case for most real world projects.

4) You can put the stream object on the thread local or even in the exception object and then rethrow the exception to handle it in a generic location. But really thats a silly idea because you would have to catch the exception anyway so might as well handle it rather than propogate it forward. My point was to illustrate other possible solutions for propogating the exception rather than handling it in place.
Shai Almog Says:
August 4th, 2006 at 2:44 am
It just occured to me now (after all these years…) why doesn’t a file stream close itself before it throws the exception? It knows it would fail, right? right before the throw statement include a close (technically its thrown in native but same thing).
So I looked a bit in the mustang source code (an old source tree but still…) and it seems they didn’t do that but it seems pretty trivial to add this (Mustang submission anyone?). The nice thing is that you can easily create an Input/Output stream that wrapps any existing stream and provides this functionality
adiGuba Says:
August 11th, 2006 at 6:07 pm
Hello.

Even if the finalize() method close the stream, it’s better to close it immediatly at the end of the task, because you cannot determine exactly (and easyly) when the GC while finalize the stream…

I use this pattern to correctly close the strea, which has the advantage to use only one catch for all the stuff (close() included) :

try
{
FileInputStream f = new FileInputStream(“somefile”);
try
{
// Do some stuff
}
finally
{
f.close();
}
}
catch(IOException e)
{
// Frankly, my dear…
}
Infernoz Says:
August 12th, 2006 at 12:54 am
“why doesn’t a file stream close itself before it throws the exception?”

Because a stream may contain state, like a line number, file offset, connection parameters etc., you need to record when an exception occurs, exceptions often don’t pass state or provide enough information, that is what a catch clause is for.

This naive comments on this thread suggest that some people here don’t have a clue how important logging is or knowing why an issue occured e.g. via exceptions and object state, this requirement quite validly applies to close() too, so a boolean return value is not OK.
Sree Says:
August 14th, 2006 at 9:44 pm
Catching Uncaught Exceptions
http://www.roseindia.net/javatutorials/catching_uncaught_exceptions_in_jdk_5.shtml

If Java Could Have Just One C++ Feature…

20 Responses to “If Java Could Have Just One C++ Feature…”

Leave a Reply

Where am I?

Archives

Categories