February 24, 2005

Ruby-style iterators in Java

Loop abstraction is probably the most common application of blocks in Ruby. The idea being that the programmer should not have to deal with the iteration code itself and concentrate on what needs to be done at each step, instead. And it's typically a good idea.

This is what a standard pre-Java 5 iteration looks like:

Iterator i = collection.iterator();
while (i.hasNext()) {
     Object obj = i.next();
     ...
}

Or, for iterating over an array:

for (int i = 0; i < array.length; ++i) {
    Object obj = array[i];
    ...
}

These are cleary prone to cause programming errors. Forget to call i.next(), call it twice, use <= instead of <, forget ++, and all goes to hell.

Iterators have gotten significantly better in Java 5 with the introduction of the for each construct, but they are still a long way from Ruby's internal iterators:

for (String str : listOfStrings) {
    ...
}

A little bit of nomenclature here. External iterators are those where the client code controls how the iteration takes place (e.g., the Java way). Internal iterators, on the other hand, abstract that logic and only require the client code to supply the action to perform at each step of the iteration (e.g., the Ruby way).

Take a look at the iterator idiom in Ruby:

array.each { |value|
     ...
}

Much cleaner, and no need for so much boilerplate, don't you think?

In the majority of cases, what a program really needs are internal iterators. From a code reuse perspective, why replicate the same construct over and over? But make no mistake, there are some legitimate cases for where external iterators are necessary, such as when looping over several collections in parallel (try doing that with internal iterators).

Ruby provides a way for externalizing internal iterators. Using the Generator class, an Enumerable object (i.e., that which exposes a method each) can be transformed into an external iterator:

generator = Generator.new [1,2,3,4,5,6,7]

while generator.next?
     puts generator.next
end


Internal iterators in Java

So, how about internal iterators for Java? The first thing we need to look at is how to encapsulate a chunk of code in a form that can be passed around (a la Ruby blocks).

The closest we can get are anonymous inner classes, but it's not even fair to compare them with Ruby blocks. The latter a lot more flexible, without the clunky syntax. Here's an how we could emulate a general-purpose block in Java 5:

interface Block {
     Object call(Object... items);
}

...

void methodThatTakesBlock(Block block)
{
     block.call("hello", "world");
}

...

Block block = new Block() {
     Object call(Object... items) {
         return items[0].toString() + items[1].toString();
     }
}

methodThatTakesBlock(block);

First, you'll notice we're declaring a Block interface. This is needed in order to be able to invoke the "call" method on the block (it would be possible to do via reflection, too, but that's another story).

The use of varargs helps avoid having to declare a block class or method for each possible combination of arguments and return values, but places the burden of extracting the arguments from the array on the implementer of the block.

For what we're trying to achieve here, we're going to stick with non-vararg arguments. In fact, our block only needs one argument, namely, the element that's being iterated over at each step. Here it goes:

interface IterationCallback<E>
{
     void call(E element);
}

interface EnumerableCollection<E>
    extends Collection
{
     void each(IterationCallback<E> callback);
}

class EnumerableArrayList<E>
     extends ArrayList<E>
     implements EnumerableCollection<E>
{

     public void each(IterationCallback<E> callback)
     {
         for (E element : this) {
             callback.call(element);
         }
     }
}

And here's how to use it:

EnumerableCollection<Integer> list = new EnumerableArrayList<Integer>();

list.add(1);
list.add(2);
list.add(3);
list.add(4);
list.add(5);

list.each(new IterationCallback<Integer>() {
     public void call(Integer element) {
         System.out.println(element);
     }
});

So, it is possible, but nobody said it would be pretty. Better language support would be needed to make internal iterators as developer-friendly as they are in Ruby. And we're not even considering the performance implications. The above code instantiates an extra object compared to using Java's iterator idiom, and requires an extra method call. It could have significant impact if used in the wrong place.

2 Comments:

Blogger Anthony said...

I still think Python beats the pants off of both Ruby and Java:

for item in list:
  print item

March 06, 2005 1:56 AM  
Blogger Unknown said...

So many blogs and only 10 numbers to rate them. I'll have to give you a 10 because you have done a good job. Great Job,

Free Access To More Information Aboutbroadcast

October 11, 2005 8:21 PM  

Post a Comment

<< Home