Tuesday, January 8, 2008

Is Java Dumb?

Slashdot reported an essay on Computer Science education that bemoans the average quality of their graduates.  Of course the first most obvious points are not in tools.  But is there something wrong with Java?  Well, actually, yes there is.  It's not in the syntax of the Java Programming Language, and it's not in the Java Virtual Machine, it's in the style of the design and implementation of the class libraries. 

The syntax of the Java Programming Language provides a high level, systemic or architectural layer for reasoning about the organization of application systems. 

The Java Virtual Machine delivers a general, high performance facility which in recent years has seen the performance of its runtime outstrip the performance of source equivalent applications written in C and C++.   This fact alone is the first best honor for the JVM.   As a machine interface or Instruction Set Architecture (ISA), one can compare the JVM to other machine abstractions like the more recent development of the Dis virtual machine.  The JVM is pretty good.  Its weaknesses are best illustrated by the difficulty that's been had in a handful of cases implementing the JVM in hardware.  These efforts require pointers, machine addresses for memory and devices, and in this and other issues the JVM has been proved to be at crossed purposes -- unlike remarks circa 1995 that the JVM was designed to be compatible with its implementation in hardware.  This idea is proved correct in the syntax and semantics of its ISA, but not in a very precise sense including the implications of its ISA.  The JVM is great because it has been and remains an independent and reasonable Instruction Set Architecture (no bad, messy or unpredictable instructions).  It is independent from the kind of issues prevalent in the Java class libraries.

The general style of programming in the class libraries, on the other hand, has been what it is since the first alpha source dump.  In having classes where there could have been interfaces, the java class libraries obviously espouse a singular usage pattern -- a vision of Java programming as beany scripting.  When the java core APIs don't make an example of good style in using interfaces where applicable, the example is poor by any standard.  Evidently there exists an argument in favor of document driven, code generating IDEs.   The Java platform is many things to many people, and would have been best served in 1995 with an independent script language in which the beany platform was realized.  Everyone was too busy to realize the best possible future for us all, and JavaScript emerged as altogether something else.

The Java 1.0 classes java.io.OutputStream and java.io.InputStream are classes that should have been interfaces.  The classes do nothing an implementor wouldn't just assume doing anyway -- implementing skip(int) bytes in the input stream, or throwing an exception for a bad call to write(byte[],int,int) in the output stream.  Their reason for existance isn't providing functionality, or at least the benefit of that functionality is far outweighed by the benefit of their being interfaces.  Arguments in favor of their being classes aren't in the domain of their best engineering.  One can imagine that they're classes in the beany argument, so that programmers working from documentation can subclass in the singular usage model.  And one can imagine that if as much of the core classes were interfaces as would have been best in terms of their own engineering, then that core could be perceived as an open invitation to alternative implementations.   If Input Stream and Output Stream were interfaces, then implementors like my own bbi and bbo would instantiate one class instead of two at runtime.

What if the 1.0 AWT classes had been interfaces?   The AWT is a well known problem child, with more quirks to hack around than predictable behavior.  The class based implementation as it stands writes these problems into compatibility cement.  As intended, in that the point of the Java platform is to have identical (write once run anywhere) behavior everywhere.  But for Sun's internal engineering, had the AWT been interfaces, they would have been in a better place to solve their problems as bugs rather than cement them into the history of the platform.

There are many problem cases among the Java class libraries, including my own most recent favorite, java.nio.ByteBuffer, which should have been an interface.  Like all good examples of classes that should have been interfaces, the class implementation is shallow and poor.  (Shallow but good has the favor of formalizing particular semantics, for example with respect to what exceptions are thrown and in which cases).  The NIO Byte Buffer has three fields, implementing an in- memory readable and writeable byte buffer.  My own gnu.iou.bbuf is more interesting (in counterpoint here), implementing following read writeable and following write readable semantics.  This read/write buffer is illustrated in the bbuf streams, bbi and bbo.  Plainly the author of the NIO Byte Buffer was not interested in diving into writing a read/write byte buffer.  And rightly so, the package is for exposing the kernel primitive "sendfile" operator present in most operating system kernels for five years or more.  And that's where an interface would have been A Good Thing.  Or at least a class that could be subclassed. 

The aggressive possessiveness inherent in the definition of the NIO package makes in inutile.  And here's why.  To expose native I/O primitives in an extensible way, the author of NIO has constructed an extensive framework for reading and writing files and bytes.  Most of the architecture of the NIO framework is oriented to reading and writing individual data values like numbers, for example the following list of classes.


These classes are representations of the 64bit IEEE754 "double" floating point value in the NIO framework.  The NIO package requires the creation of new objects containing new buffers for the I/O of each individual data value.  Any operation between NIO an in- memory data requires a new NIO buffer to interface to the NIO package.  This is not A Good Thing, because the objective of a high performance system is to cut the number of buffers in the total process pipeline.  Ideally to one.  Using NIO, achieving one in memory data buffer is not possible -- except possibly in the case of copying files to sockets.

So NIO is yet another problem case in the Java class libraries.  Not because it's not good at copying files to sockets, it excels at that.  But because an "SE" core framework was released that no one but a JRE/JDK can reach into.  There are many, many cases among the large number of Java APIs of a strange separation between abstraction for APIs and implementation (for API Vendors).  Which makes too little sense, as generally useful API implementations become commodities anyway.

For reasons like these and so many more of the same, this writer would say that there's a distinct lack of serious engineering discipline in the development of the JDK.  Everything's personal, and far too little is rational.  Discussion of the further implications of the example made by "Java" in its libraries are beyond the scope of this essay.  But certainly alternative languages and libraries have emerged and continue to emerge to redress the subject.

No comments: