The Present of D

I use D, as most of my friends know.  I’ve been using it for about five years (though I’ve known of it since even 2003).  In those five years, I’ve seen the language and its community grow and change.  It’s had an enormous impact on me as a programmer, since it’s the first real language that I’ve gotten to know intimately.  It’s shaped the way I think about approaching and solving problems.  I think it’s an amazing language, for the most part.

But having watched D’s development over the past five years, I don’t honestly know where it’s going to be five years from now.  The development of the language, compiler, and libraries is so disjointed, incomplete, and unstable that I can’t recommend this language to others with good conscience.  And it’s a Goddamn shame, too, since the language is great!  Most of the issues lie instead in the practical aspects of implementation than in the theoretical aspects of the language’s design.  These practical problems are the kinds of things that you can only find out about by getting your hands dirty with the toolchain and by participating in the community, and I’ve decided to compile many of them together here.

I’ve seen many people say whether or not they think D has a future.  I honestly can’t say whether or not it does.  Having used it for so long, I’m of course biased to think of course it has a future!  what would I do without it? But stepping back and looking at it objectively, the present looks pretty.. well, I’ll let you decide.

  • Compilers.
    • DMD is the reference compiler.  DMD is “source-available,” meaning it’s not open source in the hippie-dippie freedom sense, but all the source is available for you to compile, and if you so wish, modify, fix, and submit patches for.  DMD’s frontend (the part of the compiler responsible for parsing and semantically checking D code) is common to all three compilers.  DMD’s implementation is interesting at best.  It’s written in extremely unidiomatic C++ and has been historically rife with internal compiler errors due to segfaults and assertion failures.  Worse, its semantic analysis doesn’t completely do away with forward references as the D language specification promises, often forcing otherwise well-organized code to be reorganized in a less-than-optimal fashion, and at the worst, making some code uncompilable.  Worse still, the object format that DMD uses on Windows is antiquated, and the linker even moreso, leading to some large object files and complex code simply killing the linker by exceeding its outdated limits.
    • GDC is a D frontend for GCC which used to be the community’s best effort at a fully open-source D compiler.  Sort of.  I guess it never really was, since it was headed by a single guy who was wary of accepting patches and contributions from others.  But the real problem is that this one guy seems to have dropped off the face of the Earth, meaning that GDC has had no meaningful development in almost two years.  Even the most recent code in the repository is fifteen releases behind the reference compiler at the time of writing, and there doesn’t seem to be any indication that this situation will change any time soon.  More and more D code is becoming uncompilable with GDC as it slides into obsolescence.
    • LDC is an D frontend for LLVM which is the community’s current best effort at a fully open-source D compiler, but it’s also shaping up to be much more than that.  LDC takes advantage of LLVM‘s powerful optimization capabilities, including adding some D-specific optimization passes.  Being based on DMD’s frontend, it tends to have the same quirks, but the developers have been much more proactive about fixing longstanding DMD frontend bugs.  And best of all, it has a team of 4 or 5 core developers, as well as several other contributors, avoiding the “all eggs in one basket” issue that came up with GDC.  LDC isn’t without its share of problems, however.  The most egregious one – which, to be fair, really isn’t even their fault – is that exceptions are not supported on Windows, limiting it exclusively to POSIXen.
  • Versions.
    • Fun fact: there are two basically incompatible versions of D: D1, which was released at the beginning of 2007, and D2, which was forked later that year and which has been the main focus of development ever since.  D2 brings all manner of new and “interesting” features, the biggest of which include a system of const-correctness (somewhat like C and C++, but .. different), a concept of functional purity, and a frankly groundbreaking change whereby global variables now default to thread-local storage.  Furthermore D2′s standard library is almost unrecognizable, having changed from D1 so much.  The net result of all these (and other) differences is that D1 and D2 are, by now, very different languages.
    • So why does any of this matter?  Because it has essentially split the community in two.  There is D1, which is stable (or, depending on your view of Walter’s idea of stable, it’s stale instead), and for which a relatively large number of libraries have been written.  D1 is also the only version supported by GDC and LDC.  Then there’s D2, which is still a quickly-moving target, with some features not even fully implemented and possibly others on the way.  D2, however, is the version of the language that has been hyped by D’s developers.  What ends up happening is that newcomers find out that this awesome language that they’ve been reading about isn’t really done yet, has virtually no third-party library support, and has only one conforming compiler.  Then they find out that D1 is fairly well-supported, but who wants to learn a language that will be obsoleted by its successor just months from now?
  • Standard Libraries.
    • Another fun fact: there are, in effect, three standard libraries for D overall.  There are two for D1 – Phobos, the default, and Tango, a community-written alternative – and one for D2 – Phobos 2, which as mentioned in the previous section, is for all intents and purposes completely unrelated to Phobos 1.  What this means is that not only do newcomers have to decide which version of the language they want to use, but if they pick D1, they also have to decide which mutually incompatible set of third-party libraries they want to use since each third-party library is usually written to only use one standard library.  In practice, the situation isn’t quite that bad, since Tango has become almost the de facto standard library for D1, but there are still some Phobos-only libraries which you would be locked out from using if you chose that route.
    • If you use Phobos 1 with D1, prepare to be underwhelmed.  It provides only the most basic string, math, file, networking, and threading support, and not a lot else.  There are no standard synchronization primitives.  There is no library support for anything other than UTF-8 strings, despite D supporting UTF-8, UTF-16, and UTF-32 natively.  There are no basic algorithmic provisions, array manipulations, or even the simplest of templated containers.  It sports a quirky collection of ill-organized, ad-hoc, and in some cases undocumented modules, for things like obscure database formats you’ve never heard of.  It also has woefully incomplete sets of API headers for Windows, Linux, and OSX.  It has little to no concern for heap allocations, and there are no complexity guarantees specified anywhere.  It’s a very organically-grown library, and it has a lot of weeds as a result.
    • If you do go with Tango, your problems are not over!  Tango is not the default standard library, and has to be built and installed – somewhat awkwardly at that – to replace Phobos.  Many people have endless problems at this stage alone.  But even if you get it installed, you’ve now got to deal with a beta-quality library that’s just as quickly a moving target as D2 is!  Word is that most of the major library reorganization is over – thank God – but breaking changes are still inevitably introduced with disturbing frequency.  Sometimes it’s just a simple renaming of a method or class.  Sometimes it’s the confusing deprecation of a feature that you depended on.  Sometimes it’s a complete redesign of a core API from the ground up.  The documentation is often incomplete, missing, or confusing.  With a very small team of overworked contributors, the number of reported bugs keeps growing while the number of bugs fixed never seems to catch up.  I don’t know if what the Tango team has created is a comprehensive standard library or, given their limited means, a maintenance nightmare.
    • If you use D2, you’re currently limited to Phobos 2, which is being written by Andrei Alexandrescu.  Yes, the Modern C++ Design author.  If you’re at all familiar with his C++ code style, you should have some idea of what he’s doing with Phobos 2: templates everywhere.  Whether that’s a good thing or not is up to you.  I personally like it, but I think a lot of his code shows just how inexpressive and awkward most of D’s metaprogramming facilities really are.  I can only imagine how terse, powerful, and efficient Phobos 2 would be with proper AST macros.
  • Third-party Libraries.
    • There are precious few third-party libraries for D.  Dsource is the repository for most of them, but to be honest, it’s starting to look less like a repository and more like a graveyard.  The truth is most of the projects there are dead.  Now by no means am I blaming this on Dsource – Brad Anderson generously provides the server and bandwidth at no cost to its users, and it’s not his fault that all these projects are dead. There are only a few – MiniD (mine), Derelict, Arclib, gtkD, DWT, DFL, QtD, Blaze, and maybe DWin – which seem to be at least somewhat up-to-date.  Most of the others are either years out-of-date or never even got off the ground.  Some are ancient but still somewhat useable.  But shouldn’t it be a telling sign when you can count the number of actively-developed libraries on two hands?
    • If you use D2, ha!  I think there was one small library developed for it a few months ago.  Something like that.  Most people don’t feel like getting their hands dirty with D2 for fear of their work being invalidated by some Phobos 2 addition or language change.
  • Toolchain.
    • Ohhh, man.  Where should I even begin.
    • DMD is more or less your best bet for development on Windows, and the shortcomings of the OMF object format and OPTLINK linker it uses put a pretty definitive end to more complex template-based metaprogramming.  This is due to the resulting large objects which push the boundaries of both OMF and OPTLINK to their breaking points.  Such problems don’t exist with any other compiler or on any other platform, all of which use more capable formats and linkers.
    • DMD is limited to Windows, Linux, OSX, and FreeBSD.  Okay, not a bad selection of OSes.  But more seriously, it’s limited to producing 32-bit code.  Not a big problem for me, really, but 64-bit machines are already the norm and 64-bit OSes will be soon.
    • Both GDC and LDC can be built as cross-compilers, and I’ve had some experience with this (I built an x86-to-x64 GDC compiler, as well as an x86-to-ARMv5 LDC).  This seems to work pretty well, actually.  Of course, GDC is still ooold.
    • Again, LDC can only be reasonably used on non-Windows OSes since it doesn’t support exceptions on Windows.  There are also several bugs found in it every week, though the developers are very responsive and issues are resolved quickly.
    • Debugging is .. fun.
      • On Windows, DMD comes with WinDbg, and I honestly can’t tell whether or not it’s some kind of practical joke.  I think it’s the debugger from Visual Studio 4 or 5 separated into its own program.  It’s hilariously buggy and outdated.  To debug D programs on Windows, you can kinda use Visual Studio 6 (which has some problems understanding D’s types, exceptions, and some other random issues); there’s ddbg, a simple command-line debugger that works kinda OK but hasn’t been worked on in over a year; and there’s cv2pdb, which converts the CodeView debugging info that DMD outputs into the PDB format that any modern MS debugger uses, and it seems to work pretty well.
      • On Linux, DMD doesn’t output correct debug information.  This is a pretty major problem.  GDC and LDC both output fine debugging info, though most debuggers don’t know anything about D’s types.  GDB can be patched to understand them and to demangle D’s symbol names.  I’m not really a Linux dev, so I don’t know why so many people pop boners over GDB, but there you go.
    • IDEs?  I can’t tell you how many times I’ve been on the D IRC channel and had some noob come in and ask “hey where can i get an ide for d?”  Or even more presumptuously, “what is the standard D IDE?”  Like it or not, there is now a huge contingent of programmers out there who automatically assume that an IDE is part of the deal.  D’s current offerings mostly come in the form of plugins for existing IDEs.  Really the best one available now is Descent, an Eclipse plugin for D.  It’s got some pretty compelling features, like debugging runtime-evaluated code and templates.  That’s just too cool.  Others are Poseidon and support in Code::Blocks.  I think there’s also an addin for Visual Studio but given VS’s relatively rigid nature I don’t think it’s very seamless.
    • Build tools.  D’s module system makes it a natural candidate for automatic dependency detection, and some make-replacements have sprung up to that effect.  Bu[il]d, DSSS/Rebuild, and now xfBuild are the main contenders.  Some other minor ones include jake and di0xide (D0xD?  deeoxid?  I can’t remember what clever spelling it uses).  Bud hasn’t been updated in a long time, which is annoying, since there are some really irritating bugs.  DSSS/Rebuild haven’t been worked on in several months since the maintainer went to grad school and stopped having time.  xfBuild is very new.  Installing and using any of them can be kind of an adventure, just like anything else in the D toolchain.
  • Specification and implementability. (thanks Deewiant (Matti Niemenmaa) for suggesting this after posting)
    • The D specifications are.. messy.  To say the least.  They’re informally written, often with the expectation that the reader is intimately familiar with C++ and its implementations, which was fine when the specs were first written 8 or 10 years ago and the D community consisted almost entirely of such people.  But that’s changed drastically.  Many features are unclearly-specified or are described by a simple one-liner despite their complexity and possible implementation issues.  The spec pages are hard to search and poorly organized.  The snippets of grammar provided throughout the spec are sometimes contradictory or incomplete, and there is no single full-language grammar anywhere.  These are comments I hear about the spec all the time.
    • But this leads me to a much more important point: with such a shaky spec, the language is practically unimplementable at the moment.  There are three compilers which all share the same frontend code.  Wonder why?  Because it seems like half of the language’s behavior is tied up in the rat’s nest that is DMD’s frontend code.  Here are some questions concerning the language’s (and the compiler’s) behavior.
      • When the compiler’s behavior differs from the spec, which is right?  When the compiler does something that’s not mentioned in the spec, should that be assumed to be the correct behavior?  What if that “correct” behavior seems buggy, or incorrect, or useless?  Should other compilers attempt to emulate it?  Since the DMD frontend has so much trouble with forward references, should other compilers also emulate that (nonconforming!) “feature” so as to preserve compatibility with it?
      • How does the .stringof property work?  How are more complex types, like templated types, handled?  If you use .stringof on a constant variable, should you get a string of the variable name or of its value?  (fun: you actually get different behavior on this one in D1 and D2, though it’s never mentioned!)  What spacing and formatting, if any, should we expect in the resulting strings?  Why does getting the string of a function’s parameter type tuple include ‘ref’ and ‘out’ even though they are not type constructors and don’t seem to be accessible in any other way?  Similarly, why does getting the string of a aggregate’s .tupleof give the names of the fields?  Can we depend on other compilers duplicating this behavior or is this the result of how DMD’s internal code representation works?  Shouldn’t there be other, more canonical means of getting access to this information?  If there is, why are there two (or three or four) ways to do it?
      • What about cases of ambiguous parse trees?  Which parse is preferred, or should the compiler flag an error?  If C-style function pointer declarations are the only thing that cause these ambiguous parse trees, why are they allowed?  Why bother preserving C source compatibility at the declaration syntax level when you’re going to be running a tool to convert C headers over anyway (since the type names are completely different)?  Why not just let the tool convert the crappy C-style syntax into the much better D-style syntax?  Is conversion of C headers really so important as to preserve a piece of syntactic history that causes nothing but confusion and problems?
      • Why are there so many [contradictory, overlapping, inadequate] means of doing compile-time introspection and metaprogramming?  Why aren’t all the metaprogramming features aggregated in one place?  Why are amazingly useful metaprogramming features like .tupleof given an incomplete, incorrect, one-line description in the spec?  Are there any plans to unify these features?  Is there any justification for having multiple disjoint methods of acquiring information?
    • I think this is one of the main reasons why Dil, which started out so well and promising, seems to have hit a brick wall.  D was supposed to have been a lanugage that was designed from the perspective of a compiler designer, and as such, was meant to be easy to implement.  It’s anything but that.  The spec provides so little useful implementation information, and the behavior of the DMD frontend seems so odd and unemulatable at times that it seems almost impossible to make a new implementation of D, at least beyond the simplest and most basic features.

Man, you can complain like no one else! I’ve done what I can to improve the situation.  I’ve developed my library, and provided input on language and library design, and made some (admittedly weak) attempts to help out on other projects.  All I’m doing with this post is pointing out what the current problem areas are in D’s current incarnation.  I don’t even pretend to have the time or skills to solve most of these problems.

These are just implementation-specific issues.  A better implementation wouldn’t have these problems. Thanks for pointing out the obvious.  Where’s the better implementation?  D’s been in development for almost ten years.  The current situation is pretty shameful.

Let’s not get ahead of ourselves by asking whether or not D has a future.  Let’s fix the problems it has now.  And if we fix them, then D will at least have a fighting chance of having a future.