Types

There are three classes of types in D: basic types, derived types, and user-defined types.

Basic Types

Basic types are small values which generally map to one or maybe two CPU registers. All other types are ultimately composed of these basic values.

Basic types are subdivided into four classes: void, boolean, numeric, and character.

Void

void represents the absence of a type. Like most other C-style languages, it is used to indicate that a function does not return a value.

While it's not possible to have a value of type void, it is however possible to have both pointers to and arrays of void. A void pointer is simply an untyped pointer which can point to anything.

Boolean

bool is the boolean type in D. It is one byte in size and can only hold the values true and false. Boolean values can be implicitly converted to an integral type, with false becoming 0 and true becoming 1. Conversely, the integer literals 0 and 1 can be implicitly converted to bool.

RFC: the original spec says that pretty much only logical operators can accept values of type bool, but then immediately afterwards says that bools can be implicitly converted to int, meaning that in effect, any arithmetic operator can accept bools. This makes it legal to write nonsense like "4 + true". Why exactly is bool implicitly convertible to int?

Numeric

There are several basic numeric types in D. First are the integral types:

Name/KeywordDescription
byte8-bit signed
ubyte8-bit unsigned
short16-bit signed
ushort16-bit unsigned
int32-bit signed
uint32-bit unsigned
long64-bit signed
ulong64-bit unsigned
cent128-bit signed
ucent128-bit unsigned

All values of these types default to 0. The size of each type is fixed and does not vary from platform to platform, unlike in C.

cent and ucent are not currently implemented and are reserved for future use. RFC: where did these names come from, anyway? And what's the plan for any other integer sizes should the need arise? More qualitative/goofy names like "quitelong"?

WANT: the compiler actually also defines size_t and ptrdiff_t internally as built-in types. However, these are not exposed as built-in type to the programmer, and are usually defined in object.d instead. In addition, it turns out that having a "platform-native integer" is a fairly useful concept; should these be given nicer names, like word and uword, and be made keywords?

Next are the floating-point types:

Name/KeywordDescription
floatSingle-precision IEEE 754 float
doubleDouble-precision IEEE 754 float
realLargest hardware-implemented IEEE 754 float,
or
double, whichever is larger

D uses IEEE 754-compliant floating-point arithmetic. All values of these types default to NaN, an invalid value.

real may be the same as double on some platforms, or it may be larger. For instance, on x86 CPUs, real is an 80-bit extended-precision number.

RFC: half- and quad-precision floats are specified by the new IEEE 754r spec. Should the keywords half and quad be reserved in advance?

Last are the complex and imaginary types:

Name/KeywordDescription
ifloatImaginary number represented as a float
idoubleImaginary number represented as a double
irealImaginary number represented as a real
cfloatComplex number represented as a pair of floats
cdoubleComplex number represented as a pair of doubles
crealComplex number represented as a pair of reals

The ifloat, idouble, and ireal types represent imaginary numbers. Float literals such as 4.3i are, for instance, of type idouble. The cfloat, cdouble, and creal types represent complex numbers; that is, they have both a real and an imaginary part.

All of these types default to NaN (and for complex types, both components do).

Character

There are three character types in D.

Name/KeywordDescriptionDefaults to
charSingle UTF-8 code unitU+00FF
wcharSingle UTF-16 code unitU+FFFF
dcharSingle UTF-32.. code unitU+FFFF

char and wchar are, on their own, not entirely useful, as they may only be able to hold part of a multi-unit encoding. They are usually used instead as array elements, where arrays of them represent strings. dchar can, however, hold any Unicode codepoint. The initialization value for wchar and dchar was chosen as U+FFFF is guaranteed to never be a Unicode character. char went with U+00FF because, uh, that's as close as it could get..?

Derived Types

Derived types are created from other types. Syntactically, they are always written as a suffix on another type (except in C-style array/function pointer declarations). In the following sections, the metasyntactic type T is used as the type which the derived type is extending.

Pointers (T*)

A pointer is, in short, a memory address. Rather than holding a value of type T itself, it holds the address of a value of type T; it points to a T. More information on pointers is available in Pointers?.

Fixed-Size Arrays (T[n])

A fixed-size array is an array of T whose length is determined at compile-time (the value n) and which cannot be changed. Fixed-size arrays can't decide whether they are value types or reference types, in order to be compatible with C's fixed-size arrays. String literals in D are fixed-size arrays of characters. More information on fixed-size arrays is available in Arrays?.

Dynamic Arrays (T[])

A dynamic array is an array of T whose length can be determined at runtime and be changed. Strings in D are just dynamic arrays of characters. More information on them is available in Arrays?.

Associative Arrays (T[U])

An associative array maps from one arbitrary type (the 'key' type, U) to another (the 'value' type, T). They're commonly called hashtables (somewhat inaccurately), tables, dictionaries, or maps in other languages. They are dynamic in size and sparsely populated. More information on them is available in Associative Arrays?.

Function Types

Function types are strange. You can't declare a variable or an array of function types, and it's difficult to get a function type (there is no type syntax for them). The only really useful thing you can do with a function type is to have a pointer to it, as explained in the next section.

Function Pointers (T function(U...))

A function pointer is exactly that: the memory address of a function. T is the return type of the function, and U is zero or more parameter types. More information on them is available in Functions?.

Delegates (T delegate(U...))

A delegate is similar to a function pointer, but it has an extra piece of data associated with it: the "context" in which the pointed-to function is to execute. This can be either an object, if the delegate refers to an object's member function, or a stack frame, if the delegate refers to a nested function. I'm sure I'm just confusing you now, so for more information, see Functions?.

User-Defined Types

Last are types which are defined by the user. They are all composed of or derived from the previous two kinds of types. They are as follows:

Aliases

An alias is a renaming of an existing type to something else. The new name is just a shorthand for the original type, and no new type is actually created. This is similar to the typedef found in C and C++.

The 'alias' mechanism in D is actually more general than described here. For more information, see Aliases?.

Typedefs

A typedef takes an existing type and creates a new type which, while being implemented the same way, is semantically distinct from the original type, and can for instance distinguish function overloads.

typedef int X; // X is the new type
void foo(int x) { ... }
void foo(X x) { ... } // this is a legal overload

foo(5); // calls the first overload
X x;
foo(x); // calls the second overload

In addition, typedefs can supply a different initialization value from the original type.

typedef int X = 5;
X x; // initialized to 5 instead of 0

Enumerations

An enumeration is a set of integral constants. The syntax and behavior of enumerations is inherited mostly from C and C++, which might be a good or a bad thing. For more information, see Enums?.

Structs and Unions

Structs and unions are how you define "plain old data" (POD), or value types, in D. They do not have inheritance or polymorphism, and provide guarantees on and options for data alignment and ordering. For more information, see Structs and Unions?.

Classes

Classes are the basis of D's object-oriented capabilities. Unlike structs and unions, they are always accessed by reference; they may inherit they may have polymorphic method calls; and they do not provide any control over data alignment or ordering. For more information, see Classes?.

Type Conversions

Since D has numerous types, you will sometimes need to convert from one type to another.

Base Types

Sometimes the term base type will be used. The base type of an enumeration is the type from which it was derived (see Enums? for information). The base type of a typedef is the type which it renames.

Pointer Conversions

It is legal to cast between pointer and non-pointer types in D. However, the semantics of casting a pointer that refers to a garbage-collected object to a non-pointer type are undefined.

Implicit Conversions

A strong type system is useful for catching errors at compile time, but a type system that is too strong can feel like a straitjacket, forcing unnecessary verbosity and conversions in cases that should obviously "just work." Implicit conversions punch convenient holes in the type system in order to eliminate the necessity to perform explicit conversions in many cases.

Pointers

Any pointer type may be implicitly converted to void*.

Arrays

A fixed-size array may be implicitly converted to a dynamic array with the same element type, or with an element type that is a supertype of the element type.

A dynamic array may be implicitly converted to a dynamic array with an element type that is a supertype of its element type.

Both fixed-size and dynamic arrays are implicitly convertible to void[]. A void[] uses the smallest addressable unit as the size of a single array element (i.e. on most architectures, this would be a byte, and so a void[]'s length would be measured in bytes).

Enums

See Enums? for information on enum implicit conversion rules.

Integer Conversions

There are several integral types, so it's nice to have conversions between some of them. The process of converting smaller integral types to larger ones is called promotion. The following implicit conversions exist:

FromTo
boolint
byteint
ubyteint
shortint
ushortint
charint
wcharint
dcharuint

RFC: should the other direction - implicit narrowing conversions - be a warning or an error? DMD treats them as warnings, but they're almost always a bug.

RFC: why are there implicit conversions from character types to ints? This has proven very tedious in some metaprogramming and function overloading situations with no apparent benefits.

When implicitly converting the product of integer promotion, the target type must be able to represent the value's bit pattern, otherwise it's an error. For example:

ubyte  u1 = cast(byte)-1;  // error, -1 cannot be represented in a ubyte
ushort u2 = cast(short)-1; // error, -1 cannot be represented in a ushort
uint   u3 = cast(int)-1;   // ok, -1 can be represented in a uint
ulong  u4 = cast(ulong)-1; // ok, -1 can be represented in a ulong

RFC: WHAT. This seems like an extremely wonky rule. What is the justification? I'd say either all of these should be an error or all of them should be legal (preferably the former).

Arithmetic Conversions

When performing arithmetic operations on numeric types, the types sometimes do not match. There are implicit conversion rules that exist to simplify this situation.

In an arithmetic operation, first the types of the operands are converted to their base types if necessary. Then they are checked to both be of arithmetic types. Then the following rules are applied in order:

  1. If either operand is real, the other is converted to real.
  2. Else if either operand is double, the other is converted to double.
  3. Else if either operand is float, the other is converted to float.
  4. Else, perform integer promotions on each operand.
  5. If both operands are the same type, stop here.
  6. Else if both are signed or both are unsigned, the smaller type is converted to the larger.
  7. Else if the signed type is larger than the unsigned type, the unsigned type is converted to the signed type.
  8. Else, the signed type is converted to the unsigned type.
  9. If one or both of the operands was a typedef or enum after the above conversions,
    1. If both operands are the same type, the result will be that type.
    2. Else if one operand is a typedef or enum and the other is its base type, the result will be the base type.
    3. Else if the operands are both typedefs or enums but have the same base type, the result will be the base type.

If the above rules are applied and there is no result (i.e. no consistent type could be determined), it is an error.

RFC: there is no documentation on the result types of bitwise operations, or on how bitwise operations are performed when mixing integer sizes (and of course the resulting types there). For instance, in DMD both "short | short" and "short | byte" result confusingly in int. Is this a result of the "perform integer promotions on each operand" step?

Integral types may be implicitly converted to floating point types, but the other direction requires an explicit conversion.

Complex floating-point types may not be implicitly converted to non-complex floating point types.

Imaginary types and non-imaginary types may not be implicitly converted to one another.

Upcasts

Class and interfaces references may be implicitly converted to base class and interface types.

Truth

D, like many other C-style languages, allows you to use non-booleans in the context of a truth value. This is used, for instance, in conditional statements. When you use a non-boolean value in a context where a boolean was expected, it .. doesn't really do an implicit cast to bool, it does something similar but slightly different.

The following table shows which values are considered "false" for each type or class of types. All other values are considered "true:"

TypesFalse value
numeric, character0
pointernull
arraya.ptr is null
associative arraycast(void*)aa is null
delegated.funcptr is null
classc is null

Any types not listed are an error to use as a condition.

RFC: this is incompletely documented in the original spec, and the behavior listed here is what DMD does. Bugzilla 1626 asks for clarification on the issue.

Explicit Conversions (Casts)

Sometimes you need a blunter instrument when converting from one type to another. A cast is just that. Casts allow you to convert many types to other types that implicit conversions would not allow. Casts also allow potentially unsafe or bug-inducing behavior to slip by the compiler, so they should be used only when necessary.

The cast syntax in D is "cast(T)x", where x is the expression to be casted, and T is the type to cast it to.

Numeric Casts

When casting between signed and unsigned integers of the same size, the bit pattern is preserved and reinterpreted. For instance, the uint 0xFFFFFFFF, when cast to int, would become -1.

When casting from a larger integer type to a smaller one, the upper bits are simply truncated.

When casting from a floating-point number to an integer, the type of cast performed is dependent of the current hardware floating-point unit state.

Pointer Casts

Any pointer type may be explicitly cast to any other pointer type.

Array Casts

When casting from one array type to another, a the data is reinterpreted in-place as an array of the destination type. The only restriction is that the size of the array's data, in bytes, must be a whole multiple of the size of the destination element type. Some examples:

int[] x = [1, 2, 3];
short[] y = cast(short[])x;
// y's length is 6, since the original array was 3 * 4 = 12 bytes,
// divided by 2 bytes per short = 6 items.

short[] a = [1, 2, 3];
int[] b = cast(int[])a; // invalid, original array is 2 * 3 = 6 bytes, which
// is not an even multiple of 4

RFC: Array casts are currently undocumented in the original spec. This is how the reference compiler implements them.

RFC: The current behavior is prone to endianness issues. How should this be dealt with?

RFC: Currently the compiler does not diagnose invalid static array casts; see Bugzilla 3133.

Downcasts

Class and interface references can be explicitly downcast to derived classes and interfaces. These kinds of casts are checked for validity at runtime using runtime type info. If the cast is illegal, the cast returns null; otherwise, it returns a reference of the destination type.

Boolean

Similar to truth values, many types can be explicitly cast to bool. The result of an explicit cast to boolean is almost identical to the truth value as defined in the earlier section. The only differences are with classes, structures, and unions. For these three types, it is first checked whether or not the type implements the opCast operator overload for bool. If one exists, it is used. Otherwise, if the value is a class, it falls back to the null check, as with the truth value; or if the value is a struct or union the result is always true.

RFC: For structs/unions DMD considers the cast true if 'test %esp, %esp' - which should be always.