Classification of types

  • Built-in types have direct hardware support
  • Interesting examples:
    • complex numbers (pair of numbers)
    • fixed point - contains an implicit decimal point at programmer specified location (e.g., dollars and cents)

Numeric types

Have to decide on certain properties when designing a language, such as:

  • length of the numeric type, i.e., number of bytes
  • complex types - are these built in, standard library, unsupported?
  • rational numbers? - pair of integers
  • integers with arbitrary precision?

Common terms

discrete types (ordinal) are countable and have a well-defined notion of predecessor and successor (except first and last). These are “ordered”. scalar types are types such as discrete, real, rational, and complex.

Enumerations were first introduced in Pascal. Helped with readability and getting rid of some weird bugs. May be represented internally as integers, but not always compatible with integers (e.g., can’t set int x; to an enum in most languages). In C, enums aren’t type safe since they really are just a typedef for an int.

In OO languages, there is difficulty mixing enums with inheritance. In Java, enums are a class.

Subrange types

Values are contiguous subset of the values of a discrete base type.

type test_score = 0..100.
type weekday = (sun, mon, tue, wed, thu, fri, sat).
type workday = (mon..fri).

weekday is just an enum in Pascal. test_score is compatible with integers, since it’s built on integers.

Composite types

  • nonscalar types
  • Created by applying type constructor to one or more simple types

These include arrays, lists, files, etc.

Type checking

Enforcing the rules of a type system

Type equivalence

How to define equivalence when users can define new types? Two main ways: structural and name equivalence.

structural equivalence says two types are equivalent if they consist of the same components put together in the same way. Appears in C and ML. It’s principle problem is an inability to distinguish between types that the programmer may think of as distinct, but which happen by coincidence to have the same internal structure.

name equivalence is based on lexical occurence of type definitions. Most modern imperative languages use this. Appears in Java, C#, standard Pascal. Name equivalence is based on the assumption that if the programmer goes to the effort of writing two type definitions, then those definitions are probably mean to represent different types.

With aliasing constructs such as typedef, have to decide between loose and strict name equivalence. For loose name equivalence, aliases are equivalent in an otherwise name equivalent language. In strict name equivalence, aliased types are not equivalent.

Under strict name equivalence, a declaration

type A = B

is considered a definition. A’s type is different from B’s type. Under loose name equivalence it is merely a declaration; A shares the definition of B.

Type casting

Useful when:

  • Types are structurally equivalent but language is name equivalent
  • Types have different sets of values, but intersection is represented in the same way (e.g., subtype of integers and integers)
  • Types have different low-level representations, but there is some sensible correspondence between them (e.g., short and int)

Compiler generates run-time code for checking that the cast is valid. Some casts, e.g., converting from a float to an int, requires code to change the low-level representation of values. This can also cause a loss of fractional digits, and integer overflow can result.

Non-converting type cast does not alter underlying bits, merely reinterpreting them. Breaks the language’s type system! In C++, see reinterpret_cast<T>().