Composite types - Arrays

  • Can think of an array as a mapping from an index type (domain) to an element type (range).
  • Fortran has arbitrary indexing for arrays, e.g., can use a range of indices extending into the negative integers. C has 0 as a lower bound by default.
  • Fortran is an example of a language that supports slicing. Uses a:b notation akin to MATLAB.

Allocation in memory - Arrays

For global lifetime with static shape, the shape must be known at compile time so the compiler can allocate space for the array in static global memory.

Example is C,

int myArray[45];

If the array should only exist throughout the execution of a subroutine and the shape is known at compile time, the object can be allocated on the subroutine’s stack frame at runtime. In contrast, if the shape isn’t known until elaboration time, another level of indirection is needed to allocate on stack frame. Must introduce a dope vector, which is a runtime descriptor containing information about the shape of the array. The location of the dope vector is known at compile time and so the array itself can be allocated elsewhere on the stack.

For arbitrary lifetime and the shape not known until run time, declarations only create references and space allocation on heap under explicit control of programmer. To support dynamic shape, stack cannot really be used. Usually involves allocating a new array, copying contents, deallocating old array. Some languages allow Strings to dynamically change size, but not arbitrary arrays. Certain “nice” assumptions about Strings can be made, allowing for this.

Memory layout

  • Contiguous elements
  • Row-pointer layout - for multidimensional; array of pointers to rows
    • Downside is extra space is needed for pointers, and potential overhead added for scientific computing
    • Allows rows to be put anywhere
    • Nice for matrices whose rows are of different length
    • Shuffle around arrays without copying Fortran and MATLAB use column-major ordering for matrices, while everyone else uses row-major.

When traversing a multidimensional array in column-major language, need to loop over columns as outer loop and then rows as inner loop. This is to maximize # of cache hits.