2.5 Variables in Modula-2

As in the sections following the "hello world" example, a careful analysis now needs to be given of the major new idea introduced in the example of the section 2.3--that of the Modula-2 variable. Variables as names for data representation abstractions were discussed in general terms in section 1.6.5, and the reader is asked to review that material before beginning with this portion of the text.

Good programs are versatile, and give different results depending on the data being fed into them (even if the result is electronic indigestion). For that reason, programs must make extensive use of the computer's memory to store data and partial results, and to retain final answers pending the time when they will be printed out or otherwise communicated back to the real world. If, for each machine, (and each piece of code) the programmer had to write code to find available memory locations for data and then store things in those locations, the task would be unmanageable.

Fortunately, any good programming language has a facility to control all this work--all the programmer has to do is give a name to the pigeonhole where the data is to be stored, and the compiler automatically generates code to allocate the right amount of memory, give it the desired name, find it again whenever the name is subsequently used, and store values there whenever the program demands. In Modula-2 programs, it is necessary to give variables their name and type before the program code itself begins. For example, in the last program, the line

VAR
  base, exponent, counter, result: CARDINAL;

causes code to be generated to set aside enough memory to store the values of the four named variables. It also ensures that their type will be CARDINAL. In all subsequent references to these, code is generated to find the same location by its name. For instance, when the code for the line

  result  := result * base;

is executed, the locations named by result and base are looked up and the values retrieved. The two values are multiplied, and the answer is stored back in the location result. Likewise, the code for the line

  counter := counter + 1;

causes the number stored at the location named by counter to be increased by one. The following definitions are useful at this point:

A Modula-2 variable is a name for a memory location, the contents of which can be changed by a program.
The symbol := is called the assignment operator, and is the means by which the name on its left is given to the value on its right.

The difference may seem unimportant, but this last definition is technically more correct than saying that the value is assigned to the name, though the distinction is not always carefully maintained, even in this text.

NOTE: There should be a space before and a space after an assignment operator.

The assignment also illustrates the simplest kind of statement--the assignment statement. Its railroad diagram is shown in figure 2.5.

2.5.1 Simple Variable Types

In some computer notations, variables come into existence at the time they are first used in the program. Moreover, the type of contents the variable may have will be determined by the context in which it is used (these may, for instance, be alphabetic characters instead of numbers). Modula-2, on the other hand, is what is known as a strongly typed language. This means that every variable is of a particular kind or type and only values of the same type can be assigned to it.

A Modula-2 variable is given its name and type, and memory is set aside for the value under a VAR heading. This is called declaring the variable. The declaration of a variable must include its type. The entire section of the program Module preceding the word BEGIN, excluding any import statements, is called the declaration part.

This leads to a revised diagram of a program module. (Figure 2.6)

Once the type of a variable has been appropriately declared, the variable may be assigned only values of that type. For instance, only numbers such as -30, 0, 45, 62, etc. can be assigned to a variable of type INTEGER and only single characters such as 'c', '*', 'H', or '5' can be assigned to a variable of type CHAR (short for Character). Some rules for type declarations must also be observed. For example, in a declaration such as:

VAR
  firstNumber : INTEGER;
  ch : CHAR;

NOTES: 1. VAR is a reserved word which must be capitalized.

2. Each declaration of another type must be separated from the ones before with a semicolon.

3. There should be spaces before and after each colon and a new line should be started after the VAR and before any actual declarations. (This is not absolutely required, but it makes the program look neater.)

This declaration creates the identifiers firstNumber and ch, with their types fixed for the balance of the program. In subsequent statements, an assignment like firstNumber := 'M' or ch := 5 would be rejected by the compiler as being in "type conflict." That is, one is not allowed to give a name of one type to a value of another. If there are several variables of the same type to declare all at once, they may be separated by commas. Thus,

VAR
  counter, size, length, int : CARDINAL;

has the same effect as the declaration below and conserves space in the source file.

VAR
  counter : CARDINAL;
  size : CARDINAL;
  length : CARDINAL;
  card : CARDINAL;

The syntax of the declaration is diagrammed in Figure 2.7.

The values a variable can take on are not only limited by its type--CHAR for instance, is restricted to single characters--but also by a predefined range. This is not a restriction imposed by Modula-2 as a programming notation, but by each individual computer and/or its operating system. In some older systems, the type CARDINAL, for instance can range only from 0 through 65535. The similar type INTEGER on such systems ranges from -32768 through 32767. In newer systems, these are 0 through 4294967295 for CARDINAL's and the INTEGER range is from -2147483647 through 2147483647.

NOTES: 1. CARDINAL, INTEGER and CHAR are built-in identifiers. As detailed later (section 2.5.2), it is unwise to attempt to reuse these names for program identifiers.

2. These limits are typical of certain small computers and can in some circumstances be very restrictive. Some systems offer a much larger range for CARDINAL and INTEGER. (See chapter 15 for a fuller discussion of the reasons these particular numbers arise as limits.) The reader must obtain information for the actual system being used from the implementation manuals, or by experimentation.

3. Alternatively, many versions of Modula-2 have one or both of the nonstandard types named by the built-in identifiers LONGCARD and LONGINT. The range of these two data types is much greater than that of the regular types, as each has more memory set aside for its data.

Although one cannot assign INTEGER or CARDINAL variable names to CHAR values, or vice versa, it is possible to freely assign CARDINAL and INTEGER names to each others' values, but only in the overlapping part of their domain.

Example:

Assuming the ranges given above, suppose one had:

VAR
  firstInt, secondInt : INTEGER;
  firstCard, secondCard : CARDINAL;
  ck, ch  : CHAR;

Then the following are correct:

  firstInt := -50;
  secondInt := 4500;
  firstCard := 62500;
  ch := 'X';
  ck := "m";  (* single or double quotes allowed *)
  secondCard := secondInt;  (* This is okay. *)
  firstCard := firstCard + 3000;  (* This is almost too big for some older systems. *)

and, the following are incorrect:

  firstInt := firstCard; (* error when code is executed. *)
  secondCard := firstInt;  (* assigning negative to a cardinal. *)
  ch := 'ab';   (* This won't work; 2 characters. *)

NOTE: Creating a variable name and giving it a type via the VAR statement does not also give it a value. Until the first assignment statement for that variable is encountered, it is undefined.

Assigning the first known value to a variable is called initializing it.

To illustrate the last point, consider the calculation section in the last major example:

    (* calculation section *)
  result := base;  (* initially, set the result to the base *)
  counter := 1; (* What if this initialization is omitted? *)
  WHILE counter < exponent
    DO
      result  := result *  base; (* multiply base enough times *)
      counter := counter + 1
    END;

Suppose that the line initializing counter were left out. When the code entered the WHILE loop, the value of counter would be whatever happened to be in that piece of the machine's memory before the program began to run. It may already be greater than exponent, in which case the loop will not be executed even once. It may be some value between one and exponent in which case a further computation will be made, but the wrong number of multiplications will be performed. Assuming, say, that there are 65536 possible values for a CARDINAL, and counter has one of these chosen at random, the probability that the program would produce the correct result is 1/65536 or 0.000153. Since the range for CARDINAL is much larger on many systems, the actual situation is probably much worse than this. Notice that similar incorrect results can also be obtained by leaving out the line initializing result.

Moral: always initialize variables before expecting them to have any particular value.

2.5.2 Standard Identifiers

It is important to note that since the capitalized words CARDINAL, INTEGER, and CHAR name types, they too are identifiers. Like reserved words (VAR, IMPORT, MODULE, etc.,) they are part of (are built into) the Modula-2 notation rather than imported from a library.

A Modula-2 Standard Identifier is a name that is built-in to the notation. It must be written entirely in capital letters.

Though like reserved words in that they are built-in to the notation and must be capitalized, standard identifiers differ from reserved words in two important respects:

1. They are indeed names, rather than structural punctuation markers.

2. They may be reused for other purposes, as unwise as this may be.

Thus,

VAR
  IMPORT: CARDINAL;

is an error the compiler will report, but

VAR
  CARDINAL : INTEGER;

may be foolish, for it redefines the identifier CARDINAL as a variable, cutting off access to the standard Modula-2 type CARDINAL, but it is permitted.


Contents