14.6 An Introduction to Table Indexing Methods

A number of other topics related to tables are somewhat more advanced, and will not be considered here in any more detail than will assist the reader in consulting further references. It should be noted, however, that as a table grows very large, the sequential access method implied by the linked structure becomes a hindrance. If the table is sorted, possible solutions include:

The maintenance of a table of starting points of separate linked lists is called the indexed sequential access method (ISAM), although this term is most commonly used specifically for the indexing of sequential files.
Any method of computing directly from the search key at run time an index into a table of data is called hashing; a table indexed in this way is called a hash table; and the function employed for the calculation is called a hash function.

Alphabetically ordered items are not spread evenly through the letters; they are clustered heavily in some parts. For instance, a telephone book may contain many last names "Kim," "Smith," "Wong," and "Friesen," but not very many "Sutcliffe," "Hacker," "Zinzelmeyer," "Szczepanczyk," or "Herod." Moreover, the nature of the clustering would not be the same in every country (not many named "Smith" in China; mostly "Kim" in Korea, and lots of "Sutcliffe" in Yorkshire). Thus, the development of effective and efficient hash functions to index large tables depends very much on the data at hand, and may change over time as the data is maintained. The reader is invited to consult an advanced work on these subjects.


Contents