Management Information and Program Evaluation Systems by Alan Walter Steiss

INFORMATION MANAGEMENT AND DECISION-SUPPORT SYSTEMS

Effective planning and control of any project or program requires relevant management information. Timely information is essential to understand the circumstances surrounding any issue and to evaluate alternative courses of action to resolve any problem. Information is the raw material of intelligence that triggers the recognition that decisions need to be made. Such incremental knowledge can reduce uncertainty in a particular problem situation.

Information Management Systems

Vast amounts of facts, numbers, and other data usually are processed in any organization. What constitutes information, however, depends on the problem at hand and the particular frame of reference of the manager. The American Accounting Association asserts that:

Accounting data, for example, can provide management information when arrayed appropriately in balance sheets and financial statements. Traditional accounting data may be relatively meaningless, however, if the objective is to evaluate the effectiveness of a new program. To achieve better management decisions, the information available must be both timely and pertinent.

The Objectives of an Information Management System (IMS)

As a concept, information management systems often are vaguely described and broadly misunderstood. Some people tend confuse an information management system with electronic data-processing, thinking that the all-knowing computer will provide the answers to complex problems if and when we simply learn to press the right buttons. Many information management systems make effective use of modern data- and word-processing software and hardware. However, an IMS is much more than an electronic marvel--a "black box" to direct and control the operations of complex organizations.

First and foremost, an information management system is a process by which information is organized and communicated in a timely fashion to resolve organizational problems. Traditionally, information management systems have been developed as tools for operational management. Data are tracked in some detail to record and measure various aspects of an organization's day-to-day operations. Strategic decisions differ from operational decisions, however, along several dimensions. Therefore, the information necessary for strategic management varies from the more traditional IMS used for operational control.

The concept of IMS can best be understood by examining separately the three terms: information, management, and system. This understanding may be enhanced by taking these words in reverse order. A system is fundamentally a set of two or more elements joined together to attain a common objective.

A system often is made up of a number of smaller systems or subsystems, which, in turn, are composed of basic elements which define the purpose and capacity of the total system. Failure to penetrate beyond the surface is one reason why "systems" often are so misunderstood.

A properly functioning system is characterized by synergy. That is, all elements and subsystems work more effectively together in a system than if they were operating independently. The output of an integrated system may be expected to be far greater than the sum of the outputs of its component elements. To understand these output relationships, however, it is first necessary to identify and understand the elements and subsystems that serve as the components of the larger system.

For purpose of an IMS, management consists of the activities carried out by managers and decision-makers. Managers/decision-makers must plan, organize, implement, and control those operations within their realm of responsibility. They must continually develop, adapt, and implement strategic, tactical, and technical decisions to enhance the capacity of the organization to meet the demands that impinge upon it. The specific objective of an IMS is to communicate information for decision making in a synergistic fashion--where the whole becomes greater than the sum of the individual parts.

Information is different from data, and this distinction is very important. Data are facts and figures that are not currently being used in a decision process. Files, records, reports not under immediate con-sideration are examples of data. By contrast, information consists of classified and interpreted data that are being used for decision-making. Thus, the "memory" of a management information system is a repository for information concerning past experiences, for programmed decisions, for information by which "right" decisions can be tested for acceptability, as well as for raw data.

Organizational "memory," like human memory, is characterized by a selective process--items are retained which may have some future application. An since the future is uncertain, organizations tend to retain more data than can possibly be used as information, thus complicating the retrieval process. Organizational memory also is dissociative and combinatorial--stored information can be reassembled into new patterns which meet the overall needs of the organization (and particular decision situations) more effectively.

IMS, DBMS, and Computers

Computers, through their ability to store, retrieve, and carry out rapid computations on data, have made possible the collection and dissemination of greater quantities of information more quickly and economically. Computerized databases provide the basic source of information for organizations in today's fast-paced decision environment. An IMS is composed of data bases and the software packages (computer programs) required to manage them. A data base is a collection of structured and related information stored in the computer system. Different software packages permit access and management of these data, along with the tools necessary to generate reports.

Organizational data may suffer from significant incompatibilities across different computing platforms (i.e., hardware and supporting software), however, and even within the personal computer environment. Multiple users must be able to share much of the same accurate, con-sistent, and up-to-date information in an efficient and secure manner, regardless of the purpose and origin of such information. The primary objective of a database management system (DBMS) is to facilitate this sharing function.

Together, these capabilities make up a Data Base Management System (DBMS). Bassler defines a DBMS as: "A software system that provides for a means of representing data, procedures for making changes in these data (adding to, subtracting from, and modifying), a method for making inquires of the data base and to process these raw data to produce information, and to provide all the necessary internal management functions to minimize the user effort to make the system responsive." [2]

A Data Base Management Systems should include: (a) a high-level, interactive query language facility; (b) an interactive financial modeling package that permits "what if" calculations to be made; (c) a package that supports modeling and simulation; (d) a statistical analysis package; (e) word-processing software; and possibly, (f) customized software related to specialized management needs. In the past, such systems--with collections of extensive and often expensive software packages--have been limited to large mainframe computers. This limitation is one major reason why management information systems have been used mainly for operational decisions and not for strategic management decisions.

Data sharing has been achieved to some degree through file servers housed in local area networks (LAN). Files are shipped from a DBMS residing centrally on the network to be processed locally. Whole files may be downloaded and selectively accessed. This approach can be inefficient, however, especially when only a few records are required by the request-ing applications. Moreover, the integrity, security, concurrency, and recovery of such files can be difficult to manage under this approach.

Similarly, the connection of microcomputers to other platforms has been limited to host links. Data files of different formats are transferred back and forth for processing and storage, accompanied by more or less explicit conversions problems and the resulting data redundancy.

Unfortunately, many popular so-called DBMS are not database management systems at all: They are programmable filers at the core, leaving most of the job of managing data to the users and providing relatively unproductive tools to assist in this undertaking. Except for the simpler data manipulations, the results often cannot be accessed directly; internal procedures must be created for the system to follow to obtain the desired results.

Much of the procedural detail consists of explicit references to internal storage structures, addressing mechanisms, and so on, which are irrelevant to logical database tasks. Thus, the user must become involved in machine complexities and performance considerations, which most people are ill-equipped to handle and should not have to bother with anyway. In short, traditional database access usually requires a considerable degree of programming skill.

Often technical personnel are required to mediate between end users and their data. The natural language of the end user differs from the procedural machine-oriented tools that traditional DBMS products provide. Therefore, communication between the user and the DBMS often is time-consuming, inefficient, and frequently ineffective. The development of procedural applications frequently is difficult and error-prone. A data base that tracks the research proposals and awards of a major university, for example, may require the attention of a programmer/systems analyst for 20 to 30 hours a week, not only to access the data for various admin-istrative reports but to ensure that data consistency is maintained so that the information generated from this data base is consistent over time.

Without the systematic guidelines of a theoretical foundation, data-base products have been developed largely on an ad hoc basis. The con-sequence has been a proliferation of different solutions to a general set of problems. Most of the available products were originally designed to work in stand-alone mode. Furthermore, these products are proprietary, and despite some similarities, each one approaches the same data tasks in its own unique way. As a result, the user ends up having to fill the gaps with his or her own programs and often must accept disruptive revisions that may result in further programming requirements to deal with further incompatibilities.

Implementation details change for a variety of reasons, and when these changes occur, further maintenance burdens may be imposed on their applications. The ability to transfer or distribute data and applications may be limited because such details tend to vary across platforms.

Various attempts have been made to overcome these limitations within the constraints of the personnel computer environment. In these approaches, however, the overall purpose of the data operations is not obvious to the database system, and thus, it cannot optimize them. In addition, there is neither information about its current state nor the intelligence on which to base optimizing decisions.

Issues of integrity, security, concurrency, and recovery must be properly addressed in the development of more effective database management systems. The power to ease-of-use ratio must be improved, and maintenance burdens must be minimized, while performance is maximized, especially over networks. Moreover, a variety of non-database software packages, which store and manage their own disparate data in different formats, must be more fully integrated into the DBMS.

The Relational Model

E. F. Codd, an IBM mathematician, developed a relational theory of data in 1969, which he proposed as a universal foundation for database systems. [3] His model is based on the mathematics of relations and first-order predicate logic--a rigorous definition of the "set operations" that a relational database should support for manipulation of tables. Codd's model covers the three primary aspects that any DBMS must address--structure, integrity, and manipulation.

As originally presented, the meaning and implications of Codd's relational model were largely misunderstood by others. Therefore, Codd supplemented his model with the now-famous Fidelity Rules to guide the implementation and evaluation of relational DBMS software. [4] The rules cover matters ranging from the database access that must be provided for users to issues of data security. Since then, the relational model has been refined, clarified, and extended in many ways, but the initial features and rules still valid.

All relational database share the following basic technology:

o A clear distinction is maintained in the database system between the logical views of the data presented to the user and the physical structure of the data stored in the system. The user does not need to understand the physical structure of the data in order to access and manage data in the database.

o The data structure is based on a simple logical that is easily understood by users who are not database technologists. Data are stored in tables, the rows of which must have unique storage addresses or ordering. Each cell of a table contains a single attribute value. Attributes in the same column are members of a set. Attributes in the same row are members of an ordered n-tuple. The n-tuples in the table form a relation (from which comes the term relational as applied to databases). Each table has one or more columns that contain the key to the table. The attributes in the key uniquely identify each relation. The DBMS--and not the user--must ensure that all database tables comply with these requirements. When they do, mathematical operations and strict logic can be applied to manipulate them.

o A high-level language is provided for accessing the sets (rows and columns) of the table and for joining (combining) tables that have a common set of attributes (one or more columns containing the same atttributes). The SQL language has been standardized by the American National Stanards Institute (ANSI) to fulfill this role. (Most vendors of relational database also provide other methods for accessing data.)

The characteristics of a relational database eliminate deficiencies of traditional databases and offer significant practical benefits. The tabular structure is simple and relatively "user friendly". It is general enough to represent most types of data; it is independent of any internal computer mechanisms; and it is flexible, because the user can readily restructure tables vertically, horizontally, or both ways, through either splitting or joining data. In fact, table manipulation always yields results that are tables themselves. By supporting a well-defined set of mathematical operations and some useful combinations--restrict, project, natural join, division, product, union, difference, and intersect--data access no longer needs to be limited by predetermined reporting procedures.

A data request can be specified in terms of the operations that must be performed on the tables within the database to derive the desired information (as a table). The DBMS then transparently translates these logical requests into an efficient internal-access strategy. A relational DBMS is built upon a catalog--a set of tables dynamically maintained by the system--and can use information about the database (e.g., statistics) in its catalog to optimize the logical operations.

Normalization

A prominent aspect of relational database theory is the concept of normalization--how data should be organized in order to make the database as compact and as easy to manage as possible and to ensure that consistent results are produced. Normalization rules provide design guidelines (or schema) by specifying how a relational database should be divided into tables and how these tables should be linked together. The two major objectives of normalization are:

(1) Minimize the duplication of data.

(2) Minimize the number of attributes that must be updated when changes are made to the database, thereby making the maintenance of the data easier and reducing the possibility for errors.

Codd initially defined three ways in which data in a database can be normalized. [5] Subsequently, two other approaches have been identified as normal forms. In order for a database to conform to the first normal form, attributes must be atomic; that is, an attribute must not be an n-tuple (row), and therefore, cannot be a set, list or, most importantly, a complex object or table. This restriction means that a table cannot be "nested" in a first normal form database--the nesting of parts must be eliminated by creating separate tables for each data set and by creating a relation in each table for the attributes that form the keys in the other tables. The second through fifth normal forms each define increasingly stringent conditions that must be met in order for the database to conform to that normal form. However, more stringent the requirements reduce the storage space needed in the database, the number of updates required, or both. Conformance with the first normal form often increases the amount of storage required, makes maintenance more difficult, and greatly increases the processing required since separate tables must be maintained and often must be joined to produce the desired information. Joins are highly compute-intensive operations.

When the first normal form constrain is removed and nested tables are allowed, the database is technically described as a non-first-normal form database and may be generally referred to as an extended or nested RDBMS. The extensions support new relational operations that are not possible with a first normal form database. Nested RDBMS simply the logical structure of many databases by eliminating tables that record/map relationships among data tables (referred to as relationship relations tables). Implicit relationship relations exist whenever one table is nested within another. However, these implicit relationship relations are maintained automatically and very efficiently by the extended relational database. Explicit relationship relations, on the other hand, must be defined by the designer as additional tables in the database schema, with the additional complexity, maintenance, and potential for error that the first normal form definition entails.

The relational approach requires that strict and comprehensive integrity constraints be enforced in the database to ensure data accuracy and consistency. Thus, the user is relieved from having to develop or maintain integrity code in his or her specific applications. As a consequence, the relational DBMS offers a level of productivity and reliability superior to that of traditional database systems. In addition, the relational model also requires support of logical units of work (or multi-statement transactions), as well as self-recovery from operational failures that can corrupt the database.

For the practical benefits of the relational model to materialize, the structure, integrity, and manipulative features must be incorporated in the DBMS engine. These features are highly interdependent, and the lack of any one feature affects the support of the others. It is not possible to provide all of the intended benefits by arbitrarily implementing only some of the features or by simply adding an interface to non-relational engines. The Fidelity Rules were devised to clarify this important point. A standard based on the relational model would yield the best of both worlds: The products that complied would offer both relational fidelity and standard compatibility. The underlying database functions would be the same for all products, regardless of whether they are stand-alone or multi-user or what kind of front-end tools and applications they have. In addition, front-end tools, such as spread-sheets and word processors, could then all operate on databases, not on disparate files.

The concrete expression of the relational model that has gained industry acceptance is Structured Query Language or SQL. SQL is now part of IBM's Systems Application Architecture (SAA) strategy. SQL is a language for interacting with relational databases, not a full application development language. First, this keeps the well-defined, set-oriented database foundation distinct from the less precise, procedural character of existing programming languages. Second, it avoids creating yet another general-purpose language that, by trying to be everything to everybody, becomes too complex to master and invites compromises. Third, it eschews the lengthy process that would be required to extend standard procedural languages, such a COBOL and FORTRAN, with relational database functions.

Centralized Data Processing

Centers Computers can help to achieve better management information if used to process properly designed information flows. Computers are not the automatic answer to the need for better information, however. In fact, undue preoccupation with how data will be processed and with the characteristics of the processing hardware and software often can inhibit the design of an effective management information system.

Hardware should be the last matter to be considered when thinking about an IMS. It is first necessary to decide what kind of information is needed--how soon, how much, and how often. Management information must include explicit attention to nonquantifiable inputs, as well as those that result from computerized data processing applications. The kind of equipment that will best serve these needs is a secondary, although important, consideration. Many early wrong notions about data processing can be dispelled by concentrating first on the information and communication requirements. In so doing, plans for computer hardware often shrink to more a realistic size.

The desirability of large centralized data processing centers depends more on the size and nature of the organization than on the purposes of an IMS. Many excellent information systems are serviced by relatively simple, local data processing operations, tailored to the particular needs of the users. With the further miniaturization and mass production of computer systems, the cost of mainframe capabilities has decreased dramatically. Through the introduction of more and more powerful desk-top computers, the power of the computer is now more readily available to resource managers in most organizations. With the advent of smaller hardware systems has come major breakthroughs in software support. Some hardware producers have combined several software packages into Decision Support Systems (DSS) which combine spreadsheets, word processing, a DBMS, and reporting functions. These systems can be very useful to the project manager.

An IMS goes beyond the objectives of centralized data collection and retrieval, however. As Kennevan suggests, an IMS is:

Endnotes

[1] As cited in Joel E. Ross, Modern Management and Information Systems (Reston, Va.: Reston Publishing Company, 1976), p. 133.

[2] Richard A. Bassler, "Data Bases, MIS and Data Base Management Systems, " in Computer Science and Public Administration, compiled by Richard A. Bassler and Norman L. Enger (Alexandria, Va.: College Readings, 1976), p. 203

[3] E. F. Codd, "A Relational Model of Data for Large Shared Data Banks," Communications of the ACM, June 1970.

[4] E. F. Codd, "The Twelve Rules for Determining How Relational a DBMS Product Is." TRI Technical Report EFC-6/05-16-86 (San Jose, Ca.: The Relational Institute, 1986).

[5] E.F. Codd, "Relational Completeness of the Data Base Sublanguages," in Data Base Systems, edited by R. Rustin (Englewood Cliffs, N.J.: Prentice-Hall, 1972), pp. 65-98.

[6] Walter J. Kennevan, "Management Information Systems," in Management of Information Handling Systems, edited by Paul W. Howerton (Roselle Park, N.J.: Hayden Book Company, 1974).

Continue Text

Return to Summary