UMDL Ontology Concept Descriptions

This document is a forum for ongoing work on the University of Michigan Digital Library (UMDL) ontology. It defines the ontology concepts in stylized natural language. Other online sites related to the ontology include:

Last modification on 12/18/97:

Coming next:

Concept definitions by area
Generic Digital Library
	Content
		Work
			Stages of Realization
			Collection Metadata
			Genre
			Publishing Format
			Digital Format
		Ontogeny
		Containment
		Roles
		Authority
	Services
		Foundation
		Top-level services
		Infrastructure related to services
	Licenses

UMDL-Specific Digital Library

Objectives

We anticipate that the ontology will be used for three important purposes:

Strategy

The ontology will include at least two modules: one generic to many digital libraries, and another specific to the UMDL (including concepts to describe societies of agents organized as computational economies).

We will embed concepts, as appropriate, in ontologies developed outside UMDL.

Eventually, we may translate (semi-automatically) from structured natural language definitions to a knowledge interchange format such as KIF. For now, we translate directly to a representation language, such as Loom, as required for each application.

Notation

The following conventions are used:

Related Publications

These papers are not immediately available due to copyright restrictions, but you may request copies.)

Generic Digital Library

Content

Work

Stages of Realization

Our concepts for the stages of work realization are adapted from a proposal by the International Federation of Library Associations and Institutions (IFLA proposal). They should prove very useful for our formalization of the licenses associated with content in the digital library. We have renamed some of the IFLA concepts (from WORK to CONCEPTION, from ITEM to DIGITIZATION), sharpened their definitions with the genre, PUBLISHING-FORMAT, and DIGITAL-FORMAT concepts, and have added the INSTANCE concept. We commonly use the word "work" to refer to content at an unspecified stage of realization, from CONCEPTION to INSTANCE; similarly, "genre" refers to the set of concepts under MODE.

Figure 1 - Stages of realization of work

Table 1 - A hierarchy of concepts for work realization
ConceptDefinitionExamples for how to derive a new <CONCEPTION ... INSTANCE>
CONCEPTION A concept, plan, or design for work. Convert a novel into a screenplay.
EXPRESSION Work with specified content. Translate to a new language.
MANIFESTATION An expression packaged in a publishing format. Publish a new edition.
DIGITIZATION A manifestation encoded in a digital format. Convert from Microsoft Word to ClarisWorks.
INSTANCE A particular copy of a digitization. Copy to a new location.

CONCEPTION. A concept, plan, or design for work, abstracted from any particular format. CONCEPTION has a UNIFORM-TITLE, a UNIFORM-NAME (for an AUTHOR), OTHER-DISTINGUISHING-CHARACTERISTIC, DESCRIPTION*, KEYWORD(s)*, TOPIC(s)*, DATE*, AUDIENCE*, HISTORICAL-CONTEXT, and CONCEPTUAL-LEVEL. (Note, no rights are associated with work at this level of abstraction). A CONCEPTION may be derived from another CONCEPTION. A CONCEPTION may include a set of CONCEPTIONs.

EXPRESSION. A CONCEPTION with specified content, in a genre described as a MODE. An EXPRESSION also has a TITLE*, OTHER-DISTINGUISHING-CHARACTERISTIC, DATE, LANGUAGE*, SUMMARY, HISTORICAL-CONTEXT, CRITICAL-RESPONSE, ROLEs for various rights-owners (e.g. author), USE-RESTRICTIONS, and a SIZE. An EXPRESSION may be derived from another EXPRESSION, or from a CONCEPTION if the precise source of its derivation isn't known. An EXPRESSION may include a set of EXPRESSIONS.

MANIFESTATION. An EXPRESSION packaged in a PUBLISHING-FORMAT. MANIFESTATIONs also have a TITLE* (typically inherited from the EXPRESSION, but may be a variant), and a NAME for an AUTHOR (that may be a pseudonym); also a unique IDENTIFIER, EDITION/ISSUE, PLACE-OF-PUBLICATION, SERIES-STATEMENT, PROVIDER(s)*, ROLEs for various rights-owners (e.g. publisher), TERMS-OF-AVAILABILITY, CONTACT*, CHRONOLOGICAL-COVERAGE*, and UPDATE-FREQUENCY*. A MANIFESTATION may be derived from another MANIFESTATION, or from an EXPRESSION if the precise source of its derivation isn't known. A MANIFESTATION may include a set of MANIFESTATIONs.

DIGITIZATION. A MANIFESTATION encoded in a DIGITAL-FORMAT. A DIGITIZATION has a unique IDENTIFIER, DATE, and PROVENANCE. A DIGITIZATION may be derived from another DIGITIZATION, or from a MANIFESTATION if the precise source of its derivation isn't known. A DIGITIZATION may include a set of DIGITIZATIONs.

INSTANCE. A particular copy of a DIGITIZATION. An INSTANCE has a unique IDENTIFIER, DATE, ADDRESS*, ACCESS-MECHANISMs*, ACCESS-RESTRICTIONS, EXHIBITION-HISTORY, CONDITION, and TREATMENT-HISTORY. An INSTANCE may be derived from another INSTANCE, or from a DIGITIZATION if the precise source of its derivation isn't known. An INSTANCE may include a set of INSTANCEs.

COLLECTION. An INSTANCE maintained by a CONTENT-PROVIDER.

Collection Metadata
NAME. Short text which identifies some Thing (but not necessarily uniquely).

DESCRIPTION. Natural language text that describes some Thing.

KEYWORD. A word used as a value in CONTENT-METADATA.

PROVIDER. A (publishing) Agent.

CONTENT-PROVIDER. A PROVIDER responsible for maintaining DIGITIZATIONs.

ACCESS-PROVIDER. A PROVIDER responsible for ACTOR(s) that perform SERVICEs that provide access to DIGITIZATIONs.

RIGHTS-OWNER. An Agent (organization or person) who holds some RIGHTS relating to a particular work or ACTOR.

CONTACT. A Person responsible for CONTENT-METADATA describing a COLLECTION.

TOPIC. Short text used as values in CONTENT-METADATA.

AUDIENCE. A "typical" Person for whom work is intended... One of JUVENILE, YOUNG-ADULT, ADULT...

CONCEPTUAL-LEVEL. The degree of intellectual sophistication of work... One of INTRODUCTORY, INTERMEDIATE, ADVANCED, SCHOLARLY, GENERAL.

LANGUAGE. A natural language. One of ENGLISH, GERMAN, ...

CHRONOLOGICAL COVERAGE. A range of publication dates for which INSTANCEs are available, including starting and ending Timepoints (definition is incomplete).

UPDATE-FREQUENCY. (?) One of CONTINUOUS, IRREGULAR, ...

ACCESS-MECHANISM(s). A computerized means that provides access to a COLLECTION.... SEARCH-ENGINES, SEARCH-LANGUAGES,.... (?)

Genre

Genre is perhaps the most slippery area in the ontology, for it encompasses the incredibly variegated diversity of human expression that might be provided by a digital library. We use the word "genre" because it conveys a sense of the process of categorizing work based on expectations for form and content established inductively by experience. Our use, however, is on a much more general level than that denoted by the common use of the word "genre" (for example, to denote a set of mystery novels inspired by Hitchcock) -- for which we use the suffix "approach", as in MUSICAL-APPROACH.

To capture the complexity of "genre" we require a set of concepts, including MODE and all its children, as summarized in Figure 2. MODE has multiple co-existing sub-graphs rather than a simple tree. Lower-level concepts may inherit from each dimension: for example, a SONG is MUSIC and also lyrics (LINGUISTIC SYMBOLIC expression).

There are currently some gaps in Figure 2; we consider this area of the ontology to be provisional. Future work will include insight into the relationship between genre and publishing format; genre connotes an expectation, or potential for a particular publishing format, but not a requirement.

Figure 2 - Genre: Formats for Expression

MODE. A category of creative composition, loosely defined by a set of expectations for form, style, and to a lesser degree the content of the communication. A MODE has a VERACITY and a MODE-OF-PERCEPTION.

VERACITY. One of {FICTION, NON-FICTION}, distinguished by whether a work's content is presented as factually accurate.

MODE-OF-PERCEPTION. One or more of {SYMBOLIC, SOUND, VISUAL}, distinguished by the medium in which a work's content is encoded.

FICTION. A MODE with narrative proceeding from invention, with a FICTION-APPROACH and FICTION-STRUCTURE.

NON-FICTION. A MODE with...

SYMBOLIC. A MODE with content whose meaning is encoded in numbers, text, or some other system...

SOUND. A MODE with content encoded in sound, meant to be listened to...

VISUAL. A MODE with content encoded in images, meant to be seen...

FICTION-APPROACH. One or more of {MYSTERY, ROMANCE, SCIENCE-FICTION, HISTORICAL, ...}...

FICTION-STRUCTURE. One or more of {NOVEL, SCREENPLAY, PLAY, MOVIE, NOVELLA, SHORT-STORY, POEM, ...}...

DATA. A SYMBOLIC MODE with content encoded in numbers and text as structured attribute/value pairs...

NOTATION. A SYMBOLIC MODE with content encoded in symbols with meaning other than is standard for numbers or text...

LINGUISTIC. A SYMBOLIC MODE with content encoded in words...

MUSIC. A SOUND MODE with metrical composition having some MUSICAL-FORM, MUSICAL-ARRANGEMENT, and MUSICAL-APPROACH.

NOISE. A SOUND MODE without designed structure...

STATIC-VISUAL. A VISUAL MODE including a single image, with a STATIC-VISUAL-APPROACH and a STATIC-VISUAL-STRUCTURE.

SEQUENTIAL-VISUAL. A VISUAL-MODE with many images,...

METADATA. DATA describing some aspect of the digital library with a frame (a set of Attribute Name/Type/Value slots).

CONTENT-METADATA. METADATA that describes work.

CONTEXT-METADATA. METADATA that describes tasks (as changes in world STATEs).

CONTROL-METADATA. METADATA that describes PREFERENCEs.

MUSICAL-NOTATION. A NOTATION MODE for representing music with little dots on horizontal lines...

MUSICAL-FORM. One or more of {SONG, SYMPHONY, CONCERTO, ...},...

MUSICAL-ARRANGEMENT. Has a KEY, HARMONY, and RYTHYM...

MUSICAL-APPROACH. One or more of {ROCK, JAZZ, CLASSICAL, ... },...

SONG. MUSIC with lyrics, thus a LINGUISTIC STYLE and possibly TONGUE.

LINGUISTIC-STYLE. One of {PROSE, VERSE},...

PROSE. A LINGUISTIC SYMBOLIC MODE...

VERSE. A LINGUISTIC SYMBOLIC MODE which may rhyme, may have a meter, ...

TONGUE. A human language, one of {english, hebrew, swahili, ...}, ...

STATIC-VISUAL-APPROACH. One or more of {ROMANTIC, SURREALIST, IMPRESSIONIST, ...},...

STATIC-VISUAL-STRUCTURE. One or more of {PAINTING, PHOTOGRAPH, DRAWING, HOLOGRAM, ...},...

Publishing Format

PUBLISHING-FORMAT. A way to package information for distribution, loosely defined by a set of expectations for the organization and presentation of the communication.

LIVE. A PUBLISHING-FORMAT that is ephemeral, consumed at the point of creation.

PERFORMANCE. A LIVE PUBLISHING-FORMAT with SCHEDULED-PERFORMANCE-DATES, PERFORMERS, and a PRESENTER.

PHENOMENA. A LIVE PUBLISHING-FORMAT that communicates natural or historical events...

RECORDED. A PUBLISHING-FORMAT that is stored in the digital library for some Duration.

BOOK. A RECORDED PUBLISHING-FORMAT that may have a DEDICATION, FORWARD, PREFACE, CHAPTERS, INDEX, ILLUSTRATIONs, and INSCRIPTIONs....

MONOGRAPH. A BOOK, non-fiction, ...

NOVEL. A fiction BOOK, fiction, ...

MAGAZINE. A series of issues...a RECORDED PUBLISHING-FORMAT...

JOURNAL. A MAGAZINE whose content is refereed...

ESSAY. ?...a RECORDED PUBLISHING-FORMAT...

ILLUSTRATION. A PAINTING or DRAWING in a BOOK...

LP. A RECORDED PUBLISHING-FORMAT: a "long-playing" analog disk, usually fabricated of black vinyl, with approximately 40 minutes of music or other recorded sounds...

HYPERMEDIA. A RECORDED PUBLISHING-FORMAT structured as a cross-linked network of multiple elements with distinct MODE-OF-PERCEPTIONs.

Digital Format

DIGITAL-FORMAT. A specification for encoding information in digital form.

TEXT...

IMAGE...

VIDEO...

SOUND...

EXECUTABLE...

Ontogeny

An entity's ontogeny is the story of its origin and development. This section of the UMDL ontology defines relationships between work, when a work transforms or extends previous work in some way. We distinguish between DERIVATION, which transforms the earlier work, and ADDITION, which extends it. Examples of DERIVATION include translations, abridgements, and summaries. Supplementary indices and sequels are examples of ADDITION. These categories are typically, but not necessarily disjoint. For example, a REVISION may be both DERIVATION (REVISED-EDITION) and ADDITION (ENLARGED-EDITION). We call the earlier work in an ontogenic development the "preceding" work. We call the derived or added work the "subsequent" work.

Every instance of an ontogenic relationship must be linked from a specific stage of realization of the preceding work, to some specific stage of realization of the subsequent work. Fortunately, the rule for identifying the correct stages of realization is very simple:

Figure 3 illustrates two situations. In the first, (a), the subsequent work is realized from the same CONCEPTION, but has a different EXPRESSION. An example would be a supplementary index: since page numbers are specific to a PUBLISHING-FORMAT, the link's origin is at the MANIFESTATION level. The second situation, (b), would be appropriate for a sequel, a new work that makes specific references to events in the preceding work that identify an EXPRESSION, but not a MANIFESTATION.

Figure 3 - Placing ontogenic relationships in the work hierarchy

ONTOGENY. A relationship between a preceding and a subsequent work (minimally, between a from-CONCEPTION to a to-CONCEPTION).

DERIVATION. An ONTOGENY relationship in which the subsequent work is a transformation of the preceding work.

ADDITION. An ONTOGENY relationship in which the subsequent work extends the preceding work.

(One or two layers of additional detail for ontogeny relationships is pending...)

Containment

Works often contain other works. As with ONTOGENY relations, we need to specify the stage of realization of both the containing, and the contained work. Again, the rules for doing this are surprisingly simple:

Figure 4 illustrates CONTAINS relations for an album that includes three concertos. Concerto1 and Concerto2 are existing works. They already have CONCEPTIONs independent of the album. Ideally, the contained work concepts (a MANIFESTATION and EXPRESSION, respectively) also have DERIVED-FROM relations from the original work.

Figure 4 - Containment relations for an album with three concertos

Contained works are represented using all of the work hierarchy concepts and ontogenic relations in exactly the same manner as works that are not contained. For example, in Figure 5, Concerto3 was originally created to be contained by the album. Subsequently, a new EXPRESSION is created. The new album shares Concerto3's CONCEPTION in the usual manner, and may have a DERIVED-FROM relation from Concerto3's original EXPRESSION.

Some special processing is required to keep track of the original contents of the album. The album's EXPRESSION concept inherits the CONTAINS link from its CONCEPTION to Concerto3's CONCEPTION, but this is not sufficient to identify which of the two EXPRESSIONs are actually on the album. It is straightforward, however, to include rules in the ontology that automatically infer CONTAINS relations on all sub-levels of the work hierarchy (as illustrated in the figure). This inference occurs at the time of the creation of the CONTAINS relation: By the containment rules, above, the contained work is guaranteed to have only one instance of each work sub-concept at that time. This simple approach also depends on the hierarchical structure of instances of work: Every CONCEPTION can have multiple EXPRESSIONs, every EXPRESSION can have multiple MANIFESTATIONs, and so on. It is not possible for an EXPRESSION to have multiple CONCEPTIONs, however, and so forth.

Figure 5 - Automatic inference of more specific containment relations permits subsequent addition of derived works

CONTAINS. A whole-part relationship; when one CONCEPTION includes another CONCEPTION. (Remember that EXPRESSIONs, MANIFESTATIONs and so on are also CONCEPTIONs).

Roles

ROLE. A CREATOR's contribution to the realization of work, a CONCEPTION.

CREATOR. An Agent (person or organization), identified by a NAME.

AUTHOR.... a ROLE.

COMPOSER.... a ROLE.

EDITOR.... a ROLE for an EXPRESSION.

TRANSLATOR.... a ROLE for an EXPRESSION.

PUBLISHER.... a ROLE for a MANIFESTION.

PRODUCER.... a ROLE for a MANIFESTION.

PERFORMER.... a ROLE for a MANIFESTION.

DIGITIZER.... a ROLE for a DIGITIZATION.

OWNER.... a ROLE for an INSTANCE.

LICENSEE.... a ROLE for an INSTANCE.

ARCHIVER.... a ROLE for an INSTANCE.

DISTRIBUTOR.... a ROLE for an INSTANCE.

Authority

UNIFORM-STRING. A controlled NAME -- usage is via an authority file. A UNIFORM-STRING may have one or more variant NAMEs.

UNIFORM-NAME. A UNIFORM-STRING for an Agent.

UNIFORM-TITLE. A UNIFORM-STRING and a TITLE.

Services

Foundation

SERVICE. An ACTION, by an ACTOR and for one or more ACTORs, performed in accordance with an AGREEMENT.

ACTION. A modification of the world, from a input STATE to an output STATE. An ACTION may be primitive, or may be composed of one or more alternative ACTION-SEQUENCEs.

ACTION-SEQUENCE. A partial order of ACTIONs.

ACTOR. Either a Person, Organization, or computational AGENT. ACTORs are capable of ACTION, and they act according to PREFERENCEs.

PREFERENCE. A function which expresses an ACTOR's preferences between world STATEs. PREFERENCEs may be explicitly encoded as a utility function or a set of discrete goals, or may be left implicit in an ACTOR's behavior. (An open and difficult issue will be the degree to which we need to explicitly represent the conditionality of preferences on task contexts, and the roles actors play within task contexts!).

STATE. A set of Attributes with Values that partially describe the state of the world at some Timepoint.

Top-level services

DELIVER. A SERVICE that copies some Thing from one ADDRESS to another ADDRESS.

DELIVER-CONTENT.

DELIVER-INTELLECTUAL-WORK.

DELIVER-CONTRACT.

DELIVER-MONEY.

QUERY. A SERVICE that formulates and submits a request for a RECOMMENDation. (Associates services may include QUERY-FORMULATION, QUERY-SUBMISSION, and perhaps QUERY-STORAGE).

STANDING-QUERY. A QUERY that requests notification of newly available work, subject to a CONTRACT.

RECOMMEND. A SERVICE that answers a QUERY with a set of options.

RECOMMEND-COLLECTIONS. A RECOMMEND SERVICE that suggests COLLECTIONs.

RECOMMEND-WAYS-TO-INQUIRE. A RECOMMEND SERVICE that suggests other tools to use.

RECOMMEND-WAYS-TO-ALTER-QUERY. A RECOMMEND SERVICE that suggests revisions to the QUERY.

REGISTRATION. A SERVICE that produces METADATA and enters it in a REGISTRY.

AGENT-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a computational entity.

COLLECTION-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a GOOD.

SELF-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a person.

TRANSLATE. A SERVICE whose ACTION transforms content to a different but semantically equivalent form.

Infrastructure related to services

REGISTRY. A set of METADATA, including an ADDRESS for each entry.

ADDRESS. Data that can be used to locate some Thing.

Licenses

The definition for LICENSE and its related concepts are based on the memo Intellectual Property Rights Specification produced by the UMDL intellectual property group.

AGREEMENT. A relationship between two or more ACTORs. May be explicit or implicit, requested or unsolicited.

CONTRACT. An AGREEMENT that is stipulated explicitly as a set of LICENSEs.

LICENSE. Authorization to provide a a set of SERVICEs, granted by an ACTOR (the owner) to an ACTOR or class of ACTORs (holders). A LICENSE has a Duration (starting and ending Timepoints). Transfer of a LICENSE may be permitted if there is an associated REDISTRIBUTION-LICENSE.

USAGE-LICENSE. A LICENSE that applies to some class of work (authorizing SERVICEs whose ACTIONs modify attributes of the world's STATE that are associated with the work).

DERIVATION-LICENSE. A LICENSE to use work (EXPRESSIONs) in some way to produce new work.

REDISTRIBUTION-LICENSE. A LICENSE to convert a LICENSE of a holding ACTOR to a (potentially modified) LICENSE held by a member of some class of license-buyer ACTORs. Potential license-buyers may include all ACTORs, none, or some union of ACTOR classes. When a holder redistributes a LICENSE, that holder is no longer a holder (is this always true!?). (Note that modification of LICENSEs as part of transfer provides a way to represent exclusivity).

UMDL-Specific Digital Library (descriptions pending)

AGENT (computational), GOOD, AUCTION, TASK-PLANNER, other types of AGENTs, UMDL-AGREEMENT (sets of services that are OK or prohibited), ...
The University of Michigan Digital Library.
The Beethoven Project.
The Service Classifier Agent.
To mail Peter Weinstein.
To mail the ontology group.