This document is a forum for ongoing work on the University of Michigan Digital Library (UMDL) ontology. It defines the ontology concepts in stylized natural language. Other online sites related to the ontology include:
Coming next:
Generic Digital Library Content Work Stages of Realization Collection Metadata Genre Publishing Format Digital Format Ontogeny Containment Roles Authority Services Foundation Top-level services Infrastructure related to services Licenses UMDL-Specific Digital Library
We will embed concepts, as appropriate, in ontologies developed outside UMDL.
Eventually, we may translate (semi-automatically) from structured natural language definitions to a knowledge interchange format such as KIF. For now, we translate directly to a representation language, such as Loom, as required for each application.
These papers are not immediately available due to copyright restrictions, but you may request copies.)
Our concepts for the stages of work realization are adapted from a proposal by the International Federation of Library Associations and Institutions (IFLA proposal). They should prove very useful for our formalization of the licenses associated with content in the digital library. We have renamed some of the IFLA concepts (from WORK to CONCEPTION, from ITEM to DIGITIZATION), sharpened their definitions with the genre, PUBLISHING-FORMAT, and DIGITAL-FORMAT concepts, and have added the INSTANCE concept. We commonly use the word "work" to refer to content at an unspecified stage of realization, from CONCEPTION to INSTANCE; similarly, "genre" refers to the set of concepts under MODE.
Figure 1 - Stages of realization of work
Concept | Definition | Examples for how to derive a new <CONCEPTION ... INSTANCE> |
---|---|---|
CONCEPTION | A concept, plan, or design for work. | Convert a novel into a screenplay. |
EXPRESSION | Work with specified content. | Translate to a new language. |
MANIFESTATION | An expression packaged in a publishing format. | Publish a new edition. |
DIGITIZATION | A manifestation encoded in a digital format. | Convert from Microsoft Word to ClarisWorks. |
INSTANCE | A particular copy of a digitization. | Copy to a new location. |
CONCEPTION. A concept, plan, or design for work, abstracted from any particular format. CONCEPTION has a UNIFORM-TITLE, a UNIFORM-NAME (for an AUTHOR), OTHER-DISTINGUISHING-CHARACTERISTIC, DESCRIPTION*, KEYWORD(s)*, TOPIC(s)*, DATE*, AUDIENCE*, HISTORICAL-CONTEXT, and CONCEPTUAL-LEVEL. (Note, no rights are associated with work at this level of abstraction). A CONCEPTION may be derived from another CONCEPTION. A CONCEPTION may include a set of CONCEPTIONs.
EXPRESSION. A CONCEPTION with specified content, in a genre described as a MODE. An EXPRESSION also has a TITLE*, OTHER-DISTINGUISHING-CHARACTERISTIC, DATE, LANGUAGE*, SUMMARY, HISTORICAL-CONTEXT, CRITICAL-RESPONSE, ROLEs for various rights-owners (e.g. author), USE-RESTRICTIONS, and a SIZE. An EXPRESSION may be derived from another EXPRESSION, or from a CONCEPTION if the precise source of its derivation isn't known. An EXPRESSION may include a set of EXPRESSIONS.
MANIFESTATION. An EXPRESSION packaged in a PUBLISHING-FORMAT. MANIFESTATIONs also have a TITLE* (typically inherited from the EXPRESSION, but may be a variant), and a NAME for an AUTHOR (that may be a pseudonym); also a unique IDENTIFIER, EDITION/ISSUE, PLACE-OF-PUBLICATION, SERIES-STATEMENT, PROVIDER(s)*, ROLEs for various rights-owners (e.g. publisher), TERMS-OF-AVAILABILITY, CONTACT*, CHRONOLOGICAL-COVERAGE*, and UPDATE-FREQUENCY*. A MANIFESTATION may be derived from another MANIFESTATION, or from an EXPRESSION if the precise source of its derivation isn't known. A MANIFESTATION may include a set of MANIFESTATIONs.
DIGITIZATION. A MANIFESTATION encoded in a DIGITAL-FORMAT. A DIGITIZATION has a unique IDENTIFIER, DATE, and PROVENANCE. A DIGITIZATION may be derived from another DIGITIZATION, or from a MANIFESTATION if the precise source of its derivation isn't known. A DIGITIZATION may include a set of DIGITIZATIONs.
INSTANCE. A particular copy of a DIGITIZATION. An INSTANCE has a unique IDENTIFIER, DATE, ADDRESS*, ACCESS-MECHANISMs*, ACCESS-RESTRICTIONS, EXHIBITION-HISTORY, CONDITION, and TREATMENT-HISTORY. An INSTANCE may be derived from another INSTANCE, or from a DIGITIZATION if the precise source of its derivation isn't known. An INSTANCE may include a set of INSTANCEs.
COLLECTION. An INSTANCE maintained by a CONTENT-PROVIDER.
DESCRIPTION. Natural language text that describes some Thing.
KEYWORD. A word used as a value in CONTENT-METADATA.
PROVIDER. A (publishing) Agent.
CONTENT-PROVIDER. A PROVIDER responsible for maintaining DIGITIZATIONs.
ACCESS-PROVIDER. A PROVIDER responsible for ACTOR(s) that perform SERVICEs that provide access to DIGITIZATIONs.
RIGHTS-OWNER. An Agent (organization or person) who holds some RIGHTS relating to a particular work or ACTOR.
CONTACT. A Person responsible for CONTENT-METADATA describing a COLLECTION.
TOPIC. Short text used as values in CONTENT-METADATA.
AUDIENCE. A "typical" Person for whom work is intended... One of JUVENILE, YOUNG-ADULT, ADULT...
CONCEPTUAL-LEVEL. The degree of intellectual sophistication of work... One of INTRODUCTORY, INTERMEDIATE, ADVANCED, SCHOLARLY, GENERAL.
LANGUAGE. A natural language. One of ENGLISH, GERMAN, ...
CHRONOLOGICAL COVERAGE. A range of publication dates for which INSTANCEs are available, including starting and ending Timepoints (definition is incomplete).
UPDATE-FREQUENCY. (?) One of CONTINUOUS, IRREGULAR, ...
ACCESS-MECHANISM(s). A computerized means that provides access to a COLLECTION.... SEARCH-ENGINES, SEARCH-LANGUAGES,.... (?)
To capture the complexity of "genre" we require a set of concepts, including MODE and all its children, as summarized in Figure 2. MODE has multiple co-existing sub-graphs rather than a simple tree. Lower-level concepts may inherit from each dimension: for example, a SONG is MUSIC and also lyrics (LINGUISTIC SYMBOLIC expression).
There are currently some gaps in Figure 2; we consider this area of the ontology to be provisional. Future work will include insight into the relationship between genre and publishing format; genre connotes an expectation, or potential for a particular publishing format, but not a requirement.
Figure 2 - Genre: Formats for Expression
MODE. A category of creative composition, loosely defined by a set of expectations for form, style, and to a lesser degree the content of the communication. A MODE has a VERACITY and a MODE-OF-PERCEPTION.
VERACITY. One of {FICTION, NON-FICTION}, distinguished by whether a work's content is presented as factually accurate.
MODE-OF-PERCEPTION. One or more of {SYMBOLIC, SOUND, VISUAL}, distinguished by the medium in which a work's content is encoded.
FICTION. A MODE with narrative proceeding from invention, with a FICTION-APPROACH and FICTION-STRUCTURE.
NON-FICTION. A MODE with...
SYMBOLIC. A MODE with content whose meaning is encoded in numbers, text, or some other system...
SOUND. A MODE with content encoded in sound, meant to be listened to...
VISUAL. A MODE with content encoded in images, meant to be seen...
FICTION-APPROACH. One or more of {MYSTERY, ROMANCE, SCIENCE-FICTION, HISTORICAL, ...}...
FICTION-STRUCTURE. One or more of {NOVEL, SCREENPLAY, PLAY, MOVIE, NOVELLA, SHORT-STORY, POEM, ...}...
DATA. A SYMBOLIC MODE with content encoded in numbers and text as structured attribute/value pairs...
NOTATION. A SYMBOLIC MODE with content encoded in symbols with meaning other than is standard for numbers or text...
LINGUISTIC. A SYMBOLIC MODE with content encoded in words...
MUSIC. A SOUND MODE with metrical composition having some MUSICAL-FORM, MUSICAL-ARRANGEMENT, and MUSICAL-APPROACH.
NOISE. A SOUND MODE without designed structure...
STATIC-VISUAL. A VISUAL MODE including a single image, with a STATIC-VISUAL-APPROACH and a STATIC-VISUAL-STRUCTURE.
SEQUENTIAL-VISUAL. A VISUAL-MODE with many images,...
METADATA. DATA describing some aspect of the digital library with a frame (a set of Attribute Name/Type/Value slots).
CONTENT-METADATA. METADATA that describes work.
CONTEXT-METADATA. METADATA that describes tasks (as changes in world STATEs).
CONTROL-METADATA. METADATA that describes PREFERENCEs.
MUSICAL-NOTATION. A NOTATION MODE for representing music with little dots on horizontal lines...
MUSICAL-FORM. One or more of {SONG, SYMPHONY, CONCERTO, ...},...
MUSICAL-ARRANGEMENT. Has a KEY, HARMONY, and RYTHYM...
MUSICAL-APPROACH. One or more of {ROCK, JAZZ, CLASSICAL, ... },...
SONG. MUSIC with lyrics, thus a LINGUISTIC STYLE and possibly TONGUE.
LINGUISTIC-STYLE. One of {PROSE, VERSE},...
PROSE. A LINGUISTIC SYMBOLIC MODE...
VERSE. A LINGUISTIC SYMBOLIC MODE which may rhyme, may have a meter, ...
TONGUE. A human language, one of {english, hebrew, swahili, ...}, ...
STATIC-VISUAL-APPROACH. One or more of {ROMANTIC, SURREALIST, IMPRESSIONIST, ...},...
STATIC-VISUAL-STRUCTURE. One or more of {PAINTING, PHOTOGRAPH, DRAWING, HOLOGRAM, ...},...
LIVE. A PUBLISHING-FORMAT that is ephemeral, consumed at the point of creation.
PERFORMANCE. A LIVE PUBLISHING-FORMAT with SCHEDULED-PERFORMANCE-DATES, PERFORMERS, and a PRESENTER.
PHENOMENA. A LIVE PUBLISHING-FORMAT that communicates natural or historical events...
RECORDED. A PUBLISHING-FORMAT that is stored in the digital library for some Duration.
BOOK. A RECORDED PUBLISHING-FORMAT that may have a DEDICATION, FORWARD, PREFACE, CHAPTERS, INDEX, ILLUSTRATIONs, and INSCRIPTIONs....
MONOGRAPH. A BOOK, non-fiction, ...
NOVEL. A fiction BOOK, fiction, ...
MAGAZINE. A series of issues...a RECORDED PUBLISHING-FORMAT...
JOURNAL. A MAGAZINE whose content is refereed...
ESSAY. ?...a RECORDED PUBLISHING-FORMAT...
ILLUSTRATION. A PAINTING or DRAWING in a BOOK...
LP. A RECORDED PUBLISHING-FORMAT: a "long-playing" analog disk, usually fabricated of black vinyl, with approximately 40 minutes of music or other recorded sounds...
HYPERMEDIA. A RECORDED PUBLISHING-FORMAT structured as a cross-linked network of multiple elements with distinct MODE-OF-PERCEPTIONs.
TEXT...
IMAGE...
VIDEO...
SOUND...
EXECUTABLE...
Every instance of an ontogenic relationship must be linked from a specific stage of realization of the preceding work, to some specific stage of realization of the subsequent work. Fortunately, the rule for identifying the correct stages of realization is very simple:
Figure 3 illustrates two situations. In the first, (a), the subsequent work is realized from the same CONCEPTION, but has a different EXPRESSION. An example would be a supplementary index: since page numbers are specific to a PUBLISHING-FORMAT, the link's origin is at the MANIFESTATION level. The second situation, (b), would be appropriate for a sequel, a new work that makes specific references to events in the preceding work that identify an EXPRESSION, but not a MANIFESTATION.
Figure 3 - Placing ontogenic relationships in the work hierarchy
ONTOGENY. A relationship between a preceding and a subsequent work (minimally, between a from-CONCEPTION to a to-CONCEPTION).
DERIVATION. An ONTOGENY relationship in which the subsequent work is a transformation of the preceding work.
ADDITION. An ONTOGENY relationship in which the subsequent work extends the preceding work.
(One or two layers of additional detail for ontogeny relationships is pending...)
Figure 4 illustrates CONTAINS relations for an album that includes three concertos. Concerto1 and Concerto2 are existing works. They already have CONCEPTIONs independent of the album. Ideally, the contained work concepts (a MANIFESTATION and EXPRESSION, respectively) also have DERIVED-FROM relations from the original work.
Figure 4 - Containment relations for an album with three concertos
Contained works are represented using all of the work hierarchy concepts and ontogenic relations in exactly the same manner as works that are not contained. For example, in Figure 5, Concerto3 was originally created to be contained by the album. Subsequently, a new EXPRESSION is created. The new album shares Concerto3's CONCEPTION in the usual manner, and may have a DERIVED-FROM relation from Concerto3's original EXPRESSION.
Some special processing is required to keep track of the original contents of the album. The album's EXPRESSION concept inherits the CONTAINS link from its CONCEPTION to Concerto3's CONCEPTION, but this is not sufficient to identify which of the two EXPRESSIONs are actually on the album. It is straightforward, however, to include rules in the ontology that automatically infer CONTAINS relations on all sub-levels of the work hierarchy (as illustrated in the figure). This inference occurs at the time of the creation of the CONTAINS relation: By the containment rules, above, the contained work is guaranteed to have only one instance of each work sub-concept at that time. This simple approach also depends on the hierarchical structure of instances of work: Every CONCEPTION can have multiple EXPRESSIONs, every EXPRESSION can have multiple MANIFESTATIONs, and so on. It is not possible for an EXPRESSION to have multiple CONCEPTIONs, however, and so forth.
Figure 5 - Automatic inference of more specific containment relations permits subsequent addition of derived works
CONTAINS. A whole-part relationship; when one CONCEPTION includes another CONCEPTION. (Remember that EXPRESSIONs, MANIFESTATIONs and so on are also CONCEPTIONs).
CREATOR. An Agent (person or organization), identified by a NAME.
AUTHOR.... a ROLE.
COMPOSER.... a ROLE.
EDITOR.... a ROLE for an EXPRESSION.
TRANSLATOR.... a ROLE for an EXPRESSION.
PUBLISHER.... a ROLE for a MANIFESTION.
PRODUCER.... a ROLE for a MANIFESTION.
PERFORMER.... a ROLE for a MANIFESTION.
DIGITIZER.... a ROLE for a DIGITIZATION.
OWNER.... a ROLE for an INSTANCE.
LICENSEE.... a ROLE for an INSTANCE.
ARCHIVER.... a ROLE for an INSTANCE.
DISTRIBUTOR.... a ROLE for an INSTANCE.
UNIFORM-NAME. A UNIFORM-STRING for an Agent.
UNIFORM-TITLE. A UNIFORM-STRING and a TITLE.
ACTION. A modification of the world, from a input STATE to an output STATE. An ACTION may be primitive, or may be composed of one or more alternative ACTION-SEQUENCEs.
ACTION-SEQUENCE. A partial order of ACTIONs.
ACTOR. Either a Person, Organization, or computational AGENT. ACTORs are capable of ACTION, and they act according to PREFERENCEs.
PREFERENCE. A function which expresses an ACTOR's preferences between world STATEs. PREFERENCEs may be explicitly encoded as a utility function or a set of discrete goals, or may be left implicit in an ACTOR's behavior. (An open and difficult issue will be the degree to which we need to explicitly represent the conditionality of preferences on task contexts, and the roles actors play within task contexts!).
STATE. A set of Attributes with Values that partially describe the state of the world at some Timepoint.
DELIVER-CONTENT.
DELIVER-INTELLECTUAL-WORK.
DELIVER-CONTRACT.
DELIVER-MONEY.
QUERY. A SERVICE that formulates and submits a request for a RECOMMENDation. (Associates services may include QUERY-FORMULATION, QUERY-SUBMISSION, and perhaps QUERY-STORAGE).
STANDING-QUERY. A QUERY that requests notification of newly available work, subject to a CONTRACT.
RECOMMEND. A SERVICE that answers a QUERY with a set of options.
RECOMMEND-COLLECTIONS. A RECOMMEND SERVICE that suggests COLLECTIONs.
RECOMMEND-WAYS-TO-INQUIRE. A RECOMMEND SERVICE that suggests other tools to use.
RECOMMEND-WAYS-TO-ALTER-QUERY. A RECOMMEND SERVICE that suggests revisions to the QUERY.
REGISTRATION. A SERVICE that produces METADATA and enters it in a REGISTRY.
AGENT-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a computational entity.
COLLECTION-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a GOOD.
SELF-REGISTRATION . A REGISTRATION SERVICE where the METADATA describes a person.
TRANSLATE. A SERVICE whose ACTION transforms content to a different but semantically equivalent form.
REGISTRY. A set of METADATA, including an ADDRESS for each entry.
ADDRESS. Data that can be used to locate some Thing.
AGREEMENT. A relationship between two or more ACTORs. May be explicit or implicit, requested or unsolicited.
CONTRACT. An AGREEMENT that is stipulated explicitly as a set of LICENSEs.
LICENSE. Authorization to provide a a set of SERVICEs, granted by an ACTOR (the owner) to an ACTOR or class of ACTORs (holders). A LICENSE has a Duration (starting and ending Timepoints). Transfer of a LICENSE may be permitted if there is an associated REDISTRIBUTION-LICENSE.
USAGE-LICENSE. A LICENSE that applies to some class of work (authorizing SERVICEs whose ACTIONs modify attributes of the world's STATE that are associated with the work).
DERIVATION-LICENSE. A LICENSE to use work (EXPRESSIONs) in some way to produce new work.
REDISTRIBUTION-LICENSE. A LICENSE to convert a LICENSE of a holding ACTOR to a (potentially modified) LICENSE held by a member of some class of license-buyer ACTORs. Potential license-buyers may include all ACTORs, none, or some union of ACTOR classes. When a holder redistributes a LICENSE, that holder is no longer a holder (is this always true!?). (Note that modification of LICENSEs as part of transfer provides a way to represent exclusivity).