Technology for the Rest of Us:
What Every Librarian Should Understand about the Technologies that Affect Us

XML Glossary

Ron Gilmour

Back to Glossary List
Back to XML Page
Back to Seminar Home Page


ASCII (American Standard Code for Information Exchange)

A system of character encoding that includes 128 uppercase and lowercase Latin letters, numbers, punctuation, and other symbols; often referred to as "plain text."

Attribute

A name-value pair that is associated with an element. It is located after the element name in the opening tag.

Comment

A bit of code intended only for human readers of the source document and not meant to be processed by parsing software.

CSS

(Cascading Style Sheets)

A system of specifying presentation by providing formatting rules to elements.

Data-typing

Providing information about what kind of data is contained in a given data field, element, or attribute (i.e., textual data, numeric data, a US zip code).

DBMS

(Database Management System)

A software system designed for the storage, retrieval, and manipulation of data, generally in a binary (non-ASCII) form.

div

An HTML element that indicates a section (division) of a document; creates a "carriage return" at the end of the element but has no other presentational effects.

DTD

(Document Type Definition)

A file that contains declarations of elements, attributes, and entities and rules for how they are to be combined in a document.

Dublin Core

A set of standard metadata elements (using that word in a loose, non-XML specific sense) often used in describing electronic resources.

EAD

(Encoded Archival Description)

An SGML/XML application for encoding archival finding aids.

Element

The basic unit of XML markup, consisting of an opening tag, closing tag, and any content that comes between them.

Entity

A name assigned to a piece of data by declaration in a DTD.

Entity reference

A string which "calls" or refers to an entity; these begin with an ampersand and end with a semi-colon.

FOP

(Formatting Object Processor)

A program from the Apache XML Project that can create PDF files from XSLFO documents.

GML

(Generalized Markup Language)

A meta-language developed in 1969 that is ancestral to SGML and XML.

HTML

(Hypertext Markup Language)

An SGML/XML application used for text and image presentation on the World Wide Web.

li

HTML element for a list item; used as a child element of either an ordered list (ol) or unordered list (ul) element.

MARC XML

Standard for the re-formulation of MARC to conform to XML syntax proposed by the Library of Congress.

Markup language

"A set of rules for representing data and encoding structures that surround the data." (Ray, 2003)

Metadata

Data about data; in the database field, this term often refers to things like number and types of fields in a table; in the library world, it often refers to data about electronic resources.

Meta-language

A standard that specifies how specific markup languages may be created. GML, SGML, and XML are examples of meta-languages.

METS

(Metadata Encoding & Transmission Standard)

A very general metadata format developed by the Library of Congress for use with electronic resources.

MODS

A format developed by the Library of Congress for bibliographic metadata; a subset of MARC XML.

Namespace

A particular set of elements and attributes identified by a URI and labeled using a designated namespace prefix and a colon in front of the element or attribute name.

OAI

(Open Archives Initiative)

A "harvesting protocol" designed to collect metadata from participating institutions with the goal of facilitating open access to scholarly information.

ol

HTML element representing an "ordered list"; contains li elements.

Parser

Software that reads and "digests" XML code so that the information contained in the XML is made accessible to other software. Parsers are classified as "validating" or "non-validating" based on whether or not they check the document they are processing against its DTD or XML schema. Examples of parsers include MSXML and Xerces.

PCDATA

(Parsed Character Data)

A type of element content indicating that the element may contain only textual data and may not contain any child elements.

Presentational

Referring to how a document appears when viewed with a particular reader application.

PI

(Processing Instruction)

An XML construct used to pass instructions to some processing software. One common use of the PI is to specify a stylesheet to be applied to an XML document.

Schema

In the broadest sense, a set of rules dictating the structure and allowed elements and attributes in an XML document. Sometimes used more restrictively to indicate native XML ways of expressing these rules (in contrast to a DTD).

Semantic

Relating to meaning (as opposed to appearance).

SGML

(Standard Generalized Markup Language)

The version of GML that was made a standard by the American National Standards Institute (ANSI) in 1980; "parent language" of XML.

span

An HTML element designating a particular inline area of text, with no predefined presentational features attached to it.

tag

Element names enclosed in angled brackets; these delimit the content of an element.

TEI

(Text Encoding Initiative)

An SGML/XML-based markup language used to facilitate the scholarly analysis of texts by computers.

Template

The basic unit of an XSLT stylesheet; a "recipe" for the code that should be produced when a particular element (designated by the "match" attribute) is encountered by the XSLT processor.

ul

HTML element representing an "unordered list"; contains li elements.

Unicode

A character set that attempts to include characters from all the world's major scripts.

URI

(Uniform Resource Identifier)

Strings of characters that identify resources on the Internet. URLs are a common type of URI.

URL

(Uniform Resource Locator)

The address of an object available via the Internet.

Valid

Conforming to a particular DTD or XML schema.

W3C

(World Wide Web Consortium)

The standards body which created XML and HTML.

Well-formed

Conforming to the fundamental syntactic rules of XML, but not necessarily conforming to any particular DTD or XML schema.

WML

(Wireless Markup Language)

Simple markup language for the presentation of data on cellular phones and PDAs.

Xalan

An XSLT processing program from the Apache XML Project.

Xerces

An open-source validating XML parser from the Apache XML Project.

XHTML

A reformulation of HTML to conform to XML syntax.

XML

A simplified version of SGML designed for use on the World Wide Web. XML is a meta-language, not a specific markup language.

XML MARC

Standard for the re-formulation of MARC to conform to XML syntax proposed by the Lane Medical Library at Stanford University Medical Center.

XOBIS

"Light" version of Stanford's XML MARC specification.

XPath

(XML Path Language)

A system for addressing a particular part of an XML document; used extensively in XSLT.

XSL

(Extensible Stylesheet Language)

A set of standards that allow presentation instructions to be encoded in XML and applied to XML documents.

XSLFO

(XSL Formatting Objects)

The part of XSL that addresses presentational issues such as colors, margins, and fonts.

XSLT

(XSL Transformation)

A programming language for writing instructions for the transformation of an XML document into some other format, usually either another flavor of XML or HTML.