What is SGML?

SGML (Standard Generalized Markup Language) is a set of formal rules for defining markup schemes, such as specifying the delimiters that separate markup from text data, what names (used in tags) are allowed in which contexts, and so on. The HTML specs contain a formal definition using the SGML formalism (hence the claim that "HTML is an SGML application"). A SGML parser is a program that can read this formal definition, and use it to parse a tagged document (and in the process, to verify that the markup used conforms to that formal definition.)

SGML is an international standard, and a parser for it by James Clark (called the SP library) is considered by many as close to a reference implementation as you're likely to find. This free software is the guts of the on-line validation service at the W3C.

