Document Structuring Conventions
Document Structuring Conventions, or DSC, is a set of standards for PostScript, based on the use of comments, which primarily specifies a way to structure a PostScript file and a way to expose that structure in a machine-readable way. A PostScript file that conforms to DSC is called a conforming document.
The need for a structuring convention arises since PostScript is a Turing-complete programming language. There is thus no guaranteed method — short of actually printing the document — to do things like determining how many pages long a given document is or how large a given page is, or how to skip to a particular page. The addition of structure, with DSC comments exposing that structure, helps provide a way for, e.g., an intelligent print spooler to have the ability to rearrange the pages for printing, or for a page layout program to find the bounding box of a PostScript file used as a graphic image. Collectively, any such program that takes PostScript files as input data is called a document manager.
In order for a PostScript print file to properly distill to PDF using Adobe tools, it should conform to basic DSC standards.
Some DSC comments serve a second function, specifying a way to tell the document manager to do certain things, like inserting a font or other PostScript code (collectively called resources) into the file. DSC comments that serve this second function are more akin to preprocessing directives and are not purely comments. Documents using those kinds of DSC comments require a functioning document manager to come out as intended; sending them directly to a printer will not work.
DSC is the basis for encapsulated PostScript; EPS files are conforming documents with further restrictions.
The set of DSC comments can be expanded by a mechanism called the Open Structuring Conventions, which, together with the EPS specification, form the basis of early versions of the Adobe Illustrator Artwork file format.
DSC at a glance
The basic premise of DSC is the separation of prolog (static definitions) and script (code that affects job-specific printed output), plus the disallowing of certain PostScript operators deemed inappropriate for page descriptions. This ensures a basic level of predictability in the PostScript code, thus forming the basis of document manageability.
An optional, additional layer of document manageability is provided by separating the script into a document setup section, zero or more functionally independent pages, and an optional trailer (cleanup code). (“Zero pages” in DSC usually means “one page without the use of the PostScript ‘showpage’ operator.) The functional independence between pages, plus the disallowing of more PostScript operators in the pages section, form the basis for page independence, which allows pages to be reordered, and independently and randomly accessed.
This imposed structure is then exposed by delimiting the PostScript file with DSC comments, which normally begin with two percent signs followed by a keyword. Some keywords need to be followed by a colon, an optional space character, and then a series of arguments.
Finally, the document is marked as conforming by starting it with a comment starting with “%!PS-Adobe-” followed by the DSC version number.
Sections of reusable PostScript code can be modularized into procsets (procedure sets, corresponding to function libraries in other programming languages), in order to ease the generation of PostScript code. Procsets and other PostScript resources (for example, fonts) can be omitted from the PostScript file itself, and externally referenced by a directive-like DSC comment; such external referencing, however, can only work with a document manager that understands such DSC comments.
DSC version 3.0 was released on September 25, 1992. The specification states, "Even though the DSC comments are a layer of communication beyond the PostScript language and do not affect the final output, their use is considered to be good PostScript language programming style." Thus, most PostScript-producing programs output DSC-conformant comments along with the code, although some such programs do not actually produce conforming documents.
Example
A DSC-conforming document (this one generated by dvips) might begin:
%!PS-Adobe-2.0
%%Creator: dvips(k) 5.95a Copyright 2005 Radical Eye Software
%%Title: texput.dvi
%%Pages: 1
%%PageOrder: Ascend
%%BoundingBox: 0 0 612 792
%%DocumentPaperSizes: Letter
%%EndComments
which has the following meaning:
- marks the document as conforming to version 2.0 of the DSC
- identifies the PostScript-producing program as dvips 5.95a
- identifies the document title
- tells the document manager that the document consists of one page
- tells the document manager that pages are independent (i.e., not in Special ordering) and appear in ascending order in the document; in this example, since the document only consists of one page, this information is not usually relevant, but will be needed if additional pages are to be inserted by a document manager
- tells the document manager the coordinates, measured in PostScript points, of the bounding box for all the pages taken together; 0 0 612 792 is the coordinates of a US Letter–sized page
- tells the document manager what kind of paper sizes are used in the whole document; in this example only one size is used, namely the US Letter size
- marks the end of the prolog
External links
- PostScript Language Document Structuring Conventions Specification (25 September 1992)