CMS Pipelines
CMS Pipelines is a feature of the VM/CMS operating system that allows the user to create and use a pipeline. The programs in a pipeline operate on a sequential stream of records. A program writes records that are read by the next program in the pipeline. Any program can be combined with any other because reading and writing is done through a device independent interface.
Paradigm | Dataflow programming |
---|---|
Designed by | John P. Hartmann (IBM) |
Developer | IBM |
First appeared | 1986 |
Stable release | 1.1.12/0012
/ 2020-06-03 |
Platform | IBM z Systems |
OS | z/VM 7.1 |
Website | http://vm.marist.edu/~pipeline |
Influenced by | |
Pipeline (Unix) |
Overview
CMS Pipelines provides a CMS command, PIPE
. The argument string to the PIPE
command is the pipeline specification. PIPE selects programs to run and chains them together in a pipeline to pump data through.
Because CMS programs and utilities don't provide a device independent stdin and stdout interface, CMS Pipelines has a built-in library of programs that can be called in a pipeline specification. These built-in programs interface to the operating system, and perform many utility functions.
Data on CMS is structured in logical records rather than a stream of bytes. For textual data a line of text corresponds to a logical record. In CMS Pipelines the data is passed between the stages as logical records.
CMS Pipelines users issue pipeline commands from the terminal or in EXEC procedures. Users can write programs in REXX that can be used in addition to the built-in programs.
Example
A simple example that reads a disk file, separates records containing the string "Hello" from those that do not. The selected records are modified by appending the string "World!" to each of them; the other records are translated to upper case. The two streams are then combined and the records are written to a new output file.
PIPE (end ?) < input txt | a: locate /Hello/ | insert / World!/ after | i: faninany | > newfile txt a ? a: | xlate upper | i:
In this example, the <
stage reads the input disk file and passes the records to the next stage in the pipeline. The locate
stage separates the input stream into two output streams. The primary output of locate
(records containing Hello) passes the records to the insert
stage. The insert
stage modifies the input records as specified in its arguments and passes them to its output. The output is connected to faninany
that combines records from all input streams to form a single output stream. The output is written to the new disk file.
The secondary output of locate
(marked by the second occurrence of the a:
label) contains the records that did not meet the selection criterion. These records are translated to upper case (by the xlate
stage) and passed to the secondary input stream of faninany
(marked by the second occurrence of the i:
label).
The pipeline topology in this example consists of two connected pipelines. The end character (the ?
in this example) separates the individual pipelines in the pipeline set. Records read from the input file pass through either of the two routes of the pipeline topology. Because neither of the routes contain stages that need to buffer records, CMS Pipelines ensures that records arrive at faninany
in the order in which they passed through locate
.
The example pipeline is presented in 'portrait form' with the individual stages on separate lines. When a pipeline is typed as a CMS command, all stages are written on a single line.
Features
The concept of a simple pipeline is extended in these ways:
- A program can define a subroutine pipeline to perform a function on all or part of its input data.
- A network of intersecting pipelines can be defined. Programs can be in several pipelines concurrently, which gives the program access to multiple data streams.
- Data passed from one stage to the next is structured as records. This allows stages to operate on a single record without the no need for arbitrary buffering of data to scan for special characters that separate the individual lines.
- Stages normally access the input record in locate mode, and produce the output records before consuming the input record. This lock-step approach not only avoids copying the data from one buffer to the next; it also makes it possible to predict the flow of records in multi-stream pipelines.
- A program can dynamically redefine the pipeline topology. It can replace itself with another pipeline, it can insert a pipeline segment before or after itself, or both. A program can use data in the pipeline to build pipeline specifications.
CMS Pipelines offers several features to improve the robustness of programs:
- A syntax error in the overall pipeline structure or in any one program causes the entire pipeline to be suppressed.
- Startup of the programs in the pipeline and allocation of resources is coordinated by the CMS Pipelines dispatcher. Individual programs can participate in that coordination to ensure irreversible actions are postponed to a point where all programs in the pipelines have had a chance to verify the arguments and are ready to process data. When the pipeline is terminated, the dispatcher ensures resources are released again.
- Errors that occur while data flow in the pipeline can be detected by all participating programs. For example, a disk file might not be replaced in such circumstances.
History
John Hartmann, of IBM Denmark, started development of CMS Pipelines in 1980.[1] The product was marketed by IBM as a separate product during the 80's and integrated in VM/ESA late 1991. With each release of VM, the CMS Pipelines code was upgraded as well until it was functionally frozen at the 1.1.10 level in VM/ESA 2.3 in 1997. Since then, the latest level of CMS Pipelines has been available for download from the CMS Pipelines homepage for users who wish to explore new function.
The current level of CMS Pipelines is included in the z/VM releases again since z/VM 6.4, available since November 11, 2016.
An implementation of CMS Pipelines for TSO was released in 1995 as BatchPipeWorks in the BatchPipes/MVS product. The up-to-date TSO implementation has been available as a Service Offering from IBM Denmark until 2010.
Both versions are maintained from a single source code base and commonly referred to as CMS/TSO Pipelines. The specification is available in the Author's Edition.[2]
References
- VM and the VM Community, Melinda Varian
- CMS/TSO Pipelines Author's Edition Author's Edition