Enable JavaScript.

Cover of the ADL Reference Manual

1990-1991: DDM & DD&C

A key objective of

was transparency, the ability to redirect an application program to the files on a remote system without having to change or recompile the application program. This had been largely achieved regarding the interfaces used by programs to create, manage and access remote files; could data transparency now be achieved for the encodings and mappings of data flowing between programs and files on different computers? Data transparency had not been considered a problem in the first levels of DDM Architecture because all of the computers implementing DDM had been developed by the IBM Rochester laboratory. But DDM products for additional computer systems brought this problem to the forefront and required DDM architecture support for data description and conversion (DD&C).

John Hufford from the IBM Santa Theresa laboratory (near San Jose, CA) had gotten funding for what he called Distributed File Management (DFM). He wanted to provide DDM architecture connectivity between IBM Multiple Virtual Storage (MVS) and Virtual Storage Extended (VSE) mainframe computers with other IBM systems. These operating systems were used by many large customers and untold gigabytes of data were stored in their Virtual Storage Access Method (VSAM) record-oriented files. DDM connectivity would unlock that data and make it available to computers in remote locations.

Two things had held back DFM. One was the lack of mainframe support for Systems Network Architecture (SNA) Advanced Program to Program Communications (APPC). Yes, it was available when running the Customer Information Control System (CICS), but it wasn't available to other MVS and VSE applications until APPC was implemented to support

. Only then could DFM be considered.

The second problem was more difficult to resolve, namely the conversion of data as it flowed between computers. This was necessary because different types of computers encode data in different ways. For example, IBM mainframes encode character data in EBCDIC and floating point numbers in hexadecimal, while IBM Personal Computers encode character data in ASCII and floating point numbers according to IEEE standards. Conversions could be performed, but only if the encodings of the communicating systems were known in advance. For true transparency, it would be necessary to obtain descriptions of data as it existed and descriptions of data as it was needed so that conversions could be performed automatically.

This was, at its core, a problem with the way in which the third generation programming languages (3GLs) — RPG, COBOL, and PL/I — on MVS and VSE created the records they wrote to VSAM files. They each mapped records defined by language declaration statements onto computer memory as efficiently as possible, aligning fields on optimal bit, byte or word boundaries. They then wrote these memory-mapped records directly to VSAM files. However, the only descriptions of these records resided in 3GL declaration statements and in the mapping algorithms the compilers used. A common practice was to put these 3GL declaration statements in a separate file and then include the file at compile time in whatever programs needed to read or write records, but this only worked for a single 3GL when compiled on the same type of computer. An application written in any other language for any other type of computer was out of luck.

In contrast, file records on the System/38 (and its AS/400, iSeries and Power Systems successors) were described by means of Data Description Specifications (DDS) external to the programs that accessed them. The System/38 3GL compilers all used the DDS to generate appropriate record declaration statements. Further, the mapping of DDS described records to memory was the same for all of these languages. In this way, many inter-language issues disappeared, but how could data descriptions from existing mainframe 3GL programs be used? John Hufford also wanted a way to externally describe VSAM files. He asked the DDM architecture team to tackle this problem and I took the lead.

Conceptually, this was not all that complicated. Define a representation domain, a domain for short, as the ways in which data can be described, mapped and encoded in memory by a single programming language compiled for a single type of computer. Given a data description appropriate to one domain, and a second data description appropriate to a different domain, a program could be generated to convert the described data from one domain to the other.

What made this effort difficult is that existing record descriptions were specified in a variety of 3GL language declarations embedded in the code of individual programs. While it was possible to identify the various domains to be initially provided, such as MVS COBOL, System/38 DDS, or PC/DOS C++, it was not reasonable to expect each domain to interpret and appropriately map the declaration statements of all other domains. What was needed was an intermediate language that all domains could interpret. They could then map their own language's declarations to or from the intermediate language.

I named this intermediate description language A Data Language (ADL) as a complement to one of my favorite programming languages, A Programming Language (APL), though there was no relationship between them. An ADL program consists of two ADL DECLARE statements, one for a program's view of a record and a second for a file's view of the same record, plus ADL PLAN statements that specify which fields of the program's view are to be converted and assigned to the file's view, or vice versa. The ADL compiler used this information to generate a conversion program.

The definition of ADL was the work of many months. I was assisted by Koko Yamaguchi from IBM's Santa Theresa laboratory and by Elaine Patry, Marsha Brown and Ken Sissors from IBM's Manassas, VA laboratory. We formally defined ADL's grammar using Bauckus-Naur Form (BNF), and filled in all of the details about the many ways of encoding text and numbers with full consideration for all the known ways fields could be aligned in memory in fixed and varying configurations. This was accompanied by an appendix that provided precise specifications as to how all valid data conversions were to be performed.

However, it was never my intention that the textual form of ADL descriptions would be carried in file metadata or that it would be directly used by the ADL compiler. Instead, I defined DDM objects isomorphic to ADL statements to be used as an encoded form of ADL because it was much easier to interpret, store, transmit and compile into conversion programs. It was this encoded DDM form that I had proposed for

use and which Systems Application Architecture (SAA) management had rejected without any technical assessment. To say I was frustrated is an understatement; they had opted for a poor substitute. I could live with their decision regarding DRDA, but this was not what DFM wanted, so I continued to work on ADL and its DDM encodings.

An important part of DD&C architecture was the concept of domain managers, polymorphic programs (that respond to the same interfaces) created for each representation domain. Each of these programs had three responsibilities. The first was to parse the data declaration statements of its language and generate the corresponding DDM encodings of ADL. The second was to generate declaration statements of its language from DDM encodings of ADL. And the third was to translate the DDM encodings of ADL received from a different representation domain into the DDM encodings of ADL in its own representation domain.

Together, a suite of domain managers would have the flexibility needed to work with the data declaration statements of all of their domains and thereby achieve the data transparency that was a fundamental objective of DDM Architecture. For example, if a AS/400 RPG program needed to access a file created by a MVS COBOL program, the MVS COBOL domain manager could parse the COBOL declaration statements and generate DDM encodings of ADL. The AS/400 RPG domain manager could then translate it to DDM encodings of ADL appropriate to AS/400 DDS, and then generate the corresponding DDS declaration statements, which could be used by the AS/400 RPG compiler. The two ADL views of the data could then be fed into the ADL compiler to generate a program to perform data conversions.

There was only one little problem. The owners of the programming languages in the IBM Toronto laboratory refused to work with us to create the necessary domain managers. Months of negotiations led nowhere. Their primary excuse was that most data conversions violated the integrity of the data declarations in their languages. As examples, conversions between IEEE and hexadecimal forms of floating point numbers inevitably lose precision, and there are character set differences between ASCII and EBCDIC encodings that cause losses during conversions. The Toronto language people were correct, but was it really their right to decide what losses their customers would find acceptable? I was not able to get past their excuses and gain their support. In hind sight, I should have engaged the Toronto owners of the 3GL languages from the very beginning of the DD&C effort and gotten their commitment to develop the necessary domain managers. I just hadn't seen them as stakeholders in the architecture design process.

For me, wrapping up the DD&C effort consisted of publishing the ADL specifications and writing an article for the IBM Systems Journal. I then left the DDM Architecture team for a position in the IBM Rochester laboratory. I assumed that without the necessary domain managers DD&C was a failure, in spite of the tremendous effort we had expended on it. But that was not completely true. The DFM team used ADL in a more direct fashion than I had imagined, without domain managers. Where data conversion was necessary, they required application programmers to manually create ADL programs that DFM could call when VSAM files were accessed by remote programs. The DDM objective of data transparency was sacrificed, but at least ADL, the kernel of DD&C Architecture, was realized as part of a real product. Actually two products, because roughly the same approach was used by the IBM 4680 Store System DDM product.

Cited references

  1. R. A. Demers and K. Yamaguchi, Data Description and Conversion Architecture, IBM Systems Journal 31, No. 3, 488-515.
  2. Distributed Data Management Architecture: Specifications for A Data Language Level 1, SC21-8286, IBM Corporation, 1992.
  3. 4680 DDM User's Guide, IBM Corporation, 1991.

1990-1991: DDM & L'architecture pour la description et la conversion des données

Désolé, pas encore traduit.

IBM Systems Journal, Data Description and Conversion Architecture