LEO III and PROGRAMMING TECHNIQUES

1 INTRODUCTION

This document has been produced by people who programmed the LEO III computer in the 1960s to describe some of the techniques and methods which were in use at the time. The objective is to provide information to help people understand how the LEO III was used for data processing using the hardware and software available at the time. These techniques and methods are now obsolete and hence are probably not understood by those who did not experience computing in the 1960s.

This document is part of the project to emulate and run programs written for LEO III. Original reference manuals are available and should be consulted for technical information on the hardware and software. They came in 5 volumes:-

Volume 1 deals with the hardware and includes the machine code instruction set

Volume 2 is a reference manual for CLEO, the high level language, similar to COBOL

Volume 3 is a reference manual for Intercode, the assembler language

Volume 4 describes the Master programme, which was the resident programme to control loading and time slicing for user programmes and the actual input and output of data

Volume 5 describes the standard programmes which were available such as sort, merge, file print, etc.

Cross reference to these manuals in this document will be in the form of VOL n s.t, where n is the volume number and s.t the section number within the volume.

2 OVERVIEW

2.1 The Computer and Its Users

The LEO III computer was physically large occupying an air conditioned room. There was no electronic communication between the computer and equipment outside of the computer room. Any information the user wanted to input to the computer had to be converted to a form the computer could input and be physically moved into the computer room. Typically this took the form of paper tape or punched cards. (See section 3 for more information)

Any information the user required from the computer was printed and delivered to the user. The printing was put onto continuous fan folded, sprocket fed stationery. Most printing was on plain paper and kept in this form by the user, but pre-printed stationery could be used (e.g. invoices or pay slips) but this may have required further handling to remove the sprocket holes and burst into single sheets. Multi-part stationery, interleaved with carbon paper, or chemically treated, was available for additional copies. (See section 9 for more information)

As far as the user was concerned, there was no direct interaction with the computer. They wrote their input on forms, sent it to be punched onto paper tape or cards and waited (sometimes hours) for the printouts to be returned to them. This applied to both programmers and end users.

The other users of the computer were the operators who actually decided which programs to run, and then loaded any input data and magnetic tapes and appropriate stationery for printing. They were the only people who had any direct contact with the computer.

2.2 Mass Data Storage

The only form of mass storage of data was magnetic tape (MT). This consisted of spools of tape typically 2400 feet in length. The programmes were also held on MT. (see VOL 1 12)

Magnetic tape could only be processed by reading or writing it sequentially along its length. Under normal conditions, a tape could be used for reading or writing but not both at the same time. These constraints controlled the way programmes were designed. For instance, to update a simple file with a list of part numbers and descriptions with additions and deletions would require the old file on a MT to be read and copied to the new file on another MT, while deleting the unwanted and inserting the new. This would only work if everything, including the changes was in part number order. (See sections 6 to 8 for more information)

A MT could be divided into "lumps" by writing a special type of information called "Marks". This enabled the MT to be aligned quickly at a mark by the hardware without having to read all the information on the MT. Marks could be given an identifier so that the MT could be aligned to a specified mark. Marks were often used to provide restart facilities for long processing runs.

The physical input to and and output from MT was done in units called blocks. A block could contain more than one logical record. The intercode programmer was responsible for packing and unpacking the records into blocks and for controlling the reading and writing of the physical blocks. A record could be of fixed or variable length. The CLEO programmer dealt only with logical records and the compiler packed and unpacked them into blocks.

A MT could be physically protected from being written to by removal of a plastic ring around the mounting hole in the centre of the spool. This allowed a micro switch to be activated on the tape drive which prevented the drive from writing to the tape. The ring was known as the Write Permit Ring (WPR) as its presence allowed the tape to be written on. It was normal for the operators to remove these rings after dismounting a MT from the tape drive. The WPRs were ideal for playing hoop la.

All MTs for a site were given a unique 5 digit serial number which was held as part of the tape information and physically written on the outside of the spool for visual identification. The operators would, as required, input to the master programme, a list of tape serial numbers which were available to be used for writing on. This was known as the Released Tapes Index (RTI). When a programme requested to open a MT to be written to, the Master programme would check that the MT it was attempting to use was in the RTI. The combination of WPR and RTI gave 2 ways of preventing erroneous overwriting of MTs. Work tapes were identified by having a zero serial number and were used as temporary tapes during the run of a programme and would become re-usable after the programme was unloaded.

2.3 Not Available to LEO III

For a user of LEO III the following were not available:-

Any direct connection to the computer - no screens, no keyboards. This prevented any direct interaction between user and the computer such as validation of data on entry or display of results.

Disks. This prevented programmers using random access methods of locating or updating information held on mass storage devises.

Large amount of memory. The amount of memory (known as core store) was very limited and allowed for only relatively small amounts of data to be held for direct processing.

Images. Images were not available - only characters of text and numbers could be used. This included single characters used to represent 10 and 11 pence for printing and calculations of pre-decimalisation sterling amounts.

Choice of fonts. The printers were capable of printing only one font in one size with no variable attributes such as colour, bold or italic. The font was Courier with a fixed pitch of 10 characters per inch and 6 lines per inch. Lower case letters were not available. For the programmer this had the advantage of simplifying printing, but had the disadvantage of limited design choices for the end user.

3 PUNCHED CARDS, PAPER TAPE AND DATA PREPARATION

3.1 Data Input Media

The vast majority of data submitted to the computer was first punched onto paper tape or punched cards. There were other devices such as AUTOLECTOR (a device for reading hand written marks on pre-printed forms - see VOL 1 10.7) and cheque readers (see VOL 3 12.3) which could provide direct input to the computer both of which will not be considered further.

Punched cards were the standard 80 column 12 row Hollerith or IBM cards which had been in use for many years with tabulators and associated equipment.

Paper tape (PT) was usually 7 channel tape with a sprocket hole and odd parity (i.e. the number of holes punched in a row was always an odd number, except for blank tape which was ignored). On reading into the computer, the sprocket holes were not used to move the tape, but formed a timing pulse. (See VOL 1 10.1)

Comparing cards and PT, cards held data in fixed columns and were obviously limited to a maximum of 80 characters. PT normally held data in variable length fields separated by "number end" character and the record terminated with a "block end" character. This was similar to the CSV format now available as output from spreadsheets with number end replacing the comma and block end replacing the line feed. PT was more often chosen as the input media as it offered greater flexibility over the limitation of 80 characters for cards. Cards were used where an installation was still using tabulators for other purposes.

PT could be produced as a by-product of some other operation such as the use of accounting machines, or readers for Kimball Tags (a very small pre-punched card typically attached to clothing and detached at point of sale), and LECTOR (similar to AUOLECTOR but not connected to the computer). These will not be considered further.

3.2 Data Forms

Information to be submitted to the computer was written on to forms designed specially for the application. Each site may have its own standards and conventions but the following were fairly commonly used:-

All alpha characters were written in upper case (there was no lower case).

To distinguish letter O from number zero, and letter S from number 5 the alpha characters were written with a bar over the top. Number one was written without top serif and letter I was written with top and bottom bars.

To indicate a multiple spaces in variable fields either ] or an S in a circle was used, repeatedly.

To indicate hexadecimal 10 to 15 they were written with a bar over the top. This applied to pre-decimal sterling pence greater than 9 which was treated as a single character.

The first field in a form was usually a "data type" to define the type of form. Often these were negative numbers.

Control totals and other checking information was usually included within the forms. (See section 5 for more information)

All the forms for a run of a computer job were put together. They may have been completed by various people in the organisation. Standard forms common to all sites and containing "sentinels" were added at the front and end to identify the programme to process the data. (See VOL 3 12.2.4) The front standard form had a single block (the start sentinel) consisting of 15 characters:-

/////pppppAn101 Where ppppp is the 5 digit programme number
and An is the file name consisting one alpha and one numeric

The end standard form had 2 blocks, the first consisting of 15 characters:-

/////ppppAn300 Where pppp and An are as described above.

The second block was usually just a block containing zero, but could be anything as it was just used to satisfy the advance read for input buffering and never processed.

For PT input of a large set of forms, the set could be divided into smaller sub sets each equivalent to one reel of paper tape by adding an end of reel form, and a start of reel form. The end of reel form was the same as the end form except the final 300 was replaced with 200. The start of reel form was the same as the start sentinel except the 2 digit reel number replaced the last 2 digits of the start sentinel. The reels had to be numbered sequentially.

The forms were then sent to the data preparation department, accompanied with some control sheet giving details of who submitted the job and other information which the site required for running the job. The control sheets varied widely from one site to another.

The data preparation department consisted of many people. In the 1960s before feminism and ageism political correctness was invented, the staff were all female and referred to as "punch girls". They were highly skilled at punching accurately and quickly. The principle employed for both PT and cards was for one "girl" to punch the data, and a second "girl" to verify it by punching it again and having it compared with the PT or cards produced by the first "girl".

3.3 Data Preparation of Paper Tape

Depending on the work load and priority of the job, a sub set of forms making up a reel, may have been divided into small parts to allocate it to more "girls", but it would be physically spliced back to form a single reel before being passed on to the computer room.

Data forms for PT consisted of a series of boxes, each box representing a field. The left edge of the first box and the right edge of the last box of the form had double vertical lines to indicate the start and end of the form (i.e. the record). Negative numbers were written with the minus sign to the right. There may be more than one record on a sheet. The order of punching was, unless specifically instructed otherwise to start at the top left and work down and from left to right. Number ends terminated each box, except the last which was followed by a block end. Any trailing empty boxes (fields) were ignored. (i.e. the block end was punched after the last non blank field)

A PT punch work station consisted of a key board and a paper tape punch. It was possible back space the PT and erase a mis-punch by punching a hole in all 7 rows of the character position. This would be ignored at the next stage.

When the forms had been punched, they were passed to the verification work station which was similar to a punch work station with the addition of a PT reader into which the PT from the punch work station was fed. A different coloured tape was used for punching and for verifying to ensure they did not get mixed up. The verifier keyed in the forms again and at each key stoke it was compared with the PT being read. If the same, the character was punched onto the output PT. If different, the verifier had to investigate by examining the input PT. (No VDUs to help with this process). The data prep. "girls" were very good at reading PT. The verifier then carefully punched the correct information and by use of the key board could re-align the input tape by effectively ignoring or replacing an input character or by inserting and extra character onto the output PT.

When all verification was complete, any split reels were spliced together. This was done by carefully laying the two ends over each other, ensuring the sprocket holes aligned and making one clean cut with a pair of sharp scissors through both PT at a slight angle to form a non vertical join. The two angled ends were butted together and a special adhesive patch applied which was the same width as the PT and had sprocket holes in it.

The tapes were wound onto cardboard cores and physically labelled by writing on the blank tape forming the lead in, the identity of the job and reel number. The tapes were secured by an elastic band around the circumference.

When the job was complete, the verified PTs were passed to the computer room with the job control sheet and the unverified punched tape of a different colour were disposed of. The data forms were returned to the originating users.

The small bits of PT removed to form the holes in the PT were known as "chad". Their use as confetti was discouraged due to the difficulty in clearing the litter and they could be injurious if they entered the eye.

A hand PT punch was occasionally used to make very limited changes to a PT. The hand punch consisted of a metal base with locating pins to accurately mount the PT on the base. A hinged metal flap could then by closed over the PT. The hinge had one row of 7 holes in it and using a metal stylus pushed through any combination of the 7 holes could be punched into the PT below the hinge. Deletion of a character was easy as it just required all 7 holes to be present. Insertion of a character probably required the PT to be carefully cut at the point of insertion and a piece of bank tape (with sprocket holes) to be spliced in and then the character to be inserted in the blank tape punched using the stylus.

The operators had a teletype which punched PT and printed a hard copy; this was like a typewriter with a PT punch attached. It was not available for general use.

3.4 Data Preparation of Punched Cards

Data forms for cards were usually on landscape paper to allow the 80 columns of the card to be represented. The actual card column numbers were usually shown. The fields were separated by vertical lines, and within a field the individual character positions corresponding to a card column were indicated. Pre-decimal sterling 10 and 11 pence occupied only one card column. For negative numbers, the left most column of the numeric field was used for the sign. If this was also to be used for a non zero digit, then it was written as ???? to represent an over punch in that row. (See VOL 1 10.2.3 and VOL 1 Appendix A2).

The data forms were divided into sub-sets for punching and distributed amongst the "girls".

A card punch was a fairly sophisticated electro-mechanical machine which had been developed over many years for use pre-computer in conjunction tabulators and other card equipment. It could be set up with "tab stops" to correspond to the fields on a card in order that the operator could easily move to the next field. To make use of this required the data forms to be "batched" so that those of the same format were together - another limitation compared with PT.

Once a set of forms had been punched, the pack of cards was either secured by an elastic band, or put into a metal box with a sliding fence, and passed to be verified. A card verifier machine was similar to a card punch with the addition of a card reader into which the un-verified cards were read. The verifier keyed in the forms again and each key stroke was compared with the corresponding column of the card being read. { what happens next I am vague about - I think if OK input card passed to output hopper, if not ???? } Some verifier machines were fitted with a printer which would print along the top edge of the card to show the content.

When verification was complete, the sub-sets were put together in the empty card boxes used to supply the blank cards, labelled and sequentially numbered, and passed to the computer room with the job control sheet. The data forms were returned to the originating users.

Hand card punches were occasionally used for very small volume work. This consisted of a slide into which one card could be loaded to pass under the punch head which contained 12 keys corresponding with the 12 rows of a card. Any of the 12 keys could be pressed simultaneously to punch a set of holes. On release of the keys, the card advanced one row ready for the next row to be punched. Verification was done either by visual inspection, or punching twice and comparing the two cards by holding up against a strong light. It was possible to carefully replace a chad in a hole punched in error, but was not recommended as a reliable method.



4 OVERVIEW OF TYPES OF PROGRAMME

4.1 Design Constraints

The hardware of LEO III imposed constraints on the design of programmes, the major one being the use of MT and its serial processing. Since the amount of memory was extremely small and no random access devices such as disks were available, virtually all programmes consisted of processing one or more MTs by reading or writing from start to end of a set of data held on MT. The number of MTs used by one program had to be minimised in order that multi-processing of more than one programme could be achieved with the limited number of MT drives available on the computer. For a typical medium sized installation, 8 MT drives might be available.

Another constraint was the relatively slow speed of PT and card readers, and line printers, and requirement to make efficient use of them for multi-processing. It should be noted that the volume of printing produced was significant as this provided the only means of showing the users the results and what was being held on the MTs.

Most data processing jobs were made up of a number programmes run sequentially. The programmes themselves typically had characteristics which belonged to one of 5 types of programme which are outlined below and described in more detail in sections 5 to 9. These 5 types of programme would be found on every LEO site doing data processing, but variations in detail occurred depending on the design requirements of the job and local standards set for the site.To aid understanding, first consider a typical, but simplified data processing job.

4.2 Payroll Processing

Weekly and/or monthly payroll processing was a job very often run on LEO III. For each employee, details are held of their semi permanent information such as employee number, name, department, rate of pay and tax code, and the latest financial information such as gross pay to date and tax to date.

Each week (and similarly monthly) details of new employees, changes to existing employees and notification of leavers is input and applied, together with details of hours worked in the week for each employee. For each employee, the gross pay for the week is calculated and the gross pay to date updated in order to calculate the tax for the week. The gross and tax to date are updated in the employee's details. Payslips are printed and various financial and management reports printed.

4.3 Types of Programmes

4.3.1 Update Programme. This is the programme which updates the information which is kept from one run to the next, and usually at the same time processes the "transactions". It normally uses 4 MTs, two of which are referred to as the "Brought Forward file" (B/F) and "Carried Forward file" (C/F). In the payroll example above, the details of each employee as at the end of the previous week are held on the B/F file in employee number sequence. The transactions (i.e. the changes and time sheets) are held on the third MT, also in employee number order. In this simplified example, there would be one record per employee on the B/F and none, one or more records per employee on the transaction file.

The update programme reads the B/F and transaction files one record at time and using the employee number, matches the transaction record with its B/F record and processes the transaction. When all transactions for an employee have been processed, an employee details record is written to the C/F file containing the updated information. At the same time, information is written to the fourth MT for further processing by another programme. The C/F file becomes the B/F file for the next week's run.

The 4 tape update is at the heart of most data processing jobs. It is described in more detail in section 7.

4.3.2 Sort Programme. For the update program to function, it requires its transaction file to be in order. LEO III had standard software programmes to provide this function driven by parameters to define the how the records on a file were to be sorted. The sorting was achieved by reading the unsorted MT and forming strings which were a set of records in ascending key order written to one of two temporary MTs until all the input was read. Next by reading 2 strings from different MTs, a single, bigger, string of records in order was formed and written to MT. This process was repeated and gradually the strings were merged and made bigger until finally all records were in a single string, which became the sorted version of the data.

See section 6 for more details.

4.3.3 Data Vet Programme. The transactions for the update were created on PT or cards. The data vet programme read the PT or cards and wrote the information to MT for sorting. Validation checks were applied to the data, but it was not possible to validate against any other files. In the payroll example, the employee records on the transactions could be in any order and could not be checked to exist as a valid employee on the B/F file because that would require a serial read and then a rewind of the B/F for each transaction.

Errors of a catastrophic nature would cause the whole job to be abandoned and sent back to the user for correction. Other errors caused the transaction to be rejected and a report printed for the user to deal with later and if necessary submit corrections for the next run of the job.

One objective of data vet programmes was to process the transactions within the time it took to read the PT or cards, so that it maximised the use of these slow devices.

See section 5 for more details.

4.3.4 Extract and Analysis Programme. This type of programme was used to take the information held a MT and extract from it and analyse it into a form that the user wanted. In the payroll example, at the end of the tax year, returns were produced for the tax office by analysing the C/F file produced at the end of the tax year and extracting information for each employee who had not ceased employment with the company.

The output from this type of programme was either another MT used for further processing or printed output. Where several different analyses were to be produced from the same data (e.g. a detailed list and a summary list), they would be produced during a single pass through the source data file.

See section 8 for further information.

4.3.5 Print Programme. Line printers are relatively slow devices and could suffer with paper jams and other paper problems which would necessitate a reprint of, at best, a part of the results. The volume of printing at a site was quite high, as previously described, owing to this being the only form of result the user could use. Therefore it was desirable to make efficient use of the printers, which would be few in number at a site due to their high cost. A typical medium size site would have only two printers.

The above factors resulted in most printing being done through a standard print programme. (See section 9.4). The programme which required to produce printed output, created the line of print with all the spaces edited in, then wrote this image of the print line to MT. The standard print programme, then read the MT and printed each line. Because the standard print programme was doing little processing, it could drive the printer at near maximum speed.

Another advantage was that the programme generating the printing could be designed to use as many "printers" as required. For example, an extract programme could produce a list of employees who had terminated employment, another list of employees who had worked overtime in the week and another list showing employees who did not submit a time sheet in a week. Each of these three lists would be produced while reading once sequentially through the file and deciding to which category the employee belonged. Each print image record written to the output MT would have a "result type" attached to it. The standard print programme would then read the MT and print one result type, then rewind the MT to print the next result type and so on. The fourth MT in an update was sometimes used as input to the standard print programme, depending on the design of the set of programmes.

The standard print programme had facilities to reprint any damaged pages and other functions. These would have to be provided by the programmer if he or she wrote a programme which printed directly to a printer. Direct printing was only done where the overall time to run the programme was relatively short.

See section 9 for further information.

5 DATA VET PROGRAMMES

5.1 Overview

The purpose of a data vet programme was to read data submitted on paper tape or punched cards, validate it, and write accepted data to magnetic tape (MT) or reject and produce a printed report on invalid data.

The major constraint with validating data was that it could not be cross checked against reference files or other existing data previously created and held on MT. To do this would require a serial read through the MT and a rewind, for each lookup - a totally impractical solution requiring many hours of processing.

Most validation was done when a card or a block of PT was input. Owing to the limited amount of memory, it was not feasible to read a sequence of cards or PT blocks and vet them all before output to MT, unless the number of cards or PT blocks was known to be small. (e.g. to insert an employee into a payroll, might require a maximum of 6 cards or blocks, so this could be handled as a unit, but an invoice might consist of an invoice heading, invoice lines and invoice totals, each of which was a card or block. As the number of lines could be large, the invoice could not be handled as a unit. Each line had to be read and output to MT before the next could be read.

The validation performed by the data vet programme was essentially to check that each piece of data looked reasonable on its own. This could be supplemented by submitting extra information to provide additional checks.

The consequences of finding erroneous data varied on its seriousness and on design requirements of the whole job. The data vet run might be abandoned for manual investigation, the offending card or PT block might be rejected, a series of cards or blocks might be rejected, or an erroneous item might be replaced and then accepted. Generally only erroneous data was reported upon, with sufficient information to identify it.

5.2 Extra Information Submitted for Validation Purposes

The cycle of events between the user deciding what information to submit and it being input to the data vet programme consists of:-

User decides on information

User writes the information onto a form

Form is punched and verified to create paper tape or punched cards

Errors can occur during these three stages and between them.

To alleviate some of the problems, the following could be used.

5.2.1 Batch Control

A set of forms cold be treated as a batch by adding a batch heading and a batch trailer form. The batch header would probably have a batch number and possibly an indication of the type of data to be contained in the batch. The batch trailer would probably have a count of the number of forms in the batch and one or more batch totals.

Using batch control would add extra time when assembling the data forms, so its use had to be balanced against the additional benefits that could be gained in the quality of input.

A batch heading could also be used to hold some item of data which applied to all the forms in the batch to avoid repeating it (e.g. the department number of an employee for a set of time sheets).

5.2.2 Check Totals

Within a form or a group of logically related forms, one or more check totals could be submitted. Again time to prepare against benefit had to be considered.

5.2.3 Run Heading

It was usual for the first form to be a run heading form which identified this run from other runs of the the same data vet programme.

5.3 Basic Checks

All items were subject to a range check to ensure they were within the permitted range. Within a card or block, the range may depend another item. For items which had only a small number of permissible values, they could be checked against a list hard coded into the programme.

All numeric items were subject to a radix check. (e.g. only digits 0 to 9 for decimal items, only 0 or 1 for the tens of shillings, valid date format)

Alpha items might be checked to contain only A to Z or space.

Some items contained a check digit (the most popular being modulus eleven) which could be recalculated to detect erroneous characters in that item. This was normally only used on critical items such as part numbers or employee numbers to ensure transcription or punching errors did not produce a valid number.

5.4 Other Checks

5.4.1 Run Heading

The run heading was checked to be the first card or block. It usually contained a run number which was checked against the run number submitted on the job control sheet and entered into the computer by the operator as part of loading the programme. This check was primarily to ensure that the correct set of PT or cards were being used and not that from a previous run of the job. Errors in the run heading were treated as disastrous and the job was abandoned for investigation.

5.4.2 Batch Checking

Errors in the batch heading might cause the whole batch to be rejected.

If the batch heading was acceptable, errors in the records in the batch may cause one or more of them to be rejected. These rejected record need to be included when checking the batch count and totals, unless the item being accumulated to a batch total is itself of invalid radix.

Errors in the batch totals were reported for further investigation later and often valid batch totals were printed as proof of acceptance. The errors could only be detected after the information in the batch had been written to the MT. As errors in batch totals could be due to several causes, and would take a while to investigate, the remainder of the job was allowed to proceed. After investigation, it may have been necessary to submit correcting data next time.

5.4.3 Check Totals

All check total errors were reported, but the effect could vary.

A total error within a single card or block or with a small set treated as a unit, may have caused rejection of the card or block or of the whole unit.

A total error within larger set was similar to a batch total error in that by the time the error was detected, data had already been written to MT.

5.4.4 Sequence Checking

In most circumstances, the general order of data input to the data vet is irrelevant, apart from the run heading being first, but there can be circumstances where sequence checking was required.

When a batch heading defines the type of data in the batch, then data of the wrong type would be rejected.

Where the data was logically related (e.g. invoice heading, lines, totals), the sequence was checked. An invoice heading error could cause the rejection of the whole invoice. Following an accepted invoice heading, only lines or a total is expected, but another invoice heading could occur if the total is missing. The accepted invoice lines will be written to MT as they are processed. If the update programme is written on the assumption that an invoice heading is followed by at least one line and then a total record, then the vet programme needs to ensure this occurs. This can be achieved by delaying the writing of the invoice heading until at least one line has been written, and ensuring that a total is written even if it is absent on the input data. This illustrates some of the complexity which the limitation of LEO imposed on the designers and programmers.

Sequence checking within a small set treated as a unit, is simpler as the whole set can be accepted or not.

5.5 MT Output

The accepted data was written to a MT for processing by the next programme. Reformatting and re-arranging of the data was done to create records designed to meet the requirements of the subsequent programmes.

Control totals and counts created whilst vetting the data were written to the MT. These would be re-calculated by the next programme and compared as part of the reconciliation checks between what the vet wrote and what the next programme read.

5.6 Printed Report

All data vet programmes produced a printed report to tell the user the result of the vetting. It should be noted that the report was the only way of advising the user what had happened. The report was usually printed "off line" (see section 9) so that a paper wreck did not stop the data vet run and did not use a printer inefficiently.

Like all printed reports, the page headings would identify the job, its run number and date of production. In general the report was an exception report listing rejected and questionable date, but it could also show important information such as actual and expected batch totals. The control totals calculated for reconciliation purposes would be printed at the end, perhaps with some other statistical information. (e.g. number of records input, accepted. rejected)

For rejections, it was important that sufficient details were printed to enable the user to identify the data form being reported upon and the item and reason for the error. The actual value of the reported item was printed as this may not be the same as the user wrote on the data form. (Punch and verify error rates were low, but not zero) The severity of the error may be indicated.

Without a proof list of the input data, some errors were difficult to find. (e.g. a form not punched in a batch of forms) The data vet batch controls might identify a form was missing but the reports would not inform the user which one was missing. The user would probably have to examine results of subsequent programmes and check each form in the batch to find which had not been processed.

If it was deemed necessary to produce a proof list of accepted data, this would be produced by another programme, either analysing and printing the MT output from the data vet, or by the data vet producing a second output MT for subsequent printing of the proof list.

5.7 Restart Run

For data vets with a large volume of input data, facilities were provided to restart the run part way through after it had stopped prematurely. (e.g. due to the paper tape being damaged whist being read, or a box of cards being dropped).

A restart point would be created at a convenient point which allowed run time of 15 to 30 minutes between restart points. The restart point would have to coincide with the beginning of a reel of PT or a box of cards.

Following a premature end, the data vet run could be restarted from the last restart point successfully processed. The programme would re-align the output MT to overwrite the date following the restart point and the PT or cards from the restart point would be read and processed.

A restart point would normally make us of the "mark" facility for MT (see section 2.2) to align the MT. It would also be necessary to write the control totals so far onto the output MT in order to avoid re-calculating them, which would require reading the whole of the data again.

5.8 Re-submission Run

Some data vet programmes allowed for a resubmission run to allow extra data to be processed. (e.g. to allow critically important data which had been rejected to be re-input before the next programme was run).

The data vet programme would align the MT from the previous run at the end of its data, and then add the re-submitted date onto the MT.

6 SORT PROGRAMMES

The purpose of a sort programme was to sequence the data into the required order for serial processing by the next programme. LEO had two standard programmes for performing this process: the 3 tape sort (programme 07002) and the 4 tape sort (programme 07003). See VOL 5 part 3 for details. They performed the same task but by different methods of merging the strings of records, which were in order. The 3 tape sort used 3 tapes during the merge phase and it used the Fibonacci series to control the number of stings on the tapes. The 4 tape sort used 4 tapes during the merge phase, merging 2 tapes onto one, then merging the result with the 4th tape. The 4 tape sort was more efficient but it used an extra tape drive.

The format of the data required the sort key to be at the start of each record. It could be fixed format or variable. The length of the sort key, whether fixed or variable, and other information such as the name of the input and sorted output file were provided as run time parameters. Sorting could only be done to ascending order.

Optionally, instead of sorting on the whole of the sort key as presented, it could be sub-divided into sub-keys and then sorted on a different arrangement of the sub-keys, thus allowing an input file to be sorted to more than one order for use by different programmes.

7 UPDATE PROGRAMMES

7.1 Overview

With the physical limitations of MT which restricted processing to sequential reading or writing along the the length of the tape, it was not possible to replace parts of the information on a tape. Therefore, where any information held on MT was to be updated, it had to be done by reading the old MT, applying the changes and writing to a new MT. The old MT was referred to as the Brought Forward (B/F) and the new MT as the Carried Forward (C/F). Thus the C/F from one run, became the B/F of the next run; the pair usually being referred to as the main or master files.

The information on the main files was arranged to be in order and the changes which were to be applied were held on another MT, usually called the current data (C/D) ,which was sorted to the same order as the main file.

The processing therefore consisted of reading the B/F and C/D in parallel and comparing the key order on both records to determine the action to be taken and then writing a record to the C/F. The actions were in principle:-
Insert a new C/F record
Amend an existing B/F to produce an updated C/F
Ignore an existing B/F (i.e. a deletion)
Copy the B/F to C/F (i.e. no change required)

The processing was rather like merging the B/F and C/D to form a C/F.

7.2 Update with Processing

In most cases, updating was combined with processing. Consider a parts file which holds details such as description and price for each part, identified by its part number. The time taken to apply a small number of changes such as insert new parts or change a description, was determined by the time it took to copy the whole of the B/F to C/F. For a large file with a low hit ratio, this could be a significant. If other processing relating to the parts file could be done at the same time (e.g. obtaining the description to put on orders) this did not significantly add to the processing time of the programme.

7.3 Multi Level Main File

In the payroll example considered in section 4.2, there was only one record per employee. In practice there was likely to be several, starting with an employee heading which held the fairly static information such as name, job title etc. and followed by other records such as details of regular deductions, and the this week and to date financial figures. There may be optional records such as bank details if not paid by cash.

In addition, employees might be arranged by departments, in which case there would be a department heading record and probably a department end record.

All the records would have the same key structure but with suitable values to ensure they were in the appropriate sequence. (e.g. a department heading would probably have a zero employee number)

The multilevel nature of the main file increased the complexity in order to ensure the structure of the file remained consistent. (e.g. if a new department was inserted, its equivalent department end record would be inserted after any new employees were inserted into the new department).

An exercise was often set for a trainee programmer to flowchart a multi level update and it was very unusual for anyone to produce a logically correct chart without a couple of attempts.

7.4 Validation of the Current Data

Because no validation of the C/D against the main file could be done in the data vet, it had to be done during the update before applying it. The C/D can be considered as belonging to one of three types:-

Insertion of new C/F Deletion of a B/F Update of a B/F (this includes processing of C/D against the main file)

Insertions were checked to be for a key which did not exist on the B/F and to be within a level which did exist. Deletions and updates were checked to be for a B/F which did exist.

In addition there could be other checks specific to the application. Any invalid data was rejected and reported, usually off line (see section 9). This report may also contain reports of accepted critical data for manual checking later.

7.5 Results of Updating

The result of performing an update was usually either a set of printed reports, or a MT for further processing. The printed reports would usually be written to a MT for later printing to avoid tying up a printer. (See section 9) So in either case, the programme was making use of 4 MTs which was usually considered the maximum which should be used by one programme to allow for efficient multi processing.

The requirements of the application would determine whether a print file was produced or MT for input for processing by another programme.

7.6 Reconciliation and Control Totals

To prove that the update had worked successfully, reconciliation and control totals were usually produced for printing. In principle this was two ways of deriving the same number to show the programme had functioned correctly. Failures indicated either a hardware problem or a problem with the software.

Control totals were usually applied to the C/D and reconciliation to the main files.

The simplest control was a count of the number of records. The programme producing the MT would count the number of records output and record this in a file end record. The programme reading the MT would count the number of records it read and compare it with the control total in the file end record. Obviously the two should agree.

A hash total was the sum of one or more specific critical items. Again summed by the writing and reading programmes and compared. This was used to ensure critical items had not been corrupted. For example the employee numbers in a payroll system.

Reconciliations were produced to prove that the processing of the main files had been done correctly. The simplest was of record counts, where,

No. on B/F + Insertions - Deletions = No. on C/F

If the sum on the left hand side did not equal the count on the right hand side, this indicated a failure. Each of the items on the left hand side would also be both derived and counted for comparison. The B/F count compared with the control on the file end record of the B/F which was written by the previous run of the update. The insertion and deletion counts would be compared with control totals written by the preceding programme, less any rejections by the update itself.

Reconciliation using the values of critical items could also be done. For example on a simple current account banking application, the balance on each account could be used.

Balance B/F + sum of credits - sum of debits = Balance C/F

Often these could totals could also be compared with manually produced figures.

7.6 Restarts

For an update which would run for some time (e.g. more than half an hour), restart points would be build in so that in the event of a failure, the run could be restarted part way through instead of the whole run being re-run. The restart point would be at a suitable logical point in the data structure of the main files. (e.g. at the end of a department)

The restart points would probably make use of the "mark" facility of MT (see section 2.2) to allow quick alignment of the MTs. It would probably be necessary to record the control totals so far in order that the reconciliation and control checks at the end of the run would work following a restart.

8 EXTRACT AND ANALYSIS PROGRAMMES

8.1 Overview

Extract and analysis programmes are primarily used to read one MT and process it to produce printed results or another MT for further processing, or both

Of the 5 types of programme described in this document, extract and analysis programmes are the ones which would have the greatest variety in their structure depending on the design requirements of the application. Where appropriate previously mention techniques of multiple printing via MT, reconciliations and restarts would be used where appropriate. Often programmes of this type were used between two updates, to pre-process the data output from the fist update, ready for input to the second update, possibly after sorting to a new sequence.

9 PRINT PROGRAMMES

9.1 Overview

Large volumes of print were produced as this was the only means of providing information to the users, there being no access via screens to the information. Therefore it was necessary to ensure printing did not become a bottle neck which limited the through put of the computer.

9.2 Physical Characteristics of the Printer and Stationery

Printing was restricted to a small set of 62 characters including space, (see Vol 1 appendix A4) all of which were of one size of 10 to the inch horizontally and 6 to the inch vertically and in Courier font. No lower case letters were available. The maximum number of characters to a line was either 120 or 160 depending on the model of printer. Although restrictive from the design point of view, it made it easier to programme as there was little choice as to what could be done (e.g. no choice of fonts, size, colour, italic, bold etc.)

All printing was done onto continuous fan folded stationery which had holes along both edges which engaged with toothed wheels on the printer to drive the paper through the printer. (known as tractor feed)

The printers were larger electro-mechanical and noisy machines which worked by hammer action on the paper to cause the ink ribbon to be sandwiched between the paper and the metal type. A line of print was produced in a single transfer to the printer within the programme. A cylinder, the width of the printer contained 120 or 160 sets of characters, each set arranged around the circumference of the cylinder and containing the whole 62 character set. Thus each character appeared as a row across the cylinder face. As the cylinder rotated, a hammer hit the appropriate character as it came in front of the hammer. In this way each of the individual characters were printed, at the same instant as the cylinder rotated. If the timing of the hammers was slightly out of phase, then a particular character would appear high or low on the printed line.

The printer ribbon, was as wide as the print cylinder. Because of the physical nature of the print mechanism, multi part paper could be used to print more than one copy at a time (up to 4 copies with the bottom copy being of rather poor quality) Multi part stationery either contained a thin carbon sheet between the paper, or was chemically treated paper which under pressure from the hammer would transfer the impression to the lower sheets. The latter tended to fade if left in the light for too long an it had a peculiar odour. In both cases, the multi-part set had to be "decollated" to separate into the individual copies, and in the former the carbon paper removed. This was usually done by machine. Another machine could be used to burst the continuous stationery into individual sheets. (e.g. for invoices to be posted to customers)

The mechanism to control the vertical movement of the paper to provide the 1/6th inch of vertical movement, was a combination of software and hardware. A printer had a "format tape", which was a wide paper tape loop with 8 tracks for holes to be punched in it. Each sprocket hole in the format tape corresponded to one 1/6th movement of the paper. The format tape advance and paper advance were synchronised. The programmer controlled vertical movement by defining how far the format tape should move before the line was printed. This was done by identifying the hole in the format tape at which it would next stop. More than one hole could be identified and the format tape would then stop at the first of the identified holes found. This required the programmer to know the arrangement of holes on the format tape and the current vertical position on the page, in order to define the next hole on the format to be identified. This information was contained in the first word of the block output for printing. (See Vol 3 13.6)

9.3 Design Considerations

Although the nominal print speed was 1000 lines per minute, this was relatively slow compared with processing speed. To make efficient use of the printer and to allow for multi processing, the aim was to print as near maximum print speed as possible. This was usually achieved by writing edited lines of print to a MT in one programme and then having a small programme which did the actual printing by reading the MT. This was usually referred to as "off line" printing. The small programme would use little processing power and was ideal for multi processing.

Print programmes had to provide facilities for aligning the stationery, especially where pre-printed stationery was being used, and for detecting and dealing with a low paper condition. Because of the possibility of paper wrecks during printing, facilities for reprinting selected pages or restarting the print from any point were required.

Because of the amount of printing being produced for subsequent distribution, it was good practice to ensure all plain stationery printing had sufficient identification. This was usually achieved by including in the page heading a title, the date of the run, the run number and page number.

9.4 Standard off Line Print Programme (06060)

A standard off line print programme was provided. (See Vol 5 part 3 section 9). This read a MT of edited line images in a specified format and printed them. The advantages of this method of printing was that it provided a standard interface between the operators who controlled the printer, and the programmer, for such things as reprints and restarts, and the programmer did not have to programme in these facilities.

Another advantage was that it effectively allowed a programme to produce output for multiple printers while processing data. Each different output to be printed would be assigned a number (result type) which was appended to the edited line when written to the MT by the processing programme. The standard print programme would then make multiple passes down the MT to print the lines for each result type in turn.