BCPL – a brief description

by Marius Marinescu / 26 March

 

BCPL (“Basic Combined Programming Language”) is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still felt because a stripped down and syntactically changed version of BCPL, called B, was the language on which the C programming language was based. BCPL introduced several features of many modern programming languages, including using curly braces to delimit code blocks.

 

BCPL (Basic Combined Programming Language) was designed by Martin Richards of the University of Cambridge in 1966 and it was a response to difficulties with its predecessor CPL, created during the early 1960s. The language was first described in a paper presented to the 1969 Spring Joint Computer Conference.

 

BCPL was designed so that small and simple compilers could be written for it; reputedly some compilers could be run in 16 kilobytes. Further, the original compiler, itself written in BCPL, was easily portable. BCPL was thus a popular choice for bootstrapping a system. A major reason for the compiler’s portability lay in its structure. It was split into two parts: the front end parsed the source and generated O-code, an intermediate language. The back end took the O-code and translated it into the machine code for the target machine. Only ​1⁄5 of the compiler’s code needed to be rewritten to support a new machine, a task that usually took between 2 and 5 man-months. Soon afterwards this structure became fairly common practice, but the Richards BCPL compiler was the first to define a virtual machine for this purpose.

 

The language is unusual in having only one data type: a word, a fixed number of bits, usually chosen to align with the architecture’s machine word and of adequate capacity to represent any valid storage address. For many machines of the time, this data type was a 16-bit word. This choice later proved to be a significant problem when BCPL was used on machines in which the smallest addressable item was not a word but a byte or on machines with larger word sizes such as 32-bit or 64-bit.

 

The interpretation of any value was determined by the operators used to process the values. (For example, “+” added two values together, treating them as integers; “!” indirected through a value, effectively treating it as a pointer.) In order for this to work, the implementation provided no type checking. Hungarian notation was developed to help programmers avoid inadvertent type errors.

The mismatch between BCPL’s word orientation and byte-oriented hardware was addressed in several ways. One was by providing standard library routines for packing and unpacking words into byte strings. Later, two language features were added: the bit-field selection operator and the infix byte indirection operator (denoted by “%”).

 

BCPL handles bindings spanning separate compilation units in a unique way. There are no user-declarable global variables; instead there is a global vector, similar to “blank common” in Fortran.

 

All data shared between different compilation units comprises scalars and pointers to vectors stored in a    pre-arranged place in the global vector. Thus the header files (files included during compilation using the “GET” directive) become the primary means of synchronizing global data between compilation units, containing “GLOBAL” directives that present lists of symbolic names, each paired with a number that associates the name with the corresponding numerically addressed word in the global vector. As well as variables, the global vector contains bindings for external procedures. This makes dynamic loading of compilation units very simple to achieve. Instead of relying on the link loader of the underlying implementation, effectively BCPL gives the programmer control of the linking process.

 

The global vector also made it very simple to replace or augment standard library routines. A program could save the pointer from the global vector to the original routine and replace it with a pointer to an alternative version. The alternative might call the original as part of its processing. This could be used as a quick ad hoc debugging aid.

 

BCPL was the first brace programming language and the braces survived the syntactical changes and have become a common means of denoting program source code statements. In practice, on limited keyboards of the day, source programs often used the sequences $( and $) instead of the symbols { and }. The single-line // comments of BCPL, which were not adopted by C, reappeared in C++ and later in C99.

 

It is reputedly the language in which the original “hello world” program was written. The first MUD was also written in BCPL.

Several operating systems were written partially or wholly in BCPL (for example, TRIPOS or Amiga Kickstart). BCPL was also the initial language used in the seminal Xerox PARC Alto project, the first modern personal computer; among many other influential projects, the ground-breaking Bravo document preparation system was written in BCPL.

 

By 1970, implementations existed for the Honeywell 635 and 645, the IBM 360, the TX-2, the CDC 6400, the Univac 1108, the PDP-9, the KDF 9 and the Atlas 2. In 1979, implementations of BCPL existed for at least 25 architectures; the language gradually fell out of favor as C became popular on non-Unix systems.

 

Martin Richards maintains a modern version of BCPL on his website, last updated in 2018. This can be set up to run on various systems including Linux, FreeBSD, Mac OS X and Raspberry Pi. The latest distribution includes Graphics and Sound libraries and there is a comprehensive manual in PDF format. He continues to program in it, including for his research on musical automated score following.