Educational Application

So why would anyone in his right mind would create a whole new programming language? Isn't BASIC or C++ good enough already?

Well, almost. You see, recently I got my hands on two vintage Hewlett-Packard handheld computers, an HP-95LX and an HP-200LX, and I promptly fell in love with both. Both teeny DOS-based machines are surprisingly useful, to the extent that even their little keyboards are eminently practical. (HP keyboard quality helps.) And both lack a decent programming language.

 

That is not to say that you cannot write code for these HP handhelds using any compiler that can produce DOS programs, such as Microsoft's Visual C++ 1.52c (the last 16-bit version of that development system.) But you'll never actually install Visual C++ on an HP-95LX. Apart from the fact that this machine doesn't run Windows, it also lacks the substantial number of MIPS and megabytes that this development system requires in order to run.

How about BASIC? Well, if you're lucky and you can find your old floppy disks, you may be able to locate an old copy of GWBASIC or some other "lean-and-mean" text-based BASIC interpreter. But an interpreted language is inherently inefficient, and, well, requires the interpreter to run. OK, so maybe I could find a BASIC compiler somewhere if I really looked hard, but who the devil wants to write programs in BASIC anyway?

No, I was looking for something much more simple and efficient. So like any good programmer, I decided to stop looking and start doing... having built my own computer not too long ago, it was time to put down my screwdriver and embark on another project, building a programming language and compiler from scratch. A compiler that actually fits on a single sheet of paper (okay, I admit, I DID have to use a very small font for that. But, it's still readable even if you need to put on your reading glasses!)

What I had in mind was a C-like language, but not quite C itself. Just like C, W would have functions and compound statements; local and global variables; pointers and expressions. W, on the other hand, will be a keyword-less and typeless language.

Typeless and keyword-less you ask? What for? Well, I admit to having been influenced by both C's predecessor, BCPL (itself a typeless language) and by Richard Bartle's MUDDLE, an object oriented programming language designed especially for writing Multi-User Dungeon (MUD) games. A language without data types and keywords has a great degree of clean elegance, something I was trying to reproduce in W.

This language has only one data type: a 16-bit word (hence the name, W.) Every symbol, be it the name of a function, a local, or a global variable, is in fact just a substitute for a 16-bit quantity that may be a value or an address in memory. This restriction may seem too much at first; how can a language like this, for instance, effectively handle character strings? As it turns out, it's not nearly as difficult to do as it may appear. W may be elegant but it's also practical. As these pages demonstrate, it is a language that is sufficiently powerful to compile its own compiler, and produce usable code.

A First Example

Ever since Kernighan's and Ritchie's "bible" on the C programming language first saw the light of day, it has been traditional to introduce a programming language through a simple program that just prints "Hello, World!" on the computer screen. Here is this program in W:

write := 0x8B55, 0x8BEC, 0x085E, 0x4E8B, 0x8B04, 0x0656, 0x00B8,
	 0xCD40, 0x7321, 0x3102, 0x8BC0, 0x5DE5, 0x90C3

_() :=
{
	write(1, "Hello, World!\r\n", 15)
}

Short as this program might be, it already demonstrates a few key characteristics of W.

First, W is a language without a standard library. Interfacing with the operating system is the programmer's responsibility. In the present case, we wish to use the operating system to print a 15-byte message on standard output. For MS-DOS, one possible implementation in machine language would call the appropriate Interrupt 21 function as follows:

0100 55            PUSH    BP
0101 8BEC          MOV     BP,SP
0103 8B5E08        MOV     BX,[BP+08]
0106 8B4E04        MOV     CX,[BP+04]
0109 8B5606        MOV     DX,[BP+06]
010C B80040        MOV     AX,4000
010F CD21          INT     21
0111 7302          JNB     0115
0113 31C0          XOR     AX,AX
0115 8BE5          MOV     SP,BP
0117 5D            POP     BP
0118 C3            RET
0119 90            NOP

It is the bytes of this machine code subroutine that are assigned to the symbol write in the W program above.

Second, a W program at the topmost level essentially consists of declarative statements in the following form:

symbol := definition

Both the symbol write, and the function _() are defined in this fashion.

Third, each W program must contain a function named _(), which is where program execution will begin.

Compiling this program with the W compiler produces the following output:

C:\>w hello

Address map (global symbols):
=============================
0120 (code) _
013B (heap) write

The compiled result is an MS-DOS executable, hello.com, a file 100 bytes in length.

The following pages provide a more in-depth introduction into W, both from a user and from a compiler programmer perspective.

If you wish to download the W compiler and experiment with it yourself, feel free to do so by clicking this link.