Skip to main content
shopping_basket Basket 0
Log in

Old Microprocessors, new Microcontrollers and Reverse Poles

FIGnition: an Arduino-format single-board computer, based on the ATmega328 MCU

FIGnition: an Arduino-format single-board computer, based on the ATmega328 MCU and programmed in Forth that just needs a video monitor and power supply to work. It's an excellent tool for education, but my own designs for a Forth computer were aimed at the embedded-applications development engineer.

I began my project to create a version of the vintage computer language Forth for a modern microcontroller, the Microchip dsPIC33, several years ago. My starting point was LUT-Forth, code I wrote in 1982 for the popular Zilog Z80 microprocessor as used in the Sinclair ZX81 and Spectrum at the time. At first glance, it seems like comparing apples with oranges to set an old technology 8-bit microprocessor (MPU) up against a modern 16-bit microcontroller (MCU). The MCU is of course loaded with on-chip peripheral hardware such as sophisticated timer/counters, serial bus controllers and the like. The inclusion of these functions together with integral non-volatile program memory and data SRAM are what turned a basic MPU into an MCU. Modern MPUs have lots of extra hardware too, principally Memory Management Units (MMU) and Graphics Processor Units (GPU) to reflect their use in general-purpose computers running operating systems like Windows or Linux.

Both my Forth compiler/interpreters were designed to generate code for embedded applications. In 1982 the Z80 was ubiquitous, being found inside the first ‘home’ computers and, before the IBM PC arrived, in desktop/laboratory machines with 8in. floppy disk drives and an operating system called CP/M. Microcontroller chips were a bit thin on the ground – apart from the Intel 8051 – so any embedded project needed memory, serial and/or parallel I/O interface chips and peripheral devices such as analogue-digital converters (ADCs and DACs).

Why Forth?

The Forth programming language was invented because:

  • The mainframe high-level language compilers such as FORTRAN, ALGOL and COBOL produced code that just wouldn’t fit into the maximum 64KB memory space of the early MPU chips. The compilers were certainly far too big to run on a micro and you needed a mainframe or minicomputer to compile your programs into executable machine code.
  • A BASIC interpreter would just about fit into the available memory space along with the user’s source program and didn’t need an operating system, but it ran far too slowly and was only really suitable for education. That’s why all those early home computers ran BASIC!
  • Fast operation is needed for hardware control or embedded applications.

A Forth implementation contains both interpreter and compiler, and yet takes up only a few Kbytes of program space. Like BASIC, it doesn’t need an operating system to be present. Unlike BASIC, it has a compiler that converts the source code to a linked-list executing independently of the interpreter. As a result, it’s a whole lot faster. Forth is also fast because it’s simple and the source code which it has to interpret or compile-then-run is structured to suit a computer brain, not a human one. It all comes down to the order of the operators and operands in each line of source code.

Get your operators and operands in the right order

The traditional way of recording a series of numerical operations is in the form of an equation, with placeholders for actual numbers called variables. So, we write down an equation for the addition of two numbers in this form:

A = B + C where A will be assigned the result of adding the contents of B and C.

This is a mathematical expression because the operands (B and C) are variables, as is the result A. It becomes arithmetic when actual numbers are substituted in place of the variable names. It’s also an example of infix notation because the operator (+) is placed between the two operands. Programming languages like BASIC and Python both use this convention.

Nearly all pocket calculators use it too, albeit with numbers not variables. For example:

6 x 4 = Pressing the ‘=’ key causes the operation (x) to occur and assigns the result to the display.

The infix convention is great for human brains because that nice ‘equals’ sign makes it easy for us to follow what’s happening. The problem is that a computer wastes valuable time converting a command line in human-friendly infix syntax to one it can ‘understand’. Rearranging the order of operators and operands removes this unnecessary overhead:

  • The ‘equals’ sign becomes redundant and
  • Parentheses to ensure operations are executed in the right order are also redundant.

So, first up is a prefix or ‘Polish’ notation, where the operator comes before the operands like this:

B + C becomes + B C where the + operator ‘knows’ to wait for the two operands B and C before executing. In computer terms, + and B are ‘pushed’ onto a Last-In Last-Out (LIFO) stack until C is input, then the operation is carried out leaving the result as the top item on the stack. No equals sign is needed. Another more complex example:

(B – C) x D becomes x – B C D Note how the operator x is ‘stacked’ until it has two operands to work on, which only occurs when – B C has yielded a result. No parentheses are needed, at the expense of slightly incomprehensible syntax!

Finally, we get to postfix or ‘Reverse Polish’ notation (RPN), where amazingly, the operands come before the operator:

(B – C) x D becomes B C – D x or even D B C – x Both made possible by that LIFO stack. This time though, operands are pushed onto the stack until an operator is input which then works on the last two stack entries. So, B – C is executed first and the result left on top of the stack, followed by (result of B – C) x D. Easy isn’t it? Postfix is better than Infix or Prefix because operators do not have to be saved temporarily on the stack.

The bottom line is: the extra effort involved in using a programming syntax that the processor more readily ‘understands’ pays dividends in terms of speed and efficiency. In other words, feed it data and instructions in the right order to avoid wasting time with temporary storage. That’s what RPN does for you.

Comparing Forths

Back in the early 1980s versions were being written unofficially for just about every microprocessor in existence – and their variants. Even now, adherents to the Forth cause are churning out code for the latest devices including those with ARM Cortex-M processor cores. In 1982 a set of ‘benchmark’ programs was devised: sixteen short pieces of code for testing the execution speed of basic Forth instructions. Each one was embedded in a loop which repeated up to 100,000 times so that a stopwatch could be used for timing! The Forth source code for these benchmarks can be found under Downloads below. Benchmarking FORTHdsPIC is made a lot easier and more accurate by using one of the dsPIC33’s on-chip interrupt-driven 32-bit timers. The Forth code for the updated benchmark set can be downloaded from the project page.

Benchmark results

Each test repeated 10,000 times for stopwatch timing (!)
Magnifier overhead subtracted from BM2 to BM16 results

Benchmark Test





Speed  Factor


BM1 Magnifier 0.70 0.006 117


6.25 0.042 148
BM3 Literal 9.25 0.068 136
BM4 Variable 9.50 0.108 88
BM5 Variable-Save 15.75 0.161 97
BM6 Variable-Fetch 12.75 0.137 93


9.25 0.108 87
BM8 DUP 12.25 0.095 129
BM9 Increment 12.25 0.094 130
BM10 Test > 16.25 0.128 127
BM11 Test < 16.25 0.128 127
BM12 BEGIN-WHILE-RPT 17.25 0.164 105
BM13 BEGIN-UNTIL 15.25 0.147 103
BM14 Link Search 7.2 0.040 180
BM15 SP Arithmetic 9.25 0.027 342
BM16 DP Arithmetic 10.75 0.028 384

The comparison table shows just how much faster the 70MHz dsPIC33E is than the 4MHz Z80A. No surprise there then. But there are other factors:

  • The Z80 is only an 8-bit machine. However, it does possess some instructions that treat its six 8-bit working registers as three 16-bit register pairs.
  • The Z80 chip is an Intel 8080 ‘on steroids’, with more registers and a significantly enhanced instruction set. LUT-Forth only uses the 8080 instructions and register set, so it would probably run a lot faster if the full range of Z80 features had been used!
  • Most dsPIC33 instructions take a single 70MHz clock cycle to execute (with the exception of BRAnches and CALLs). All Z80 instructions require multiple 4MHz clock cycles.
  • The dsPIC33 is in a different league when it comes to the Single-Precision and Double-Precision arithmetic timings. That’s because it has multiply and divide hardware with single-cycle, signed and unsigned mixed-precision multiply instructions.
  • The dsPIC33 is very much a stack-oriented processor: any Forth implementation benefits from its multiple hardware stacks and an addressing mode that allows most instructions to access them directly.


Forth has always been considered fast and compact in its use of memory, making it ideal for small microprocessors running embedded applications. Its use of RPN puts a lot of people off because it’s seen as ‘difficult’. But Forth code is considered to be very reliable and less likely to contain bugs. This may be due to the programmer taking greater care when generating the code. Being able to test out new word definitions in interpreter mode before compiling them also helps. Code reliability and speed may explain why so many NASA and ESA robotic spacecraft have been programmed in Forth, often running on a special radiation-hardened microcontroller, the Harris/Intersil RTX2010.

The Z80 MPU lives on in the form of the Z84C00xx (169-8229) , the Z80180xx (177-7104) and the eZ80F91 (165-9325) .

If you're stuck for something to do, follow my posts on Twitter. I link to interesting articles on new electronics and related technologies, retweeting posts I spot about robots, space exploration and other issues.

Engineer, PhD, lecturer, freelance technical writer, blogger & tweeter interested in robots, AI, planetary explorers and all things electronic. STEM ambassador. Designed, built and programmed my first microcomputer in 1976. Still learning, still building, still coding today.


May 15, 2021 10:25

@Bill, do you collect Forth memorabilia?
From my exploits in this area I have an old SmartPACs Forth board serail driven, is that of any interest to you? see attached From the date codes it's early 80's.
I'm not sure how to pm you but raise a support ticket and they will forward my details to discuss.

0 Votes

May 11, 2021 10:18

Great memories of using Forth on Z8 microcontrollers to control pre-press equipment in the late 80s. I've commented on DS before when Forth has featured in an article and hearing other engineers experiences of those days is always special. I still miss "blowing" compiled code into UVEprom before running on the target hardware. Uploading via USB may be quicker but its just not the same...

0 Votes

May 10, 2021 10:16

This is brilliant! My first 'proper' coding exercise in assembler for Z80 on ZX Spectrum was to implement Forth! I was only 15 then. I still remember waking up in the middle of the night with realisation why my first add Forth command was doing odd thing around numbers at the edge of 8bit - 256/512/1024, etc... I forgot to use adc for 'high' bytes! ;)

0 Votes

May 11, 2021 06:57

@clicky Thanks! Ah, the joy of 'bare-metal' programming... That's the trouble with high-level languages: you're isolated from things like the Arithmetic Carry bit. On the other hand, programming at machine level results in a much greater understanding of the hardware. Did you get your Forth working in the end?

April 9, 2021 07:09

Also in the category of Z80 MPUs is the Rabbit 6000 Microprocessor module (RS Stock No. 761-7374) which can operate up to 200 MHz, (at, I think, 2 clocks per instruction) and adds additional instructions and memory management beyond those in the Z80180. It also has a plethora of built-in peripherals.
(I believe the Rabbit 6000 and Zilog's eZ80F91 which clocks at 50MHz are both in the not recommended for new design status, but it is still pretty amazing what a long life the underlying architecture/family has had!)

0 Votes

April 8, 2021 15:33

I learned Forth back in the early 1980s. Not long after that, I worked on a project adding a GPIB (now known as IEEE-488) interface to a multichannel data analyzer. User needs varied - some would be using the analyzer in automated test environments, and needed the output in raw formats that could achieve the highest throughput across the interface, to keep up with data from multiple channels simultaneously. At other times users needed more human-readable decoded and filtered data output. I ended up borrowing Forth's method of compiling source into a linked list of calls and native code/data. I defined a tiny language for managing the captured data sequencing and formatting the captured data. The user could then use that language to define their desired output and formatting. The interface compiled their request into the series of calls to generate that output. That compiling approach (rather than continually re-interpreting the formatting commands) maximized the throughput capabilities of the analyzer, which was based on an 8-bit 6809 processor.
This is a benefit of learning more than one language (computer or human). They each have their own way of seeing the world, sometimes bringing with them insights into how to better address a problem, even if the solution ends up being implemented in a different language than the insight came from.

May 11, 2021 06:58

Eh, similar here, but at that times on the other side of the iron curtain. Forth was for me the only option to control IEC 625 aka IEEE488 with Czech version of PDP-11, called JPR-12R. Basic language did 6 digit precision, while we needed to measure temperature characteristics of 100 MHz crystals with 1 Hz precision. Forth and application program was loaded from punched tape.

DesignSpark Electrical Logolinkedin