Skip to main content

Clicker 2, FORTHdsPIC and more Bare-Metal


View of the Clicker 2 development system with a ‘jacked-up’ Parallax robot being used to test the revised FORTHdsPIC code. The robot’s on-board regulators provide the motor power and tacho O/C pull-ups via a separate 7.5V power supply plugged into the Arduino interface Vin socket. Everything else is USB-powered.

In a blog post a short while ago I described how I ported my dsPIC-based embedded Forth language compiler/interpreter to a new development board, the Clicker 2 from MikroElektronika. The FORTHdsPIC project started a few years ago to update a version of Forth I’d written for the Z80 microprocessor back in the 1980s. LUTForth enabled the microprocessor-controlled instruments I was developing at the time to be programmed efficiently in a high-level language. Nowadays we refer to these as ‘embedded’ systems. But why Forth, and why go to all the trouble creating it with assembler-language (“Bare-Metal”) programming? Well, for many years Forth was the language of choice for embedded real-time control applications: robots and spacecraft for example. It takes up little space, under 8Kbytes in the case of FORTHdsPIC, and is really fast. As for the bare-metal programming, it’s a bit like rock climbing without a safety line. Alright, coding mistakes aren’t usually fatal unless a driverless car or an aircraft is involved, but you may need the patience of a crash investigator to track those bugs down.

The main reason for the change to the ‘E’ part on the Clicker board was to get access to a lot more pins, more timers, and faster speed. Here are some more of the fun and games I’ve had with the Clicker 2 port.

It’s all in the Timing

The new chip will run at 70MHz, nearly doubling the processor speed from the old chip’s 40MIPS to 70MIPS. This meant that a lot of timing constants within the code had to be increased by a factor of 7/4. The critical settings that had to be changed first included those affecting the system clock generator and terminal communication UART baud rate generator. With those two areas covered FORTHdsPIC booted up and announced itself in the terminal emulator running on my laptop. All basic Forth words were available and working.

Flushed with success, I set about modifying the code that allows Forth programs to be stored in the non-volatile Flash memory.  Without that working, it could hardly be called an embedded system. Changes were necessary because Microchip had added new features to the Flash programming hardware. At this point, I should say that I’d assumed the new part would be functionally the same as the old one apart from the speed increase and more of everything. That was a mistake. Yes, there are more general-purpose timers, but other basic things like the UARTs, Input Capture (for reading wheel tachometers), Output Compare (for generating PWM waveforms) have acquired lots of extra features with more control registers to set up. My original code would require extensive modification.

Getting the Flash programmer to work took hours of frustrating effort. I had consulted the relevant section of the user manual and taken into account the various register changes, but still no joy. I got to the point where I decided the internal programming voltage circuit must have been damaged and I obtained a replacement board. The new board was no better. When I eventually spotted the problem, I at first cursed myself for being so careless, and then Microchip for not updating the assembler code example in the manual. The C code example had been changed, but not the assembler. The problem is that the Flash timing in the 40MHz part only requires a couple of ‘NOP’ (No Operation) instructions after the programming ‘trigger’ instruction. The 70MHz part needs the same wait period, but of course, the NOP instructions take just over half as long to execute (Fig.1a). It was easily solved by using a loop that only exits when a Ready flag is set (Fig.1b).The moral of this sorry tale is:

  • Always read the manual thoroughly.
  • Don’t believe everything you read in manuals.
  • Never use fixed timing delays (padding) in your code.


The Trouble with Tachometers

I was never really satisfied with the wheel rotation speed measurement code I outlined in my blog post on robot mobility control. Because of a shortage of timers, the two tachometer (Input Capture) channels had to share the same GP Timer 3. This caused two irritating limitations:

  • The timer had to free-run and interval measurements were taken by subtracting the ‘captured’ timer value on a tacho pulse interrupt from the previous one.
  • At very slow wheel speed, the interval between tacho pulses exceeds the 16-bit range of the timer and erroneous data is returned.


If each channel had its own timer then it could be restarted from zero after each capture, eliminating the need for any subtractions. Moreover, if the timer reached 0xFFFF and rolled over, it would generate an interrupt which could be used to set a flag indicating invalid data. The dsPIC33E has a load more timers. Problem solved? Yes and No. The new Input Capture hardware features a specially dedicated timer for each channel. Yay! Unfortunately, it can’t be cleared by software and there is no overflow interrupt available. Cue howls of frustration. Luckily, I spotted a solution to the overflow problem before I treated the board to a rather more literal interpretation of the term ‘boot-up’.

Despite having its own dedicated timer, each IC channel still needs a GP Timer unit to provide its clock signal. All it actually uses are the GP Timer’s input prescaler and gating circuits. The timer-counter itself is unused and just sits there counting away for no purpose. Until that is, I realised that, because it shares its clock, and can be cleared by software – it provides a perfect ‘watchdog’ generating an overflow interrupt whenever the IC timer receives more than 65535 clock pulses between adjacent tacho pulses. The capture data subtraction is still needed but at least the first few dodgy tacho data samples from rest are all replaced with 0xFFFF which helps the PID controller lock on more quickly.

Other Changes and Additions

The picture below (Fig.2) provides a guide to the functions available on this new version 0.8 of FORTHdsPIC based on a Clicker 2 board. There are four PWM servomotor channels plus two more with full PWM-programmability: Pulse Frequency from 50 to 10000Hz, and Duty Cycle from 0 to 100%. These can be used for the speed control of two PMDC motors via suitable driver circuitsClicker2_plus_blog_2_49615c5112ffae7ee101e1cfd6565dbd95e741bd.pngAlthough not fully coded yet, I have added drivers for a second UART driving the Aux Tx/Rx pins on the mikroBUS Click socket 2. The plan is to use this for a wireless channel providing a remote-control facility for a mobile robot. MikroElektronika has a huge range of wireless Click boards covering Bluetooth, WiFi, LoRa and other, simpler protocols. I think I’ll go for a basic 434/868MHz ISM-band ‘cable-replacement’ type, at least initially. Three reasons:

  • Driver software is simple for peer-to-peer remote-control communication.
  • The control unit will be like that used for R/C model vehicles, using joysticks. Do you ever see professionals trying to drive an AUV or UAV from a touch-screen on a phone or tablet?
  • Wide channel bandwidth is not needed, because only low data rates will be used.
  • I have a spare Clicker 2 board for the remote! (see above)


Of course, should I want to connect it to an IoT network based on LoRaWAN, for example, (almost) all I have to do is change the Click board.

Is it really faster?

The new board ought to be able to run Forth code 75% faster than the old one on the basis of the clock speed alone. Only one way to find out; run some benchmarks. Developers of new versions of Forth have always had the benefit of a simple set of benchmark routines, each designed to exercise different features of the language. I have the results for the 40MHz part and was looking forward to seeing that big improvement. But they showed only a 20% increase in speed. Why? After a bit of head-scratching I went back to the manuals, in particular the dsPIC33E instruction set which tells you how many clock cycles each instruction takes to execute. And there it was: all conditional branch instructions now take four cycles instead of two, if the condition is true. The unconditional branch also takes four with subroutine and interrupt returns doubling to six cycles. These changes are undoubtedly the cause of the less than impressive benchmark performance. To be fair, the tests involve an awful lot of branching relative to the execution of normal single-cycle instructions. A ‘real’ program may well perform much better. Still, it just goes to show that reading manuals is very important – including all the fine print!


I’ve attached the fully-annotated source code for version 0.8 below, together with the .hex (machine code) file which can be Flashed straight into the Clicker 2 board using MikroElektronika’s free programming suite.

If you're stuck for something to do, follow my posts on Twitter. I link to interesting articles on new electronics and related technologies, retweeting posts I spot about robots, space exploration, and other issues.

Engineer, PhD, lecturer, freelance technical writer, blogger & tweeter interested in robots, AI, planetary explorers and all things electronic. STEM ambassador. Designed, built and programmed my first microcomputer in 1976. Still learning, still building, still coding today.