Home
Community
Community Content
All Technologies
OKdo ROCK
RockSynth Part 1: Writing a MIDI Controlled Synthesizer on the ROCK 4 C+

RockSynth Part 1: Writing a MIDI Controlled Synthesizer on the ROCK 4 C+

star_borderFollow article

RyanJeffares 18 Aug 2023

7star_border 0question_answer 5thumb_up

Your next article

Dave from DesignSpark

How do you feel about this article? Help us to provide better content for you.

Dave from DesignSpark

Thank you! Your feedback has been received.

Dave from DesignSpark

There was a problem submitting your feedback, please try again later.

Dave from DesignSpark

What do you think of this article?

Since audio programming and music production are my fortes, I thought it would be really cool to test out my Okdo ROCK 4 C+ in that context. Instead of forking out hundreds of euro for a synthesizer, wouldn't it be cool to make our own by plonking an SBC on a MIDI keyboard and coding a digital synth to run on it? That's what this article series will cover. It's quite a big project, so I'll split it into three parts - this first part will cover the preliminary setup of getting an audio program running on the board using C++ and a bit of theory, in the second we'll get our synth actually playable, and in the final part we'll add some pzazz with effects.

Hardware

In terms of the hardware I'm using for this project, I've got

an Okdo ROCK 4 C+ (230-6199) - but any ROCK board that's got a DAC (either 3.5mm jack or USB) will do.
a USB MIDI keyboard - any will do but mine has eight knobs for MIDI CC that I plan on using. If you don't have a MIDI keyboard you could hook up a computer keyboard to play the keys too.
a pair of headphones or audio interface to hear the audio output.

Software Prerequisites

For this project, I set up the board in headless mode with Ubuntu Server using the instructions here. This required me to do some Things™ before I was good to go.

Andrew Back's article about building an audio player on the same board was very helpful with the setup here. They point out that after getting the OS and SSH set up, running the following commands will fix errors about apt signing keys:

$ wget -O - apt.radxa.com/focal-stable/public.key | sudo apt-key add -
$ sudo apt update && sudo apt dist-upgrade

And furthermore, we can fix the user permissions with:

$ sudo chown -R rock:rock /home/rock

I mostly followed my setup for C++ development from on of my previous articles, however since we're on Ubuntu Server and not Debian, I needed to do something different to get a version of gcc new enough to use C++20. By following the instructions here, we can either run the following commands:

$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt update

or add the following lines to the /etc/apt/sources.list file:

deb https://ppa.launchpadcontent.net/ubuntu-toolchain-r/test/ubuntu focal main 
deb-src https://ppa.launchpadcontent.net/ubuntu-toolchain-r/test/ubuntu focal main

After this, we can run

$ sudo apt update
$ sudo apt install gcc-13 g++-13

to get nice new(ish) versions of gcc and g++. At this point, you should also make sure you have CMake and Git up to date using apt. With this, I was then able to build Neovim and set up my config as I also described in my previous article, but you should use whichever editor you like.

I'm going to be using the RtAudio library for audio I/O. I installed it system-wide with CMake by running:

$ git clone https://github.com/thestk/rtaudio.git
$ cd rtaudio
$ mkdir build
$ cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Release -DRTAUDIO_API_ALSA=ON
$ cmake --build . --config Release
$ sudo cmake --install .

I also had to make sure my audio was working. This was... a process. Andrew Back's article, as I mentioned, goes through this well and also the use of alsamixer and speaker-test. In my case, I also had to follow the steps in this post to get it working reliably, since PulseAudio has some quirks with what it considers to be your default device. Once I had speaker-test reliably putting pink noise through my speakers, I was ready to start the actual project.

Setting up the Project

I used the C++ project setup script I made in my previous article to create a new project called “rocksynth”, except because I could only get version 3.16 of CMake on this version of Ubuntu Server I had to make some changes to the CMakeLists.txt, specifically the minimum required version of CMake to 3.16 and the C++ standard to 20.

cmake_minimum_required(VERSION 3.16)
…
set_property(TARGET rocksynth PROPERTY CXX_STANDARD 20)
…

The whole project structure is as follows:

├── rocksynth
    ├── CMakeLists.txt
    ├── .gitignore
    └── src
        └── main.cpp

With the .gitignore file looking like this. Since we installed RtAudio system-wide, we can use CMake to find the library and link to it. The full CMakeLists.txt now looks like this:

cmake_minimum_required(VERSION 3.16)
project(rocksynth)

include(FetchContent)

set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

FetchContent_Declare(fmt
    GIT_REPOSITORY https://github.com/fmtlib/fmt.git
    GIT_TAG master
    GIT_SHALLOW ON
)
FetchContent_MakeAvailable(fmt)

find_package(RtAudio REQUIRED)

add_executable(rocksynth src/main.cpp)
set_property(TARGET rocksynth PROPERTY CXX_STANDARD 20)
if (CMAKE_BUILD_TYPE MATCHES Debug)
    target_compile_definitions(rocksynth PRIVATE ROCKSYNTH_DEBUG)
endif()

target_compile_options(rocksynth PRIVATE -Wall -Wextra -Wpedantic -Werror)
target_link_libraries(rocksynth PRIVATE fmt::fmt RtAudio::rtaudio)

Let’s test the audio output with RtAudio. I followed something similar to the basic setup from their documentation into my main.cpp file and tried to put some white noise through the speakers:

#include <fmt/core.h>

#include <rtaudio/RtAudio.h>

#include <chrono>
#include <cstdint>
#include <random>
#include <thread>

static std::random_device s_rd;
static std::mt19937 s_gen{s_rd()};
static std::uniform_real_distribution<float> s_dist(-1.0f, 1.0f);

int audioCallback(
    void* outBuffer,
    [[maybe_unused]] void* inBuffer,
    uint32_t bufferFrames,
    [[maybe_unused]] double streamTime,
    RtAudioStreamStatus status,
    [[maybe_unused]] void* userData
)
{
    float* buffer = (float*)outBuffer;

    if (status) {
        fmt::print(stderr, "Stream underflow\n");
        return 0;
    }

    
    // Samples are interleaved, we can fill both channels with white noise (random numbers)
    for (size_t sample = 0; sample < bufferFrames; sample++) {
        for (size_t channel = 0; channel < 2; channel++) {
            *buffer++ = s_dist(s_gen) * 0.5f;
        }
    }

    return 0;
}

int main([[maybe_unused]] int argc, [[maybe_unused]] const char* argv[])
{
    // Object that will handle I/O
    RtAudio dac;

    
    // Reusable lambda we can use to clean up on error/exit
    auto cleanup = [&] {
        if (dac.isStreamOpen()) {
            dac.closeStream();
        }
    };

    auto deviceIds = dac.getDeviceIds();
    if (deviceIds.empty()) {
        fmt::print(stderr, "No audio devices found.\n");
        std::exit(1);
    }

    
    // Set up stream parameters
    auto defaultDevice = dac.getDefaultOutputDevice();
    auto deviceInfo = dac.getDeviceInfo(defaultDevice);
    fmt::print("Using audio output device: {}\n", deviceInfo.name);

    RtAudio::StreamParameters streamParameters {
        .deviceId = defaultDevice,
        .nChannels = 2,
        .firstChannel = 0
    };

    uint32_t sr = 44100;
    uint32_t bufferSize = 256; 
    
    // openStream() takes a pointer to the buffer size because it will change it if
    // that buffer size is not allowed.
    // Try and open a stream with our desired parameters.
    if (dac.openStream(&streamParameters, nullptr, RTAUDIO_FLOAT32, sr, &bufferSize, &audioCallback)) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        std::exit(1);
    }

    if (dac.startStream()) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        cleanup();
        std::exit(1);
    }

    
    // For now, just sleep the main thread for 10 seconds so we can hear the audio.
    using namespace std::chrono_literals;
    std::this_thread::sleep_for(10s);

    if (dac.isStreamRunning()) {
        dac.stopStream();
    }

    cleanup();
}

I built and ran this with these commands (in the project directory):

$ mkdir build
$ cmake -B build -S . -DCMAKE_BUILD_TYPE=Release
$ cmake --build build
$ ./build/rocksynth

If you don’t hear anything, you may have to do some more audio configuration. I had to open up the alsamixer and toggle the “Playback Path” of my selected sound card a bit.

If that’s working, let’s write an oscillator as a little proof of concept, and that will be enough code for this part. I made a new sub-folder in the source directory for synth-specific things called “Synth”, and the Oscillator.hpp and Oscillator.cpp files.

// Oscillator.hpp
#ifndef OSCILLATOR_HPP
#define OSCILLATOR_HPP

#include <cstdint>

class Oscillator
{
public:
    enum class Shape
    {
        Sine,
        Saw,
        Pulse,
        Triangle,
    };
    
    void prepare(uint32_t sampleRate) noexcept;
    float getNextSample();

    void setFrequency(float frequency) noexcept;
    void setPulseWidth(float pulseWidth) noexcept;
    void setShape(Shape shape) noexcept;

private:
    uint32_t m_sampleRate;
    float m_phase{0.0f};
    float m_frequency{1.0f};
    float m_pulseWidth{0.5f};
    Shape m_shape{Shape::Sine};
};

#endif

// Oscillator.cpp
#include "Oscillator.hpp"

#include "fmt/core.h"

#include <cmath>
#include <numbers>

void Oscillator::prepare(uint32_t sampleRate) noexcept
{
    m_sampleRate = sampleRate;
}

float Oscillator::getNextSample()
{
    if (m_phase >= 1.0f) {
        m_phase = 0.0f;
    }
    m_phase += 1.0f / (m_sampleRate / m_frequency);

    switch (m_shape) {
        case Shape::Sine:
            return std::sin(m_phase * std::numbers::pi_v<float> * 2.0f);
        case Shape::Saw:
            return 2.0f * (m_phase - std::floor(m_phase + 0.5f));
        case Shape::Pulse:
            return m_phase <= m_pulseWidth ? 1.0f : -1.0f;
        case Shape::Triangle:
            return 4.0f * std::abs(m_phase - std::floor(m_phase + 0.75f) + 0.25f) - 1.0f;
        default: {
            fmt::print(stderr, "Invalid shape\n");
            return 0.0f;
        }
    }
}

void Oscillator::setFrequency(float frequency) noexcept
{
    m_frequency = frequency;
}

void Oscillator::setPulseWidth(float pulseWidth) noexcept
{
    m_pulseWidth = pulseWidth;
}

void Oscillator::setShape(Shape shape) noexcept
{
    m_shape = shape;
}

The oscillator class encapsulates the parameters it needs to know to produce the right sound. The prepare() function is a common pattern in audio programming that we’ll be using again - essentially, any audio processor is provided with relevant information (sample rate, buffer size) about the system before any audio is processed so it can do any preliminary setup. In the case of our oscillator, we need to know the sample rate so that we can calculate the period length of the wave at the current frequency. I’m using the general definitions for these waveforms from Wikipedia (except for the pulse wave, which is simply 1 or -1 on either side of the pulsewidth).

Sine wave:

$y(t)=A\sin(2\pi ft+\varphi )=A\sin(\omega t+\varphi )$

Sawtooth wave:

$x(t)=2\left(t-\left\lfloor t+{\tfrac {1}{2}}\right\rfloor \right),t-{\tfrac {1}{2}}\notin \mathbb {Z}$

Triangle wave:

$x(t)=4\left\vert t-\left\lfloor t+3/4\right\rfloor +1/4\right\vert -1$

These functions are in terms of t, which I’ve represented as “phase” in the code, or simply the point in the period of the wave cycle between 0 and 1.

Ok! We have an oscillator, lets try playing it through the speakers, and then that's enough code for this first part. I updated the main file to look like this:

#include "Synth/Oscillator.hpp"

#include <fmt/core.h>

#include <rtaudio/RtAudio.h>

#include <chrono>
#include <cstdint>
#include <thread>

static Oscillator s_oscillator;

int audioCallback(
        void* outBuffer,
        [[maybe_unused]] void* inBuffer,
        uint32_t bufferFrames,
        [[maybe_unused]] double streamTime,
        RtAudioStreamStatus status,
        [[maybe_unused]] void* userData
)
{
    float* buffer = (float*)outBuffer;

    if (status) {
        fmt::print(stderr, "Stream underflow\n");
        return 0;
    }

    // Put the same sample in each channel
    for (size_t sample = 0; sample < bufferFrames; sample++) {
        auto value = s_oscillator.getNextSample();
        for (size_t channel = 0; channel < 2; channel++) {
            *buffer++ = value * 0.5f;
        }
    }

    return 0;
}

int main([[maybe_unused]] int argc, [[maybe_unused]] const char* argv[])
{
    RtAudio dac;

    auto cleanup = [&] {
        if (dac.isStreamOpen()) {
            dac.closeStream();
        }
    };

    auto deviceIds = dac.getDeviceIds();
    if (deviceIds.empty()) {
        fmt::print(stderr, "No audio devices found.\n");
        std::exit(1);
    }

    auto defaultDevice = dac.getDefaultOutputDevice();
    auto deviceInfo = dac.getDeviceInfo(defaultDevice);
    fmt::print("Using audio output device: {}\n", deviceInfo.name);
    
    RtAudio::StreamParameters streamParameters {
        .deviceId = defaultDevice,
        .nChannels = 2,
        .firstChannel = 0
    };

    uint32_t sr = 44100;
    uint32_t bufferSize = 256; 
    
    // Set up Oscillator's parameters - try different waveforms and frequencies!
    s_oscillator.prepare(sr);
    s_oscillator.setShape(Oscillator::Shape::Saw);
    s_oscillator.setFrequency(220.0f);

    if (dac.openStream(&streamParameters, nullptr, RTAUDIO_FLOAT32, sr, &bufferSize, &audioCallback)) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        std::exit(1);
    }

    if (dac.startStream()) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        cleanup();
        std::exit(1);
    }

    using namespace std::chrono_literals;
    std::this_thread::sleep_for(10s);

    if (dac.isStreamRunning()) {
        dac.stopStream();
    }

    cleanup();
}

We also need to tell CMake to compile the new Oscillator.cpp source file by updating the add_executable() call in CMakeLists.txt:

add_executable(rocksynth src/main.cpp src/Synth/Oscillator.cpp)

After recompiling, you should hear a beautiful buzzy saw wave after running the program again.

Next Steps

The next two parts of this article series will be quite code-heavy, since we need to add interaction and a lot more features to the synth. So, to prepare for that, let's go through some theory on the structures of a synth and an audio program.

We've already seen the core structure of our audio program - we are given a buffer of samples by the operating system (in our case this is handled internally by RtAudio) that we need to fill with the audio that we produce in real-time. If our program runs too slowly and we can't fill the buffer in time before the operating system wants to fill the next buffer, we get a "buffer underrun" leading to pops and drop-outs. For this reason, performance is absolutely pertinent - this is why we use high-performing, compiled languages like C++, C, or even Rust for real-time audio software. To keep the performance up there are certain things that should generally be avoided on the audio thread of a program, including excessive heap allocations and locking mutexes.

The rest of our program will branch out from our audioCallback() function in the main file, which is called for each buffer on a separate audio thread that gets initialised internally by RtAudio. We'll have some sort of other class instance that contains all of our other audio processors, with its own callback that will take the buffer to fill and pass that on and on where needed.

As for the synth, feel free to make changes for your own if you're following along, but my idea is to have an 8-voice polyphonic, dual oscillator synth with a VCF (voltage-controlled filter). A "voice" is a single audio path with all of the elements require to play monophonic notes - by having many of these and activating them in order as we play more notes, we get polyphony. A single voice in our synth will look something like this:

Structure of a single synthesizer voice

Each voice needs its own ADSR (attack, decay, sustain, release) envelope and filter so that each note we play can have its own envelope applied. And with eight of these, we can send MIDI events to them individually and mix their outputs at the end, with a full layout something like this:

Full structure of synthesizer

We'll call back to these diagrams in future as we flesh out the synth. In the next part, we'll get MIDI input running and set up a full synth voice and start playing real notes. Thanks for reading, and leave a comment if you run into any difficulties!

Articles in this series:

Part 1 - Getting the RtAudio running on OKdo Rock 4 C+ and Theory and structure (This article)
Part 2 - Flesh out the synth and make it actually playable and controllable
Part 3 - Take the basic synth and make it musical with the addition of a filter and chorus effect.

thumb_upLike star_borderFollow article

RyanJeffares star_borderFollow

I'm a software engineer and third-level teacher specialising in audio and front end. In my spare time, I work on projects related to microcontrollers, electronics, and programming language design.

RockSynth Part 1: Writing a MIDI Controlled Synthesizer on the ROCK 4 C+

Your next article

How do you feel about this article? Help us to provide better content for you.

Thank you! Your feedback has been received.

There was a problem submitting your feedback, please try again later.

What do you think of this article?

Hardware

Software Prerequisites

Setting up the Project

Next Steps

Articles in this series:

Your next article

Comments