Skip to main content

RockSynth Part 2: Writing a MIDI Controlled Synthesizer on the ROCK 4 C+

In the previous entry of this article series, we got the RtAudio real-time audio I/O C++ library installed and running on the Okdo ROCK 4 C+  (230-6199) , got an oscillator buzzing through the speakers, and talked about the theory and structure of our synth program going forward. In this part, it's time to flesh out the synth and make it actually playable and controllable. This article will be much more code-heavy since there are a lot of classes we need to introduce, but it's going to be very fun and the GitHub repository I'm using is available here.

The Synth Voice

A "voice" in a synth is a complete audio path capable of playing monophonic notes. Each voice will usually contain its own oscillator(s), envelope(s), filter(s), and maybe more so that they can act independently of one another in a polyphonic situation. We have an oscillator already from the previous article, so now it's time to build out the rest of the synth voice based on this diagram I introduced:

Structure of a single synthesizer voice

I'm going to leave the VCF (voltage-controlled filter) until the final part, so for now we can make an ADSR (attack, sustain, decay, release) envelope and a voice to encapsulate it all.

ADSR

An ADSR envelope is an essential part of subtractive synthesis - instead of our oscillators being limited to on or off, we can shape the volume of the sound over time. It looks like this:

Diagram of an ADSR envelope

I made two new files in the "Synth" subdirectory of the source folder, Adsr.hpp and Adsr.cpp. The header looks like this:

#ifndef ADSR_HPP
#define ADSR_HPP

#include <cstdint>

class Adsr
{
public:
    enum class Phase
    {
        Idle,
        Attack,
        Decay,
        Sustain,
        Release,
    };

    void prepare(uint32_t sampleRate) noexcept;
    [[nodiscard]] float getNextValue() noexcept;

    void noteOn() noexcept;
    void noteOff() noexcept;

    template<Phase ParamType> requires (ParamType != Phase::Idle)
    void setParam(float value) noexcept
    {
        if constexpr (ParamType == Phase::Attack) {
            m_attackTime = std::fmax(value, 0.001f);
        } else if constexpr (ParamType == Phase::Decay) {
            m_decayTime = std::fmax(value, 0.001f);
        } else if constexpr (ParamType == Phase::Sustain) {
            m_sustainLevel = value;
        } else if constexpr (ParamType == Phase::Release) {
            m_releaseTime = std::fmax(value, 0.001f);
        }
    }

    [[nodiscard]] Phase getCurrentPhase() const noexcept;

private:
    Phase m_state{Phase::Idle};
    
    float m_sampleRate;
    float m_sampleCounter{0.0f};

    float m_attackTime{0.0f}, m_decayTime{0.0f}, m_releaseTime{0.0f};
    float m_sustainLevel{1.0f};

    float m_maxLevel{0.0f};
};

#endif

And the corresponding C++ file looks like this:

#include "Adsr.hpp"

#include <cmath>

void Adsr::prepare(uint32_t sampleRate) noexcept
{
    m_sampleRate = static_cast<float>(sampleRate);
}

float Adsr::getNextValue() noexcept
{
    switch (m_state) {
        case Phase::Attack: {
            // time in seconds the attack phase has been active
            auto t = m_sampleCounter++ / m_sampleRate;
            // get this as a proportion between 0 - 1
            // this will be the maximum level reached during the attack phase
            m_maxLevel = t / m_attackTime;

            if (m_maxLevel >= 1.0f) {
                // if we reached max amplitude of 1, the attack phase is over
                // move to decay on the next call
                m_state = Phase::Decay;
                m_sampleCounter = 0;
            }

            return m_maxLevel;
        }
        case Phase::Decay: {
            // time in seconds the decay phase has been active
            auto t = m_sampleCounter++ / m_sampleRate;
            // invert the decay level since volume should be decreasing
            auto decayLevel = 1.0f - t / m_decayTime;
            // add this to the sustain level, but scaled within the headroom above sustain level
            // treat this as the new maximum level reached in case release is triggered early
            m_maxLevel = m_sustainLevel + decayLevel * (1.0f - m_sustainLevel);
            
            if (decayLevel <= 0.0f) {
                // decay phase done, move to sustain on the next call
                m_state = Phase::Sustain;
                m_sampleCounter = 0;
            }

            return m_maxLevel;
        }
        case Phase::Sustain:
            return m_sustainLevel;
        case Phase::Release: {
            // time in seconds the release phase has been active
            auto t = m_sampleCounter++ / m_sampleRate;
            // invert release level since volume is decreasing
            auto level = 1.0f - t / m_releaseTime;

            if (level <= 0.0f) {
                // release phase done, go back to idle
                m_state = Phase::Idle;
            }

            // scale the maximum level reached in case of early note off
            return level * m_maxLevel;
        }
        case Phase::Idle:
        default:
            return 0.0f;
    }
}

void Adsr::noteOn() noexcept
{
    // reset everything when a new note is activated
    m_sampleCounter = 0;
    m_state = Phase::Attack;
}

void Adsr::noteOff() noexcept
{
    // don't do anything if the whole process has already finished
    if (m_state != Phase::Release && m_state != Phase::Idle) {
        m_sampleCounter = 0;
        m_state = Phase::Release;
    }
}

Adsr::Phase Adsr::getCurrentPhase() const noexcept
{
    return m_state;
}

There are probably smarter ways to write an ADSR, but this simple approach worked. I'm using the same process() and getNextValue() pattern we saw in the oscillator before. The setter function for the envelope's parameters is a template with a fancy-shmancy C++20 requires clauses just to cut down on boilerplate.

Sidnote here: I usually prefer to define class methods in a separate .cpp file to avoid recompiling the header, and every other file that includes that header in turn, if I need to change the implementation. However, templated class methods need to be defined in the header without any further tomfoolery.

The getNextValue() function works by keeping track of how many samples have passed in each phase of the envelope, and converting that to seconds (this is why we also need to know the sample rate of the system). We can use this to calculate the amplitude the envelope should be at, with the decay and release phases also scaled to consider the sustain level and the release phase also considering the maximum level reached during the attack or decay phases in case the note off was triggered before either of these phases had finished.

Audio Buffers

Before we put the whole synth voice together, we should come up with an easier way to deal with our audio buffers than the raw pointer to interleaved samples given to us by RtAudio so far. We'll make an audio buffer class to own a block of samples and give simplified access to specific channels and samples. I made an "Audio" subfolder in the source directory, and created AudioBuffer.hpp and AudioBuffer.cpp. The header is as follows:

#ifndef AUDIO_BUFFER_HPP
#define AUDIO_BUFFER_HPP

#include <cstdlib>

class AudioBuffer
{
public:
    // Constructor
    AudioBuffer(size_t numChannels, size_t bufferSize);

    // Copy + Move constructors
    AudioBuffer(const AudioBuffer& other);
    AudioBuffer(AudioBuffer&& other) noexcept;

    // Copy + Move assign operators
    AudioBuffer& operator=(const AudioBuffer& other);
    AudioBuffer& operator=(AudioBuffer&& other) noexcept;

    // Destructor
    ~AudioBuffer();

    void setSample(size_t channel, size_t sample, float value) noexcept;
    void addSample(size_t channel, size_t sample, float value) noexcept;
    [[nodiscard]] float getSample(size_t channel, size_t sample) const noexcept;

    [[nodiscard]] size_t numChannels() const noexcept;
    [[nodiscard]] size_t bufferSize() const noexcept;

private:
    size_t m_numChannels;
    size_t m_bufferSize;
    float* m_data;
};

#endif

And the C++ file:

#include "AudioBuffer.hpp"

#include <cstring>

AudioBuffer::AudioBuffer(size_t numChannels, size_t bufferSize)
    : m_numChannels{numChannels}
    , m_bufferSize{bufferSize}
    , m_data{new float[m_numChannels * m_bufferSize]}
{
    std::memset(m_data, 0, sizeof(float) * m_numChannels * m_bufferSize);
}

AudioBuffer::AudioBuffer(const AudioBuffer& other)
    : m_numChannels{other.numChannels()}
    , m_bufferSize{other.bufferSize()}
    , m_data{new float[m_numChannels * m_bufferSize]}
{
    std::memcpy(m_data, other.m_data, sizeof(float) * m_numChannels * m_bufferSize);
}

AudioBuffer::AudioBuffer(AudioBuffer&& other) noexcept
    : m_numChannels{other.numChannels()}
    , m_bufferSize{other.bufferSize()}
    , m_data{other.m_data}
{
    other.m_numChannels = 0;
    other.m_bufferSize = 0;
    other.m_data = nullptr;
}

AudioBuffer& AudioBuffer::operator=(const AudioBuffer& other)
{
    if (&other != this) {
        delete[] m_data;
        m_numChannels = other.numChannels();
        m_bufferSize = other.bufferSize();
        m_data = new float[m_numChannels * m_bufferSize];
        std::memcpy(m_data, other.m_data, sizeof(float) * m_numChannels * m_bufferSize);
    }

    return *this;
}

AudioBuffer& AudioBuffer::operator=(AudioBuffer&& other) noexcept
{
    if (&other != this) {
        delete[] m_data;
        m_numChannels = other.numChannels();
        m_bufferSize = other.bufferSize();
        m_data = other.m_data;
        other.m_numChannels = 0;
        other.m_bufferSize = 0;
        other.m_data = nullptr;
    }

    return *this;
}

AudioBuffer::~AudioBuffer()
{
    delete[] m_data;
}

void AudioBuffer::setSample(size_t channel, size_t sample, float value) noexcept
{
    m_data[channel * m_bufferSize + sample] = value;
}

void AudioBuffer::addSample(size_t channel, size_t sample, float value) noexcept
{
    m_data[channel * m_bufferSize + sample] += value;
}

float AudioBuffer::getSample(size_t channel, size_t sample) const noexcept
{
    return m_data[channel * m_bufferSize + sample];
}

size_t AudioBuffer::numChannels() const noexcept
{
    return m_numChannels;
}

size_t AudioBuffer::bufferSize() const noexcept
{
    return m_bufferSize;
}

This class also holds the samples for each channel contiguously in an array, but it does so privately and gives simple-to-understand public functions to access or modify the samples. None of the other classes we've written so far have required user-defined constructors/destructors, but since this one does because we need to deallocate that array in the destructor to avoid memory leaks, it follows the rule of five. Creating a copy of an audio buffer will cause a new allocation of its own array with the sample values copied over, whereas moving one will just transfer the ownership of that pointer.

Now that we have a nice buffer class, I'm also going to create an interface that can be implemented by any class that needs to deal with buffers in real-time. Also in the "Audio" subfolder, I made AudioProcessor.hpp:

#ifndef AUDIO_PROCESSOR_HPP
#define AUDIO_PROCESSOR_HPP

#include "AudioBuffer.hpp"

#include <cstdint>

class AudioProcessor
{
public:
    virtual ~AudioProcessor() = default;

    virtual void prepare(uint32_t sampleRate) = 0;
    virtual void process(AudioBuffer& bufferToFill) = 0;
};

#endif

We can then inherit from this abstract class and provide implementations for these functions - the prepare() function we've seen before to do any setup before audio processing starts, and a process() function that takes a reference to a buffer to be filled with samples. By making these functions pure-virtual, this works as a sort of "contract" that we are going to provide the same API on any class that implements this. The advantage of this is that we can hold pointers to many objects of different derived types in the same collection as an AudioProcessor*.

Putting the Whole Synth Voice Together

Finally, with these classes done, we can put together the synth voice that encapsulates all these components. I made SynthVoice.hpp and SynthVoice.cpp in the "Synth" subfolder, with the header like this:

#ifndef SYNTH_VOICE_HPP
#define SYNTH_VOICE_HPP

#include "Adsr.hpp"
#include "../Audio/AudioProcessor.hpp"
#include "Oscillator.hpp"

#include <array>

class SynthVoice : public AudioProcessor
{
public:
    ~SynthVoice() override = default;

    void prepare(uint32_t sampleRate) override;
    void process(AudioBuffer& bufferToFill) override;

    void noteOn(uint8_t midiNote, uint8_t velocity) noexcept;
    void noteOff() noexcept;

    template<size_t OscNumber> requires (OscNumber < 2)
    void setPulseWidth(float pulseWidth) noexcept
    {
        m_oscillators[OscNumber].setPulseWidth(pulseWidth);
    }

    template<size_t OscNumber> requires (OscNumber < 2)
    void setShape(Oscillator::Shape shape) noexcept
    {
        m_oscillators[OscNumber].setShape(shape);
    }

    template<size_t OscNumber> requires (OscNumber < 2)
    void setVolume(float volume) noexcept
    {
        m_oscillatorVolumes[OscNumber] = volume;
    }

    template<Adsr::Phase ParamType> requires (ParamType != Adsr::Phase::Idle)
    void setAdsrParam(float value) noexcept
    {
        m_adsr.setParam<ParamType>(value);
    }

    [[nodiscard]] uint8_t getCurrentNote() const noexcept;
    [[nodiscard]] bool getIsNotePlaying() const noexcept;

private:
    uint8_t m_currentNote{0};
    float m_currentVelocity{0.0f};

    Adsr m_adsr;
    std::array<Oscillator, 2> m_oscillators;
    std::array<float, 2> m_oscillatorVolumes{0.5f, 0.5f};
};

#endif

And the C++ file containg this:

#include "SynthVoice.hpp"

#include <cmath>
#include <numeric>

static float mtof(uint8_t midiNote)
{
    return std::pow(2.0f, (static_cast<float>(midiNote) - 69.0f) / 12.0f) * 440.0f;
}

void SynthVoice::prepare(uint32_t sampleRate)
{
    m_adsr.prepare(sampleRate);
    for (auto& osc : m_oscillators) {
        osc.prepare(sampleRate);
    }
}

void SynthVoice::process(AudioBuffer& bufferToFill)
{
    for (size_t sample = 0; sample < bufferToFill.bufferSize(); sample++) {
        auto adsrLevel = m_adsr.getNextValue();
        auto oscOutput = 0.0f;
        for (size_t i = 0; i < 2; i++) {
            oscOutput += m_oscillators[i].getNextSample() * adsrLevel * m_currentVelocity * m_oscillatorVolumes[i];
        }

        for (size_t channel = 0; channel < bufferToFill.numChannels(); channel++) {
            bufferToFill.addSample(channel, sample, oscOutput);
        }
    }
}

void SynthVoice::noteOn(uint8_t midiNote, uint8_t velocity) noexcept
{
    m_currentNote = midiNote;
    m_currentVelocity = static_cast<float>(velocity) / 127.0f;

    for (auto& osc : m_oscillators) {
        osc.setFrequency(mtof(midiNote));
    }
    m_adsr.noteOn();
}

void SynthVoice::noteOff() noexcept
{
    m_adsr.noteOff();
}

uint8_t SynthVoice::getCurrentNote() const noexcept
{
    return m_currentNote;
}

bool SynthVoice::getIsNotePlaying() const noexcept
{
    return m_adsr.getCurrentPhase() != Adsr::Phase::Idle;
}

The voice holds our two oscillators and an ADSR. It holds the oscillators in an std::array to keep them nicely aligned within the class, as well as an array of floats for volumes for the corresponding oscillators. It exposes the required setter functions for the oscillators and envelopes, also templated to cut down on boilerplate and constrained to ensure we don't do anything that's going to crash and burn.

This class implements our AudioProcessor interface. The prepare() function just calls the corresponding functions in all of its components, while the process() function processes the envelope and the oscillators, accumulates their values, and fills both of the buffer's channels with these samples. We also provide calls for MIDI note-ons and offs, which keep track of the note number and velocity and passes the event on to the oscillators and envelope.

The Synth

Ok, time to put the actual synth together. In the last article, I used this diagram:

Full structure of a synthesizer

For now, this is quite simple to represent as a class really - it holds eight synth voices, exposes their public functions, and passes on MIDI events. The new files Synth.hpp and Synth.cpp in the "Synth" subfolder are as follows:

#ifndef SYNTH_HPP
#define SYNTH_HPP

#include "../Audio/AudioProcessor.hpp"
#include "SynthVoice.hpp"

#include <queue>

class Synth : public AudioProcessor
{
public:
    ~Synth() override = default;

    void prepare(uint32_t sampleRate) override;
    void process(AudioBuffer& bufferToFill) override;

    void noteOn(uint8_t midiNote, uint8_t velocity) noexcept;
    void noteOff(uint8_t midiNote) noexcept;

    template<size_t OscNumber> requires (OscNumber < 2)
    void setPulseWidth(float pulseWidth) noexcept
    {
        for (auto& voice : m_voices) {
            voice.setPulseWidth<OscNumber>(pulseWidth);
        }
    }

    template<size_t OscNumber> requires (OscNumber < 2)
    void setShape(Oscillator::Shape shape) noexcept
    {
        for (auto& voice : m_voices) {
            voice.setShape<OscNumber>(shape);
        }
    }

    template<size_t OscNumber> requires (OscNumber < 2)
    void setVolume(float volume) noexcept
    {
        for (auto& voice : m_voices) {
            voice.setVolume<OscNumber>(volume);
        }
    }

    template<Adsr::Phase ParamType> requires (ParamType != Adsr::Phase::Idle)
    void setAdsrParam(float value) noexcept
    {
        for (auto& voice : m_voices) {
            voice.setAdsrParam<ParamType>(value);
        }
    }

private:
    std::queue<size_t> m_usedVoices;
    size_t m_lastVoiceIndex{0};
    std::array<SynthVoice, 8> m_voices;
};

#endif
#include "Synth.hpp"

void Synth::prepare(uint32_t sampleRate)
{
    for (auto& voice : m_voices) {
        voice.prepare(sampleRate);
    }
}

void Synth::process(AudioBuffer &bufferToFill)
{
    for (auto& voice : m_voices) {
        voice.process(bufferToFill);
    }
}

void Synth::noteOn(uint8_t midiNote, uint8_t velocity) noexcept
{
    for (size_t i = 0; i < m_voices.size(); i++) {
        if (!m_voices[i].getIsNotePlaying()) {
            m_voices[i].noteOn(midiNote, velocity);
            m_usedVoices.push(i);
            return;
        }
    }

    auto oldestVoice = m_usedVoices.front();
    m_usedVoices.pop();
    m_voices[oldestVoice].noteOn(midiNote, velocity);
}

void Synth::noteOff(uint8_t midiNote) noexcept
{
    for (auto& voice : m_voices) {
        if (voice.getCurrentNote() == midiNote) {
            voice.noteOff();
        }
    }
}

In the noteOn() function, we find the first free voice to use, and if all of them are still playing, we just take over the oldest one that was used - we keep track of which voices were used in which order with a queue. In noteOff(), we find whichever voices had been playing that note and tell them to stop - remember that calling noteOff() on an ADSR that's already releasing or idle won't do anything, so it's ok to do this on many different voices if you just spam the same MIDI note. It's also an implementation of AudioProcessor, and all it has to do is call prepare() and process() on each of its voices.

MIDI Input

That's every new class we have to write for today and I promise we're nearly ready to play some notes, we just need a way to get MIDI input. I opted to use the RtMidi library, from the same developers as RtAudio.

To build RtMidi, I discovered I needed a newer version of CMake than the one available through apt on Ubuntu 20.04. You could compile CMake from source, but I got impatient and decided to cheat. I downloaded the latest binaries and just copied them to the relevant folders with the following commands:

$ cd /tmp
$ wget https://github.com/Kitware/CMake/releases/download/v3.27.1/cmake-3.27.1-linux-aarch64.tar.gz
$ tar -zxvf cmake-3.27.1-linux-aarch64.tar.gz
$ sudo cp cmake-3.27.1-linux-aarch64/bin/cmake /usr/bin/
$ sudo cp -r cmake-3.27.1-linux-aarch64/share/cmake-3.27/ /usr/share/

Then, I could install RtMidi from source:

$ git clone https://github.com/thestk/rtmidi.git
$ cd rtmidi
$ mkdir build
$ cmake -B build -S . -DCMAKE_BUILD_TYPE=Release -DRTMIDI_API_ALSA=ON
$ cmake --build build
$ sudo cmake --install build

Back in our project, we can now link to RtMidi the same way as we do with RtAudio in CMakeLists.txt:

...
find_package(RtAudio REQUIRED)
find_package(RtMidi REQUIRED)
...
target_link_libraries(rocksynth PRIVATE fmt::fmt RtAudio::rtaudio RtMidi::rtmidi)

While we're here, we also need to add all the new C++ files we made to the executable:

add_executable(
    rocksynth
    src/Audio/AudioBuffer.cpp
    src/main.cpp
    src/Synth/Adsr.cpp
    src/Synth/Oscillator.cpp
    src/Synth/Synth.cpp
    src/Synth/SynthVoice.cpp
)

If that compiles, we can get MIDI input running.

Back in main.cpp, I got rid of the static Oscillator instance we had made in the previous article and all the calls to it. Instead, we now have a Synth and RtMidiIn instance:

...
#include <rtmidi/RtMidi.h>
...

static Synth s_synth;
static RtMidiIn s_midiIn;

...

RtMidi will find any available MIDI devices and we open them via index - in my case, index 0 was the system's built-in MIDI through and my USB MIDI keyboard I wanted to use was at index 1. I set up a way to display these and choose which to use via a command-line argument:

int main([[maybe_unused]] int argc, [[maybe_unused]] const char* argv[])
{
    uint32_t midiDeviceNum = 0;
    if (argc > 1) {
        try {
            midiDeviceNum = std::stoi(argv[1]);
        } catch (const std::exception& e) {
            fmt::print(stderr, "Expected integer for MIDI device number: {}\n", e.what());
        }
    }

    auto numMidiPorts = s_midiIn.getPortCount();
    if (numMidiPorts == 0) {
        fmt::print("No MIDI input devices found.\n\n");
    } else {
        for (uint32_t i = 0; i < numMidiPorts; i++) {
            fmt::print("Found MIDI device #{}: {}\n", i, s_midiIn.getPortName(i));
        }
        s_midiIn.openPort(midiDeviceNum);
        fmt::print("Using MIDI device: {}\n", s_midiIn.getPortName(midiDeviceNum));
    }

...

So when I run the program, I run it with:

$ ./build/rocksynth 1

We can handle MIDI messages on the audio thread to avoid data races using RtMidi's queued messages (a full RtMidi tutorial is available here). At the top of the audioCallback() function, I added this:

int audioCallback(
        void* outBuffer,
        [[maybe_unused]] void* inBuffer,
        uint32_t bufferFrames,
        [[maybe_unused]] double streamTime,
        RtAudioStreamStatus status,
        [[maybe_unused]] void* userData
)
{
    std::vector<uint8_t> midiMessage;
    while (true) {
        double timeStamp = s_midiIn.getMessage(&midiMessage);
    }
...

If there is a message waiting, RtMidi will fill our vector with the bytes of the message. For a more extensive overview of the MIDI message format, check out this resource, but in essence:

  1. The first byte is the status byte, which tells us the type of MIDI message. The top 4 bits will contain the unique identifier of the type of message, with the bottom 4 bits being the MIDI channel number. For example, a note on message will have the bit values 1001nnnn and a note off message will have the bits 1000nnnn, where nnnn is the channel number. 
  2. The second byte is the first data byte, which may contain data specific to the type of message. In the case of a note on or note off message, this will be the MIDI note number.
  3. The third byte is the second data byte, which also may contain data specific to the type of message. In the case of a note on or off message, this will be the velocity of the note.

Armed with this knowledge, we can start handling note on and off messages:

static constexpr uint8_t s_noteOnStatusByte        = 0b10010000;
static constexpr uint8_t s_noteOffStatusByte       = 0b10000000;

int audioCallback(
        void* outBuffer,
        [[maybe_unused]] void* inBuffer,
        uint32_t bufferFrames,
        [[maybe_unused]] double streamTime,
        RtAudioStreamStatus status,
        [[maybe_unused]] void* userData
)
{
    std::vector<uint8_t> midiMessage;
    while (true) {
        [[maybe_unused]] double timeStamp = s_midiIn.getMessage(&midiMessage);
        if (!midiMessage.empty()) {
#ifdef ROCKSYNTH_DEBUG
            fmt::print("MIDI message with timestamp {}\n", timeStamp);
            for (size_t i = 0; i < midiMessage.size(); i++) {
                fmt::print("Byte {} = {:#b}\n", i, (int)midiMessage[i]);
            }
#endif

            auto statusByte = midiMessage[0] & 0b11110000;
            switch (statusByte) {
                case s_noteOnStatusByte: {
                    uint8_t note = midiMessage[1];
                    uint8_t velocity = midiMessage[2];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Note on: note number = {}, velocity = {}\n\n", note, velocity);
#endif
                    s_synth.noteOn(note, velocity);
                    break;
                }
                case s_noteOffStatusByte: {
                    uint8_t note = midiMessage[1];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Note off: note number = {}\n\n", note);
#endif
                    s_synth.noteOff(note);
                    break;
                }
            }
        } else {
            break;
        }
    }
...

We create a vector of bytes to be filled, and call getMessage() on the RtMidiIn object until it no longer gives any queued messages. If we have a message, we can check those top bits from the status byte to see if it's a note on or off, then get the relevant values from the data bytes, and trigger the note events in our synth.

We also need to update the main process loop in this audio callback to get the audio from the synth. The whole function should now look like this:

int audioCallback(
        void* outBuffer,
        [[maybe_unused]] void* inBuffer,
        uint32_t bufferFrames,
        [[maybe_unused]] double streamTime,
        RtAudioStreamStatus status,
        [[maybe_unused]] void* userData
)
{
    std::vector<uint8_t> midiMessage;
    while (true) {
        [[maybe_unused]] double timeStamp = s_midiIn.getMessage(&midiMessage);
        if (!midiMessage.empty()) {
#ifdef ROCKSYNTH_DEBUG
            fmt::print("MIDI message with timestamp {}\n", timeStamp);
            for (size_t i = 0; i < midiMessage.size(); i++) {
                fmt::print("Byte {} = {:#b}\n", i, (int)midiMessage[i]);
            }
#endif

            auto statusByte = midiMessage[0] & 0b11110000;
            switch (statusByte) {
                case s_noteOnStatusByte: {
                    uint8_t note = midiMessage[1];
                    uint8_t velocity = midiMessage[2];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Note on: note number = {}, velocity = {}\n\n", note, velocity);
#endif
                    s_synth.noteOn(note, velocity);
                    break;
                }
                case s_noteOffStatusByte: {
                    uint8_t note = midiMessage[1];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Note off: note number = {}\n\n", note);
#endif
                    s_synth.noteOff(note);
                    break;
                }
            }
        } else {
            break;
        }
    }

    float* buffer = (float*)outBuffer;

    if (status) {
        fmt::print(stderr, "Stream underflow\n");
        return 0;
    }

    AudioBuffer bufferToFill{2, static_cast<size_t>(bufferFrames)};
    s_synth.process(bufferToFill);

    for (size_t sample = 0; sample < bufferFrames; sample++) {
        for (size_t channel = 0; channel < 2; channel++) {
            *buffer++ = bufferToFill.getSample(channel, sample);
        }
    }

    return 0;
}

We create one of our audio buffers to be passed on to the synth for it to fill, and then copy over the samples into the array given to us by RtAudio.

Now, back in main(), we can prepare the synth with some parameters before we open the stream on our RtAudio object and also keep the stream running until we close the program, and recompile to give it a try!

...
    s_synth.prepare(sr);

    s_synth.setShape<0>(Oscillator::Shape::Saw);
    s_synth.setShape<1>(Oscillator::Shape::Pulse);

    s_synth.setPulseWidth<1>(0.4f);

    s_synth.setAdsrParam<Adsr::Phase::Attack>(0.1f);
    s_synth.setAdsrParam<Adsr::Phase::Decay>(1.0f);
    s_synth.setAdsrParam<Adsr::Phase::Sustain>(0.3f);
    s_synth.setAdsrParam<Adsr::Phase::Release>(1.0f);

    if (dac.openStream(&streamParameters, nullptr, RTAUDIO_FLOAT32, sr, &bufferSize, &audioCallback)) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        std::exit(1);
    }

    if (dac.startStream()) {
        fmt::print(stderr, "{}\n", dac.getErrorText());
        cleanup();
        std::exit(1);
    }

    std::cin.get(); // wait for key input to stop

    if (dac.isStreamRunning()) {
        dac.stopStream();
    }

    cleanup();
}

I'm not a pianist.

Ok, brilliant! We have a playable synth. The last thing we'll do in this part is take advantage of MIDI control changes to control the parameters of the synth, so we don't have to manually update the fields and recompile for a different sound. The status byte value for a control change is 1011nnnn (with nnnn being the MIDI channel number), the first data byte is the controller number, and the second data byte is the value. I wasn't sure which controller number was assigned to each of the knobs on my MIDI keyboard, so I added the control change status byte value as a constant and this case to the switch statement checking the status byte:

static constexpr uint8_t s_noteOnStatusByte        = 0b10010000;
static constexpr uint8_t s_noteOffStatusByte       = 0b10000000;
static constexpr uint8_t s_controlChangeStatusByte = 0b10110000;

int audioCallback(
...
            switch (statusByte) {
                case s_controlChangeStatusByte: {
                    uint8_t controllerNumber = midiMessage[1];
                    uint8_t value = midiMessage[2];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Control Change: controller number = {}, value = {}\n\n", controllerNumber, value);
#endif
                    break;
                }
...

I built in debug mode so I could see the output:

$ cmake -B build -S . -DCMAKE_BUILD_TYPE=Debug
$ cmake --build build

And after twisting all the knobs I found that in my case, the eight controllers were numbered 14 - 21, and I could also change page to get higher numbers. Yours could very well be different. With this, I assigned them to some of my synth's parameters:

...
                case s_controlChangeStatusByte: {
                    uint8_t controllerNumber = midiMessage[1];
                    uint8_t value = midiMessage[2];
#ifdef ROCKSYNTH_DEBUG
                    fmt::print("Control Change: controller number = {}, value = {}\n\n", controllerNumber, value);
#endif
                    switch (controllerNumber) {
                        case 14:
                            // max of 5 seconds on timed ADSR params
                            s_synth.setAdsrParam<Adsr::Phase::Attack>(static_cast<float>(value) / 127.0f * 5.0f);
                            break;
                        case 15:
                            s_synth.setAdsrParam<Adsr::Phase::Decay>(static_cast<float>(value) / 127.0f * 5.0f);
                            break;
                        case 16:
                            // scale sustain to be 0 - 1
                            s_synth.setAdsrParam<Adsr::Phase::Sustain>(static_cast<float>(value) / 127.0f);
                            break;
                        case 17:
                            s_synth.setAdsrParam<Adsr::Phase::Release>(static_cast<float>(value) / 127.0f * 5.0f);
                            break;
                        case 18:
                            // scale this to be 0 - 3
                            s_synth.setShape<0>(static_cast<Oscillator::Shape>(value / 42));
                            break;
                        case 19:
                            s_synth.setShape<1>(static_cast<Oscillator::Shape>(value / 42));
                            break;
                        case 20:
                            // scale pulsewidth to be 0 - 1
                            s_synth.setPulseWidth<0>(static_cast<float>(value) / 127.0f);
                            break;
                        case 21:
                            s_synth.setPulseWidth<1>(static_cast<float>(value) / 127.0f);
                            break;
                        case 22:
                            s_synth.setVolume<0>(static_cast<float>(value) / 127.0f);
                            break;
                        case 23:
                            s_synth.setVolume<1>(static_cast<float>(value) / 127.0f);
                            break;
                    }
                    break;
                }
...

And now I could fully control the sound during the performance. At this point, I discovered a bug with my ADSR - if you change the sustain level while a note is sustaining, the level of the held note will change but when the release phase starts there's a jump in volume since it was still using the previous sustain level. Simple fix in Adsr::setParam():

...
    template<Phase ParamType> requires (ParamType != Phase::Idle)
    void setParam(float value) noexcept
    {
        if constexpr (ParamType == Phase::Attack) {
            m_attackTime = std::fmax(value, 0.001f);
        } else if constexpr (ParamType == Phase::Decay) {
            m_decayTime = std::fmax(value, 0.001f);
        } else if constexpr (ParamType == Phase::Sustain) {
            if (m_state == Phase::Sustain) {
                // update the held max level so the release phase starts at the right volume
                m_maxLevel = value;
            }
            m_sustainLevel = value;
        } else if constexpr (ParamType == Phase::Release) {
            m_releaseTime = std::fmax(value, 0.001f);
        }
    }
...

Conclusion

This article was a lot longer and more code-heavy than I anticipated, but it's been extremely fun and I've been pleasantly surprised at the success of running all of this on a ROCK. In the next and final part, we'll push the limits a little further by slotting that VCF into the synth voice, and then we'll have ourselves a REAL synth. As always, leave a comment if you had any difficulty following along, and thanks for reading! Remember that the GitHub repository I'm using is available to see here.

Articles in this series:

Part 1 - Getting the RtAudio running on OKdo Rock 4 C+ and Theory and structure.
Part 2 - Flesh out the synth and make it actually playable and controllable (This article).
Part 3 - Take the basic synth and make it musical with the addition of a filter and chorus effect.

I'm a software engineer and third-level teacher specialising in audio and front end. In my spare time, I work on projects related to microcontrollers, electronics, and programming language design.

Comments