Smart pointers in C++
Smart pointers sounds like such a cool name to have but why are these pointers smart? Let's try to understand why do we need our pointers to be smart by looking at the following code:
juce::AudioProcessorValueTreeState::ParameterLayout SimpleEQAudioProcessor::createParameterLayout()
{
juce::AudioProcessorValueTreeState::ParameterLayout layout;
layout.add(std::make_unique<juce::AudioParameterFloat>("LowCut Freq", "LowCut Freq", juce::NormalisableRange<float>(20.f, 20000.f, 1.f, 1.f), 20.f));
layout.add(std::make_unique<juce::AudioParameterFloat>("HighCut Freq", "HighCut Freq", juce::NormalisableRange<float>(20.f, 20000.f, 1.f, 1.f), 20000.f));
layout.add(std::make_unique<juce::AudioParameterFloat>("Peak Freq", "Peak Freq", juce::NormalisableRange<float>(20.f, 20000.f, 1.f, 1.f), 750.f));
layout.add(std::make_unique<juce::AudioParameterFloat>("Peak Gain", "Peak Gain", juce::NormalisableRange<float>(-24.f, 24.f, 1.f, 1.f), 0.f));
layout.add(std::make_unique<juce::AudioParameterFloat>("Peak Quality", "Peak Quality", juce::NormalisableRange<float>(0.1f, 10.f, 1.f, 0.05f), 1.f));
juce::StringArray filterResponseChoices;
for (int i = 0; i < 4; ++i) {
juce::String choice;
choice << (12 + i * 12);
choice << " db/Oct";
filterResponseChoices.add(choice);
}
layout.add(std::make_unique<juce::AudioParameterChoice>("LowCut Slop", "LowCut Slope", filterResponseChoices, 0));
layout.add(std::make_unique<juce::AudioParameterChoice>("HighCut Slop", "HighCut Slope", filterResponseChoices, 0));
return layout;
}
This is a small snippet of code written with JUCE which is popular framework for audio processing in C++. In this example I am making a simple EQ app which will have 7 parameters to process the audio and apply EQing to the audio. These parameters are specifically for the GUI which will let the user control the amount of EQ applied and then will be mapped to the DSP parameters to do the processing. Once I create these local parameters I will pass them to the JUCE's AudioProcessorValueTreeState::ParameterLayout to add the parameters to the GUI. Passing these local parameters to the ParameterLayout class is where the pointers come into the picture. There are 3 ways to pass data in C++:
- Pass by Value: In this method the variable's value is copied and then passed to the function. If you make any changes to the copy the original is not affected. This is a simple way to pass around data but not so efficient since you are creating copies and using a lot more memory.
- Pass by Reference: In this method you pass the reference of the value in memory. Reference here is like an alias of the original value which gets passed around and making any changes through the reference also modifies the original value.
- Pass by Pointer: A pointer is a special type of variable which holds the raw memory address of the register where the value is stored physically on the hardware. In this method we pass around the memory address of the value and just like pass by reference any modification done through the pointer updates the value.
Now the first question that can be asked is why use pointers to pass
the parameters I am creating to the ParameterLayout class
and not just regular variables using pass by value? Just passing
simple float variables to the layout class will result in
performance degradation. Audio processing is a real time application
and works with huge amounts of data. Even a small EQ app can show
really slow performance because we are creating so many copies. In a
language like C++ which provides access to the lowest memory
register, having so many copies of your data is not a smart thing to
do.
Another issue with this approach is object slicing. Slicing happens
when you assign an object of derived class to an instance of a base
class thereby losing part of the information. If you copy a derived
class object into a base object, only the base class part gets
copied and the derived class parts are sliced off by the compiler.
If JUCE AudioProcessorValueTreeState::ParameterLayout were
to accept parameters using pass by value it would lose the
information stored in the derived class.
Ok, so we cannot pass the data by value then why not references? Because ParameterLayout would loose access to the data since the parameters needs to live longer than the function that creates them. Let's assume the AudioProcessorValueTreeState::ParameterLayout::add() expects a reference instead so that we can pass a local variable from our function SimpleEQAudioProcessor::createParameterLayout()
void addByReference(RangedAudioParameter& parameter); // Hypothetical
juce::AudioProcessorValueTreeState::ParameterLayout SimpleEQAudioProcessor::createParameterLayout()
{
juce::AudioProcessorValueTreeState::ParameterLayout layout;
juce::AudioParameterFloat lowCutParam(...); // Local variable
layout.addByReference(lowCutParam); // Hypothetical
return layout; // lowCutParam goes out of scope and is destroyed
}
- Function starts ==> local parameters created on stack
- References stored ==> layout holds references to stack objects
- Function returns ==> stack objects destroyed!
- layout survives ==> references now point to garbage memory
- Later usage ==> crash or data corruption
As soon as the function returns the local variable is removed from the stack memory and is destroyed. This is called a dangling reference and can crash the application and produce undefined behaviour. So, using pass by value in such scenarios would be a complete disaster.
No pass by value and no pass by reference! This is where pointers
come to rescue. What we need to do in such situations is have our
variables live in a space where they are not destroyed once the
function that is creating them returns. This space is called the
heap memory. In terms of the C language a heap is the portion of
memory where dynamically allocated memory resides. The good thing
about heap memory is that allocated space will remain unless that
memory is freed or the program terminates. So, this gives
AudioProcessorValueTreeState::ParameterLayout access to the
variables even if
SimpleEQAudioProcessor::createParameterLayout() returns
after allocating the variables space in heap.
To allocates space in heap memory we use new keyword in
C++.
juce::AudioProcessorValueTreeState::ParameterLayout SimpleEQAudioProcessor::createParameterLayout()
{
juce::AudioProcessorValueTreeState::ParameterLayout layout;
// create raw pointer
layout.add(new juce::AudioParameterFloat("LowCut Freq", ...));
return layout;
}
Wooohoo! We have solved all the issues for passing data around in C++ and can use pointers to pass them around efficiently!
Yes, pointers solve the efficiency problem but now we have memory
management nightmares! Let me explain what I mean.
Remember, memory allocated to the heap stays in it unless someone
frees that memory or the program itself terminates. If a piece of
code does not free up the memory from the heap it can
memory leaks. Memory is a finite resource so it needs to be
utilised very carefully. Now imagine your poorly handled memory app
is running on someone's computer. Each time your app is run it is
asking the operating system for space from the heap and is not
returning it back. When the app is running it is using up the space
in RAM which slows down the application or crashes it. If you have
an app that runs for long times eg. web servers, embedded systems,
etc. they can exhaust system memory. For audio applications, this
can can cause memory fragmentation and lead to audio dropouts. When
you repeatedly allocate memory of different sizes, you end up
creating a swiss cheese pattern in the memory i.e. lots of small
free blocks of memory scattered around and no large contiguous
blocks available. Imagine this Simple EQ app is running in a DAW for
hours in a studio and If your EQ plugin leaks memory for hours, it
might eventually crash when trying to allocate memory for basic
operations like creating new parameter objects or updating the GUI
forcing the user to restart the DAW session.
When you use new keyword to create a raw pointer you have to manually deallocate the pointer memory from the heap else you risk memory leaks in your application. You might think well that shouldn't be an issue, if I create a raw pointer I will make sure I deallocate the memory too. Unfortunately, system we work with and especially in C++ are not so simple. Let's try to understand that with an example.
juce::AudioProcessorValueTreeState::ParameterLayout SimpleEQAudioProcessor::createParameterLayout()
{
juce::AudioProcessorValueTreeState::ParameterLayout layout;
// raw pointers
layout.add(new juce::AudioParameterFloat("LowCut Freq", "LowCut Freq", ...));
layout.add(new juce::AudioParameterFloat("HighCut Freq", "HighCut Freq", ...));
layout.add(new juce::AudioParameterFloat("Peak Freq", "Peak Freq", ...));
// ... rest of the parameters
return layout;
}
Let's assume JUCE ParameterLayout expected raw pointers as parameters and I create the Audio Parameters using raw pointers and pass it to JUCE then who owns the raw pointer? Does JUCE takes the ownership of the pointer or does my code take the ownership? When should the parameter be deleted? If you're working in a big team managing a complex system and you write a piece of module using raw pointers do you take the ownership of the pointer or the consumer of the module? Such ambiguities can lead to bugs and memory leaks. What if you delete the pointer but it was still needed by some other part of the program? What if there is a double deletion of the pointer by you and the consumer of the module?
SimpleEQAudioProcessor::createParameterLayout()
{
ParameterLayout layout;
layout.add(new AudioParameterFloat(...)); // memory allocated
layout.add(new AudioParameterFloat(...)); // memory allocated
someOperation(); // throws exception
return layout;
}
If the method someOperation throws an exception the program
exists and the new memory allocation never gets deleted
leading to a memory leak. In a big piece of software many programs
are inherited from a base class so who takes the responsibility to
deallocate the memory: the parent or the child class?
Addressing these issues with raw pointers is why the concept of
smart pointers was introduced in C++.
Smart pointers were formally introduced in C++11 after the concept conceptualised in previous versions. Smart pointers are crucial for the Resource Acquisition Is Initialisation (RAII) programming idiom of C++. If this name does not make sense to you then you're not alone. The naming has been criticised and a better alternative was also introduced which is Scope Bound Resource Management and this might give you a clue on what RAII means. The motive of RAII is to ensure that all resources for the object are created and made ready for use in one line of code. The aim of RAII is to give the ownership of any heap allocated resource to a stack allocated object whose destructor contains the code to free the resource from the heap. With smart pointers you can make sure your programs are free of memory and resource leaks. Ok, but how?
void useSmartPointer()
{
// smart pointer on stack and pass it the raw pointer
unique_ptr<Resource> resource(new Resource(L"My Resource", L"In Heap"));
// Use resource
string s = resource->is_full;
//...
} // resource is deleted automatically here.
The resource is created in the heap memory and then passed to unique_ptr which creates a smart pointer on thr stack. When the function returns the stack unwinding starts where the resource goes out of scope. The unique pointer destructor is called which uses delete keyword to deallocate the memory. A smart pointer is a class template which manages deleting the memory it holds.
template<typename T>
class unique_ptr { // simplified version of the class
private:
T* ptr;
public:
~unique_ptr() { // destructor called automatically
delete ptr; // deletes the heap object
ptr = nullptr;
}
};
There is no separate garbage collector that runs in the background in C++; memory is managed through standard scoping rules so that the C++ runtime is faster and efficient. Going back to the Simple EQ example, it would look like the first code snippet provided but you'll notice that I am not using unique_ptr to create the pointer. A unique_ptr actually cannot be shared; it cannot be copied to another unique_ptr; cannot be passed by value to a function. However, a unique_ptr can be moved which means the ownership of the resource is transferred to another unique_ptr and the original no longer owns it. make_unique is a factory function that creates objects and wraps them in a std::unique_ptr. It allocates memory on the heap, constructs an object with provided arguments and wraps the raw pointer in a std::unique_ptr container and returns it. Well, why use make_unique? This is from the original proposal to introduce make_unique in C++:
make_unique's presence in the Standard Library will have several wonderful consequences. It will be possible to teach users "never say new/delete /new[]/delete[]" without disclaimers. Additionally, make_unique shares two advantages with make_shared (excluding the third advantage, increased efficiency). First, unique_ptr<LongTypeName> up(new LongTypeName(args)) must mention LongTypeName twice, while auto up = make_unique<LongTypeName>(args) mentions it once. Second, make_unique prevents the unspecified-evaluation-order leak triggered by expressions like foo(unique_ptr<X>(new X), unique_ptr<Y>(new Y)). (Following the advice "never say new" is simpler than "never say new, unless you immediately give it to a named unique_ptr".)
The first two points make sense but let's understand the last point. C++ does not specify the order of evaluation for function arguments.
void processParameters(std::unique_ptr<AudioParameterFloat> param1, std::unique_ptr<AudioParameterChoice> param2);
// this call is dangerous:
processParameters(
std::unique_ptr<AudioParameterFloat>(new AudioParameterFloat(...)), // allocation 1
std::unique_ptr<AudioParameterChoice>(new AudioParameterChoice(...)) // allocation 2
);
- new AudioParameterFloat(...) ==> Memory allocated for Float
- new AudioParameterChoice(...) ==> Memory allocated for Choice
- unique_ptr<Choice> construction ==> Choice wrapped safely
- unique_ptr<Float> construction ==> But if this throws exception
- Memory leak
With make_unique each call is atomic i.e allocation and
wrapping happen together so no leaks are possible.
So smart pointers offer a better mental model of handling memory
allocation and deallocation. Hence in a lower level language like
C++ pointers need to be smart!