Object Recycler Icon Project home gitorious.org

A reverse tracing garbage collector for C++

Introduction

The Reference Binding Object Recycler (REBOR) is an implementation of an exact (precise) concurrent reverse tracing garbage collector for C++.

Its usage is simple:

#include <recycler>

_<int> pInt = _new<int>(42);

printf("The answer is %d", *pInt);

This example creates a new garbage-collected object (on the heap) and initializes it with 42. Then it is printed out.

"_" is the name of a template class! The reason for the name was to keep the focus on the program and its classes and not the references (smart pointers). They should be as inconspicuous as possible though I could not get rid of the angle brackets.

The project is still in an early development state (it started as a feasibility study), but I already used it successfully in several projects.

With java and .NET being so prominent I don't think it is necessary to repeat the advantages of garbage collection. Still, I may point out the disadvantages again:

This implementation is not intended to solve these problems though it tries to keep them as small as possible.

The reason why objects are not immediately destroyed when they have no more references is that finding orphan object clusters can take a long time and would block the execution of the program.

An object "cluster" consists of all objects referencing themselves directly or indirectly via other objects (circular references). Orphan means that there is no way to access the cluster from the program any more.

If you find this garbage collector useful or worth being improved I'm glad for any comment.

Operating systems

This version of the garbage collector works on Microsoft Windows and UNIX (Linux) operating systems. MinGW is not recommended because it doesn't support thread local storage.

Using the garbage collector

As there should be only one recycler (garbage collector) instance the project is set up as dynamic link library.

The library can be build with qmake (via the Qt Creator project file rebor.pro) or with MS Visual Studio 2010 (rebor.sln).

Binding to the library is as usual.

To use the recycler first include the header recycler. Then you can directly start to create objects. Any type can be used, primitives or classes/structures.

Here is a coding example with a user defined class:

#include <recycler>

class MyClass
{
public:
   MyClass() {}
   ~MyClass() {}

   int value() const { return _value; }
   void setValue(int value) { _value = value; }

private:
   int _value;
};

void foo()
{
   _<MyClass> myClass = _new<MyClass>();

   myClass->setValue(42);
}

Restrictions:
There is only one thing to avoid: Mixing the standard new operator and the Recycler _new<> template functions. Always use _new<>() for objects containing references unless you carefully design pure hierarchic object chains. This restriction also applies to Lists and Maps! If you need lists or maps of objects use rcy::List or rcy::Map.

Recycler (GC) characteristics

Using the smart pointers with other libraries

Theoretically the member references (_<> or o<> in earlier versions) could be used in any library if it wouldn't be for the necessity to establish a link between the reference and the containing object!

Hence I provided some collection classes similar to the standard C++ library (rebor_collections.h) including an optimized forEach macro that accepts template arguments.

Performance

Warning:

Advantages

Disadvantages

Background

Why another garbage collector?

In my research studies I found some literature about garbage collection (see footnote; for an introduction into garbage collection I would recommend the Wikipedia article). But actually implemented garbage collectors for C++ I found 4 in 2011 and I'm not speaking of GCs solely based on reference counting (a GC should be able to cope with circular references):

The last one (librtgc) is very similar to my implementation but with less features and abandoned 2003 at Version 0.2.

The other garbage collectors are either conservative (not precise), based on special memory allocation or not concurrent.

I came to the conclusion that apart from the difficulty to implement a precise GC due to the lack of reflection in C++ there seems to be little need for garbage collection in C++. Experienced developers can handle objects with other techniques, for example with simple reference counting or the parent-child pattern (parents deleting their children when they are destroyed).

But developers new to C++ (coming from java or .NET) quite miss garbage collection and have great difficulty to change their custom to use the new operator everywhere, necessary or not.

What I wanted is a simple but effective GC for all platforms. I found myself challenged to prove that it is possible to write a GC purely in C++ without OS or memory hacks, having full compliance with C++ (using standard new operator, multiple inheritance etc.) and no side effects to existing C++ code (if there are no GC objects no computer resources are ussed).

Though the actual version of the GC abandoned using the standard C++ new operator for simplicity of usage.

How the garbage collector was developed

My approach was to start with reference counting by smart pointers (a template class which behaves like a pointer), which left the problem of circular references.

The key problem with circular references is reflection. The GC needs to know the references (pointers) an object has to other objects to resolve circular references.

My solution was to tell the references (smart pointers) who their containing object is!

How the containing object were assigned to the references changed several times in the past years as I got new ideas. I called the different methods "generations", because they always meant a step forward in usability of the framework. Further down I give an overview.

The first and second generation distinguished two types of references (smart pointers):

In the first version (1st generation) the member references (o<>) got their containing object as parameter in their constructor:

class Test
{
public:
   Test() : _ref(this) {}
   ~Test() {}

private:
   o<MyClass> _ref;
};

This was a bit awkward and dangerous as the application crashes when the initialization of the o<> smart pointers was missing.

In the second attempt (2nd generation) I used thread local storage to assign the containing objects to the o<> references, thus being able to use the default constructor of o<>.

In the third version I distinguished externally controlled references and references controlled by objects implicitly. That left only one reference type: _<>.

Alternative versions

The development of the garbage collector took place in steps. Some time after a version was finished and tested I got new ideas how it could be made more comfortable. I called these steps "generations".

1st generation

In this version the smart pointers must be initialized with their containing objects in the constructor (via this).

2nd generation

Here TLS and overloaded new and delete operators are used to attach the smart pointers to their objects.

Objects controlled by the GC must be derived from AutoDelete and use the smart pointers _<> and o<> as in the following example:

class Test : public AutoDelete
{
public:
   Test() {}
   virtual ~Test() {}
   
   void setReference(_<Test> ref) { _myReference = ref; }
   
private:
   o<Test> _myReference;
};

void main()
{
   _<Test> test = new Test;
   test->setReference(test);
   // Simplest form of circular reference (a reference to self)
}

There are some important rules to follow:

3rd generation

The third generation is the actual version of the framework.

History

Version 1.0

Bug fixes:

Version 0.9

Features:

Version 0.8

Features:

Bug fixes:

Version 0.7

Recycler 3rd generation (multiple platforms).

Version 0.5

Recycler 2nd generation (only Microsoft Windows). See project AutomaticMemoryManagement_2G.

References

About the author

I am a German "Diplom-Physiker" (graduate physicist) working in Germany and Switzerland as software developer since 1997.