Saturday, August 1, 2009

Some thoughts on metaprogramming, reflection, and templates

The thought struck me recently that C++ templates really are a downright awful metaprogramming system. Don't get me wrong, they are very powerful and I definitely use them, but recently I've realized that whatever power they have is soley due to enabling metaprogramming, and there are numerous other ways of approaching metaprogramming that actually make sense and are more powerful. We use templates in C++ because thats all we have, but they are an ugly, ugly feature of the language. It would be much better to combine full reflection (like Java or C#) with the capability to invoke reflective code at compile time to get all the performance benefits of C++. Templates do allow you to invoke code at compile time, but through a horribly obfuscated functional style that is completely out of synch with the imperative style of C++. I can see how templates probably evolved into such a mess, starting as a simple extension of the language that allowed a programmer to bind a whole set of function instantiations at compile time, and then someone realizing that its turing complete, and finally resulting in a metaprogramming abomination that never should have been.

Look at some typical simple real world metaprogramming cases. For example, take a generic container, like std::vector, where you want to have a type-specialized function such as a copy routine that uses copy constructors for complex types, but uses an optimized memcpy routine for types where that is equivalent to invoking the copy constructor. For simple types, this is quite easy to do with C++ templates. But using it with more complex user defined structs requires a type function such as IsMemCopyable which can determine if the copy constructor is equivalent to a memcpy. Abstractly, this is simple: the type is mem-copyable if it has a default copy constructor and all of its members are mem-copyable. However, its anything but simple to implement with templates, requiring all kinds of ugly functional code.

Now keep in my mind I havent used Java in many years, and then only briefly, I'm not familar with its reflection, and I know almost nothing of C#, although I understand both have reflection. In my ideal C++ with reflection language, you could do this very simply and naturally with an imperative meta-function with reflection, instead of templates (maybe this is like C#, but i digress):

struct vector {
generic* start, end;
generic* begin() {return start;}
generic* end() {return end;}
int size() {return end-start;}

type vector(type datatype) {start::type = end::type = datatype*;}

void SmartCopy(vector& output, vector& input)
if ( IsMemCopyable( typeof( *input.begin() ) ) {
memcpy(output.begin(), input.begin(), input.size());
else {
for_each(output, input) {output[i] = input[i];}

bool IsMemCopyable(type dtype) {
bool copyable (dtype.CopyConstructor == memcpy );
for_each(type.members) {
copyable &= IsMemCopyable(type.members[i]);
return copyable;

The idea is that using reflection, you can unify compile time and run-time metaprogramming into the same framework, with compile time metaprogramming just becoming an important optimization. In my pseudo-C++ syntax, the reflection is accesable through type variables, which actually represent types themselves: pods, structs, classes. Generic types are specified with the 'generic' keyword, instead of templates. Classes can be constructed simply through functions, and I added a new type of constructor, a class constructor which returns a type. This allows full metaprogramming, but all your metafunctions are still written in the same imperative language. Most importantly, the meta functions are accessible at runtime, but can be evaluated at compile time as well, as an optimization. For example, to construct a vector instantiation, you would do so explicitly, by invoking a function:

vector(float) myfloats;

Here vector(float) actually calls a function which returns a type, which is more natural than templates. This type constructor for vector assigns the actual types of the two data pointers, and is the largest deviation from C++:

type vector(type datatype) {start::type = end::type = datatype*;}

Everything has a ::type, which can be set and manipulated just like any other data. Also, anything can be made a pointer or reference by adding the appropriate * or &.

if ( IsMemCopyable(typeof( *input.begin() ) ) {

There the * is used to get past the pointer returned by begin() to the underlying data.

When the compiler sees a static instantiation, such as:
vector(float) myfloats;

It knows that the type generated by vector's type constructor is static and it can optimize the whole thing, compiling a particular instantiation of vector, just as in C++ templates. However, you could also do:

type dynamictype = figure_out_a_type();
vector(dynamictype) mystuff;

Where dynamictype is a type not known at compile time and could be determined by other functions, loaded from disk, or whatever. Its interesting to note that in this particular example, the unspecialized version is not all that much slower as the branch in the copy function is invoked only once per copy, not once per copy constructor.

My little example is somewhat contrived and admittedly simple, but the power of reflective metaprogramming can make formly complex big systems tasks mucher simpler. Take for example the construction of a game's world editor.

The world editor of a modern game engine is a very complex beast, but at its heart is a simple concept: it exposes a user interface to all of the game's data, as well as tools to manipulate and process that data, which crunch it into an optimized form that must be streamed from disk into the game's memory and thereafter parsed, decompressed, or what have you. Reflection allows the automated generation of GUI components from your code itself. Consider a simple example where you want to add dynamic light volumes to an engine. You may have something like this:

struct ConeLight {
HDRcolorRGB intensity_;
BoundedFloat(0,180) angleWidth_;
WorldPosition pos_;
Direction dir_;
TextureRef cookie_;
static HelpComment description_ = "A cone-shaped light with a projected texture."

The editor could then automatically connect a GUI for creating and manipulating ConeLights just based on analysis of the type. The presence of a WorldPosition member would allow it to be placed in the world, the Direction member would allow a rotational control, and the intensity would use an HDR color picker control. The BoundedFloat is actually a type constructor function, which sets custom min and max static members. The cookie_ member (a projected texture) would automatically have a texture locator control and would know about asset dependencies, and so on. Furthermore, custom annotations are possible through the static members. Complex data processing, compression, disk packing and storage, and so on could happen automatically, without having to write any custom code for each data type.

This isn't revolutionary, in fact our game editor and generic database system are based on similar principles. The difference is they are built on a complex, custom infrastructure that has to parse specially formatted C++ and lua code to generate everything. I imagine most big game editors have some similar custom reflection system. Its just a shame though, because it would be so much easier and more powerful if built into the language.

Just to show how powerful metaprogramming could be, lets go a step farther and tackle the potentially hairy task of a graphics pipeline, from art assets down to the GPU command buffer. For our purposes, art packages expose several special asset structures, namely geometry, textures, and shaders. Materials, segments, meshes and all the like are just custom structures built out of these core concepts. On the other hand, a GPU command buffer is typically built out of fundemental render calls which look something like this (again somewhat simplified):

error GPUDrawPrimitive(VertexShader* vshader, PixelShader* pshader, Primitive* prim, vector samplers, vector vconstants, vector pconstants);

Lets start with a simpler example, that of a 2D screenpass effect (which, these days, encompasses alot).

Since this hypothetical reflexive C language could also feature JIT compilation, it could function as our scripting language as well, the effect could be coded completely in the editor or art package if desired.

struct RainEffect : public FullScreenEffect {

function(RainPShader) pshader;

float4 RainPShader(RenderContext rcontext, Sampler(wrap) fallingRain, Sampler(wrap) surfaceRain, AnimFloat density, AnimFloat speed)
// ... do pixel shader stuf

// where the RenderContext is the typical global collection of stuff
struct RenderContext {
Sampler(clamp) Zbuffer;
Sampler(clamp) HDRframebuffer;
float curtime;
// etc ....

The 'function' keyword specifies a function object, much like a type object with the parameters as members. The function is statically bound to RainPshader in this example. The GUI can display the appropriate interface for this shader and it can be controlled from the editor by inspecting the parameters, including those of the function object. The base class FullScreenEffect has the quad geometry and the other glue stuff. The pixel shader itself would be written in this reflexive C language, with a straightforward metaprogram to actually convert that into HLSL/cg and compile as needed for the platform.

Now here is the interesting part: all the code required to actual render this effect on the GPU can be generated automatically from the parameter type information emedded in the RainPShader function object. The generation of the appropriate GPUDrawPrimitive function instance is thus just another metaprogram task, which uses reflection to pack all the samplers into the appropriate state, set the textures, pack all the float4s and floats into registers, and so on. For a screen effect, invoking this translator function automatically wouldn't be too much of a performance hit, but for lower level draw calls you'd want to instantiate (optimize) it offline for the particular platform.

I use that example because I actually created a very similar automatic draw call generator for 2D screen effects, but all done through templates. It ended up looking more like how cuda is implemented, and also allowed compilation of the code as HLSL or C++ for debugging. It was doable, but involved alot of ugly templates and macros. I built that system to simplify procedural surface operators for geometry image terrain.

But anyway, you get the idea now, and going from a screen effect you could then tackle 3D geometry and make a completely generic, data driven art pipeline, all based on reflective functions that parse data and translate or reorganize it. Some art pipelines are actually built on this principle already, but oh my wouldn't it be easier in a more advanced, reflective language.

1 comment:

DEADC0DE said...

You're wrong.

Reflection won't enable you to do anything, as you will still dynamically dispatch, using a branch is not different from using a virtual table. Even in JIT compiled languages, like Java and C#, it's unlikely that the JIT will optimize that out. That's why C# has proper generics, and Java has a something in between C++ templates and the C# stuff.

You may want to have somemore stuff. Reflection and code-generation. But then you will need a JIT too, and in the end, you'll end up designing C# again, so just use it if you want it.

The problem of templates is that they were born as generics, but they were implemented in a cheap and stupid way that left them open, incidentally, to other uses. They weren't designed to be a metaprogramming system like lisp macros or so.