Part Seven: Matrices

In part six, we changed the name of the uniform variable that modified the vertices’ Y axis to make it more descriptive, added a uniform variable that modified the vertices’ X axis, and added a line to the shader to use that new X axis uniform variable in a cos() function call.

As a result, we created a rainbow square that followed a circular path around the screen instead of just bouncing up and down. But what if we wanted it to change scale too, or rotate? We might also want some vertices to move in one direction and others to move in another, creating a little perspective in the scene. How would we do that, just continue to add uniform values?

We might be able to do that, but it wouldn’t be a good idea. There are a couple of good reasons why our current architecture is less than optimal.

First of all, there are a limited number of uniform variables that can be passed into a shader, and that limit is implementation specific. On a large operating system, you may have access to a higher number of available uniforms in your shaders than on a smaller device like the iPhone. Not a problem if you start development on an iPhone, but if you write an awesome OpenGL application for the Mac or PC with a lot of shader variables and decide to port it to the iPhone, your shaders may suddenly begin to fail.

According to the Khronos group’s ‘The OpenGL ES Shading Language’ specification 1.0.17, the minimum number of uniform vectors a vertex shader should support is 128, and the minimum number of uniform vectors that a fragment shader should support is 16.

Attribute support is even more limited, with a minimum of 8 attributes in a vertex shader. When I say ‘a minimum of 8’, that means that the implementation of OpenGL ES is free to support more than 8, but it must support at least 8.

The second reason our current architecture could use some improvement is performance. Since this vertex shader will be running once for every vertex defined in the model(s) being drawn, it should do as little processing as possible to run at the fastest speed possible.

If you think about what our sin() and cos() functions are doing in the vertex shader, they really don’t need to be there. Every time the shader runs during that render process, they return the same results since they’re being fed a uniform value that doesn’t change during the entire rendering process.

Right now it doesn’t seem like a big deal because we only have 4 vertices, but what if we had 4 million? One solution would be to calculate the sin() and cos() values in the main program and pass those in to the shader instead of having the shader do the calculations, since the program calling the shaders would only have to perform the calculations once regardless of the number of vertices being processed.

That is part of the solution, but what if we could address that first problem at the same time? We can, through the use of matrices. A matrix as used by OpenGL is a 4 by 4 array (or matrix) of numbers that holds values in different positions of that matrix that OpenGL can use to quickly and efficiently perform scaling, translations, and rotations of vertex data when multiplied by those matrices.

Instead of continuing to add uniform variables to my shader, I can pack all of that information into a matrix and just pass that in for everything.

Under OpenGL ES 1.1, OpenGL took care of matrix management for you, allowing you to change modes and apply changes to the appropriate matrices through the use of convenience functions.

Let’s look at that OpenGL ES 1.1 code from our sample program again.

        glMatrixMode(GL_PROJECTION);
        glLoadIdentity();
        glMatrixMode(GL_MODELVIEW);
        glLoadIdentity();
        glTranslatef(0.0f, (GLfloat)(sinf(transY)/2.0f), 0.0f);
        transY += 0.075f;
       
        glVertexPointer(2, GL_FLOAT, 0, squareVertices);
        glEnableClientState(GL_VERTEX_ARRAY);
        glColorPointer(4, GL_UNSIGNED_BYTE, 0, squareColors);
        glEnableClientState(GL_COLOR_ARRAY);

Remember this code? The first glMatrixMode() function made the projection matrix active and the following glLoadIdentity() function loaded an identity matrix into it.

Then the next glMatrixMode() function call made the modelview matrix active and was also followed by a glLoadIdentity() function.

What’s with the glLoadIdentity() function calls?

As a programmer, when you allocate memory to use in your programs, you typically memset() the memory chunk to zeros so you don’t accidentally get any trash data in there that might mess up your code later.

The glLoadIdentity() has a similar function, it loads an identity matrix so that any changes you make to the matrix will work properly when multiplied against another matrix. In addition, any matrix multiplied against an identity matrix resolves to itself, so even if you don’t make changes to the matrix, like the processing for the GL_PROJECTION mode, nothing will happen when OpenGL tries to use that matrix to change the projection during the fixed pipeline render step (remember, this is OpenGL ES 1.1 code we’re looking at now).

We’ll go deeper into matrices soon, but because I know you’re wondering, here is what an identity matrix looks like.

 

1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

 

Let’s look at that OpenGL ES 1.1 code again.

        glMatrixMode(GL_MODELVIEW);
        glLoadIdentity();
        glTranslatef(0.0f, (GLfloat)(sinf(transY)/2.0f), 0.0f);
        transY += 0.075f;

This logic makes the modelview matrix active, loads an identity matrix, then uses the glTranslatef() function to set a translation value for the Y coordinate. That’s just OpenGL speak for ‘tells the render engine how far to move the Y coordinate of the vertex when it’s drawn’. If this value is positive, it’ll move the vertex that far up the Y axis, and if it’s negative, it’ll move it that far down the Y axis.

The last line of code in this snippet just increments the transY variable for next time we run through here.

But what about OpenGL ES 2.0, none of these function exist any longer – why not?

In OpenGL ES 2.0, the parts of the fixed pipeline that used the data from these function calls has been replaced by the shaders, so now we have to feed all of this information into the new programmable pipeline (the shaders) ourselves.

Additionally, the projection matrix and modelview matrix can be combined, which is what the ES 1.1 fixed pipeline was doing anyway, and fed into the vertex shader as a model-view-projection matrix.

At a high level, here’s how the functionality of a model-view-projection matrix breaks down.

The ‘model’ part means the matrix will contain the data needed to manipulate your model at render time. It may have values to scale the model up or down, rotate it on the X, Y, or Z axis (or any combination), and move it around in OpenGL rendering space.

The ‘view’ part means the matrix will be adjusted to take the position of the ‘camera’ into account, or the point in OpenGL space from which the observer would be looking in from. Since there isn’t a real ‘camera’ to manipulate, what the view matrix really does is adjust the positions of everything else to make it look like it would if it were being viewed from that phantom ‘camera’ location.

The ‘projection’ part means that the matrix will be adjusted to take the camera ‘lens’ into account, and will adjust for the observer’s field of view. In the real world, large objects appear to get smaller as they move farther away, and disappear when they leave your field of vision. Having the right values in your projection matrix will accomplish all of that in your rendered scene for you.

We can calculate all of that information in our program, put it in a single matrix, and feed it into the shader through a uniform. After that, no matter how many vertices the shader processes, it only has to do one computation, a matrix multiplication operation, and all of that information is applied to the vertex being processed.

Graphics cards are very good at that kind of math work, so performance will be great.

The best way to understand is by doing, so lets convert our sample program from using separate X and Y uniform variable to a single model-view-projection matrix with model translation values loaded into it.

Part Six | Index | Part Eight