World, View and
Projection Matrix Unveiled

In
order of understanding how geometry is displayed on your computer screen, may
it be 2d or 3d, one needs to understand what math people came up with to
simplify our life.

Basically,
every computer program that displays some kind of geometry/object on your
screen probably uses these tricks. In 3d, without this, we would still have
only very simple graphics.

We
shall not write tens of pages explaining how this math actually works, we shall
do a visual presentation of it. All modern graphics hardware, game engines,
DirectX, OpenGL already have encoded helper functions for using the matrix.
Once you understand what these are, you will simply use these functions in
order to have the desired effect. You can also check out the matrix examples here on
this page for a more easy understanding.

**Note: all drawings use an X and Z scale, as we would look on the scene from up. 2d presentations added for the sake of simplicity.**

The
Matrix

After
wasting a big part of the Amazon forests and a substantial amount of graphite
(I believe, some coffee was also involved in the process), mathematicians came
up with a simple, handy, yet genius formula for effectively enslaving all kinds
of objects into simulated computer worlds. The purpose was to move objects
around, rotate them, stretch, and still be able to do that very fast, even if
the objects consists of ten, hundred thousand vertexes. The project was a big
success, and the solution is shared for all humanity to benefit and enjoy.
Behold! The

**4×4 homogeneous matrix:**
As
you see, it consist of 16 float point/real numbers. Every section of it serves
a different purpose. Data filled into the corresponding fields creates a matrix
that you may use on any object. Don’t worry, you don’t have to fill in data
yourself. All the platforms have ways of creating matrices.

The
point is, when you multiply a matrix with a 3d point in space the result is the
vertex transformed by values of the matrix.

While
explaining this, I will mention the

**identity matrix**. It is filled with default data, so when you multiply vertexes with this matrix, nothing will happen. Still, this is important, because if you need a**transform matrix**, you should first create and identity matrix (or have one filled with identity values), and then scale, rotate and move it. It also depends what platform are you using: some automatically add the identity values, some not. Check your documentation.
The
identity matrix looks like this:

It
may confuse you that the rotation part is diagonally filled with 1′s and other
are zeros. The reason for this is that in the rotation part you feed in the
directions of the 3 axises that will rotate your input accordingly: row one is
the X axis, row two is the Y axis and row three is the Z axis. In the given
example, row one points to the positive X direction, row two to positive Y
direction (up), and the third points in the direction we are looking at: these
match the axes on your scene, so no rotation is applied. Other fields are zero,
because we do not modify anything.

If
you are using a commercial game engine, a loaded object, or a created one,
usually already receives the matrix of it’s parent, so you do not have to think
about these a lot. If you are programming in DirectX, OpenGL or you are
creating your own grahics engine you need to handle the matrices for yourself.

World
Matrix

When
you create an object in your favorite 3d art creation program, the object
itself is in object space. If you created it on coordinates e.g. x=10, z=10 these
numbers are put in the object’s world transformation matrix. The object itself
is

*never moved*. Now, if you rotate it clockwise for 30 degrees, that one – you guessed – will also be put in the object’s**transformation matrix**. I probably should not say that scaling is handled the same way. Before displaying the created object is multiplied with it’s transformation matrix (also known as**world matrix**) and then presented.
To
be more clear, let’s take a look to the next picture:

This
what I explained is VERY handy! Every transformation you do while creating the
object, or later in your 3d engine, is in the

*world transformation matrix*, and the object itself*never*moves. This is far better than moving/rotating/stretching every point of the object on the scene, specially if it has a lot of vertexes. Also, after a lot of operation on the object, as the precision of the float point values used in 3d math is not perfect, the object might (would) become distorted as errors are introduced in every operation you do. This never happens with the world transform matrix, as the object never moves actually. And, yes –*there is no spoon*.
View
Matrix

The

**view matrix**is also known as the**camera matrix**. It transforms the entire world space into camera space. Similarly, as the world matrix transform puts the object form object space to world space, the view matrix transforms the space so the view space’s right points in the X direction, it’s top points to the Y direction, and the look direction becomes the positive z axis (if it’s a left hand matrix).
I’ll
add that the view matrix is actually the inverse of the camera matrix, because
it transforms back the entire wold to it’s local coordinate system.

Now,
you may think that with world and view transform matrix applied, you can
already present your object on your screen, what is true, but there would be no
perspective. The view matrix only transforms the objects to the camera’s space.
What we additionally need is the:

Projection
Matrix

The

**projection matrix**is based on near and far view distance, angle of the view of the camera and your screen resolution proportion. When you create a camera, you just feed it with the data, and everything is a go. Please consult your platform user manual to get the command for creating it.
On
the left is our object in world and view space, in which case we have no
perspective – all the objects would be projected to the screen in a parallel
fashion. On the right side, we introduce the projection matrix, which takes
care of our perspective: the closer the object, larger it becomes, and vice
versa. The X and the Y (Y here not visible, for the sake of simplification)
axis becomes the surface of the computer display, and the Z axis is used to
modify the size of the object. The white lines are the left and the right side
of the display, near clipping is the closest point and the far clipping is the
furthest plane limiting our projection space.

Conclusion

In
order to display an object in 3d, we concluded that we need to have three
matrices:

*world*,*view*and*projection*. Another handy thing is that we can multiply all these matrices only once and create a combined World-View-Projection matrix. Now, we only have to operate the vertexes of the objects with this combined matrix, and we already have a position on the screen! Simple!
The
matrix is produced by:

matrix4x4
World_View_Projection_Matrix =

**x***World***x***View**Projection***It is important to keep that order of multiplication**, otherwise it will get messy. In this case x means matrix multiply. Refer to your platform documentation for details about matrix operators.

For
using vertex/geometry/pixel shaders one needs to understand the subject we
presented here – and that is what gives you all the power with computer
graphics – you can customize/enhance it further!

Happy
coding!