World, View and Projection Matrix Unveiled
In order of understanding how geometry is displayed on your computer screen, may it be 2d or 3d, one needs to understand what math people came up with to simplify our life.
Basically, every computer program that displays some kind of geometry/object on your screen probably uses these tricks. In 3d, without this, we would still have only very simple graphics.
We shall not write tens of pages explaining how this math actually works, we shall do a visual presentation of it. All modern graphics hardware, game engines, DirectX, OpenGL already have encoded helper functions for using the matrix. Once you understand what these are, you will simply use these functions in order to have the desired effect. You can also check out the
matrix examples here on this page for a more easy understanding.
Note: all drawings use an X and Z scale, as we would look on the scene from up. 2d presentations added for the sake of simplicity.
After wasting a big part of the Amazon forests and a substantial amount of graphite (I believe, some coffee was also involved in the process), mathematicians came up with a simple, handy, yet genius formula for effectively enslaving all kinds of objects into simulated computer worlds. The purpose was to move objects around, rotate them, stretch, and still be able to do that very fast, even if the objects consists of ten, hundred thousand vertexes. The project was a big success, and the solution is shared for all humanity to benefit and enjoy. Behold! The
4×4 homogeneous matrix:
As you see, it consist of 16 float point/real numbers. Every section of it serves a different purpose. Data filled into the corresponding fields creates a matrix that you may use on any object. Don’t worry, you don’t have to fill in data yourself. All the platforms have ways of creating matrices.
The point is, when you multiply a matrix with a 3d point in space the result is the vertex transformed by values of the matrix.
While explaining this, I will mention the
. It is filled with default data, so when you multiply vertexes with this matrix, nothing will happen. Still, this is important, because if you need a
, you should first create and identity matrix (or have one filled with identity values), and then scale, rotate and move it. It also depends what platform are you using: some automatically add the identity values, some not. Check your documentation.
The identity matrix looks like this:
It may confuse you that the rotation part is diagonally filled with 1′s and other are zeros. The reason for this is that in the rotation part you feed in the directions of the 3 axises that will rotate your input accordingly: row one is the X axis, row two is the Y axis and row three is the Z axis. In the given example, row one points to the positive X direction, row two to positive Y direction (up), and the third points in the direction we are looking at: these match the axes on your scene, so no rotation is applied. Other fields are zero, because we do not modify anything.
If you are using a commercial game engine, a loaded object, or a created one, usually already receives the matrix of it’s parent, so you do not have to think about these a lot. If you are programming in DirectX, OpenGL or you are creating your own grahics engine you need to handle the matrices for yourself.
When you create an object in your favorite 3d art creation program, the object itself is in object space. If you created it on coordinates e.g. x=10, z=10 these numbers are put in the object’s world transformation matrix. The object itself is
. Now, if you rotate it clockwise for 30 degrees, that one – you guessed – will also be put in the object’s
. I probably should not say that scaling is handled the same way. Before displaying the created object is multiplied with it’s transformation matrix (also known as
) and then presented.
To be more clear, let’s take a look to the next picture:
This what I explained is VERY handy! Every transformation you do while creating the object, or later in your 3d engine, is in the
world transformation matrix
, and the object itself
moves. This is far better than moving/rotating/stretching every point of the object on the scene, specially if it has a lot of vertexes. Also, after a lot of operation on the object, as the precision of the float point values used in 3d math is not perfect, the object might (would) become distorted as errors are introduced in every operation you do. This never happens with the world transform matrix, as the object never moves actually. And, yes –
there is no spoon
is also known as the
. It transforms the entire world space into camera space. Similarly, as the world matrix transform puts the object form object space to world space, the view matrix transforms the space so the view space’s right points in the X direction, it’s top points to the Y direction, and the look direction becomes the positive z axis (if it’s a left hand matrix).
I’ll add that the view matrix is actually the inverse of the camera matrix, because it transforms back the entire wold to it’s local coordinate system.
Now, you may think that with world and view transform matrix applied, you can already present your object on your screen, what is true, but there would be no perspective. The view matrix only transforms the objects to the camera’s space. What we additionally need is the:
is based on near and far view distance, angle of the view of the camera and your screen resolution proportion. When you create a camera, you just feed it with the data, and everything is a go. Please consult your platform user manual to get the command for creating it.
On the left is our object in world and view space, in which case we have no perspective – all the objects would be projected to the screen in a parallel fashion. On the right side, we introduce the projection matrix, which takes care of our perspective: the closer the object, larger it becomes, and vice versa. The X and the Y (Y here not visible, for the sake of simplification) axis becomes the surface of the computer display, and the Z axis is used to modify the size of the object. The white lines are the left and the right side of the display, near clipping is the closest point and the far clipping is the furthest plane limiting our projection space.
In order to display an object in 3d, we concluded that we need to have three matrices:
. Another handy thing is that we can multiply all these matrices only once and create a combined World-View-Projection matrix. Now, we only have to operate the vertexes of the objects with this combined matrix, and we already have a position on the screen! Simple!
The matrix is produced by:
matrix4x4 World_View_Projection_Matrix =
It is important to keep that order of multiplication
, otherwise it will get messy.
In this case x means matrix multiply. Refer to your platform documentation for details about matrix operators.
For using vertex/geometry/pixel shaders one needs to understand the subject we presented here – and that is what gives you all the power with computer graphics – you can customize/enhance it further!