2012年4月25日 星期三


World, View and Projection Matrix Unveiled

In order of understanding how geometry is displayed on your computer screen, may it be 2d or 3d, one needs to understand what math people came up with to simplify our life.

Basically, every computer program that displays some kind of geometry/object on your screen probably uses these tricks. In 3d, without this, we would still have only very simple graphics.
We shall not write tens of pages explaining how this math actually works, we shall do a visual presentation of it. All modern graphics hardware, game engines, DirectX, OpenGL already have encoded helper functions for using the matrix. Once you understand what these are, you will simply use these functions in order to have the desired effect. You can also check out the matrix examples here on this page for a more easy understanding.
Note: all drawings use an X and Z scale, as we would look on the scene from up. 2d presentations added for the sake of simplicity.
The Matrix

After wasting a big part of the Amazon forests and a substantial amount of graphite (I believe, some coffee was also involved in the process), mathematicians came up with a simple, handy, yet genius formula for effectively enslaving all kinds of objects into simulated computer worlds. The purpose was to move objects around, rotate them, stretch, and still be able to do that very fast, even if the objects consists of ten, hundred thousand vertexes. The project was a big success, and the solution is shared for all humanity to benefit and enjoy. Behold! The 4×4 homogeneous matrix:
The Matrix 4x4 presentation
As you see, it consist of 16 float point/real numbers. Every section of it serves a different purpose. Data filled into the corresponding fields creates a matrix that you may use on any object. Don’t worry, you don’t have to fill in data yourself. All the platforms have ways of creating matrices.
The point is, when you multiply a matrix with a 3d point in space the result is the vertex transformed by values of the matrix.
While explaining this, I will mention the identity matrix. It is filled with default data, so when you multiply vertexes with this matrix, nothing will happen. Still, this is important, because if you need a transform matrix, you should first create and identity matrix (or have one filled with identity values), and then scale, rotate and move it. It also depends what platform are you using: some automatically add the identity values, some not. Check your documentation.
The identity matrix looks like this:
The Identity Matrix
It may confuse you that the rotation part is diagonally filled with 1′s and other are zeros. The reason for this is that in the rotation part you feed in the directions of the 3 axises that will rotate your input accordingly: row one is the X axis, row two is the Y axis and row three is the Z axis. In the given example, row one points to the positive X direction, row two to positive Y direction (up), and the third points in the direction we are looking at: these match the axes on your scene, so no rotation is applied. Other fields are zero, because we do not modify anything.
If you are using a commercial game engine, a loaded object, or a created one, usually already receives the matrix of it’s parent, so you do not have to think about these a lot. If you are programming in DirectX, OpenGL or you are creating your own grahics engine you need to handle the matrices for yourself.
World Matrix

When you create an object in your favorite 3d art creation program, the object itself is in object space. If you created it on coordinates e.g. x=10, z=10 these numbers are put in the object’s world transformation matrix. The object itself is never moved. Now, if you rotate it clockwise for 30 degrees, that one – you guessed – will also be put in the object’s transformation matrix. I probably should not say that scaling is handled the same way. Before displaying the created object is multiplied with it’s transformation matrix (also known as world matrix) and then presented.
To be more clear, let’s take a look to the next picture:
Object and World Transform/Space
This what I explained is VERY handy! Every transformation you do while creating the object, or later in your 3d engine, is in the world transformation matrix, and the object itself never moves. This is far better than moving/rotating/stretching every point of the object on the scene, specially if it has a lot of vertexes. Also, after a lot of operation on the object, as the precision of the float point values used in 3d math is not perfect, the object might (would) become distorted as errors are introduced in every operation you do. This never happens with the world transform matrix, as the object never moves actually. And, yes – there is no spoon. :)
View Matrix

The view matrix is also known as the camera matrix. It transforms the entire world space into camera space. Similarly, as the world matrix transform puts the object form object space to world space, the view matrix transforms the space so the view space’s right points in the X direction, it’s top points to the Y direction, and the look direction becomes the positive z axis (if it’s a left hand matrix).
Object Space and World Space
I’ll add that the view matrix is actually the inverse of the camera matrix, because it transforms back the entire wold to it’s local coordinate system.
Now, you may think that with world and view transform matrix applied, you can already present your object on your screen, what is true, but there would be no perspective. The view matrix only transforms the objects to the camera’s space. What we additionally need is the:
Projection Matrix

The projection matrix is based on near and far view distance, angle of the view of the camera and your screen resolution proportion. When you create a camera, you just feed it with the data, and everything is a go. Please consult your platform user manual to get the command for creating it.
View and Projection Matrix
On the left is our object in world and view space, in which case we have no perspective – all the objects would be projected to the screen in a parallel fashion. On the right side, we introduce the projection matrix, which takes care of our perspective: the closer the object, larger it becomes, and vice versa. The X and the Y (Y here not visible, for the sake of simplification) axis becomes the surface of the computer display, and the Z axis is used to modify the size of the object. The white lines are the left and the right side of the display, near clipping is the closest point and the far clipping is the furthest plane limiting our projection space.
Conclusion
In order to display an object in 3d, we concluded that we need to have three matrices: world, view andprojection. Another handy thing is that we can multiply all these matrices only once and create a combined World-View-Projection matrix. Now, we only have to operate the vertexes of the objects with this combined matrix, and we already have a position on the screen! Simple!
The matrix is produced by:
matrix4x4 World_View_Projection_Matrix = World x View x Projection
It is important to keep that order of multiplication, otherwise it will get messy. :) In this case x means matrix multiply. Refer to your platform documentation for details about matrix operators.
For using vertex/geometry/pixel shaders one needs to understand the subject we presented here – and that is what gives you all the power with computer graphics – you can customize/enhance it further!
Happy coding!

AAM

AAM的思想最早可以追溯到1987年kass等人提出的snake方法,主要用於邊界檢定與圖像分割。該方法用一條由n個控制點組成的連續閉合曲線作為snake模型,再用一個能量函數作為匹配度的評價函數,首先將模型設定在目標對象預估位置的周圍,再通過不斷迭代使能量函數最小化,當內外能量達到平衡時即得到目標對象的邊界與特征。 1989年yuille等人此提出使用參數化的可變形模板來代替snake模型,可變形模板概念的提出為aam的產生奠定了理論基礎。 1995年cootes等人提出的asm演算法是aam的直接前身,asm採用參數化的採樣形狀來構成對象形狀模型,並利用pca方法建立描述形狀的控制點的運動模型,最後利用一組參數來控制形狀控制點的位置變化從而逼近當前對象的形狀,該方法只單純利用對象的形狀,因此準確率不高. 1998年,cootes等人在asm演算法的基礎上首先提出aam,與asm的不同之處是他不僅利用了對象的形狀信息而且利用了對象的紋理信息