C. Vectors and Matrices
Vectors
Before I go into what a vector is, I’ll first tell you what it isn’t. Generally, you divided physical quantities into scalars and vectors. A scalar gives the magnitude of a quantity. It’s a single number, like the ones you use every day. Mass, energy and volume are examples of scalars. A vector is something with both a magnitude and direction, and is usually represented by multiple numbers: one for every dimension. Position, momentum and force are prime examples. Also, note that velocity is a vector, while speed is not. 50 kph is not a vector. 50 kph down Highway 60 is. The notation of a vector is as a bold character, usually lowercase, and either as a set of numbers enclosed by parentheses, u = (1, 4, 9), or as an M×1 column. And yes, I do mean a column, not a row; we’ll see why when we get to matrices.
(C.1)  $$\begin{array}{c}\mathbf{\text{u}}\equiv \left[\begin{array}{c}{u}_{1}\\ :\\ {u}_{m}\end{array}\right]\equiv ({u}_{1},\cdots ,{u}_{m})\not\equiv \left[\begin{array}{ccc}{u}_{1}& \cdots & {u}_{m}\end{array}\right]\end{array}$$ 
Fig C.1: the difference between vectors and points.
If you have a coordinate system, vectors are usually used to represent a spatial point in that system, with vectors’ elements as the coordinates. However, there is a crucial difference between points and vectors. Points are always related to an origin, while vectors can are independent of any origin. Fig C.1 on the right illustrates this. You have points P and Q, and vectors u, v, w. Vectors u and v are equal (they have equal lengths and directions). However, while u and the point it points to (P) have the same coordinates, this isn’t true for v and Q. In fact, Q = u + w. And, to be even more precise, Q = *O* + u + w, which explicitly states the origin (O) in the equation.
Vector operations
Vector operations are similar to scalar operations, but the multidimensionality does add some complications, especially in the case of multiplications. Note that there are no less than three ways of vectormultiplication, so pay attention. On the right you can see examples of vector addition and scalarvector multiplication. u= (8, 3), v= (4 ,4). With the definitions of the operations given below, you should be able to find the other vectors.
Vectorvector addition and subtraction
Fig C.2: vector addition and scalarvector multiplication.
When it comes to addition and subtraction, both operands must be Mdimensional vectors. The result is another vector, also Mdimensional, which elements are the sum or difference of the operands’ elements: with w = u + v we have w_{i} = u_{i} + v_{i}.
(C.2)  $$\begin{array}{c}\mathbf{\text{w}}=\mathbf{\text{u + v}}\equiv \left[\begin{array}{c}{u}_{1}+{v}_{1}\\ :\\ {u}_{m}+{v}_{m}\end{array}\right]\end{array}$$ 
Scalarvector multiplication
This is the first of the vector multiplications. If you have a scalar a and a vector u, the elements of resultant vector after scalarvector multiplication are the original elements, each multiplied with the scalar. So if v = c u, then v_{i}= c*u_{i}. Note that u and v lie on the same line – only the length is different. Also, note that subtraction can also be written as w = u − v = u + (−1)*v.
(C.3)  $$\begin{array}{c}\mathbf{\text{v}}=c\mathbf{\text{u}}\equiv \left[\begin{array}{c}c\cdot {u}_{1}\\ :\\ c\cdot {u}_{m}\end{array}\right]\end{array}$$ 
The dotproduct (aka scalar product)
The second vectormultiplication is the dotproduct, which has two vectors as input, but a scalar as its output. The notation for this is c = u · v, where u and v are vectors and c is the resultant scalar. Note the operator is in the form of a dot, which gives this type of multiplication its name. To do the dotproduct, multiply the elements of both vectors piecewise and add them all together. In other words:
(C.4)  $$\begin{array}{c}c=\mathbf{\text{u}}\cdot \mathbf{\text{v}}=\sum _{}^{}{u}_{i}\cdot {v}_{i}={u}_{1}\cdot {v}_{1}+\cdots +{u}_{m}\cdot {v}_{m}\end{array}$$ 
Now, this may seem like a silly operation to have, but it’s actually very useful. For one thing, the length of the vector is calculated via a dotproduct with itself. But you can also find the projection of one vector onto another with the dotproduct, which is invaluable when you try to decompose vectors in terms of other vectors or determine the basevectors of an Mdimensional space (do what to the whaaat?!? Don’t worry, I’ll explain later). One of the most common uses of the dotproduct is finding the angle between two vectors. If you have vectors u and v, u and v their lengths and α the angle between the two, the cosine can be found via
Fig C.3: dot product.
(C.5)  $$\begin{array}{c}\mathrm{cos}\left(\alpha \right)=\frac{\mathbf{\text{u}}\cdot \mathbf{\text{v}}}{\mathbf{\text{u}}\text{}\mathbf{\text{v}}}\end{array}$$ 
Why does this work? Well, you can prove it in a number of ways, but here’s is the most elegant (thanks Ash for reminding me). Remember that the square of the length of a vector is given by the dotproduct with itself. This means that v−u^{2} = v^{2} + u^{2} − 2·u·v. From the cosine rule for the triangle in fig C.3, we also have v−u^{2} = v^{2} + u^{2} − 2·v·u cos(α). Combined, these relations immediately result in eq C.5. And people say math is hard.
By the way, not only can you find the angle with this, but it also provides a very simply way to see if something’s behind you or not. If u is the lookingdirection and v the vector to an object, u · v is negative if the angle is more than 90°. It’s also useful for fieldofview checking, and to see if vectors are perpendicular, as u · v = 0. You also find the dotproduct by the truckload in physics when yo do things like force decomposition, and pathintegrals over force to find the potential energy. Basically, every time you find a cosine in an equation in physics, it’s probably the result of a dotproduct.
The crossproduct (aka vectorproduct)
The cross product is a special kind of product that only works in 3D space. The crossproduct takes two vectors u and v and gives the vector perpendicular to both, w, as a result. The length of w is the area spanned by the two operand vectors. The notation for it is this: w = u × v, which is why it’s called the crossproduct. The elements of w are w_{i}= ε_{ijk}·u_{j}· v_{k}, where ε_{ijk} is the LeviCevita symbol ( +1 for even permutations of i,j,k, −1 for odd permutations, and 0 if any of the indices are equal). Since you’ve probably never even seen this thing (for your sanity, keep it that way), it’s written down in full in eq C.4.
Fig C.4: cross product.
(C.6)  $$\begin{array}{c}\mathbf{\text{w}}=\mathbf{\text{u}}\text{}\times \text{}\mathbf{\text{v}}\equiv \left[\begin{array}{c}{u}_{y}{v}_{z}{u}_{z}{v}_{y}\\ {u}_{z}{v}_{x}{u}_{x}{v}_{z}\\ {u}_{x}{v}_{y}{u}_{y}{v}_{x}\end{array}\right]\end{array}$$ 
In fig C.4 you can see a picture of what the crossproduct does; it’s a 3D picture, so you have to use your imagination a bit. Vectors u and v define a parallelogram (in yellow). The crossproduct vector w is perpendicular to both of these, a fact that follows from u·w and v·w. The length of w is the area of this parallelogram, A and if you remember your areacalculations, you’ll realize that
(C.7)  $$\begin{array}{c}A=\mathbf{\text{u}}\text{}\times \text{}\mathbf{\text{v}}=\mathbf{\text{u}}\cdot \mathbf{\text{v}}\cdot \mathrm{sin}\left(\alpha \right)\end{array}$$ 
meaning that you can find the sine of the angle between two vectors with the crossproduct. Note that the crossproduct is that it is anticommutative! That means that u × v = −v × u. Notice the minus sign? This actually brings up a good point: the plane defined by u and v, the normal vector to this plane is pointing up; but how do you determine what ‘up’ is? What I usually do is take a normal 3D coordsystem (like the one in the lowerright part of fig C.4), put the xaxis on u, rotate till the yaxis is along v (or closest to it), and then w will be along the zaxis. Eq C.6 has all of this sorted out already. I do need a righthanded system for this, though, a lefthanded one messes up my mind so bad.
Now, when the vectors are parallel, u × v = 0, which means that w is the nullvector 0. It also means that if u is your view direction, the object with vector v is deadcenter in your sights. However, if u is the velocity of a rocket and v is the relative vector to you, prepare to respawn. Basically, whereas the dotproduct tells you whether an object is in front or behind (along the tangent), the crossproduct gives you the offset from center (the normal). Very useful if you ever want to implement something like red shells (and by that I mean the original SMK red shells, not the wussy instanthoming shells in the later Mario Karts, booo!!). The crossproduct also appears abundantly in physics in things like angular momentum (L = r × p) and magnetic induction.
That was the 3D case, but the crossproduct is also useful for 2D. Everything works exactly the same, except that you only need the zcomponent of w.
The norm (or length)
I’ve used this already a couple of times but never actually defined what the length of a vector is. The norm of vector u is defined as the square root of the dotproduct with itself, see eq C.8. The ageold Pythagorean Theorem is just the special case for 2D.
The length or norm of a vector is a useful thing to have around. Actually, you often start with the length and use the sine and cosine to decompose the vector in x and y components. A good example of this is the speed. One other thing where the length plays a role is in the creation of unitvectors, which have length 1. Many calculations require the length in some way, but if that’s one, you won’t have to worry about that anymore. To create a normal vector, simply define it by its length: û = u / u.
(C.8)  $$\begin{array}{c}\mathbf{\text{u}}=\sqrt{(\mathbf{\text{u}}\cdot \mathbf{\text{u}})}=(\sum _{}^{}{u}_{i}^{2}{)}^{\frac{1}{2}}\end{array}$$ 
Algebraic properties of vectors
What follows is a list of algebraic properties of vectors. Most will seem obvious, but you need to see them at least once. Take directly from my linear algebra textbook: let u, v, w be Mdimensional vectors and c and d scalars, then:
u + v = v + u  Commutativity 
(u + v) + w = u + (v + w)  Associativity 
u + 0 = 0 + u = u  
u + (−u) = −u + u = 0  where −u denotes (−1)u 
c·(u + v) = c·u + c·v  Distributivity 
(c + d)·u = c·u d·u  Distributivity 
c·(d·u) = (c·d)·u  Associativity 
1·u = u 
And on the products:
u · (v + w) = (u + v) · w  
u · (c·v) = (c·u) · v = c·(u · v)  
u × v = −(u × v)  Anticommutativity 
u × (v + w) = u × v + u × w  
(u + v) × w = u × w + v × w  
u × (c·v) = c·u × v = (c·u) × v  
u · (u × v) = 0  
u · (v × w) = (u × v) · w  Triple scalar product, gives the volume of parallelepiped defined by u, v, w. 
u × (v × w) = u(v · w) − w(u · v)  Triple vector product 
Matrices
In a nutshell, a matrix is a 2dimensional grid of numbers. They were initially used as shorthand to solve a system of linear equations. For example, the system using variables x, y, z:
(C.9a)  $$\begin{array}{cccccccc}& x& & 2y& +& z& =& 0\\ & & & 2y& & 8z& =& 8\\ & 4x& +& 5y& +& 9z& =& 9\end{array}$$ 
can be written down more succinctly using matrices as:

or 

Eq C.9b is called the coefficient matrix, in which only the coefficients of the variables are written down. The augmented matrix (eq C.9c) also contains the righthand side of the system of equations. Note that the variables themselves are nowhere in sight, which is more or less the point. Mathematicians are the laziest persons in the world, and if there’s a shorthand to be exploited, they will use it. If there isn’t, they’ll make one up.
Anyway, a matrix can be divided into rows, which run horizontally, or columns, which run vertically. A matrix is indicated by its size: an M×N matrix has M rows and N columns. Note that the number of rows comes first; this in contrast to image sizes, where width is usually given first. Yeah I know, that sucks, but there’s not a lot I can do about that. The coefficient matrix of eq C.9b is a 3x3 matrix, and the augmented matrix of eq C.9c is 3x4. The whole matrix itself is usually indicated by a bold, capital; the columns of a matrix are simply vectors (which were M×1 columns, remember?) and will be denoted as such, with a single index for the columnnumber; the elements of the matrix will be indicated by a lowercase (italic) letter with a double index.
(C.10)  $$\begin{array}{c}\mathbf{\text{A}}=\left[\begin{array}{ccc}{\mathbf{\text{a}}}_{1}& \cdots & {\mathbf{\text{a}}}_{n}\end{array}\right]\equiv \left[\begin{array}{ccc}{\mathbf{\text{a}}}_{11}& \cdots & {\mathbf{\text{a}}}_{1n}\\ \vdots & & \vdots \\ {\mathbf{\text{a}}}_{m1}& \cdots & {\mathbf{\text{a}}}_{mn}\end{array}\right]\end{array}$$ 
Most computer languages also have the concept of matrices, only they don’t always agree in how the things are ordered. Indexing in Visual Basic and C, for example, is rowbased, just like eq C.10 is. Fortran, on the other hand, is vectorbased, so the indices need to be reversed. Thanks to the C’s pointertype, you can also access a matrix as an array.
mat(i, j) // VB matrix
mat[i][j] // C matrix
mat[i+N*j] // C matrix, in array form
mat(j, i) // Fortran matrix
Let’s return to (eq C.9) for a while, if we use x = (x, y, z), b= (0, 8, −9), and A for the coefficient matrix, we can rewrite (eq C.9a) to
(C.9d)  $$\begin{array}{c}{a}_{1}\cdot x\text{}+\text{}{a}_{2}\cdot y\text{}+\text{}{a}_{3}\cdot z=\mathbf{\text{b}}=\mathbf{\text{A}}\cdot \mathbf{\text{x}}\end{array}$$ 
I’ve used the columnvector notation on the left of b, and the full matrix notation on the right. You will do well to remember this form of equation, as we’ll see it later on as well. And yes, that’s a matrixmultiplication on the righthand side there. Although I haven’t given a proper definition of it yet, this should give you some hints.
Matrix operations
Transpose
To transpose a matrix is to mirror is across the diagonal. It’s a handy thing to have around at times. The notation for the transpose is a superscript uppercase ‘T’, for example, B = A^{T}. If A is an M×N matrix, its transpose B will be N×M, with the elements b_{ij} = a_{ji}. Like I said, mirror it across the diagonal. The diagonal itself will, of course, be unaltered.
Matrix addition
Matrix addition is much like vector addition, but in 2 dimensions. If A, B, C are all M×N matrices and C = A + B, then the elements of C are c_{ij} = a_{ij} + b_{ij}. Subtraction is no different, of course.
Matrix multiplication
Aahhh, and now things are getting interesting. There are a number of rules to matrix multiplication, which makes it quite tricky. For our multiplication, we will use C = A · B. The thing is that the number of columns of the first operand (A) must equal the number of rows of the second (B). So if A is a p×q matrix, B should be a q×r matrix. The size of C will then be p×r. Now, the elements of C are given by
(C.11)  $$\begin{array}{c}{c}_{ij}\equiv \sum _{k}^{}{a}_{ik}\cdot {b}_{kj}\end{array}$$ 
In other words, you take row i of A, column j of B and take their dotproduct. k in eq C.11 is the summationindex for this dotproduct. This is also the reason why the columns of A and the rows of B must be of equal size; if not you’ll have a loose end at either vector. Another way of looking at it is this: The whole of A forms the coefficient matrix of a linear system, similar to that of eq C.9b. The columns of B are all vectors of variables which, when processed by the linear system, gives the columns of C:
(C.12)  $$\begin{array}{c}\mathbf{\text{C}}=\mathbf{\text{A}}\cdot \mathbf{\text{B}}\equiv \mathbf{\text{A}}\left[\begin{array}{ccc}{\mathbf{\text{b}}}_{1}& \cdots & {\mathbf{\text{b}}}_{r}\end{array}\right]\equiv \left[\begin{array}{ccc}\mathbf{\text{A}}\cdot {\mathbf{\text{b}}}_{1}& \cdots & \mathbf{\text{A}}\cdot {\mathbf{\text{b}}}_{r}\end{array}\right]\end{array}$$ 
The value of this way of looking at it will become clear when I discuss coordinate transformations. Also, like I said, for matrixmultiplication you take the dotproduct of a row of A and a column of B. Since a vector is basically an M×1 matrix, the normal dotproduct is actually a special case of the matrixmultiplication. The only thing is that you have to take the transposed of the first vector:
(C.13)  $$\begin{array}{c}c=\mathbf{\text{u}}\cdot \mathbf{\text{v}}={\left[\begin{array}{c}\mathbf{\text{u}}\end{array}\right]}^{T}\cdot \left[\begin{array}{c}\mathbf{\text{v}}\end{array}\right]\end{array}$$ 
There’s a wealth of other things you can do with matrixmultiplication, but I’ll leave it with the following two notes. First, the operation is not commutative! What that means is that A · B ≠ B · A. you may have guessed that from the rowcolumn requirement, but even if those do match up it is still not commutative. My affine sprite demo kind of shows this: a rotationthenscale does not give the same results as scalethenrotate (which is probably what you wanted). Only in very special cases is A · B equal to B · A.
The other note is that matrix multiplication is expensive. You have to do a dotproduct (q multiplications) for each element of C, which leads to p*q*r multiplications. That’s an O(3) operation, the nastiest ones around. OK, so for 2x2 matrices it doesn’t amount to much, but when you deal with 27x18 matrices (like I do for work), this becomes a problem. Fortunately there are ways of cutting down on the number of calculations, but that’s beyond the scope of this tutorial.
Determinant
The determinant is a scalar that you get when you combine the elements of a square matrix (of size N×N) a certain way. I’ve looked everywhere for a nice, clearcut definition of the determinant, but with very little luck. It seems it has a number of uses, but it is most often used as a simple check to see if a equations of a system (or a set of vectors) is linearly independent, and thus if the coefficient matrix is invertible. The mathematical definition of the determinant of N×N matrix A is a recurrence equation and looks like this.
(C.14)  $$\begin{array}{c}\mathrm{det}\mathbf{\text{A}}=\mathbf{\text{A}}=\sum _{j}^{}(1{)}^{1+j}\text{}{a}_{1j}\text{}\mathrm{det}{A}_{1j}\end{array}$$ 
I could explain this in more detail, but there’s actually little point in doing that. I’ll just give the formulae for the 2x2 and 3x3 case. Actually, I’ve already done so: in the cross product. If you have matrix A = [a_{1} a_{2} a_{3}], then A = a_{1} · (a_{2} × a_{3}). For a 2×2 matrix, B = [b_{1} b_{2}] it’s b_{11}·b_{22} − b_{12}·b_{21}, which in fact also uses the cross product. This is not a mere coincidence. Part of what the determinant is used for is determining whether a matrix can be inverted. Basically, if A = 0, then there is no inverse matrix. Now, remember that the crossproduct gives is involved in the calculation of the area between vectors. This can only be 0 if the vectors are colinear. And linear independence is one of the key requirements of having an inverse matrix. Also, notice the notation for the determinant: det A = A. Looks a bit like the norm of a vector, doesn’t it? Well, the related crossproduct is related to the area spanned between vectors, so I guess it makes sense then.
Matrix inversion
Going back to eq C.9 (yet again), we have a system of equations with variables x= (x, y, z) and matrix A such that A · x = b. We’ll that’s nice and all, but most of the times it’s x that’s unknown, not b. What we need isn’t the way from x to b (which is A), but its inverse. What we need is x = A^{−1} · b. A^{−1} is the notation for the inverse of a matrix. The basic definition of it is A · A^{−1} = I, where I is the identity matrix, which has 1s on its diagonal and 0s everywhere else. There are a number of ways of calculating an inverse. There’s trialanderror, of course (don’t even think about it!), but also the way one usually solves linear systems: through row reduction. Since I haven’t mentioned how to do that, I’ll resort to just giving a formula for one, namely the 2x2 case:
(C.15)  $$\begin{array}{c}\mathbf{\text{A}}=\left[\begin{array}{cc}a& b\\ c& d\end{array}\right]\end{array}$$  $$\begin{array}{c}{\mathbf{\text{A}}}^{1}\equiv \frac{1}{adbc}\left[\begin{array}{cc}d& b\\ c& a\end{array}\right]\end{array}$$ 
This is the simplest case of an inverse. And, yup, that’s a determinant as the denominator. You can see what happens if that thing’s zero. Now, some other things you need to know about matrix inverses. Only square matrices have a chance of being invertible. You can use the determinant to see if it’s actually possible. Furthermore, the inverse of the inverse is the original matrix again. There’s more, of course (oh gawd is there more), but this will have to do for now.
Algebraic properties of matrices
A and B are M×N matrices; C is N×P; D and E are N×N. ei are the column vectors of E. c is a scalar.
A + B = B + A 
c·(A + B) = cB + cA 
A·I = I·A = A 
A·C = C·A only if M=P, and then only under very special conditions 
If E·F = I, then E^{−1} = F and F^{−1} = E 
(A^{T})^{T} = A 
(A·C)^{T} = C^{T} · A^{T} 
(A·C)^{−1} = C^{−1} · A^{−1} 
If a_{i} · a_{j} = δ_{ij}, then A^{−1} = A^{T} (in other words, if the vectors are unit vectors and mutually perpendicular, the inverse is the transposed.) 
Spaces, bases, coordinate transformations
The collection of all possible vectors is called a vector space. The number of dimensions is given by the amount of numbers of the vectors (or was it the other way around?). A 2D space has vectors with 2 elements, 3D vectors have 3, etc. Now, usually, the elements of a vector tell you where in the space you are, but there’s more to it than that. For a fully defined position you need
 a base
 an origin
 coordinates
The vectors you’re used to cover the coordinates part, but without the other two coordinates mean nothing, they’re just numbers. A set of coordinates like (2, 1) means as little as, say, a speed of 1. You need a frame of reference for them to mean anything. For physical quantities, that means units (like km/h or miles/h or m/s, see what a difference that makes for speed?); for spaces, that means a base and an origin.
Coordinate systems
Fig C.5a: a standard coordinate system S. Point P is given by coordinates 
Fig C.5b, a sheared coordinate system S'. Point P is given by coordinates 
Fig C.5a shows the 2D Cartesian coordinates system you’re probably familiar with. You have an horizontal xaxis (i = (1, 0) ) and a vertical yaxis (j = (0, 1) ). And I have a point P in it. If you follow the gridlines, you’ll see that x=3 and y=2, so P= (3, 2), right? Well, yes. And no. In my opinion, mostly no.
The thing is that a point in space has no real coordinates, it’s just there. The coordinates depend on your frame of reference, which is basically arbitrary. To illustrate this, take a look at fig C.5b; In this picture I have a coordinate system S’, which still has a horizontal xaxis (u = (1, 0) ), but the yaxis (v = (1, 1) ) is sheared 45°. And in this system, point P is given by coordinates (1, 2), and not (3, 2). If you use the coordinates of one system directly into another system, bad things happen.
Two questions now emerge: why would anyone use a different set of coordinates, and how do we convert between two systems. I’ll cover the latter in the rest of this article. As for the former, while a Cartesian system is highly useful, there are many instances where real (or virtual) world calculations are complicated immensely when you stick to it. For one thing, describing planetary orbits or things involving magnetism considerably easier in spherical or cylindrical coordinates. For another, in texture mapping, you have a texture with texels which need to be applied to surfaces that in nearly all cases do not align nicely with your world coordinates. The affine transformations are perfect examples of this. So, yeah, using nonCartesian coordinates are very useful indeed.
Building a coordinate base
Stating that there are other coordinate systems besides the Cartesian one is nice and all, but how does one really use them? Well, very easily, actually. Consider what you are really doing when you’re using coordinates in a Cartesian system. Look at fig C.5a again. Suppose you’re given a coordinate set, like (x, y)= (3, 2). To find its location, you move 3 along the xaxis, 2 along the yaxis, and you have your point P. Now, in system S’ (fig C.5b) we have (x’, y’) = (1, 2), but the procedure we used in S doesn’t work here since we don’t have an yaxis. However, we do have vectors u and v. Now if you move 1 along u and 2 along v, we’re at point P again. Turning back to system S again, the x and y axes are really vectors i and j, respectively, so we’ve been using the same procedure in both systems after all. Basically, what we do is:
(C.16a)  $$\begin{array}{c}P=\mathbf{\text{i}}\cdot x\text{}+\text{}\mathbf{\text{j}}\cdot y\end{array}$$ 
(C.16b)  $$\begin{array}{c}P=\mathbf{\text{u}}\cdot {x}^{\prime}\text{}+\text{}\mathbf{\text{v}}\cdot {y}^{\prime}\end{array}$$ 
Now, if you’ve paid attention, you should recognize the structure of these equations. Yes, we’ve seen them before, in eq C.9d. If we rewrite our vectors and coordinates, to matrices and vectors, we get
$$\begin{array}{c}\mathbf{\text{M}}=\left[\begin{array}{cc}\mathbf{\text{i}}& \mathbf{\text{j}}\end{array}\right]=\left[\begin{array}{cc}1& 0\\ 0& 1\end{array}\right],\text{}\text{}\mathbf{\text{x}}=\left[\begin{array}{c}x\\ y\end{array}\right];\text{}\text{}{\mathbf{\text{M}}}^{\prime}=\left[\begin{array}{cc}\mathbf{\text{u}}& \mathbf{\text{v}}\end{array}\right]=\left[\begin{array}{cc}1& 1\\ 0& 1\end{array}\right],\text{}\text{}{\mathbf{\text{x}}}^{\prime}=\left[\begin{array}{c}{x}^{\prime}\\ {y}^{\prime}\end{array}\right]\end{array}$$(C.16c)  $$\begin{array}{c}P=\mathbf{\text{M}}\cdot \mathbf{\text{x}}={\mathbf{\text{M}}}^{\prime}\cdot {\mathbf{\text{x}}}^{\prime}\end{array}$$ 
Vectors x and x’ contain the coordinates, just like they always have. What’s new is that we have now defined the coordinate system in the form matrices M and M’. The vectors that the matrices are made of are the base vectors of the coordinate system. Of course, since the base vectors of system S are the standard unit vectors, the matrix that they form is the identity matrix M == I, which can be safely omitted (and usually is), but don’t forget it’s there behind the curtains. Actually, there’s one more thing that’s usually implicitly added to the equation, namely the origin O. The standard origin is the null vector, but it need not be.
Eq C.17 is the full equation for the definition of a point. O is the origin of the coordinate system, M defines the base vectors, x is a coordinate set in that base, starting at the origin. Note that each of these is completely arbitrary; the M and x in the preceding discussion are just examples of these.
(C.17)  $$\begin{array}{c}P=O+\mathbf{\text{M}}\cdot \mathbf{\text{x}}\end{array}$$ 
Last notes
It really is best to think of points in terms of eq C.17 (that is, an origin, a base matrix, and a coordinate vector), rather than merely a set of coordinates. You’ll find that this technique can be applied to an awful lot of problems and having a general description for them simplifies solving those problems. For example, rotating and scaling of sprites and backgrounds is nothing more than a change of coordinate systems. There’s no magic involved in papd, they’re just the matrix that defines the screen→texture space transformation.
Be very careful that you understand what does what when dealing with coordinate system changes. When transforming between two systems, it is very easy to write down the exact inverse of what you meant to do. For example, given the systems S and S’ of the previous paragraph, we see that x = M · x’, that is M transforms from S’ to S. But the base vectors of M are inside system S, so you may be tempted to think it transforms from S to S’. Which it doesn’t. A similar thing goes on with the P matrix that the GBA uses. The base vectors of this matrix lie inside texture space (see fig 5 in the affine page), meaning that the transformation it does goes from screen to texture space and not the other way around.
The base matrix need not be square; you can use any M×N matrix. This corresponds to a conversion from N dimensions to M dimensions. For example, if M=3 and N=2 (i.e., two 3D vectors), you would have a flat plane inside a 3D world. If N>M, you’d have a projection.