Linear Algebra

Physical Vectors

To gain some intuition on we'll start by focusing on physical vectors — vectors that can be visualized geometrically. These are the vectors often seen in physics. Later, we'll extend this concrete foundation to more general and abstract concepts.
definition. A physical vector is a mathematical object with a magnitude and a direction.
definition. A vector quantity is a quantity that has both a magnitude and a direction.
- Examples: Displacement, velocity, acceleration, etc.
definition. A physical scalar is a mathematical object with a magnitude but no direction.
- Examples: Mass, temperature, pressure, energy, etc.
We denote physical vectors with a lower case letter and an over-right arrow. For example: $\overrightarrow{v}.$
In later sections, we will omit the use of this over-right arrow.

Resultants

Suppose an object moves from point ${A}$ to ${B,}$ then from ${B}$ to ${C.}$
We can represent its overall displacement with two successive vectors, ${\overrightarrow{a}}$ (a vector from ${A}$ to ${B}$ ) and ${\overrightarrow{b}}$ (a vector from ${B}$ to ${C}$ ).
The object's net displacement is a single displacement vector from ${A}$ to ${C,}$ call it ${\overrightarrow{s}.}$
We call ${\overrightarrow{s}}$ the vector sum or resultant of ${\overrightarrow{a}}$ and ${\overrightarrow{b}.}$ $\overrightarrow{s} = \overrightarrow{a} + \overrightarrow{b}.$
Some properties of vector addition to take note of:
1. Commutativity. The order of addition does not matter. $\overrightarrow{a} + \overrightarrow{b} = \overrightarrow{b} + \overrightarrow{a}$
2. Associativity. Adding two or more vectors, we can group the vectors however way we'd like. $(\overrightarrow{a} + \overrightarrow{b}) + \overrightarrow{c} = \overrightarrow{a} + (\overrightarrow{b} + \overrightarrow{c}).$

Vector Negation & Difference

The vector ${- \overrightarrow{a}}$ is a vector with the same magnitude as ${\overrightarrow{b}}$ but with opposite direction.
Thus: $\overrightarrow{b} + (-\overrightarrow{b}) = 0.$
We use this property to define the vector difference. $\overrightarrow{d} = \overrightarrow{a} - \overrightarrow{b} = \overrightarrow{a} + (-\overrightarrow{b}).$

Vector Components

A component of a vector is the vector's projection on an axis.
To find the projection of a vector along an axis geometrically, we draw perpendicular lines from the vector's ends to the axis, as shown below.
We can also find the projection algebraically: $\begin{aligned} a_x &= a \cos \theta \\ a_y &= a \sin \theta \end{aligned}$ where ${\theta}$ $θ$ is the angle that the vector ${\overrightarrow{a}}$ $a$ makes with the positive direction of the ${x}$ $x$ -axis, and ${a}$ $a$ is the magnitude of ${\overrightarrow{a}.}$ $a .$
- From the equations above, we have: $\begin{aligned} a &= \sqrt{a_x^2 + a_y^2} \\[1.5em] \tan \theta &= \dfrac{a_y}{a_x} \end{aligned}$
The vector's projection on an ${x}$ -axis is its ${x}$ -component.
The vector's projection on a ${y}$ -axis is its ${y}$ -component.
The vector's projection on a ${z}$ -axis (for 3D vectors) is its ${z}$ -component.
The process of finding the components of a vector is called vector resolution.

Unit Vectors

A unit vector is a vector with a magnitude of ${1}$ and points in a particular direction.
The unit vector's sole purpose is to specify a direction.

We use special symbols for these vectors:

Symbol	Meaning
${\hat{i}}$	Unit vector in the positive direction of the ${x}$ -axis.
${\hat{j}}$	Unit vector in the positive direction of the ${y}$ -axis.
${\hat{k}}$	Unit vector in the positive direction of the ${z}$ -axis.

Unit vectors can be used to express vectors. For example: $\begin{aligned} \overrightarrow{a} &= a_x \hat{i} + a_y \hat{j} \\ \end{aligned}$ Here, the quantities ${a_x \hat{i}}$ and ${a_y \hat{j}}$ are called the vector components of ${\overrightarrow{a}.}$ The quantities ${a_x}$ and ${a_y}$ are called the scalar components (or simply components) of ${\overrightarrow{a}.}$

Adding Vectors by Components

We can add multiple vectors by adding their respective components.
example. Given $\overrightarrow{r} = \overrightarrow{a} + \overrightarrow{b},$ we have: $\begin{aligned} r_x &= a_x + b_x \\ r_y &= a_y + b_y \\ r_z &= a_z + b_z. \end{aligned}$

Vector Multiplication

There are three types of vector multiplication:
1. The scalar product,
2. dot product, and
3. cross product.

Scalar Product

definition. To muliply a vector ${\overrightarrow{a}}$ by a scalar ${\lambda,}$ we multiply each component of the vector by ${\lambda.}$ $\begin{aligned} \overrightarrow{v} &= \lambda(v_x \hat{i} + v_y \hat{j} + v_z \hat{k}) \\ &= \lambda v_x \hat{i} + \lambda v_y \hat{j} + \lambda v_z \hat{k}. \end{aligned}$

Dot Product

definition. Given vectors ${\overrightarrow{a}}$ and ${\overrightarrow{b},}$ the dot product, denoted ${\overrightarrow{a} \cdot \overrightarrow{b},}$ is defined to be $\overrightarrow{a} \cdot \overrightarrow{b} = ab \cos \theta$ where ${a}$ is the magnitude of ${\overrightarrow{a},}$ ${b}$ is the magnitude of ${\overrightarrow{b},}$ and ${\theta}$ is the angle between the directions of ${\overrightarrow{a}}$ and ${\overrightarrow{b}.}$
Dot products are commutative: $\overrightarrow{a} \cdot \overrightarrow{b} = \overrightarrow{b} \cdot \overrightarrow{a}.$
When two vectros are in unit-vector notation, we write their dot product as: $\overrightarrow{a} \cdot \overrightarrow{b} = (a_x \hat{i} + a_y \hat{j} + a_z \hat{k}) \cdot (b_x \hat{i} + b_y \hat{j} + b_z \hat{k}).$

Cross Product

definition. Given vectors ${\overrightarrow{a}}$ and ${\overrightarrow{b},}$ the cross product of ${\overrightarrow{a}}$ and ${\overrightarrow{b},}$ denoted ${\overrightarrow{a} \times \overrightarrow{b},}$ is defined to be: $\overrightarrow{a} \times \overrightarrow{b} = a b \sin \theta,$ where ${\theta}$ is the smaller of the two angles between ${\overrightarrow{a}}$ and ${\overrightarrow{b}.}$

Real Vectors

For now, we'll focus only on vectors in ${\reals^n.}$ That is, vectors with ${n}$ components, all of which are real numbers.
definition. The zero vector in ${\reals^n}$ is the vector ${\bold{0}^n,}$ whose components are all zero. $\bold{0}^n = \begin{bmatrix} 0_1 \\ 0_2 \\ \vdots \\ 0_n \end{bmatrix}.$
The zero vector is special for a few reasons:
1. Given a vector ${\bold{v} \in \reals^n,}$ adding the zero vector ${\bold{0}^n}$ will always return ${\bold{v}.}$ Because of this result, we call the zero vector ${\bold{0}^n}$ the additive identity of ${\reals^n.}$
2. The zero vector is the only vector with a magnitude of zero.
3. The zero vector is the only vector with no direction.

Scalar Multiplication

Scalar multiplication is straightforward: We multiply each component of the vector by the given scalar ${k.}$ Geometrically (in ${\reals^2}$ or ${\reals^3}$ ), if ${k \gt 0,}$ this has the effect of making the vector shorter or longer.
definition. Let ${\bold{v}}$ be a vector in ${\reals^n,}$ and let ${k \in \reals}$ be a scalar. Then: $k \bold{v} = \begin{bmatrix} k\bold{v}_1 \\ k\bold{v}_2 \\ \vdots \\ k\bold{v}_n \end{bmatrix}.$
example. $3 \begin{bmatrix} 5 \\ 8 \\ 7 \end{bmatrix} = \begin{bmatrix} 3 \times 5 \\ 3 \times 8 \\ 3 \times 7 \end{bmatrix} = \begin{bmatrix} 15 \\ 24 \\ 21 \end{bmatrix}.$ Since ${k}$ can be any real number, scalar multiplication's definition implies that we can negate a vector and perform element-wise division.
definition. Given a vector ${\bold{v},}$ the negation of ${\bold{v},}$ denoted ${-\bold{v},}$ is: $-\bold{v} = \begin{bmatrix} -1 \times \bold{v}_1 \\ -1 \times \bold{v}_2 \\ \vdots \\ -1 \times \bold{v}_n \end{bmatrix} = \begin{bmatrix} -\bold{v}_1 \\ -\bold{v}_2 \\ \vdots \\ -\bold{v}_n \end{bmatrix} .$ Geometrically, negating a vector ${\bold{v}}$ "switches" its direction.

Linear Equations

definition. A linear equation with variables ${x_1, x_2, \ldots, x_n}$ is an equation of the form $a_1 x_1 + a_2 x_2 + \ldots + a_n x_n = b$ where ${a_1, a_2, \ldots, a_n}$ are constants.
If ${x_1 = c_1, x_2 = c_2, \ldots, x_n = c_n}$ satisfy the equation ${a_1x_1 + a_2x_2 + \ldots + a_nx_n = b,}$ we say that ${c_1, c_2, \ldots, c_n}$ is a solution to the equation.
example. Consider ${x_1 + 2x_2 - 3x_3 = 8.}$ One solution to the equation is ${x_1 = 3, x_2 = 4, x_3 = 1,}$ since ${3 + 2(4) - 3(1) = 8.}$ Another solution is ${x_1 = -1, x_2 = 3, x_3 = -1,}$ since ${-1 + 2(3) - 3(-1) = 8.}$ There are, in fact, an infinite number of solutions to the equation. We can express this infinite set of solutions as follows: $\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 8 - 2s + 3t \\ s \\ t \end{bmatrix} ~~~~ (s, t \in \reals).$
The set of all solutions to a linear equation is called the linear equation's solution space.

Systems of Linear Equations

Many problems in mathematics can be reduced to systems of linear equations. Linear algebra provides us the tools for solving these systems.
definition. The system
$\begin{aligned} a_{1,1}x_1 &+ &\ldots &+ a_{1,n}x_n = b_1 \\ &\vdots & &\vdots \\ a_{m,1}x_1 &+ &\ldots &+ a_{m,n}x_n = b_n \end{aligned}$
is called an ${(m,n)}$ (or ${m \times n}$ ) system of linear equations, or linear system, where ${a_{i,j}, b_{i} \in \reals}$ and each ${x_i}$ is a real number called an unknown of the system.
Given a system of linear equations ${S,}$ each ${n}$ -tuple ${(x_1, \ldots, x_n) \in \reals^n}$ that satisfies ${S}$ is called a solution of ${S.}$ In a system with two unknowns ${x_1}$ and ${x_2,}$ each linear equation defines a line on the ${x_1}$ - ${x_2}$ plane.
example. Consider the system
$4x_1 + 4x_2 = 5 \\ 2x_1 - 4x_2 = 1$
Note that we chose the variables ${x_1}$ and ${x_2}$ arbitrarily. We could've just used ${x}$ and ${y:}$
$4x + 4y = 5 \\ 2x - 4y = 1.$
Notice that these are simple linear equations. Solving for ${y:}$
$y = -x + \dfrac{5}{4} \\[1em] y = \dfrac{x}{2} - \dfrac{1}{4}.$
The point where these lines intersect is a solution to the system. In this case, the point ${(1, \frac{1}{4}).}$ Thus, the system above has the solution space of ${\lbrace (1, \frac{1}{4}) \rbrace.}$ Notice the graph for this system. What are the coordinates of the point where the two lines meet?

An ${(m,n)}$ system of linear equations is inconsistent if it has no solution (i.e., a solution space of ${\varnothing}$ ). The system is consistent if it has at least one solution. An ${(m,n)}$ system of linear equations is underspecified if it has more unknowns than equations ${(n \gt m).}$ If the system has more equations than unknowns ${(m \gt n),}$ then the system is overspecified. If an underspecified system of linear equations is consistent, then it will have infinitely many solutions.

Homogeneous Linear Systems

Given the linear system $\begin{aligned} a_{1,1} x_1 + a_{1,2}& x_2 + \ldots a_{1,n} x_n = b_1 \\ a_{2,1} x_1 + a_{2,2}& x_2 + \ldots a_{2,n} x_n = b_2 \\ a_{3,1} x_1 + a_{3,2}& x_2 + \ldots a_{3,n} x_n = b_3 \\ &\vdots \\ a_{m,1} x_1 + a_{m,2}& x_2 + \ldots a_{m,n} x_n = b_m \end{aligned}$ If ${b_i = 0}$ for all ${i = 1,2,3,\ldots,n,}$ we say that the system is homogeneous. Homogeneous linear systems have a special property: If a system is homogeneous, then ${x_1 = x_2 = \ldots = x_n = 0}$ is always a solution (but not necessarily the only solution).
Every homogeneous linear system is consistent because all such systems have the solution ${x_1 = 0, x_2 = 0, \ldots, x_n = 0.}$ $x_{1} = 0, x_{2} = 0, \dots, x_{n} = 0.$ This solution is called the trivial solution. All other solutions (if there are any) are called nontrivial solutions. Because homogeneous systems always have the trivial solution, there are only two possibilities for their solutions:
1. The system has only one solution (the trivial solution), or
2. the system has infinitely many solutions, including the trivial solution.
The one case where a homogeneous system always has nontrivial solutions is the underspecified homogenous linear system — a homogeneous linear system with more unknowns than equations.
theorem. If a homogeneous linear system has ${n}$ unknowns, and if the reduced row echelon form of its augmented matrix has ${r}$ nonzero rows, then the system has ${n - r}$ free variables.
theorem. A homogeneous linear system with more unknowns than equations has infinitely many solutions.

Augmented Matrices

An ${(m,n)}$ linear system can be written as a rectangular array of ${m}$ by ${(n + 1)}$ numbers by leaving out the ${+}$ and ${=}$ signs and the variables. We call this an augmented matrix. The general system
$\begin{aligned} a_{1,1} x_1 + a_{1,2}& x_2 + \ldots a_{1,n} x_n = b_1 \\ a_{2,1} x_1 + a_{2,2}& x_2 + \ldots a_{2,n} x_n = b_2 \\ a_{3,1} x_1 + a_{3,2}& x_2 + \ldots a_{3,n} x_n = b_3 \\ &\vdots \\ a_{m,1} x_1 + a_{m,2}& x_2 + \ldots a_{m,n} x_n = b_m \end{aligned}$
has the augmented matrix
$\left\lbrack\hspace{-12pt}\begin{array}{ccccc|c} &a_{1,1} & a_{1,2} & \ldots & a_{1,n} & b_1 \\ &a_{2,1} & a_{2,2} & \ldots & a_{2,n} & b_2 \\ &a_{3,1} & a_{3,2} & \ldots & a_{3,n} & b_3 \\ &\vdots &\vdots &\vdots &\vdots &\vdots \\ &a_{m,1} & a_{m,2} & \ldots & a_{m,n} & b_m \\ \end{array}\hspace{-3pt}\right\rbrack.$

Gaussian Elimination

We can solve a linear system (find its solution space) through Gaussian elimination. The method of Gaussian elmination employs three operations:
1. Swapping: Interchange the equations.
2. Scaling: Replace an equation by any nonzero multiple of itself.
3. Pivoting: Replace an equation by a multiple of itself plus a multiple of another equation.
example. Consider the system
$\begin{cases} 2x + y = 17 \\ x - 3y = 17 \end{cases}.$
Swapping:
$\begin{cases} \textcolor{teal}{2x + y = 17} \\ x - 3y = 17 \end{cases} \rightarrow \begin{cases} x - 3y = 17 \\ \textcolor{teal}{2x + y = 17} \end{cases}.$
Scaling by the first equation ${-3}$ :
$\begin{cases} -3(2x + y = -1) \\ x - 3y = 17 \end{cases} \rightarrow \begin{cases} -6x - 3y = 3 \\ x - 3y = 17 \end{cases}$
Pivoting (replace the second equation by "two times the second equation minus the first equation"):
1. Two times the second equation: $2(x - 3y = 17) \\ \downarrow \\ 2x - 6y = 34$
2. Minus the first equation: $\begin{align*} 2x - 6y &= 34 \\ -(2x + y &= -1) \\ \hline -7y &= 35 \end{align*}$
3. Result: $\begin{cases} 2x + y = -1 \\ -7y = 35 \end{cases}.$

Row Reduction

Since linear systems can be represented with augmented matrices, we can frame Gaussian elimination in terms of augmented matrices:
1. Swapping: Interchange rows.
2. Scaling: Replace any row by a non-zero multiple of itself.
3. Pivoting: Replace any row by itself plus a multiple of another row.
example. Using the augmented matrix for our previous system:
$\begin{cases} 2x + y = 17 \\ x - 3y = 17 \end{cases} \rightarrow \left\lbrack\hspace{-5pt}\begin{array}{cc|c} 2 & 1 & -1 \\ 1 & -3 & 17 \end{array}\hspace{-5pt}\right\rbrack.$
Swapping rows ${1}$ and ${2}$ (we can express this notationally with ${R_1 \harr R_2}$ ):
$\left\lbrack\hspace{-5pt}\begin{array}{cc|c} 2 & 1 & -1 \\ 1 & -3 & 17 \end{array}\hspace{-5pt}\right\rbrack \rightarrow \left\lbrack\hspace{-5pt}\begin{array}{cc|c} 1 & -3 & 17 \\ 2 & 1 & -1 \end{array}\hspace{-5pt}\right\rbrack.$
Multiply row ${1}$ by ${-3}$ (scaling, expressed notationally as ${-3R_1 \rightarrow R_1}$ ):
$\left\lbrack\hspace{-5pt}\begin{array}{cc|c} 2 & 1 & -1 \\ 1 & -3 & 17 \end{array}\hspace{-5pt}\right\rbrack \rightarrow \left\lbrack\hspace{-5pt}\begin{array}{cc|c} -6 & -3 & 3 \\ 1 & -3 & 17 \end{array}\hspace{-5pt}\right\rbrack.$
Replace row ${2}$ by " ${2}$ times row ${2}$ minus row ${1}$ (a pivot, expressed notationally as ${2R_2 - R_1 \rightarrow R_2}$ ):
$\left\lbrack\hspace{-5pt}\begin{array}{cc|c} 2 & 1 & -1 \\ 1 & -3 & 17 \end{array}\hspace{-5pt}\right\rbrack \rightarrow \left\lbrack\hspace{-5pt}\begin{array}{cc|c} 2 & 1 & -1 \\ 0 & -7 & 35 \end{array}\hspace{-5pt}\right\rbrack.$
The goal of Gaussian elimination, as applied to augmented matrices, is row reduction: We use the elementary row operations to eliminate variables from selected rows of the augmented matrix until we obtain a solution space.

Graphical Solutions

For linear systems in ${\reals^2}$ and ${\reals^3,}$ we can quickly see solutions with the aid of plotting.

Linear Systems in ℝ²

A linear equation of the form ${ax + by = c,}$ where ${a,b,c}$ are real constants, graphically corresponds to a line in the Cartesian plane. Given two such lines (i.e., a ${2 \times 2}$ linear system), there are three possible cases which may occur:

Intersecting Lines	Parallel Lines	Coincident Lines
One point of intersection; there is a unique solution.	No points of intersection; there are no solutions.	Infinitely many points of intersection; there are infinitely many solutions.

Linear Systems in ℝ³

In ${\reals^3,}$ parallel planes indicate there are no solutions, since there is no common intersection.
exmaple. The system

$\begin{cases} 2x + 3y - z = -5 \\ 2x + 3y - z = -2 \\ 2x + 3y - z = 3 \end{cases}$

has no solution. Examining the plot above, there is no point of intersection since the resulting planes are parallel. The fact that planes intersect, however, does not imply that there is a solution.
Two parallel planes with no common intersection also indicates that the system has no solutions.
example. The system

$\begin{cases} x + 2y + z = -2 \\ x + 2y + z = -\frac{1}{4} \\ 2x - y + z = 1 \end{cases}$

has no solution. While there are infinitely many points of intersection, there isn't a common intersection.
If the planes share a common intersection, then either the intersection is a line or a point. If it's a line, then there are infinitely many solutions, since the line comprises infinitely many points.
example. The system

has infinitely many solutions since its common intersection is a line.
If, however, the common intersection is a point, then there is exactly one solution.
example. The system

$\begin{cases} 3x + y + 2z = 0 \\ 4x - y + 3z = -1 \\ x + y - z = -1 \end{cases}$

has exactly one solution, since the equations' corresponding planes intersect at a single, common point of intersection. Nevertheless, the following fact always applies, regardless of what linear system we're in:
theorem. Every system of linear equations has zero, one, or infinitely many solutions. There are no other possibilities.

Echelon Forms

Consider the following system, with its corresponding augmented matrix and plot.

$\begin{cases} x + y + 2z = 9 \\ 2x + 4y - 3z = 1 \\ 3x + 6y - 5z = 0 \end{cases}$ $\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 2 & 4 & -3 & 1 \\ 3 & 6 & -5 & 0 \end{array}\hspace{-5pt}\right\rbrack.$

What is the solution space for this system? We will use the elementary row operations.

Add ${-2}$ times the first equation to the second equation.
$\begin{cases} x + y + 2z = 9 \\ 2y - 7z = -17 \\ 3x + 6y - 5z = 0 \end{cases}$
Add ${-2}$ times the first row to the second row.
$\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 0 & 2 & -7 & -17 \\ 3 & 6 & -5 & 0 \end{array}\hspace{-5pt}\right\rbrack.$

Add ${-3}$ times the first equation to the third equation.

\begin{cases} x + y + 2z = 9 \\ 2y - 7z = -17 \\ 3y - 11z = -27 \end{cases}

Add ${-2}$ times the first row to the second row.

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 0 & 2 & -7 & -17 \\ 0 & 3 & -11 & -27 \end{array}\hspace{-5pt}\right\rbrack.

Multiply the second equation by ${\frac{1}{2}.}$

\begin{cases} x + y + 2z = 9 \\ y - \frac{7}{2}z = -\frac{17}{2} \\ 3y - 11z = -27 \end{cases}

Multiply the second row by ${\frac{1}{2}.}$

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \\ 0 & 3 & -11 & -27 \end{array}\hspace{-5pt}\right\rbrack.

Add ${-3}$ times the second equation to the third equation.

\begin{cases} x + y + 2z = 9 \\ y - \frac{7}{2}z = -\frac{17}{2} \\ -\frac{1}{2}z = -\frac{3}{2} \end{cases}

Add ${-3}$ times the second row to the third row.

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \\ 0 & 0 & -\frac{1}{2} & -\frac{3}{2} \end{array}\hspace{-5pt}\right\rbrack.

Multiply the third equation by ${-2.}$

\begin{cases} x + y + 2z = 9 \\ y - \frac{7}{2}z = -\frac{17}{2} \\ z = 3 \end{cases}

Multiply the third row by ${-2.}$

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 1 & 2 & 9 \\ 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \\ 0 & 0 & 1 & 3 \end{array}\hspace{-5pt}\right\rbrack.

Add ${-1}$ times the second equation to the first equation.

\begin{cases} x + \frac{11}{2} z = \frac{35}{2} \\ y - \frac{7}{2}z = -\frac{17}{2} \\ z = 3 \end{cases}

Add ${-1}$ times the second row to the first row.

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 0 & \frac{11}{2} & \frac{35}{2} \\ 0 & 1 & -\frac{7}{2} & -\frac{17}{2} \\ 0 & 0 & 1 & 3 \end{array}\hspace{-5pt}\right\rbrack.

Add ${-\frac{11}{2}}$ times the third equation to the first equation and ${\frac{7}{2}}$ times the third equation to the second equation.

\begin{cases} x = 1 \\ y = 2 \\ z = 3 \end{cases}

Add ${-\frac{11}{2}}$ times the third row to the first row and ${\frac{7}{2}}$ times the third row to the second row.

\left\lbrack\hspace{-5pt}\begin{array}{ccc|c} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 2 \\ 0 & 0 & 1 & 3 \end{array}\hspace{-5pt}\right\rbrack.

We see that the solution space comprises exactly one solution:

\begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}.

If we rewrite the augmented matrix as a plain, old matrix $\begin{bmatrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 2 \\ 0 & 0 & 1 & 3 \end{bmatrix}$ we get a particularly special kind of matrix — a matrix in reduced row echelon form.
definition. A matrix ${M}$ $M$ is in reduced row echelon form if, and only if, the following properties hold for ${M:}$ $M :$
1. If a row does not consist entirely of zeros, then the first nonzero number in the row is ${1.}$ This is called the leading 1.
2. If there are any rows that consist entirely of zeros, then they are grouped together at the bottom of the matrix.
3. In any two successive rows that do not consist entirely of zeros, the leading ${1}$ in the lower row occurs farther to the right than the leading ${1}$ in the higher row.
4. Each column that contains a leading ${1}$ has zeros everywhere else in that column.
Variables that correspond to the leading ${1}$ s are called leading variables, and all others free variables.
Matrices that satisfy the first three properties are said to be in row echelon form.
To be in reduced row echelon form, the matrix must satisfy all four properties.

Parametric Solutions

Consider the augmented matrix below (vertical separator omitted): $\begin{bmatrix} 1 & 0 & 3 & -1 \\ 0 & 1 & -4 & 2 \\ 0 & 0 & 0 & 0 \end{bmatrix}.$
The last row maps to the equation ${0x + 0y + 0z = 0.}$ This is trivial and can be discarded since it holds no restrictions on ${x,}$ ${y,}$ or ${z.}$ Thus: $\begin{bmatrix} 1 & 0 & 3 & -1 \\ 0 & 1 & -4 & 2 \\ \end{bmatrix} \rightarrow \begin{cases} x + 3z = -1 \\ y - 4z = 2 \end{cases} .$
Above, we see that the leading variables are ${x}$ and ${y,}$ with ${z}$ a free variable. If we solve for the leading variables in terms of the free variables, we get: $x = -1 - 3z \\ y = 2 + 4z.$ It's now apparent that ${z}$ can be treated as a parameter. So we can express the solution space with the parametric equations $\begin{cases} x = -1 - 3t \\ y = 2 + 4t \\ z = t \end{cases} ~~~ t \in \reals.$

Gauss-Jordan Elimination

In the preceding examples, we saw that we can solve a linear system by casting its augmented matrix into reduced row echelon form. But how did we know which elementary operations to perform? Our choices seemed arbitrary, if not the result of trial and error. We now introduce an algorithm for reducing any matrix to reduced row echelon form — Gauss-Jordan Elimination. To illustrate, we will reduce this matrix:
$\begin{bmatrix} 0 & 0 & -2 & 0 & 7 & 12 \\ 2 & 4 & -10 & 6 & 12 & 28 \\ 2 & 4 & -5 & 6 & -5 & -1 \end{bmatrix}.$
Step 1. Find the leftmost column that does not consist entirely of zeros.

$\begin{bmatrix} \color{red}{0} & 0 & -2 & 0 & 7 & 12 \\ \color{red}{2} & 4 & -10 & 6 & 12 & 28 \\ \color{red}{2} & 4 & -5 & 6 & -5 & -1 \end{bmatrix}.$
The leftmost column not entirely of zeros is colored red.

Step 2. Swap the top row with another row, if necessary, to bring a nonzero entry to the top of the column found in step ${1.}$

$\begin{bmatrix} 2 & 4 & -10 & 6 & 12 & 28 \\ 0 & 0 & -2 & 0 & 7 & 12 \\ 2 & 4 & -5 & 6 & -5 & -1 \end{bmatrix}.$
We swapped the first and second rows.

Step 3. If the entry on top of the column found in step ${1}$ is ${a,}$ multiply the first row by ${1/a}$ to introduce a leading ${1.}$

$\begin{bmatrix} 1 & 2 & -5 & 3 & 6 & 14 \\ 0 & 0 & -2 & 0 & 7 & 12 \\ 2 & 4 & -5 & 6 & -5 & -1 \end{bmatrix}.$
We multiplied the first row by ${\frac{1}{2}.}$

Step 4. Add multiples of the top row to the rows below so that all entries below the leading ${1}$ become zeros.

$\begin{bmatrix} 1 & 2 & -5 & 3 & 6 & 14 \\ 0 & 0 & -2 & 0 & 7 & 12 \\ 0 & 0 & 5 & 0 & -17 & -29 \end{bmatrix}.$
We added ${-2}$ times the first row of the preceding matrix to the third row.

Step 5. Now cover (ignore) the top row in the matrix. With the remaining rows (the submatrix) repeat the process again starting from step ${1.}$

$\begin{bmatrix} \color{lightgrey}{1} & \color{lightgrey}{2} & \color{lightgrey}{-5} & \color{lightgrey}{3} & \color{lightgrey}{6} & \color{lightgrey}{14} \\ 0 & 0 & \color{red}{-2} & 0 & 7 & 12 \\ 0 & 0 & \color{red}{5} & 0 & -17 & -29 \end{bmatrix}.$
Leftmost nonzero column is colored red.

$\begin{bmatrix} \color{lightgrey}{1} & \color{lightgrey}{2} & \color{lightgrey}{-5} & \color{lightgrey}{3} & \color{lightgrey}{6} & \color{lightgrey}{14} \\ 0 & 0 & 1 & 0 & -\frac{7}{2} & -6 \\ 0 & 0 & 5 & 0 & -17 & -29 \end{bmatrix}.$
We multiply the first row in the submatrix by ${-\frac{1}{2}}$ to introduce a leading ${1.}$

$\begin{bmatrix} \color{lightgrey}{1} & \color{lightgrey}{2} & \color{lightgrey}{-5} & \color{lightgrey}{3} & \color{lightgrey}{6} & \color{lightgrey}{14} \\ 0 & 0 & 1 & 0 & -\frac{7}{2} & -6 \\ 0 & 0 & 0 & 0 & \frac{1}{2} & 1 \end{bmatrix}.$
We add ${-5}$ times the first row of the submatrix to the second row of the submatrix to introduce a zero below the leading ${1.}$

$\begin{bmatrix} \color{lightgrey}{1} & \color{lightgrey}{2} & \color{lightgrey}{-5} & \color{lightgrey}{3} & \color{lightgrey}{6} & \color{lightgrey}{14} \\ \color{lightgrey}{0} & \color{lightgrey}{0} & \color{lightgrey}{1} & \color{lightgrey}{0} & \color{lightgrey}{-\frac{7}{2}} & \color{lightgrey}{-6} \\ 0 & 0 & 0 & 0 & \color{red}{\frac{1}{2}} & 1 \end{bmatrix}.$
We cover the top row of the submatrix and return again to step ${1.}$ The leftmost nonzero column is colored red.

$\begin{bmatrix} \color{lightgrey}{1} & \color{lightgrey}{2} & \color{lightgrey}{-5} & \color{lightgrey}{3} & \color{lightgrey}{6} & \color{lightgrey}{14} \\ \color{lightgrey}{0} & \color{lightgrey}{0} & \color{lightgrey}{1} & \color{lightgrey}{0} & \color{lightgrey}{-\frac{7}{2}} & \color{lightgrey}{-6} \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}.$
The first (and only) row in the new submatrix is multiplied by ${2}$ to introduce a leading ${1.}$

$\begin{bmatrix} {1} & {2} & {-5} & {3} & {6} & {14} \\ {0} & {0} & {1} & {0} & {-\frac{7}{2}} & {-6} \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}.$
The matrix is now in row echelon form. To find the reduced row echelon form, proceed to step ${6.}$

Step 6. Beginning with the last nonzero row and working upward, add suitable multiples of each row to the rows above to introduce zeros above the leading ${1s.}$

$\begin{bmatrix} {1} & {2} & {-5} & {3} & {6} & {14} \\ {0} & {0} & {1} & {0} & {0} & {1} \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}.$
We added ${\frac{7}{2}}$ times the third row of the preceding matrix to the second row.

$\begin{bmatrix} {1} & {2} & {-5} & {3} & {0} & {2} \\ {0} & {0} & {1} & {0} & {0} & {1} \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}.$
We added ${-6}$ times the third row to the first row.

$\begin{bmatrix} {1} & {2} & {0} & {3} & {0} & {7} \\ {0} & {0} & {1} & {0} & {0} & {1} \\ 0 & 0 & 0 & 0 & 1 & 2 \end{bmatrix}.$
We added ${5}$ times the second row to the first row. The matrix is now in reduced row echelon form.

Matrices & Linear Equations

We can represent a system of linear equations with matrices. We place the coefficients ${a_{i,j}}$ into vectors,
$\begin{bmatrix} a_{1,1} \\ \vdots \\ a_{m,1} \end{bmatrix} x_1 + \begin{bmatrix} a_{1,2} \\ \vdots \\ a_{m,2} \end{bmatrix} x_2 + \ldots + \begin{bmatrix} a_{1,n} \\ \vdots \\ a_{m,n} \end{bmatrix} x_n = \begin{bmatrix} b_1 \\ \vdots \\ b_m \end{bmatrix},$
then place the vectors into matrices
$\begin{aligned} \begin{bmatrix} a_{1,1} &\ldots &a_{1,n} \\ \vdots &~ \vdots &\vdots \\ a_{m,1} &\ldots &a_{m,n} \end{bmatrix} \end{aligned} \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} b_1 \\ \vdots \\ b_m \end{bmatrix}.$

Matrices

A matrix can be thought of as an array of arrays. They allow us to compactly represent a system of linear equations.
definition. Let ${m,n \in \N.}$ A real-valued ${(m,n)}$ matrix ${\bold{A}}$ is an ${m \cdot n}$ tuple of elements ${a_{i,j},}$ ${i = 1, \ldots, m,}$ ${j = 1, \ldots, n.}$ This tuple is ordered as a rectangle of ${m}$ rows and ${n}$ columns: $\bold{A} = \begin{bmatrix} a_{1,1} & a_{1,2} & \ldots & a_{1,n} \\ a_{2,1} & a_{2,2} & \ldots & a_{2,n} \\ \vdots & \vdots & \vdots & \vdots \\ a_{m,1} & a_{m,2} & \ldots & a_{m,n} \end{bmatrix}, ~~ a_{i,j} \in \R.$ We call ${i}$ the row index, and ${j}$ the column index.
Some additional terminology: A ${(1,n)}$ matrix is called a row vector, and a ${(m,1)}$ matrix is called a column vector.
example. In programming, we can represent a row vector with a simple array:
```
let R = [1, 2, 3, 4];
```

example. A matrix would be an array of arrays:

let M = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9],
];

example. A column vector would be an array of one-element arrays:
```
let C = [[1], [2], [3]];
```

Matrix Equality

definition. Two matrices are equal if and only if they have the same size and their corresponding entries are equal.

Matrix Addition

To add two matrices, we add each element of one matrix to the corresponding element (the element of the same indices) in the other matrix (element-wise sum).
definition. Let ${\bold{A} \in \R^{m \times n},}$ and let ${\bold{B} \in \R^{m \times n}.}$ Then the sum ${\bold{A} + \bold{B}}$ is defined as: $\bold{A} + \bold{B} := \begin{bmatrix} a_{1,1} + b_{1,1} & \ldots & a_{1,n} + b_{1,n} \\ \vdots & \vdots & \vdots \\ a_{m,1} + b_{m,1} & \ldots & a_{m,n} + b_{m,n} \end{bmatrix} \in \reals^{m \times n}.$
example. $\begin{aligned} \begin{bmatrix} 3 & 0 & 3 \\ 5 & 7 & 4 \\ 2 & 1 & 8 \end{bmatrix} + \begin{bmatrix} 1 & 2 & 5 \\ 4 & 1 & 4 \\ 3 & 0 & 1 \end{bmatrix} &= \begin{bmatrix} 3 + 1 & 0 + 2 & 3 + 5 \\ 5 + 4 & 7 + 1 & 4 + 4 \\ 2 + 3 & 1 + 0 & 8 + 1 \end{bmatrix} \\ &= \begin{bmatrix} 4 & 2 & 8 \\ 9 & 8 & 8 \\ 5 & 1 & 9 \end{bmatrix} \end{aligned}.$
From the definition, we see that matrices of different sizes cannot be added.

Matrix Multiplication

There are three forms of matrix multiplication: scalar multiplication, the Hadamard product, and the row-column dot product.

Hadamard Product

The Hadamard product is simply element-wise multiplication.
definition. Let ${\bold{A} \in \R^{m \times n},}$ and let ${\bold{B} \in \R^{m \times n}.}$ Then the Hadamard product ${\bold{A} \odot \bold{B}}$ is defined as: $\bold{A} \odot \bold{B} := \begin{bmatrix} a_{1,1}b_{1,1} & \ldots & a_{1,n}b_{1,n} \\ \vdots & \vdots & \vdots \\ a_{m,1}b_{m,1} & \ldots & a_{m,n}b_{m,n} \end{bmatrix} \in \reals^{m \times n}.$
Like the matrix sum, the Hadamard product is defined only if ${\bold{A}}$ and ${\bold{B}}$ have the same dimensions.

example. $\begin{aligned} \begin{bmatrix} 2 & 7 & 5 \\ 9 & 4 & 3 \\ 1 & 3 & 8 \end{bmatrix} + \begin{bmatrix} 7 & 3 & 5 \\ 9 & 8 & 4 \\ 1 & 6 & 3 \end{bmatrix} &= \begin{bmatrix} 2 \times 7 & 7 \times 3 & 5 \times 5 \\ 9 \times 9 & 4 \times 8 & 3 \times 4 \\ 1 \times 1 & 3 \times 6 & 8 \times 3 \end{bmatrix} \\ &= \begin{bmatrix} 14 & 21 & 25 \\ 81 & 32 & 12 \\ 1 & 18 & 24 \end{bmatrix} \end{aligned}.$

Scalar Multiplication

When we multiply a scalar to a matrix, we multiply each element of the matrix with the scalar.

Row-Column Dot Product

With matrix sums, we simply perform element-wise addition. Matrix multiplication, however, looks a little different:
definition. Given matrices ${\bold{A} \in \reals^{m \times n}}$ and ${\bold{B} \in \reals^{n \times k},}$ the elements ${c_{i,j}}$ of the product ${\bold{C} = \bold{A}\bold{B} \in \reals^{m \times k}}$ are computed as:
$c_{i,j} = \sum_{\ell = 1}^{n} a_{i,\ell} b_{i,\ell,} \\[1em] i = 1, \ldots, m \\ j = 1, \ldots, k.$
That is, to compute the element ${c_{i,j},}$ multiply the ${i^{\text{th}}}$ row of ${\bold{A}}$ with the ${j^{\text{th}}}$ column of ${\bold{B}}$ and sum them.
example. To compute the row-column dot product of
$\underbrace{ \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}}_{\bold{A}} \cdot \underbrace{ \begin{bmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12 \end{bmatrix}}_{\bold{B}},$
we start by computing the dot product of the first row of ${A}$ and the first column of ${B:}$
$\begin{aligned} [1,2,3] \cdot [7,9,11] &= (1 \times 7) + (2 \times 9) + (3 \times 11) \\ &= 58. \end{aligned}$
This gives us the first row, first column entry:
$\begin{bmatrix} 58 & ~ \\ ~ & ~ \end{bmatrix}.$
Next we compute the first row, second column dot product:
$\begin{aligned} [1,2,3] \cdot [8,10,12] &= (1 \times 8) + (2 \times 10) + (3 \times 12) \\ &= 64. \end{aligned}$
This gives us the first row, second column entry:
$\begin{bmatrix} 58 & 64 \\ ~ & ~ \end{bmatrix}.$
Now the second row, first column:
$\begin{aligned} [4,5,6] \cdot [7,9,11] &= (4 \times 7) + (5 \times 9) + (6 \times 11) \\ &= 139. \end{aligned}$
Thus, we now have:
$\begin{bmatrix} 58 & 64 \\ 139 & ~ \end{bmatrix}.$
Finally, compute the second row, second column:
$\begin{aligned} [4,5,6] \cdot [8,10,12] &= (4 \times 8) + (5 \times 10) + (6 \times 12) \\ &= 154. \end{aligned}$
The result:
$\begin{bmatrix} 58 & 64 \\ 139 & 154 \end{bmatrix}.$
Notice: Given a matrix ${\bold{A}}$ and a matrix ${\bold{B},}$ the row-column product ${\bold{A} \cdot \bold{B}}$ is defined only if the number of columns of ${\bold{A}}$ is equal to the number of rows of ${\bold{B}.}$ In general, given an ${m \times n}$ matrix ${\bold{A}}$ and an ${n \times k}$ matrix ${\bold{B},}$ ${\bold{A}\cdot\bold{B}}$ is defined only if the ${n}$ s are the same. The result ${\bold{A} \cdot \bold{B}}$ is an ${m \times k}$ matrix.
Why do we multiply matrices like this? A small bakery sells three types of cake slices: Coffee is ${\text{\textdollar} 3}$ a slice; chocolate is ${\text{\textdollar}4}$ a slice; and carrot is ${\text{\textdollar} 2}$ a slice. The bakery maintains a ledger of how many that've sold. Here is a snippet for the past four days:

Mon. Tue. Wed. Thu.
Coffee 13 9 7 15
Chocolate 8 7 4 6
Carrot 6 4 0 3

And here are the prices per slice:

Price
Coffee 3
Chocolate 4
Carrot 2

How much revenue did the bakery generate on Monday? We multiply the number of slices sold for that type with the type's price per slice, perform that operation for each type, and sum the products:
$(3 \times 13) + (4 \times 8) + (2 \times 6) = 83.$
To compute the revenue for each day, we compute the row-column dot product:
$[3 ~~ 4 ~~ 2] \begin{bmatrix} 13 & 9 & 7 & 15 \\ 8 & 7 & 4 & 6 \\ 6 & 4 & 0 & 3 \end{bmatrix} = [83 ~~ 63 ~~ 37 ~~ 75].$
In other words, to find the revenue, we must match each price to each quantity. As it turns out, this operation (matching entries in one table to the entries in another, multiplying them, and then summing the results) occurs over and over again in pure and applied mathematics. So much so that we've defined it as its own operation, the row-column dot product, or simply, the matrix product.

	Mon.	Tue.	Wed.	Thu.
Coffee	13	9	7	15
Chocolate	8	7	4	6
Carrot	6	4	0	3

	Price
Coffee	3
Chocolate	4
Carrot	2

⚠ In real arithmetic, we know that ${ab = ba}$ (the commutative law of multiplication). This law does not hold for matrix multiplication (it is very rarely the case that ${AB = BA;}$ in fact, you have better luck assuming ${AB \neq BA}$ ). The test ${AB = BA}$ can fail for possible reasons:

${AB}$ is defined, but ${BA}$ is not.
${AB}$ and ${BA}$ are both defined, but they are of different orders.
${AB}$ and ${BA}$ are both defined, ${AB}$ and ${BA}$ have the same orders, but the product's entries are different.

Matrices & Linear Combinations

definition. If ${A_1, A_2, \ldots, A_r}$ are matrices of the same size, and if ${c_1, c_2, \ldots, c_r}$ are scalars, then an expression of the form
$c_1 A_1 + c_2 A_2 + \ldots + c_r A_r$
is called a linear combination of ${A_1, A_2, \ldots, A_r}$ with coefficients ${c_1, c_2, \ldots, c_r.}$

Transpose of a Matrix

If ${M}$ is any ${m \times n}$ matrix, then the transpose of ${M,}$ denoted ${M^T,}$ is defined to be the ${n \times m}$ matrix that results from interchanging the rows and columns of ${M.}$
example. If
$M = \begin{bmatrix} 2 & 3 \\ 1 & 4 \\ 5 & 6 \end{bmatrix},$
then
$M^T = \begin{bmatrix} 2 & 1 & 5 \\ 3 & 4 & 6 \end{bmatrix}.$

Trace of a Matrix

If ${M}$ is a square matrix, then the trace of ${M,}$ denoted by ${\text{tr}(M),}$ is the sum of the entries on the main diagonal of ${M.}$ If ${M}$ is not a square matrix, then ${\text{tr}(M)}$ is undefined.
example. If
$A = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \\ a_{2,1} & a_{2,2} & a_{2,3} \\ a_{3,1} & a_{3,2} & a_{3,3} \end{bmatrix},$
then
$\text{tr}(A) = a_{1,1} + a_{2,2} + a_{3,3}.$

Properties of Matrix Algebra

Let ${A}$ $A$ and ${B}$ $B$ be matrices and ${a}$ $a$ and ${b}$ $b$ be scalars. Then:
1. ${A + B = B + A.}$
2. ${A + (B + C) = (A + B) + C.}$
3. ${A(BC) = (AB)C.}$
4. ${A(B + C) = AB + AC.}$
5. ${(B + C)A = BA + CA.}$
6. ${A(B - C) = AB - AC.}$
7. ${(B - C)A = BA - CA.}$
8. ${a(B + C) = aB + aC.}$
9. ${a(B - C) = aB - aC.}$
10. ${(a + b)C = aC + bC.}$
11. ${(a - b)C = aC - bC.}$
12. ${a(bC) = (ab)C.}$
13. ${a(BC) = (aB)C = B(aC).}$