Matrices — Multiplication, Determinants, and Inverses — Bench

What a Matrix Is

A matrix is a rectangular array of numbers arranged in rows and columns. An m×n matrix has m rows and n columns.

A = [1  2  3]    (2×3 matrix)
    [4  5  6]

B = [7  8]       (3×2 matrix)
    [9  0]
    [1  2]

The entry in row i, column j is written Aᵢⱼ.

Matrices represent: systems of linear equations, linear transformations, data tables, graphs, and more. They’re the central object of linear algebra.

Addition and Scalar Multiplication

Addition: add corresponding entries. Matrices must have the same dimensions.

[1 2] + [5 6] = [6  8]
[3 4]   [7 8]   [10 12]

Scalar multiplication: multiply every entry by the scalar.

3 × [1 2] = [3  6]
    [3 4]   [9 12]

Matrix Multiplication

Multiplying A (m×n) by B (n×p) gives C (m×p). The inner dimensions must match.

The entry Cᵢⱼ is the dot product of row i of A with column j of B:

Cᵢⱼ = Σₖ Aᵢₖ Bₖⱼ

[1 2] × [5 6] = [1×5+2×7  1×6+2×8] = [19 22]
[3 4]   [7 8]   [3×5+4×7  3×6+4×8]   [43 50]

Matrix multiplication is not commutative: AB ≠ BA in general (and BA may not even be defined).

It is associative: (AB)C = A(BC).

Distributive: A(B + C) = AB + AC.

Special Matrices

Identity matrix I: square matrix with 1s on the diagonal, 0s elsewhere. Acts as the multiplicative identity: AI = IA = A.

I₂ = [1 0]    I₃ = [1 0 0]
     [0 1]         [0 1 0]
                   [0 0 1]

Zero matrix: all entries zero. Additive identity.

Diagonal matrix: non-zero entries only on the main diagonal.

Symmetric matrix: A = Aᵀ (equal to its transpose). Aᵢⱼ = Aⱼᵢ.

Transpose: flip rows and columns. (Aᵀ)ᵢⱼ = Aⱼᵢ.

[1 2 3]ᵀ = [1 4]
[4 5 6]    [2 5]
           [3 6]

The Determinant

The determinant is a single number computed from a square matrix. It encodes key information about the matrix.

2×2:

det([a b]) = ad − bc
    [c d]

3×3 (cofactor expansion along first row):

det([a b c])
    [d e f]  = a(ei − fh) − b(di − fg) + c(dh − eg)
    [g h i]

What the determinant means

Geometrically: the absolute value of the determinant is the scale factor for areas (2D) or volumes (3D) under the transformation the matrix represents.

|det(A)| = 2: the transformation doubles areas
|det(A)| = 0.5: halves areas
det(A) = 0: the transformation collapses space into a lower dimension (all area/volume becomes zero)

Sign: det > 0 means orientation is preserved (no flip); det < 0 means orientation is reversed.

Key properties

det(AB) = det(A) × det(B)
det(Aᵀ) = det(A)
det(I) = 1
det(cA) = cⁿ det(A)  for n×n matrix

Row operations and determinants:

Swap two rows: det changes sign
Multiply a row by c: det multiplies by c
Add a multiple of one row to another: det unchanged

Singular vs Invertible

A square matrix is singular (non-invertible) if det(A) = 0. It’s invertible (non-singular) if det(A) ≠ 0.

Equivalent conditions for a matrix to be invertible:

det(A) ≠ 0
The columns are linearly independent
Ax = 0 has only the trivial solution x = 0
A has full rank
The transformation is bijective (one-to-one and onto)

All these conditions are equivalent — they’re different ways of saying the same thing.

The Inverse

The inverse A⁻¹ of a square matrix A satisfies:

A A⁻¹ = A⁻¹ A = I

2×2 inverse:

A = [a b],   A⁻¹ = 1/(ad−bc) × [ d  −b]
    [c d]                        [−c   a]

Swap the diagonal, negate the off-diagonal, divide by the determinant.

Why it matters: the matrix equation Ax = b has solution x = A⁻¹b (when A is invertible). This is the matrix version of solving a linear system.

In practice: computing A⁻¹ explicitly is expensive. For solving Ax = b, Gaussian elimination is faster. The inverse is conceptually important but computationally avoided.

Gaussian Elimination

The standard algorithm for solving linear systems. Transform the augmented matrix [A|b] using row operations into row echelon form, then back-substitute.

Solve: x + 2y = 5
       3x + 4y = 11

Augmented: [1 2 | 5]
           [3 4 | 11]

R2 → R2 − 3R1:
[1 2 | 5]
[0 −2 | −4]

R2 → R2 / −2:
[1 2 | 5]
[0  1 | 2]

Back-substitute: y = 2, x = 5 − 2(2) = 1

Row echelon form: zeros below the diagonal. Reduced row echelon form (RREF): zeros above and below each leading 1. RREF directly reads off the solution.

Rank

The rank of a matrix is the number of linearly independent rows (= number of linearly independent columns). It’s the “true” dimensionality of the matrix.

Full rank: rank = min(m, n) — maximum possible
Rank deficient: rank < min(m, n) — columns (or rows) are linearly dependent

For a square n×n matrix:

Rank n: invertible
Rank < n: singular

Rank-nullity theorem: rank(A) + nullity(A) = n, where nullity is the dimension of the null space (solutions to Ax = 0).

Matrix as Data

In machine learning, data is a matrix: n rows (observations), p columns (features). Every operation — normalisation, dimensionality reduction, regression — is matrix algebra.

Linear regression in matrix form: β̂ = (XᵀX)⁻¹Xᵀy. One formula, any number of features. The normal equations emerge directly from minimising squared error, and the solution is a matrix inverse.

PCA, SVD, neural network weight updates — all matrix operations. Understanding matrices is understanding why these algorithms work.