Matrix | Ali Gulum

Matrices in Machine Learning

Matrices are one of the most practical tools in a machine learning engineer's toolkit. When you're working with real datasets, say, 1000 records each with 10 features: a matrix gives you a clean, structured way to represent and manipulate that data. In this case, you'd simply define it as a 1000-row, 10-column matrix, where each row is a data point and each column is a feature. From there, accessing, transforming, or feeding that data into an algorithm becomes straightforward, and most ML frameworks are built around exactly this kind of matrix representation under the hood.

Matrix Addition

Adding two matrices is one of the more straightforward matrix operations, but it comes with one firm rule: both matrices must be exactly the same size: same number of rows, same number of columns. You can't add a 2x2 matrix to a 3x2 matrix; the dimensions have to match.

The operation itself is simple. You add each element in the first matrix to the element that sits in the same position in the second matrix: same row, same column. Every element maps directly to its counterpart, and you compute the sum pair by pair until you've covered the entire matrix.

Taking a concrete example:

5 + 6 = 11 4 + 2 = 6
8 + 9 = 17 3 + 5 = 8

The result is a new matrix of the same dimensions, where each value is the sum of the corresponding elements from the two original matrices.

Matrix Subtraction

Matrix subtraction follows the same rules as addition: both matrices must be identical in size, with matching rows and columns. The operation is element-wise in exactly the same way: each element in the first matrix is subtracted from the element sitting in the same position in the second matrix.

A clean way to think about it is as the addition of a negative matrix. Instead of subtracting Matrix2 from Matrix1, you're effectively computing Matrix1 + (−Matrix2), where every element in Matrix2 has been flipped to its negative. The result is the same either way.

Taking a concrete example:

5 - 6 = -1 4 - 2 = 2
8 - 9 = -1 3 - 5 = -2

As with addition, the output is a new matrix of the same dimensions, where each value is the difference between the corresponding elements of the two original matrices.

Matrix Multiplication

Matrix multiplication comes in two distinct forms, and it's important not to confuse them.

Multiplying by a constant (Scalar Multiplication)

The simpler of the two. When multiplying a matrix by a constant, known as scalar multiplication, you simply multiply every element in the matrix by that constant. No special rules, no dimension requirements.

2 x 5 = 10 2 x 4 = 8

2 x 8 = 16 2 x 3 = 6

Multiplying a Matrix by Another Matrix (Dot Product)

This is where it gets more involved. When multiplying two matrices together, you need to use the dot product. The rule is: take each row of the first matrix and multiply it element-by-element against each column of the second matrix, then sum the results. That sum becomes a single element in the output matrix.

(5x1) + (8x3) + (3x2) = 35 → first row, first column of the result

(5x3) + (8x4) + (3x5) = 62 → first row, second column of the result

You then repeat this process for every row of the first matrix against every column of the second, building up the result matrix one element at a time.

Matrix Inverse

The inverse of a matrix works on the same principle as the reciprocal of a number. Just as the reciprocal of 5 is 1/5, the inverse of matrix A is written as A⁻¹. Multiplying a matrix by its inverse gives you the identity matrix: the matrix equivalent of multiplying a number by 1.

One important caveat: not every matrix has an inverse. Whether it does depends on its determinant.

Determinant of a 2x2 Matrix

For a 2x2 matrix with elements a, b, c, and d, the determinant is simply:

det(A) = (a x d) - (b x c)

Inverse of a 2x2 Matrix

To find the inverse of a 2x2 matrix, swap the positions of a and d, put negatives in front of b and c, and divide everything by the determinant. If the determinant is zero, the inverse doesn't exist.

Matrix Division

Strictly speaking, matrix division doesn't exist. Instead of dividing by a matrix, you multiply by its inverse: which achieves the same result.

Matrix Negation

Finding the negative of a matrix is straightforward: multiply every element by -1.

-(5) = -5 -(8) = -8

-(4) = -4 -(3) = -3

Transposing a Matrix

Transposing a matrix means swapping its rows and columns: what was a row becomes a column, and what was a column becomes a row. The transpose of matrix A is written as Aᵀ. It's one of the most frequently used operations in machine learning, particularly when working with dot products and neural network weight calculations.