The Mathematics behind machine learning — Part I

Naina Chowdhary Vallurupalli
4 min readSep 30, 2023

--

In the ever-evolving landscape of machine learning, a powerful and intricate world is intricately woven into the fabric of mathematics. At the heart of this transformative field lies a deep and symbiotic relationship with mathematical principles. From algorithms that underpin predictive models to the optimization techniques that fine-tune neural networks, mathematics serves as the cornerstone of machine learning’s remarkable advancements. This intricate connection empowers data scientists and machine learning practitioners to harness the potential of data-driven insights, translating complex mathematical concepts into practical solutions that drive innovation across industries. In this synergy of mathematics and machine learning, the boundaries of what’s possible continue to expand, paving the way for groundbreaking discoveries and transformative applications in our data-driven world. Here is my take on some of the concepts of mathematics behind the magical world of ML.

The world is connected with Mathematics be it from picking one pen on the table to simple grocery summations. Everyday and in every aspect of our lives we are connected with mathematics.

With the growing saga of data science and the constant buzz of words like Machine Learning, Artificial intelligence, Deep Learning, and Gen AI let's take a step back and see the microscopic link that is connecting this huge web- the mathematics behind it.

Concept 1: Linear Algebra

Linear algebra is the bedrock of machine learning. It provides the mathematical underpinnings to solve the equations we use to build models.

Initially, it looked to me like a battle of scalars and vectors but when I understood the significance I started appreciating the usage.

In simple terms, a scalar is a single numeric value, such as a real number or a complex number. Scalars are typically denoted using lowercase letters. It is a real number and an element of the field used to define a vector space. In computing terminology, the term scalar is synchronous with the term variable and is a storage location paired with a symbolic name. This storage location holds an unknown quantity of information called value.

Vectors [this is in the context of linear algebra], a fundamental mathematical object that represents a quantity with both magnitude and direction. Vectors are used to describe various physical quantities in science and engineering, and they are essential for understanding and solving problems related to linear transformations and systems of linear equations. The number of elements in the the vector is called the order of the vector. They can also represent the n-dimentional space. In spatial sense , the Euclidean distance from the origin to the point represented by the vector gives the length of the vector.

It is often seen as x=[x1, x2, x3, x4… xn]

Matrix is a group of vectors that all have same dimension that is number of columns. In simple terms it is a two -dimentional array which has rows and columns.

An example of Matrix

Tensor is a multi-dimensional arrays of numbers that have specific transformation properties under coordinate transformations. we can look at vector as a subclass of tensors. With tensors, the rows extend along the y-axis and colunms along x-axis. Each axis is a dimension, and tensor have additional dimensions. Tensors also have a rank. On the other hand scalars have a rank of 0 ,vector have a rank of 1 and matrix have a rank of 2. Any entity of a rank of 3 and above is a tensor.

Tensor

Hyperplanes is a subspace of one dimension less than its ambient space. In a 3-dimensional space hyperplane should be 2-dimensional. In a 2-dimentional space, hyperplane would be a one-dimentional line. In visulization it can be considered as a mathematical structure that divides an n-dimentional space into separate parts and can be used in case of classifications. Optimizing the hyperplane parameters is a import part of linear modelling.

Dot product which is also called as ‘scalar product’ or ‘inner product”. It takes two vectors of same length and returns a single number. we match up the entities of two vectors , multiply them and sum up the product obtained. Dot product is a measure of how big the individual elements are in each vector. When realtive values of vectors are taken we call that as normalization, the dot product is a measure of how similar these are. Mathematically it can also be said that dot product of normalized vectors is cosine similarity.

Element-wise product also called as Hadamard Product takes 2 vectors of same length and produces a vector of same length and each corresponding element multiplied together from two source vectors. It is denoted as A ⊙ B.

C[i][j] = A[i][j] * B[i][j].

Element Wise Product

Outer Product is also known as tensor product of two input vectors. Ecah element of column vector is taken and multiplied with elemnts by all elements in a row vector creating a new row in the resultant matrix.

Vectors such a pivotal role because it is a key requirement to to take each data type such as text , time-series , image, audio , video , etc and represnt it as a vector.

Ref:

  1. O’Reilly publication
  2. Google search

--

--