# Machine Learning (in Elixir) - Intro

## Find out what Machine Learning is about and what it looks in the Elixir's world

This post will be a part of my journey in learning Machine Learning (ML). I'm a student of the decent ML course "Machine Learning with JavaScript". You may wonder: *Why not Python?* Well, to make things more interesting - it won't even be JS ๐

I decided to go with Elixir ๐ and its TensorFlow-like library - Nx! The foundations are the same for all languages and libraries. There's a lot of math. That's a bit scary. But the journey and results are so exciting! So don't worry and let's start the ML journey!

## The goal of Machine Learning

This might sound trivial but have you thought about what ML is about? The ultimate goal of the ML model is to **guess the result for the given input**. There're plenty of real-life examples from many different fields, for instance:

What's the predicted

**gender**(result), for the given**height**and**weight**(input)?*Tell me what**number**(result) is on**the image**(input)*Is this

**email**(input)**spam**(result)?AI, tell me

**everything you know**(result) about**Elixir programming language**(input)

I'd like to highlight the word **guess** here. ML problems are often complex and almost never you will get 100% certainty that the answer is correct. Roughly we can say that accuracy above **80%** is good enough. But it strongly depends on the solved problem.

In the ML world, we call an **input** set of **features**. The **result** is called a **label**. The individual set of features is called an **example**. So, for the first example of predicting gender based on height and weight, it would be something like that:

```
features = [
#[height (cm), weight (kg)]
[180, 75], # 1st example
[159, 56], # 2nd example
...
]
labels = [
#[gender 0 - man, 1 - woman]
[0], # for the 1st example
[1], # for the 2nd example
...
]
```

As you can see, order matters in both directions. Individual features should be placed in specific columns (vertical order). And remember that labels are linked to a particular set of features (horizontal order).

Notice that in the example above we represent gender as a number, that's because ML speaks in numbers. Transforming non-numerical values to numbers is called **encoding**.

## When Machine Learns...

To predict the result correctly you need a pre-trained **ML model**. A model is an algorithm, a super math formula. For most of the problems, it's so tough for humans to determine it so we incorporate machines - our computers ๐

### How humans solve math problems

When we solve the math problem, it's about applying the math formula, calculating factors, and then we can calculate the result. Let's dig deeper for one of the simpler and most useful functions - linear function.

`y = ax + b`

`y`

is the result, `x`

is an input. `a`

and `b`

are factors we call **weights** in ML. If you have at least two **examples** (*x-y* pairs), you can figure out weights (*a* and *b*) and with you, you can calculate any *y* for the given *x*.

That's simple, isn't it? We don't need ML and all the hard stuff at all, right? ๐ Well, in the math field or ideal world - **yes**! ๐ But the real world is a bit more complicated...

### How machines do it (in real life)

Math formulas won't solve many real problems themself, because they're too complex. But, with meaningful data (features with labels), we can apply some math operations and let the machine figure out the **weights** that can be used for **predictions**.

Let's suppose that we'd like to predict a person's **weight** (label) based on **height** (input). To simplify the task, let's consider **only women**. We can assume that this relationship is somehow linear - larger height = larger weight.

Would linear function work for this? Nope. But linear regression will do. I'll describe it more in the next post. For now, let's assume that it's a kind of "average linear function". This will work for the real data. ๐

We'll use this dataset for our quick analysis. I generated the graph below in Apple Numbers, where you can see how it looks in practice and what's the calculated equation.

Numbers calculated that for the given data `y = ax + b`

, `a=0.0578`

and `b=95.853`

(see top left corner). It calculated the weights (`a`

and `b`

)! We may say that the "machine learned" and figured it out itself.

Dots are spread out all over the `y`

(weight) axis and only a few of them are close to the line representing the linear function we'd used for predictions. It looks like the accuracy is pretty bad. Why?

Is it because the height-weight relationship isn't linear? Long story short: **height alone is insufficient data to accurately determine weight**. It makes sense, in our dataset women with 172cm height weigh between 62-116kg. We can't help it. ๐คทโโ๏ธ

### Data quality and quantity matters

In machine learning, **good quality data** (without invalid, fake, or inappropriate values) and the **number of samples** (the more, the better) **are essential for increasing the accuracy of predictions**.

The dataset we used also contains the *index* column, with values ranging from 1-5, determining if the weight is relatively good (3), too low (1), or too high (5). Let's reevaluate the height-weight relationship using linear regression in spreadsheet, but only for data with an index of 3.

Now `a=0.6232`

and `b=-40.047`

. And as you can see, the dots are much closer to the prediction line. Much better! ๐

### Training and testing

In the weight-height example, we used Apple Numbers for calculating *weights* (remember? `a`

and `b`

). In day-to-day work, we'd write a code loading the data and **doing calculations based on the data**. We call this step **training**. It's like: *"Okay, here's the data, calculate the weights so the accuracy is good enough"*.

We tested accuracy manually, checking the graphs. It's always cool to plot a graph and do a sanity check ๐. But in practice, we write a math formula checking the error for some features and labels. Then, we use the calculated formula to predict the known *label* for the given test *features*.

To make this work, we split the features-label data we have into **training** and **test** sets, usually, it's 90% for training and 10% for testing.

## The quick analysis conclusions

**Quality and quantity of the data used for ML are essential to get good accuracy**Providing

**meaningful****features significantly increases the accuracy.**The analysis worked poorly for just height. After involving the index, it gave much better results. Imagine how providing BMI, body fat percentage, or waist size could affect the accuracyEven with super polished data,

**it's still predicting**- almost never you'll get 100% accuracyBonus: A spreadsheet app may also do some simple ML-ish stuff for you ๐

## Other Machine Learning Glossary

### Tensor - (un)necessary wrapper?

The foundation building brick is a **tensor**. It's a data structure that looks like a number or more often - an array. Tensors are created from plain values by the dedicated ML library, like Nx for Elixir.

Check tensor types based on **dimensions** in the table below. The most common tensors in ML are 2D tensors - **matrices**.

Dimension | Type | Example |

0 | Scalar | `123` |

1 | Vector | `[1, 2, 3]` |

2 | Matrix | `[[1], [2], [3]]` |

n | n-dimensional tensor | `[[[1], [2]],[[3], [4]]]` |

And let's check out what the tensors look like in Nx.

```
> scalar = Nx.tensor(1.0)
#Nx.Tensor<
f32
1.0
>
> vector = Nx.tensor([1, 2, 3])
#Nx.Tensor<
s64[3]
[1, 2, 3]
>
> matrix = Nx.tensor([[1], [2], [3]])
#Nx.Tensor<
s64[3][1]
[
[1],
[2],
[3]
]
>
```

`Nx.tensor/1`

function returns a `Nx.Tensor`

struct, which looks a bit... dull ๐คทโโ๏ธ It doesn't seem to be a big deal compared to plain numbers or lists. But **IT IS a big deal!** Why? For these two reasons:

You can easily

**perform any of many ML operations**from the Nx library on tensors**Nx is optimized for "crunching numbers"**and it's much faster than using plain numbers or lists. Also, you can use your GPU for calculations which speed things up even more ๐

Let's make it clear: You **can** do Machine Learning using plain numbers and lists. But it's much more painful and less performant than learning and using Nx.

We use tensors for all data we use in our ML project: *features*, *labels,* and *weights.*

### A closer look at Nx tensor

Okay, now you know that tensor is essential in ML. Let's take a look closer at how it does look under the hood. We use the `matrix`

tensor as an example.

```
> matrix = Nx.tensor([[1], [2], [3]])
> Map.from_struct(matrix)
%{
data: %Nx.BinaryBackend{
state: <<1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
0, 0>>
},
type: {:s, 64},
names: [nil, nil],
shape: {3, 1}
}
```

Now you can see that data is stored as a binary - that's where the performance comes from. Each tensor has a **type** for the stored numbers. In the example above it's `{:s, 64}`

or `s64`

which is a *signed integer 64-bit*. You can specify the type explicitly when creating a tensor.

Another attribute `names`

stores names for the axes. Names work like aliases and are optional.

The **shape** is very important. It informs you about the **size of each axis**. `{3, 1}`

means it has 3 rows and 1 column. Matrices and shapes are used a lot so keep in mind this mantra: *"row-column, row-column, row-column, ..."*

The shape is **essential for many math operations** that are performed on tensors, like matrix multiplication. Believe me, tensors are going to be transformed a lot - concatenated, reshaped, multiplied, split, etc. But more on this in the next post ๐

## Conclusion - Humanly on Machine Learning

I know there was a lot of "talking" and just a few lines of code, but I think it might be useful before diving deeper. I hope this post may encourage you to take a closer look at ML and maybe even with Nx and Elixir..? ๐

But before the end, let's wrap up some concepts:

The main purpose of Machine Learning (from an end-user perspective) is to

**predict some information**based on the given dataYou can try to predict (better or worse) almost everything based on any data, as long as you are able to create an ML model for it (from a data-engineer perspective)

The better, more meaningful, and larger the data you provide, the better results you'll get

You will (almost) never achieve 100% accuracy in your predictions; roughly speaking, accuracy above 80% is considered good enough, but it depends on the particular case

Although it's possible to solve ML problems using plain data structures provided by a programming language, it's definitely worth learning a dedicated library like PyTorch, TensorFlow (Python), TensorFlow.js (JavaScript), or the mentioned Nx (Elixir)

Nx and Elixir work great with numbers and ML ๐