The OG of Machine Learning — Linear Regression(With Implementation from Scratch)
I think this has to be the first algorithm that most of us have learned or at least I have and today we’ll talk about Linear Regression in-depth.
An example to start with ? Yeap, The main characters in our story are employees in a company. They deal a lot with advanced cutting edge computer vision techniques and hire graduates every year. A fresh graduate, Rahul was hired but he wasn’t happy with his pay. He complained saying everyone in the company is getting paid higher than him. The company explained him that as he grows in experience and expands his skillset, he will be paid more. To further solidify they showed him the graph of their employees salary w.r.t to experience. Rahul was now satisfied.
But who cares about Rahul? :p Sorry Rahul, but if you look at the graph, the way salary increases seems very…. linear right?
Linear Regression is a supervised machine learning technique that predicts
something by forming linear relationships with one or more independent features.
Before we move onto the working of the algorithm, it helps to keep few assumptions in mind. These points will help us decide if we are completely leveraging the power of linear regression. If you fail to satisfy any of the conditions, it means that your model isn’t the best and you can do better.
- The dependent variable(whatever you’re trying to predict) should always form a linear relationship with the independent variables(what features you use to predict).
- It follows normal distribution — the bell curve
- Multi-collinearity is a game-killer. Avoid it at all costs.
- The errors must have the same variance.
Now we mainly have two types of Linear Regression
Simple Linear Regression — This is where you are trying to predict something with a single independent feature.
The equation looks something like this y = mx + c
Multiple Linear Regression — This is where you are trying to predict something with multiple independent features. This is usually what you come across in real time problems.
Equation : y = mx1 + mx2 + mx3 ……. + c
In the previous graph, did you notice the straight line between the points? That line is what we are trying to achieve using linear regression. The points are simply your data scattered around the plot. The line that you predict is the called the best fit line. But the question is how do you find it?
I’ll explain in two steps
1. Line formation — Take a plot and simply draw a random line. Now measure the distances of all the points to your line and add them and square them. We are squaring to make sure the negative ones below the line do not cancel out the positives above the line. This total sum is called your squared sum of errors !
2. Iterate — Now take the same plot and draw another line now slightly titled. The rest remains the same as step 1.
Let’s say I did this for 5 times so I’ll have 5 different sets of squared errors right? The least value for the errors is the best line , since it best fitted the data points in the graph and hence called the best fit ! Also by general intuition, we want something with the least error right :)
Time to coooooooooooooode !
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_splitclass LinearRegression: #the class
def __init__(self, learning_rate = 0.01, n_iters = 10):
self.lr = learning_rate
self.n_iters = n_iters
self.weights = None
self.bias = None
def fit(self, X, y):
rows , columns = X.shape self.weights = np.zeros(columns)
self.bias = 0 for _ in range(self.n_iters):
y_predicted = np.dot(X, self.weights) + self.bias dw = 1/rows * np.dot(X.T, (y_predicted - y))
db = 1/rows * np.sum(y_predicted - y) self.weights = self.weights - self.lr * dw
self.bias = self.bias - self.lr * db
def predict(self, X):
return np.dot(X, self.weights) + self.biasif __name__ == '__main__': def mse(true, pred):
return np.mean((true - pred) ** 2)
X, y = datasets.make_regression( #artificial data
n_samples= 200, n_features =1, noise = 20
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.2, random_state = 42
) regressor = LinearRegression(learning_rate=0.01, n_iters=1000)
regressor.fit(X_train, y_train)
predictions = regressor.predict(X_test) err = mse(y_test, predictions)
print("MSE:", err) y_pred_line = regressor.predict(X)
cmap = plt.get_cmap("viridis")
fig = plt.figure(figsize=(8, 6))
m1 = plt.scatter(X_train, y_train, color=cmap(0.9), s=10)
m2 = plt.scatter(X_test, y_test, color=cmap(0.5), s=10)
plt.plot(X, y_pred_line, color="black", linewidth=2, label="Prediction")
plt.show()
There’s your nicely fitted line and congratulations, you have just completed your first model !