Learn linear regression Machine Learning

3 min readAug 16, 2021

What is Linear Regression

Linear Regression is a supervised Machine Learning approach for resolving regression issues.Learning a linear regression model involves estimating the coefficients values for independent input fields that together with the intercept Rather of trying to categorize data into separate categories, regression is used to forecast values within a continuous range.
The known parameters are utilized to create a consistent and continuous slope that is used to forecast the unknown or outcome.

When you are given that the representation is a linear equation, then it makes it easy to predict, which is as simple as solving the equation for a specific input set.

Like in the linear regression example suppose you want to predict weight y from the height x. The linear regression model is

y = M0 M1 * x1

Unlike simple linear regression, multiple linear regression uses several explanatory variables to predict the dependent outcome of a response variable.

A multiple linear regression model looks like this:

Y=a+b1X1+b2X2+b3X3+…+btXt

Here, Y is the variable that you are trying to predict, X’s are the variables that you are using to predict Y, a is the intercept, and b’s are the regression coefficients — they show how much a change in certain X predicts a change in Y, everything else being equal.

Evaluation Regression Analysis

R squared or Coefficient of Determination
The most commonly used metric for model evaluation in regression analysis is R squared. It can be defined as a Ratio of variation to the Total Variation. The value of R squared lies between 0 to 1, the value closer to 1 the better the model.
Mean Squared Error (MSE)
Another Common metric for evaluation is Mean squared error which is the mean of the squared difference of actual vs predicted values.
Root Mean Squared Error (RMSE):
The root of MSE, or the mean difference between Actual and Predicted values, Large mistakes are charged by RMSE, but not by MSE.

Coding Practice

Now lets try a real word example for calculating the house prices

First get your sframe link
https://drive.google.com/drive/folders/1mTrC013w0BjJlpe3rVdkJmB1pbIy60ON?usp=sharing

!pip install turicreate

Next step

import turicreatefrom google.colab import drivedrive.mount('/content/drive')sales =  turicreate.SFrame("/content/drive/MyDrive/week2/sframe/home_data.sframe")sales

sales.show()

turicreate.show(sales[1:1000]['sqft_living'],sales[1:1000]['price'],xlabel="square feet",ylabel="price")train_data,test_data = sales.random_split(.8,seed=458)sqft_model = turicreate.linear_regression.create(train_data, target="price",features=['sqft_living'])test_data["price"].mean()sqft_model.evaluate(test_data)

import matplotlib.pyplot as plt%matplotlib inlineplt.plot(test_data['sqft_living'],test_data['price'],'.',test_data['sqft_living'],sqft_model.predict(test_data),'-')sqft_model.coefficientshouse_features =['bedrooms','bathrooms','sqft_living','sqft_lot','floors','zipcode']

turicreate.plot(sales['zipcode'],sales['price'])mymodel_featured = turicreate.linear_regression.create(train_data,target='price',features=house_features)

Testing Model