Generating Sample Data:
Let us now generate some sample data using our previously developed functions. First, build a polynomial in the form of:
Next, create gaussian noise to simulate a gaussian process:
The resultant equation is a combination of the noise and the polynomial:
In python we will utilize numpy’s np.random.normal(mean,std,n_samples), and simply add it to the evaluated polynomial. It is important to note that there is nothing special about the choice of the polynomial function; it is simply arbitrary and for the sake of presentation.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Polynomials | |
@author: Abdullah Alnuaimi | |
""" | |
import numpy as np | |
import matplotlib.pyplot as plt | |
def fit_poly(a,k): | |
'''returns a function of the dot product (A=V.a) ''' | |
A=lambda x,a=a,k=k:[[a*n**k for a,k in zip(a,k)] for n in x] | |
return A | |
def evaluate_poly(x,A): | |
''' evaluates A=V.a,stores it in matrix form, and | |
returns a list y(x)=[A0,..An]''' | |
y=[sum(i) for i in A(x)] | |
return y,A(x) | |
############################## Main ######################################### | |
# y(x)=x+x^2-0.2x^3 | |
coefficients=[0,1,1,–.2] # polynomial | |
degree=[0,1,2,3] | |
A=fit_poly(coefficients,degree) # returns A(x0)…A(Xn) | |
# Evaluate the functions and returns p(x) | |
x=np.linspace(0,5,100) | |
p,_=evaluate_poly(x,A) | |
# Fix the random seed and add gauss. noise to the data. | |
np.random.seed(seed=1337) | |
e=np.random.normal(0,.7,len(x)) | |
y=p+e | |
#Plotting | |
plt.scatter(x,y,label='y(x)=x+x^2-0.2x^3+e(x)',marker='.') | |
plt.plot(x,p,label='p(x)=x+x^2-0.2x^3',color='red') | |
plt.xlabel('x') | |
plt.ylabel('y(x)') | |
plt.legend() | |
plt.grid() | |
#np.savetxt('data.csv', (x,y), delimiter=',') |
Now we are ready to take on the task of linear regression. (TL;DR) If you don’t care about generating the data for this tutorial you can just download it as a csv and follow along in the next section.
Github repository :
https://github.com/b00033811/ml-uae
The file that contains the output is Linear_Regression/data.csv