代做Math 4570 Matrix methods for DA and ML Homework 6代做Python语言

2024.03.30 - 首页 >> C/C++编程

Math 4570 Matrix methods for DA and ML

Homework 6.

Using Python or Matlab for the calculations of matrices. Do not use scikit-learn or statmodels libraries for calculation.

Using Mathematica or https://www.wolframalpha.com/help the calculation of integrals.

Question 1. Find a least squares approximation to the function e-x by a linear function a + bx in the interval [1; 2]. Use the inner product〈f ; g〉= 112 f (x)g(x)dx

Question 2. Find a least square approximation for the functionsin x as a quadratic function a + bx + cx2 in the interval [0; π]. Use the inner product〈f ; g〉= 10π f (x)g(x)dx

Question 3. For any two continuous functions f (x) and g(x), let the inner product

〈f ; g〉= l0 π f (x)g(x)dx.

(1) Find an orthogonal basis for the inner product space P = Span{1; 2x + 3x2 }

(2) Find the least squares approximation to the function f (x) = sin x by a quadratic function a + bx + cx2 in the interval [0; π].

(You may need: 10π sin(x)dx = 2; 10π xsin(x)dx = π ; 10π x2 sin(x)dx = π2 - 4; 10π x3 sin(x)dx = π(π2 - 6))

Question 4. Let X 2 Rn d andb( ) 2 Rn and let J() = jjX -b( )jj2. Here the norm jj jj is the standard l2 -norm

deﬁned by dot product. You can use any results in the lecture notes.

(1) Calculate the gradient of the function J().

(2) Calculate Hessian matrix of J().

(3) Write down the update formula for approximating argminθ J() using Gradient Decent, using α for the learning rate.

(4) Write down the update formula for approximating argminθ J() using Newton’s method. (5) Find the argminθ J().

x(i)	0	0.2	0.4	0.6	0.8	1	1.2	1.4
y(i)	5.1	6.4	6.1	8.2	9.5	8.6	12	14.8

You may use Python (with only numpy library) to solve the matrix calculations.

(1) Use the Method of Least Squares to ﬁt a linear model f (x) = θ0 + θ1x1 to this dataset.

(2) Use the Method of Least Squares to ﬁt a quadratic model g(x) = θ0 + θ1x1 + θ2x2 to this dataset.

(3) Calculate and compare the RSS cost RSS (θ) = jjX -b( )jj2 for the above linear ﬁtand quadratic ﬁt.

(4) Plot the graph for the data and the linear ﬁt and quadratic ﬁt.

Question 6. Let X be the data matrix with mean zero, Y be the label vector, and = θ1 be the parameter

vector. Suppose the data follows linear model h() = T = θ1x1 + · · · θdx d: (Zero mean of X implies θ0 = 0:)

Ridge regression changes the loss function from RSS to add in a term penalizing the if they get to large: For any positive number λ, the Ridge loss function

Ridgeλ() = (Y 一 X)T (Y 一 X) + λT

(1) Find an expression for the location of the critical point of Ridgeλ() by calculating gradient@θ啸(@)Ridgeλ() =

x1	x2	y
-2	-3.94	-3.8
-1	-1.94	-1.8
0	-0.14	0.2
1	2.16	1.7
2	3.86	3.7

a). Fit a linear model y = θ1x1 + θ2x2 to this dataset when the loss is RSS= jjX 一 jj2. You should report the best ﬁt function and the RSS cost value. use Python (with only numpy library)

b). Fit a linear function to this dataset when the loss is the Ridge Loss J(θ) = jjX 一 jj2 + λ(θ1(2) + θ2(2)) with λ = 1 and with λ = 10. You should report the best ﬁt function and the RSS cost value.

Question 7. Logistics Regression Consider the categorical learning problem consisting of a data set with two labels:

Label 1:

X1	3.81	0.23	3.05	0.68	2.67
X2	-0.55	3.37	3.53	1.84	2.74

Label 2:

X1	-2.04	-0.72	-2.46	-3.51	-2.05
X2	-1.25	-3.35	-1.31	0.13	-2.82

(1) Use gradient descent to ﬁnd the logistic regression model

p(Y = 1j) =

and the boundary. (Plot the boundary, only use numpy and Matplotlib.)

Hint for code:

1 def sigmoid (x):

2 return 1/(1+ np. exp(-x))

4 def grad_cost (theta , x, y):

5 z = x. dot( theta )

6 gradcost = (1/ len(x))*np.matmul (x.T ,( sigmoid (z)-y))

7 return gradcost

Deﬁne Gradient Descent function with iterations and learning rate alpha

1 def GradientDescent (x,y, theta , alpha , iteration ):

2 for i in range ( iteration ):

3 theta_new = theta - alpha * grad_cost (theta ,x,y)

4 theta = theta_new

5 return theta_new

The result of depends on your initial value )0 , number iterations, and learning rate α . With )0 = 0( ), α = 0:02, and and 1000 iterations, we get our :

[-0.04617983, -1.37920924, -1.25274956]

(Your answer may very diferent from this. But after divide θ2 , the answer should be similarly. Or the boundary graph should be similarly.)

If you want, you can also recording the Cross-entropy cost values and plot them. The cross entropy function can bedeﬁned as:

1 def CELoss (x,y, theta ):

2 z = x. dot( theta )

3 CE=np. sum(np.matmul (y.T,np. log( sigmoid (z)))+np.matmul (( np. ones(y. shape )-y).T,np. log (( np. ones( sigmoid (z). shape )))))

4 return -(1/ len(x))*CE

The boundary θ0 + θ1X1 + θ2X2 = 0 can be plotted using plt:plot(X1; (-X1 * θ1 - θ0)/θ2 ; color = ”blue”) (Here, you only need to plot two points for X1, i.e, themin and the max.)

(2) Try quadratic Logistics Regression method for this question and obtain an quadratic boundary. (bonus) (Hint: this means to use new features: X1 , X2 , X1(2) , X1X2 , X2(2).)

Remark: You may get the polynomial feature by basic coding: numpy.c [x, x1*x1, x1*x2,x2*x2] to add columns. If allow to use scikit-learn in labs, we can use sklearn.preprocessing (See CVBootstrap.ipynbin lecture notes)

1 from sklearn . preprocessing import PolynomialFeatures

2 # Quadratic

3 poly = PolynomialFeatures ( degree =2)

4 x_poly = poly. fit_transform (x)

Graphing: You may use the following code to draw the graph:

1 X, Y = np.meshgrid (np.arange (-4, 4, 0.05) ,np.arange (-4, 4, 0.05) )

2 plt. contour (X, Y,

3 -0.01614066 -1.33955452*X -1.23265001* Y +0.02176921* X*X +0.20651087* X*Y -0.11120619* Y*Y ,

[0])

4 plt. show ()

The same drawing in diferent ranges.

Question 8. Consider the classiﬁcation problem consisting of a data set with two labels:

Label 0:

X1	0.2	0.6	2	2.6	3.1	3.8
X2	3.4	1.8	2	2.7	3.5	1.5

Label 1:

X1	-0.7	-2.1	-2.5	-3	-3.9
X2	-2.9	-2.8	-1.3	-2	-1.5

(1) Find the logistic function h() = . (You can either use Scikit-learn or Gradient descent or

(2) Find the formula for the line forming the decision boundary.

(3) Find the probability P(y = 0j) and P(y = 1j) for a test point = I0(0)l for the logistics model in the above question.

(4). What is the predicted label for the point = I0(0)l?

(5) Plot the graph for the data and the boundary.