代做ESE 605-001: Modern Convex Optimization Spring 2020 Final exam代写Python语言
- 首页 >> CSESE 605-001: Modern Convex Optimization
Spring 2020
Final exam
Date: May 2020
This is a 24 hour take-home inal. Please turn it in on Gradescope within 24 hours of your respective start times. The exam is open textbook, open notes, and open homework, and you may cite equations from the book, notes, and homework. You may also refer to online documentation for CVX* and your programming language of choice. Any other outside or online resources are strictly prohibited.
There are six problems on this exam. All problems have equal weight, but we will only count the best ive out of six questions towards your inal exam grade. Some are (quite) straightforward. Others, not so much. Be sure you are using the most recent version of CVX, CVXPY, or Convex.jl.
You may not discuss the exam with anyone until May 7th, 6PM EST, after everyone has taken the exam. The only exception is that you can ask the teaching staf for clariication, via e-mail – please do not use Piazza so as to avoid accidentally sending a public post that was meant to be private. We have made the exam as unambiguous and clear as possible, so we are unlikely to say much.
Assemble your solutions in order (problem 1, problem 2, problem 3, . . . ), starting a new page for each problem. Put everything associated with each problem (e.g., text, code, plots) together; do not attach code or plots at the end of the inal.
We will deduct points from long, needlessly complex solutions, even if they are correct. Our solutions are not long, so if you ind that your solution to a problem goes on and on for many pages, you should try to igure out a simpler one. You do not have to LATEXyour solutions, but we do expect neat, legible exams from everyone.
When a problem involves computation you must give all of the following: a clear discussion and justii- cation of exactly what you did, the source code that produces the result, and the inal numerical results or plots. Remember, the more information and explanation that you give us regarding your thought process, the easier it is for us to give you partial marks!
Files containing problem data can be found as ProbX.mat and ProbX.* iles here:
https://www.dropbox.com/sh/hpykg2hwvvjbaly/AAClHOs_qe9hfDa5R7DpELlTa?dl=0
Please respect Penn’s Academic Integrity Code: although we allow you to work on homework assignments in small groups, you cannot discuss the inal with anyone until everyone has taken it.
Check your email often during the exam, just in case we need to send out an important announcement.
Some of the data iles generate random data (with a ixed seed), which are not necessarily the same for Matlab, Python, and Julia.
By taking this exam you are agreeing to respect Penn’s Academic Integrity Code. Good luck!
1. Optimal spacecraft landing. We consider the problem of optimizing the thrust proile for a spacecraft to carry out a landing at a target position. The spacecraft dynamics are
where m > 0 is the spacecraft mass, e3 = [0, 0, 1]T, p(t) 2 R3 is the spacecraft position, with 0 the target landing position and p3 (t) representing height, f(t) 2 R3 is the thrust force, and g > 0 is the gravitational acceleration. (For simplicity we assume that the spacecraft mass is constant. This is not always a good assumption, since the mass decreases with fuel use. We will also ignore any atmospheric friction.) We must have p(Ttd ) = 0 and˙(p)(Ttd ) = 0, where Ttd is the touchdown time. The spacecraft must remain in a region given by
where α > 0 is a given minimum glide slope. The initial position p(0) and velocity ˙(p)(0) are given.
The thrust force f(t) is obtained from a single rocket engine on the spacecraft, with a given maximum thrust; an attitude control system rotates the spacecraft to achieve any desired direction of thrust. The thrust force is therefore characterized by the constraint If(t)I2 Fmax. The fuel use rate is proportional to the thrust force magnitude, so the total fuel use is
where > 0 is the fuel consumption coefficient. The thrust force is discretized in time,i.e., it is constant over consecutive time periods of length h > 0, with f(t) = fk for t 2 [(k - 1)h, kh), for k = 1, . . . , K, where Ttd = Kh. Therefore we have vk+1 = vk + (h/m)fk - hge3 , pk+1 = pk + (h/2)(vk + vk+1), where pk denotes p((k - 1)h), and vk denotes˙(p)((k - 1)h). We will work with this discrete-time model.
For simplicity, we will impose the glide slope constraint only at the times t = 0, h, 2h, ..., Kh.
(a) Minimum fuel descent. Explain how to ind the thrust proile f1 , . . . , fK that minimizes fuel consumption, given the touchdown time Ttd = Kh and discretization time h.
(b) Minimum time descent. Explain how to ind the thrust proile that minimizes the touch-down time, i.e., K, with hixed and given. Your method can involve solving several convex optimization problems.
(c) Carry out the methods described in parts (a) and (b) above on the problem instance with data given in Prob1.mat. Report the optimal total fuel consumption for part (a), and the minimum touchdown time for part (b). Helper functions for plotting code (commented out) can be found in Prob1.* to help you visualize your solution. Use the code to plot the spacecraft trajectory and thrust proiles you obtained for parts (a) and (b).
Hint: in Julia, the plot will come out rotated.
Remarks. If you’d like to see the ideas of this problem in action, watch these videos:
. http://www.youtube.com/watch?v=2t15vP1PyoA
. https://www.youtube.com/watch?v=orUjSkc2pG0
. https://www.youtube.com/watch?v=1B6oiLNyKKI
. https://www.youtube.com/watch?v=ZCBE8ocOkAQ
2. Maximizing algebraic connectivity of a graph. Let G = (V, E) be a weighted undirected graph with n = jVj nodes, m = jEj edges, and weights w1 , . . . , wm 2 R+ on the edges. If edge k connects nodes i and j, then deine ak 2 Rn as (ak )i = 1, (ak )j = — 1, with other entries zero. The weighted Laplacian (matrix) of the graph is deined as
where A = [a1 · · · am ] 2 Rnxm is the incidence matrix of the graph. Nonnegativity of the weights implies L > 0.
Denote the eigenvalues of the Laplacian L as
which are functions of w. The minimum eigenvalue λ1 is always zero, while the second smallest eigenvalue λ2 is called the algebraic connectivity of G and is a measure of the connectedness of a graph: The larger λ2 is, the better connected the graph is. It is often used, for example, in analyzing the robustness of computer networks.
Though not relevant for the rest of the problem, we mention a few other examples of how the algebraic connectivity can be used. These results, which relate graph-theoretic properties of G to properties of the spectrum of L, belong to a ield called spectral graph theory. For example, λ2 > 0 if and only if the graph is connected. The eigenvector v2 associated with λ2 is often called the Fiedler vector and is widely used in a graph partitioning technique called spectral partitioning, which assigns nodes to one of two groups based on the sign of the relevant component in v2. Finally, λ2 is also closely related to a quantity called the isoperimetric number or Cheeger constant of G, which measures the degree to which a graph has a bottleneck.
The problem is to choose the edge weights w 2 Rm+subject to some linear inequalities (and the
nonnegativity constraint) so as to maximize the algebraic connectivity:
with variable w 2 Rm . The problem data are A (which gives the graph topology), and F and g (which describe the constraints on the weights).
(a) Describe how to solve this problem using convex optimization.
(b) Numerical example. Solve the problem instance given in Prob2.mat, which uses F = 1T and g = 1 (so the problem is to allocate a total weight of 1 to the edges of the graph). Compare the algebraic connectivity for the graph obtained with the optimal weights w? to the one obtained with wunif = (1/m)1 (i.e., a uniform allocation of weight to the edges). You will ind that the optimal weight vector w? has some zero entries (which due to the inite precision of the solver, will appear as small weight values); you may want to round small values (say, those under 10-4) of w? to exactly zero. Briely comment on the following (incorrect) intuition: “The more edges a graph has, the more connected it is, so the optimal weight assignment should make use of all available edges.”
3. Newton method for approximate total variation de-noising. Total variation de-noising is based on the bi-criterion problem with the two objectives
Here xcor 2 Rn is the (given) corrupted signal, x 2 Rn is the de-noised signal to be computed, and φtv is the total variation function. This bi-criterion problem can be formulated as an SOCP, or, by squaring the irst objective, as a QP. In this problem we consider a method used to approximately formulate the total variation de-noising problem as an unconstrained problem with twice diferentiable objective, for which Newton’s method can be used.
We irst observe that the Pareto optimal points for the bi-criterion total variation de-noising problem can found as the minimizers of the function
where µ ≥ 0 is parameter. (Note that the Euclidean norm term has been squared here, and so is twice diferentiable.) In approximate total variation de-noising, we substitute a twice diferentiable approximation of the total variation function,
for the total variation function φtv . Here > 0 is parameter that controls the level of approximation. In approximate total variation de-noising, we use Newton’s method to minimize
(The parameters µ > 0 and > 0 are given.)
(a) Find expressions for the gradient and Hessian of ψ .
Hint. Deine the funciton f : R ! R
and F : R(n-1) ! R as F (u1 , . . . , un-1 ) = f (ui ). Also deine the forward diference matrix D as
These deinitions should come in handy in answering part (a).
(b) Explain how you would exploit the structure of the Hessian to compute the Newton direction for ψ e伍ciently. (Your explanation can be brief.)
(c) Implement Newton’s method for approximate total variation de-noising. Get the corrupted signal xcor from the ile Prob3.mat, and compute the de-noised signal x* , using parameters = 0.001, µ = 50 (which are also in the ile). Use line search parameters α = 0.01, β = 0.5, initial point x(0) = 0, and stopping criterion λ2 /2 10-8 . Plot the Newton decrement versus iteration, to verify asymptotic quadratic convergence. Plot the inal smoothed signal x* , along with the corrupted one xcor.
Hint. Make sure to use sparse matrix representations in your code to speed up the Newton step solves! The functions spdiags and speye may come in handy in Matlab; scipy.sparse.diags, scipy.sparse.identity, .tocsr(), and scipy.sparse.linalg.spsolve may come in handy in Python; spdiagm in the SparseArrays module may come in handy in Julia.
4. Maximum likelihood estimation of an increasing nonnegative signal. We wish to estimate a scalar signal x(t), for t = 1, 2, . . . , N , which is known to be nonnegative and monotonically nondecreasing:
This occurs in many practical problems. For example, x(t) might be a measure of wear or deterioration, that can only get worse, or stay the same, as time t increases. We are also given that x(t) = 0 for t 0.
We are given a noise-corrupted moving average of x over a horizon k passed through a ilter h, via:
where v(t) are independent Ⅵ (0, 1) random variables.
(a) Show how to formulate the problem of inding the maximum likelihood estimate of x, given y , taking into account the prior assumption that x is nonnegative and monotonically nondecreasing, as a convex optimization problem. Be sure to indicate what the problem variables are, and what the problem data are.
(b) We now consider a speciic instance of the problem, with problem data (i.e., N, k, h, and y) given in the ile Prob4.mat. (This ile contains the true signal xtrue, which of course you cannot use in creating your estimate.) Find the maximum likelihood estimate ˆ(x)ml , and plot it, along with the true signal. Also ind and plot the maximum likelihood estimate ˆ(x)ml ;free not taking into account the signal nonnegativity and monotonicity.
Hints.
. Matlab: The function conv (convolution) is overloaded to work with CVX.
. Python: Numpy has a function convolve which performs convolution. CVXPY has conv which does the same thing for variables.
. Julia: The function conv is overloaded to work with Convex.jl.
5. Sparse index tracking. The (weekly, say) return of n stocks is given by a random variable r 2 Rn , with mean and covariance E(r - )(r - )T = Σ > 0. An index (such as S&P 500 or Wilshire 5000) is a weighted sum of these returns, given by z = cTr, where c 2 R . (For example, the vector c is nonzero only for the stocks in the index, and the coefficients ci might be proportional to some measure of market capitalization of stock i.) We will assume that the index weights c 2 Rn , as well as the return mean and covariance and Σ, are known and ixed.
Our goal is to ind a sparse weight vector w 2 Rn , which can include negative entries (meaning, short positions), so that the RMS index tracking error, deined as
does not exceed 0.10 (i.e., 10%). Of course, taking w = c results in E = 0, but we are interested in inding a weight vector with (we hope) many fewer nonzero entries than c has.
Remark. This is the idea behind an index fund: You ind a sparse portfolio that replicates or tracks the return of the index (within some error tolerance). Acquiring (and rebalancing) the sparse tracking portfolio will incur smaller transactions costs than trading in the full index.
(a) Propose a (simple) heuristic method based on convex optimization for inding a sparse weight vector w that satisies E 0.10.
(b) Carry out your method on the problem instance given in Prob5.mat. Give card(w), the number of nonzero entries in w. (To evaluate card(w), use sum(abs(w)>0.01), which treats weight components smaller than 0.01 as zero.) (You might want to compare the index weights and the weights you ind by typing [c w]. No need to print or turn in the resulting output, though.)
6. Optimal generator dispatch. In the generator dispatch problem, we schedule the electrical output power of a set of generators over some time interval, to minimize the total cost of generation while exactly meeting the (assumed known) electrical demand. One challenge in this problem is that the generators have dynamic constraints, which couple their output powers over time. For example, every generator has a maximum rate at which its power can be increased or decreased.
We label the generators i = 1, . . . , n, and the time periods t = 1, . . . , T. We let pi,t denote the (nonnegative) power output of generator i at time interval t. The (positive) electrical demand in period t is dt. The total generated power in each period must equal the demand:
Each generator has a minimum and maximum allowed output power:
The cost of operating generator i at power output u is φi (u), where φi is an increasing strictly convex function. (Assuming the cost is mostly fuel cost, convexity of φi says that the thermal efficiency of the generator decreases as its output power increases.) We will assume these cost functions are quadratic: φi (u) = αiu + βiu2 , with αi and βi positive.
Each generator has a maximum ramp-rate, which limits the amount its power output can change over one time period:
In addition, changing the power output of generator i from ut to ut+1 incurs an additional cost ψi (ut+1 - ut ), where ψi is a convex function. (This cost can be a real one, due to increased fuel use during a change of power, or a ictitious one that accounts for the increased maintenance cost or decreased lifetime caused by frequent or large changes in power output.) We will use the power change cost functions ψi (v) = ijvj, where i are positive.
Power plants with large capacity (i.e., Pi(max)) are typically more efficient (i.e., have smaller αi , βi ),
but have smaller ramp-rate limits, and higher costs associated with changing power levels. Small gas- turbine plants (‘peakers’) are less efficient, have less capacity, but their power levels can be rapidly changed.
The total cost of operating the generators is
Choosing the generator output schedules to minimize C, while respecting the constraints described above, is a convex optimization problem. The problem data are dt (the demands), the generator power
limits Pi(min) and Pi(max), the ramp-rate limits Ri , and the cost function parameters αi , βi , and i. We
will assume that problem is feasible, and that pt(i) are the (unique) optimal output powers.
(a) Price decomposition. Show that there exist power prices Q1 , ..., QT such that the original problem decomposes into n decoupled subproblems
The objective here is the portion of the objective for generator i, minus the revenue generated by the sale of power at the prices Qt. Note that this problem involves only generator i; it can be solved independently of the other generators (once the prices are known). How would you ind the prices Qt?
You do not have to give a full formal proof; but you must explain your argument fully. You are welcome to use results from the text book.
(b) Solve the generator dispatch problem with the data given in Prob6.mat, which gives (fake, but not unreasonable) demand data for 2 days, at 15 minute intervals. Plot the demand, optimal generator powers, and prices. Comment on anything you see in your solution that might at irst seem odd. Using the prices found, solve the problems in part (a) for the generators separately, to
be sure they give the optimal powers (up to some small numerical errors).
Hint. These documentation pages may come in handy:
. Matlab: http://cvxr.com/cvx/doc/basics.html#dual-variables
. Python: https://www.cvxpy.org/tutorial/advanced/index.html#dual-variables . Julia: https://www.juliaopt.org/Convex.jl/stable/advanced/#Dual-Variables-1
Remark. While beyond the scope of this course, we mention that there are very simple price update mechanisms that adjust the prices in such away that when the generators independently schedule themselves using the prices (as described above), we end up with the total power generated in each period matching the demand, i.e., the optimal solution of the whole (coupled) problem. This gives a decentralized method for generator dispatch.