代做PSTAT 131 Final Project调试数据库编程

- 首页 >> Java编程

PSTAT 131 Final Project

Submission Contents

You should submit a .zip file that contains the following:

-     data set(s) of choice should work for a machine learning project

- an .Rmd (R Markdown) file containing your project, in the form. of a written report

- the knitted .html or .pdf file containing your project

- any .R files (R scripts) containing work on your project. The degree of organization of these can vary, but should at least have meaningful le titles, like "eda. R" or

"missing_data_analyses. R," etc.

- any raw data les. Exceptions can be made. For instance, if your data les are huge in terms of megabytes, you don't have to submit them. If your data is proprietary or confidential, you don't have to submit it.

- a code book. This should take the form of a document (either .doc, .html, .pdf, or .txt) that, at minimum, identifies and defi nes each column in your nal data set. If a variable takes on different values (for example, 1 = "single," 2 = "married," etc.), those values should be defi ned in the code book.

If the .zip le is too large to submit via Canvas, you may submit it to the instructor (me) personally via email, either as an attachment or via Google Drive, etc.

Report Contents

Your fi nal project report should be written similarly to a paper, with figures, code, and results included throughout to illustrate your points and findings. Text should be included to guide the reader. I recommend reading through the example projects to get an idea of this layout, and referencing the project rubric for more information. Specifically, your report must contain:

- An introduction section: Describes the data, the research questions, provides any background readers need to understand your project, etc.

- A conclusion section: Discusses the outcome(s) of models you fit. Which models performed well, which performed poorly? Were you surprised by model performance? Next steps? General conclusions?

- A table of contents

- A section for exploratory data analysis: This should contain at least 3 to 5 visualizations and/or tables and their interpretation/discussion. At minimum you should create a univariate visualization of the outcome(s), a bi-variate or multivariate visualization of the relationship(s) between the outcome and select predictors, etc. Part of an EDA involves asking questions about your data and exploring your data to fi nd the answers.

- A section discussing data splitting and cross-validation: Describe your process of splitting data into training, test, and/or validation sets. Describe the process of cross-validation. Remember to write for a general audience. Act as if your project will be read by people new to machine learning.

- A section discussing model fitting: Describe the types of models you fit, their parameter values, and the results.

- Model selection and performance: A table and/or graph describing the performance of your best-fitting model on testing data. Describe your best-fitting model however you choose, and the quality of its predictions, etc.




站长地图