代做PSTAT 131 Final Project调试数据库编程

2025.06.04 - 首页 >> Java编程

PSTAT 131 Final Project

Submission Contents

You should submit a .zip ﬁle that contains the following:

- data set(s) of choice should work for a machine learning project

- an .Rmd (R Markdown) ﬁle containing your project, in the form. of a written report

- the knitted .html or .pdf ﬁle containing your project

- any .R ﬁles (R scripts) containing work on your project. The degree of organization of these can vary, but should at least have meaningful ﬁle titles, like "eda. R" or

"missing_data_analyses. R," etc.

- any raw data ﬁles. Exceptions can be made. For instance, if your data ﬁles are huge in terms of megabytes, you don't have to submit them. If your data is proprietary or conﬁdential, you don't have to submit it.

- a code book. This should take the form of a document (either .doc, .html, .pdf, or .txt) that, at minimum, identiﬁes and deﬁ nes each column in your ﬁ nal data set. If a variable takes on different values (for example, 1 = "single," 2 = "married," etc.), those values should be deﬁ ned in the code book.

If the .zip ﬁle is too large to submit via Canvas, you may submit it to the instructor (me) personally via email, either as an attachment or via Google Drive, etc.

Report Contents

Your ﬁ nal project report should be written similarly to a paper, with ﬁgures, code, and results included throughout to illustrate your points and ﬁndings. Text should be included to guide the reader. I recommend reading through the example projects to get an idea of this layout, and referencing the project rubric for more information. Speciﬁcally, your report must contain:

- An introduction section: Describes the data, the research questions, provides any background readers need to understand your project, etc.

- A conclusion section: Discusses the outcome(s) of models you ﬁt. Which models performed well, which performed poorly? Were you surprised by model performance? Next steps? General conclusions?

- A table of contents

- A section for exploratory data analysis: This should contain at least 3 to 5 visualizations and/or tables and their interpretation/discussion. At minimum you should create a univariate visualization of the outcome(s), a bi-variate or multivariate visualization of the relationship(s) between the outcome and select predictors, etc. Part of an EDA involves asking questions about your data and exploring your data to ﬁ nd the answers.

- A section discussing data splitting and cross-validation: Describe your process of splitting data into training, test, and/or validation sets. Describe the process of cross-validation. Remember to write for a general audience. Act as if your project will be read by people new to machine learning.

- A section discussing model ﬁtting: Describe the types of models you ﬁt, their parameter values, and the results.

- Model selection and performance: A table and/or graph describing the performance of your best-ﬁtting model on testing data. Describe your best-ﬁtting model however you choose, and the quality of its predictions, etc.