代写Business Data Analytics调试R程序
- 首页 >> CSGroup Assignment
Course: Business Data Analytics
Objective
The main objective of this assignment is to provide you with hands-on experience in analyzing a research question of your choice and writing a report on it using the techniques in R and business data analytics methods you have learned from the course. This assignment consists of five tasks as outlined below:
Task 1: Join a Group
This assignment must be submitted as a group project. Individual submissions will not be accepted. You are required to register in a group by going to MyUni → People → Group Assignment and selecting a group to join. By joining a group, you agree to work collaboratively with other group members as a team. It is expected that each group member contributes to the group assignment. Failure to contribute may result in the removal of your name from the final submission, leading to a zero mark for the assignment. Any disputes among group members should be first resolved within the group and then discussed with me later.
Note that you can have team members from diferent tutorials.
Task 2: Pick a Question
Your next task is to choose a research question that can be answered using a specific dataset. For this assignment, we will focus on the data available from the Gapminderwebsite. You need to be familiar with this website and the data it provides. On the website, you will find a range of topics that prompt research questions. You can use the questions as inspiration or develop a research question. Your report should clearly state your research question/hypothesis, why it is interesting to explore, and provide background information on the topic if necessary. Once you have chosen your question, one group member must input it into a Google Sheet Form. provided on the Group Assignment page on MyUni. Entries will be considered on a first-come- first-served basis, and we will notify you whether your question is preliminarily approved or not. To improve your chances of approval, input your question quickly and ensure that it is not similar to others on the form. Upon receiving approval, you will have the opportunity to further refine your central question and highlight its importance in your report submission.
The deadline for submitting the Google Sheet Form. is May 12, 2024, by 23:59.
Task 3: Data Collection and Wrangling
Once you have chosen your research question, you need to extract relevant data from the Gapminder webpage and wrangle it to answer your question. Data wrangling is an essential step to ensure that the transformed data is clean, organized, and has the desired format ready for analysis. Depending on your research topic, you may need to extract a few variables, and your task is to ensure that data on those variables are correctly merged. All data wrangling work must be done in R.
Task 4: Data Analytics
After collecting and sorting your data, you need to analyze it to provide an answer to your research question. You should take advantage of the techniques covered in the course, such as basic visual and descriptive analytics on data samples, unsupervised machine learning (e.g., cluster analysis), or supervised learning (e.g., regression analysis). This task must also be done in R.
Task 5: Report Writing
The final step is writing a report explaining and summarizing the results in Tasks 2-4. The report should be well-presented and well-written. More guidance on how to write a report can be found in the resources provided by the Writing Centre at the University. The report should start with an introduction section that explains the motivation for choosing the research question, its novelty, and importance. The next section should briefly describe the process of data collection and wrangling. The analysis section should provide a discussion of the appropriate methodology used to perform. data analysis, followed by the discussion and interpretation of results. The report should include a conclusion section.
In addition to the aforementioned, the report should provide a discussion of the following:
• The distribution of your data.
• At least one graph to illustrate your results.
• At least one statistical methodology (such as cluster analysis, regression analysis or any other technique you think is appropriate). You will need to explain the choice of method.
See below the marking rubric for this group assignment.
Submission Requirements:
All of the tasks described above must be done and written in an R-markdown file whose output is knitted into an HTML document. To be clear, you are required to submit two separate files for this assignment:
1. An R-markdown file.
2. An R-markdown output in HTML document.
The following items need to be included in the R-markdown file and output:
• Names of all group members in the “author” section.
• Adequate reference of the data being used. For example, if you use the data Expen- diture per student, primary (% of GDP per person), you need to cite the data source as: https://www.gapminder.org/Education/Schooling cost/Expenditure per student, pri- mary (% of GDP per person).
• Codes for data wrangling and generating tables and graphs that are used in the final report. This requires you to show all your analytics workflow (the R code) in the R markdown output. By default, the echo argument should be TRUE. If you cannot see the actual R code, please include the following code chunk at the beginning of your R markdown file:
、、、{r setup}
knitr::opts chunk$set(echo = TRUE)
、、、
• Sufficient comments on code chunks and functions. This requires you to use the hashtag (#) to comment on some function calls or a whole chunk of code so that one can easily read through your code.