MATH1005留学生辅导、讲解R设计、辅导soft skills

- 首页 >> CS

MATH1005Project1

Semester 2 2019

Aim

  • The aim of the Projects is to give you an authentic     experience of producing reproducible statistical reports using real data.     They are purposely open-ended to expose you to the joys and challenges of     problem solving with data.

  • The set of Projects is cumulative, allowing you     to develop and consolidate 5 vital graduate qualities: statistical     thinking, computational skills (hard skills), curiosity, communication and     collaboration (soft skills).

1 Project1: Exploring data ofyour choice

Find data of yourown choice. Investigate your own research questions using numerical andgraphical summaries. Present a 3 minute (max) report, and field questions fromyour tutor and peers.

Submission: Submita .html, produced in .Rmd, with SIDs and details of your Lab class at the top.

This project isdesigned to be completed in a group.

2 Guide to Project1

  • The task essentially asks you to (1) source your     own tidy dataset, and then (2) investigate different variables posed as     Research Questions.

  • In this project, we give you a very full marking     criteria to teach you a framework for approaching the subsequent projects.

Keywords

In this context,an Executive Summary is a “clear, interesting summary of maininsights from the report”.

Length

The report shouldbe precise and concise, following the given word-counts.

Learning in Groups

  • This project is designed to be completed in a     team, as learning group work is a Sydney graduate quality and very prized     by employers. Group work requires many skills, including flexibility,     negotiation and compromise. A good group doesn’t just coordinate or     cooperate, but learns to collaborate.

  • Your group must consist of a maximum of 4     students from the same Lab class.

    • Groups will be allocated in      your lab class by your tutor.

    • Your tutor will record      your group number on the class attendance roll, and this information will      be used in Canvas.

    • If for some reason you      miss your first 3 lab classes (eg join the unit late), then your tutor      will allocate you a Group number for a solo group, so you will need to      submit the work individually. You will forfeit the marks associated with      groups in the Communication of Presentation.

  • While a group project is submitted once as     a group, each individual will need to separately submit the     associated Group Reflection Quiz, in which you reflect on how     you contributed to your group. To avoid academic misconduct, you need to     be honest in that quiz, and if you didn’t contribute to the project then     you will need to do the subsequent projects on your own. Students who     don’t fill out the quiz will get 0 for the project.

Finding Data

  • The web is full of incredible data. However, it     takes time to find tidy data and check its integrity. This process of     searching for and assessing data is part of the project.

  • If you want to investigate a particular research     question, then it often helps to search by the area of interest (eg breast cancer) and type of file     (eg csv or .xls).

  • There are many excellent data depositories, for     example see

  • Data from the ABS can     be hard to use as it often has summarised data, not raw data.

  • If you use data from kaggle,     you must do your own original coding, or you will get 0.

  • You shouldn’t use data that is provided in the     lectures or labs or in RStudio, as finding data is part of this project. If     you do so, you can’t get the marks for IDA in the Marking Criteria (nor     the Research Questions, if this overlaps with what has already been     covered in class).

Presentation

  • You do not have to discuss everything in your     report, during the presentation. Focus on whatever you think is most     interesting for your peers to hear. You will judged on the quality of     your statistical thought, not quantity, as you only have 3     minutes (maximum).

  • You will also be marked on the cohesiveness of     the presentation (ie how well it all fits together). You all need to be     involved.

  • Before you start your presentation, you should     introduce your group and topic:

    • Our group is …

    • Our topic is …

Tips

  1. Don’t make the analysis harder than it needs to     be. Pick a dataset that fulfills all the criteria but don’t make it     massively difficult! Examples of tricky data include datasets with a lot     of missing observations, heavily formatted excel sheets (such as ABS     data). This project should be fun and allow you to apply all your R skills     to real data :)

  2. Make sure you know how to use R Markdown, including     how to read in data and knit the html.

  • A template .Rmd file is already provided for you     on the projects page!

  • There are different commands for reading in     .xlsx/.csv/.txt files! Look on the Video Resources for extra information     on how to read data into R.

  1. You will be presenting your work in your lab class.     You may upload a separate presentation if you like, but it must be     uploaded at the same time as your report. You cannot walk into your lab     class with a USB for your presentation. Alternatively, you can present     straight from the .html report itself! Both can work very well. Make sure     you practice your presentations, as it is very obvious when people have     not timed them. Your tutor will not give you a warning - you will be cut     off at 3min sharp.

  2. Do not wait until the last minute to upload your     report! There there have been occasional issues with .html files taking a     long time to upload in Canvas in the past. You should contact IT/ Canvas     directly if you are having any issues submitting.

3 Marking Criteria

See CanvasAssignments for full Marking Criteria.

Written Report [By Group]

Quality


Executive  Summary [Max 100 words]

Clear,  interesting summary of main insights from the report.

IDA  [Max 200 words]


Complexity  of Data & Classification of Variables

4 or  more variables in data. Classification of variables shown in R Output, and  assessed and changed (if needed).

Option1:  IDA (For sourced data)

Origin  of data cited and critically assessed, including reliability and limitations.

Option2:  IDA (For survey data)

Assesses  survey design, including potential bias or issues of ethics [include survey  link, or survey at end of report]. Survey involves 20+ participants.

Exploring  Data [Max 800 words]


Research  Question 1

Insightful  question, appropriately investigated using numerical and or graphical  summaries, with results explained in context.

Research  Question 2

Insightful  question, appropriately investigated using numerical and or graphical  summaries, with results explained in context. Uses regression (model produced  and assessed).

Communication  of Written Report

html  from .Rmd, with all SIDs listed at the top with details (day/time/room) of  the same Lab class. Clear use of structure and language, carefully edited  with no mistakes.

Communication  of Presentation

Engaging,  interesting, well-paced, error free content, well coordinated as team.

站长地图