Photo by William Bout on Unsplash

Overview

Your Personal Data Project (PDP) will include your choosing a data set and related research questions - something you are interested in or related to your major/profession.

Check out the Module 8 Overview page for more info on deliverables.

You will use the techniques we used in the previous labs to answer your research question. You will have templates/worksheets for each of the two Rehearse labs where you can develop your ideas and complete the analyses.

Then you will prepare a reproducible report in Word or PDF format using the same process we have previously used. We recommend you consider using the Word knitted file since that process is less finicky than the knit to pdf process is for some code chunks. You will create the report using the Student Name worksheet again in the Remix/Report session.

Additionally, you will create a video presentation using free ScreenCast-o-matic or another tool of your choice. Your video presentation could be a walk-through of your report Word or PDF file or you could use a PowerPoint or similar presentation. You should use just a few slides since your target presentation duration is 3 minutes with a max of 5 minutes.

Research Questions

Your project should be designed to answer your research question using the process we have used for the last three labs - confidence intervals and test of hypothesis. Recall we looked at comparing one mean to a standard or assumed value. We compared two or more means to each other using t-tests and ANOVA. We compared proportions to an assumed value and to each other. We looked for correlation and performed a regression using two quantitative variables.

To aid your thinking, we have provided examples of questions each of the data set should be able to answer in the descriptions of the data set.

Data Sets

Each data set is linked to a source/location. A snippet of the data is shown as well. And related research questions are listed

Code Chunk List

Basics

At this link we have provided a list of the most commonly used individual code chunks in this course. You will need to carefully edit them and may need to refer back to previous labs to refresh your memory on how they work.

Data Wrangling Tips and Tricks

This section is adapted from a Modern Dive resource created by Dr. Jenny Smetzer. It contains several very useful chunks/process for handling common student data wrangling issues:

Example problems

The next two links have many example research question types / problem solutions based on the Downey’ Infer Process. All the larger code chunks you need are here. Again, you will need to edit them carefully for your data frames and variables.

This site contains problem solutions to the types of hypothesis tests we encounter in this course. It also has code chunk examples showing how to find appropriate confidence intervals for typical research questions.

Presentation Guide

You may use the free version of ScreenCast-O-Matic

Creative Commons License
This work was created by Dawn Wright with support of an Excelsior College team, Santhosh Abraham and Mike Pennella.

It is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Last Compiled 2022-12-13