Welcome to the week 5 assignments. It's in a completely different format than you're probably used to, but it's important as a master's level economist to be extremely versatile, self-sufficient, and have a natural curiosity for finding answers independently.
You can start by saving a version of this website if you'd like - this will ensure this information will not change in the week you will have to complete this assignment.
Good news everyone! Your data is available in the above section. The bad news, you cannot click on these files if you wish to receive credit for the assignment.
File descriptions:
wsteps.dta - steps dataset 1
bsteps.dta - steps dataset 2
id-steps.dta - ID-Step-Feet dataset
Use the copy statement to save each of these file into your current working directory. Now we can use wsteps.dta, and bring it into Stata.
Now create a variable StDate which implements the date function to create a Stata date variable.
Format your new variable in the standard Date Format so dates turn from numbers to the 01jan1900 format.
Generate four new variables using the date functions, creating a variable for Year, Month, Day, and DayofWeek.
Now collapse the data so that the TimePeriod goes away, in other words, get the sum of the steps by the new date variables and the FitbitId.
Use the month-day-year function that takes three arguments (Month, Day, and Year) to create a new date variable (which was lost in the collapse). Don't try and redo the collapse command, you must make this variable.
Now, say we forgot what each of the numbers means in the Day of Week variable. Is 0 Sunday, Monday, Tuesday? Thursday? Pretend that we forgot how to use the useful help command to figure out which one of these is correct. Format the new date variable %tdDayname and look at the corresponding Day of Week numeric codes.
The Merge
We're interested in finding how many miles each participant traveled, and in order to do so, we must find their average stride length. Fortunately, id-steps.dta contains stride length, but it also contains many id variables.
Rather than searching for the right variable and hard-coding each one, we need to implement a merge statement in order to match this data. For this instance, we only want to keep observations that match.
Once you accomplish this, generate a variable Miles which is the miles traveled per day.
Time to use your final collapse statement. You're going to want to end up with two variables WAvgMiles and WSumMiles which represents the mean of miles and the sum of miles by Year Month and DayofWeek.
Save your data as whatever you want. Make sure you remember this dataset.
Part II - BSteps
Do everything above again, for bsteps. Save this resulting dataset as well. (Final variables: Month Year DayofWeek BAvgMiles BSumMiles)
Part III - Final Presentation
Merge the resulting dataset from wsteps and bsteps into one dataset. This should contain Year Month DayofWeek and the W* variables and B* variables.
Label the WAvg "Will" and the BAvg "Belen".
Use the label define command to create a value label MonthLabel. This will label 7 "July" and 8 "August". (Don't forget the replace command)
Label the values of the Month variable with the MonthLabel label.
Finally, create a line graph of the average steps for both Belen and Will over DayofWeek by the Month variable. It should contain the title "Average Miles Per Week". One this has rendered, export your graph as a png file.