Blog Stats
 1,643 hits

Recent Posts
Archives
Keyword Cloud
June 2018 M T W T F S S « Mar 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Meta
Advertisements
I have been messing around with the Pew Voter Data and have been unable to access the underlying Zip Codes from the data set in order to attach a latitude and longitude to each respondent and map their location and affiliations… … Continue reading
I am going to work with a little bit of voter data from the Pew Research Center, a nonpartisan thinktank that allows downloads of their proprietary data for academic and public use, from this election cycle. The April 2016 Politics and … Continue reading
Building tables in R is a simple process that is also extremely flexible. Using the NHANES data set as a further example, we can build a table out of two variables that we previously created: > table(age.category,BMI.category) BMI.category age.category underweight normal overweight … Continue reading
The original charting of the NHANES data used basic frequency and density plotting using histograms and scatter plots. ggplot2 is a package for flexibly visualizing all kinds of data. > install.packages(“ggplot2”) The downloaded binary packages are in /var/folders/nl/4z5wsxpn3cngl9tp9y17r5sm0000gn/T//RtmpwJmKSM/downloaded_packages > library(“ggplot2″, lib.loc=”/Library/Frameworks/R.framework/Versions/3.0/Resources/library”) > library(ggplot2) … Continue reading
Importing and cleaning data are mandatory steps prior to running any type of analytics. We should always generate a priori hypotheses based on the evidence, literature, and logic that we have available to us (e.g., guessing that a strong and … Continue reading
If we link back to the data set that I was working with earlier today, we left off with a cleaned data set, and a newly created continuous variable: BMI. We have two unique gender variables (1: Male; 2: Female), … Continue reading
As you may know, my experience with data analytics is in behavioral health and general health care on the periphery of academia. IBM’s SPSS has always been the primary program that I have used to run analytics and as a … Continue reading
Given the data set Cars93 in R, how do we break down some of the variables and start determining where some of the salient differences are in our data? In R, open the data set with the following prompt: >data() … Continue reading
This is a stacked graph of FBI crime statistics compiled from 1960 to 2012. I think stacked graphs can be an interesting way to look at large, longitudenal data sets.
I have included the (M)ANOVA PDF as a separate post so that the information can be read a little more easily. This paper includes a brief summary of how the ANOVA test works to partition variance and goes on to talk about … Continue reading