The world’s Largest Sharp Brain Virtual Experts Marketplace Just a click Away
Levels Tought:
Elementary,Middle School,High School,College,University,PHD
| Teaching Since: | Apr 2017 |
| Last Sign in: | 103 Weeks Ago, 3 Days Ago |
| Questions Answered: | 4870 |
| Tutorials Posted: | 4863 |
MBA IT, Mater in Science and Technology
Devry
Jul-1996 - Jul-2000
Professor
Devry University
Mar-2010 - Oct-2016
Assume that we want to compare the money expenditure in mobile apps between “android” user group and “ios” user group.
Let’s conduct the analysis. First, download the “buy_activity.dta” file from the following address.
https://www.dropbox.com/s/uf0wftbjshpqo3v/buy_activity.dta?dl=0
Second, merge the “buy_activity.dta” data with the “user.dta” data in Datasets 1. Note that the “user.dta” file in Datasets 3 is somewhat big (862MB) and it will take some time to merge them.
a) How long did it take to merge them? Please refer to the following R code.
# Start the clock!
ptm <- proc.time()
# Load and merge the two data
buy_activity<-read.dta("buy_activity.dta") user<-read.dta("user.dta")
user_buy_history<-merge(buy_activity, user, by="user_no", all = FALSE)
# Stop the clock!
proc.time() - ptm
b) From the experience in (a), why do we need to learn distributed computing skills such as Hadoop?
c) Compare the total spending between the two groups. Please refer to the following R code. Was your initial guess right?
t.test(user_buy_history$purch_amount_total[user_buy_history$platform_type=="a ndroid"],user_buy_history$purch_amount_total[user_buy_history$platform_type== "ios"])