Android Smartphone Analysis in R [Code + Video]

In this post, We’ll learn how to take analyse your Android Smartphone usage data.

Steps:

  1. Download your MyActivity Data from Google Takeout - https://takeout.google.com/ (after Selecting json format - instead of the default html format)

  2. When the download is available, save the zip file and unzip them to find MyActivity.json inside last-level of the folder

  3. Create a new R project (using your RStudio) with the MyActivity.json file in the project folder

  4. Follow the below codes along with this Video

Youtube - https://www.youtube.com/watch?v=fv0idLNWfqg

Loading libraries

library(jsonlite)
library(tidyverse)
library(patchwork)

Reading Input json using jsonlite package

android <- jsonlite::fromJSON("MyActivity.json")

Glimpse of the dataframe

glimpse(android)
## Observations: 37,773
## Variables: 5
## $ header   <chr> "Sense Home Launcher-News,Theme", "Google Chrome: Fast & S...
## $ title    <chr> "Used Sense Home Launcher-News,Theme", "Used Google Chrome...
## $ titleUrl <chr> "https://play.google.com/store/apps/details?id=com.htc.lau...
## $ time     <chr> "2020-05-06T15:50:53.817Z", "2020-05-06T15:47:53.613Z", "2...
## $ products <list> ["Android", "Android", "Android", "Android", "Android", "...

Data Preprocessing - Date Time

android$time <- parse_datetime(android$time,locale = locale(tz = "Asia/Calcutta"))
summary(android$time)
##                  Min.               1st Qu.                Median 
## "2017-01-06 16:08:01" "2019-07-21 18:30:18" "2019-10-14 19:53:11" 
##                  Mean               3rd Qu.                  Max. 
## "2019-10-04 14:01:15" "2020-01-17 17:10:47" "2020-05-06 21:20:53"
android %>% 
  mutate(date = lubridate::date(time),
         year = lubridate::year(time)) -> android
android_latest <- android %>% 
  filter(year %in% c(2019,2020))

Number of Unique Apps

android_latest %>% 
  count(header, sort = TRUE)  %>% 
  head(20) %>% 
  mutate(header = fct_reorder(header, n)) %>% 
  ggplot() + geom_col(aes(y = header, x = n)) +
  theme_minimal() +
  labs(title = "Most used Apps - Overall",
       subtitle = "Android Smartphone usage",
       caption = "Data:Google Takeout")

Apps Comparison

android_latest %>% 
  filter(year %in% '2019') %>% 
  group_by(year, header) %>% 
  summarise(n = n()) %>% 
  arrange(desc(n)) %>% 
  head(10) %>% #View()
  mutate(header = fct_reorder(header, n)) %>% 
  ggplot() + geom_col(aes(y = header, x = n)) +
 # facet_wrap(~year, scales = "free") +
  theme_minimal() +
  labs(title = "Most used Apps - 2019",
       subtitle = "Android Smartphone usage",
       caption = "Data:Google Takeout") -> p2019
android_latest %>% 
  filter(year %in% '2020') %>% 
  group_by(year, header) %>% 
  summarise(n = n()) %>% 
  arrange(desc(n)) %>% 
  head(10) %>% #View()
  mutate(header = fct_reorder(header, n)) %>% 
  ggplot() + geom_col(aes(y = header, x = n)) +
 # facet_wrap(~year, scales = "free") +
  theme_minimal() +
  labs(title = "Most used Apps - 2020",
       subtitle = "Android Smartphone usage",
       caption = "Data:Google Takeout") -> p2020
p2019 / p2020

Usage Timeline

android_latest %>%
  
  count(date) %>% 
  ggplot() + geom_line(aes(date,n))

If you liked this tutorial, Please SUBSCRIBE to my Youtube Channel for more R programming related videos and share your feedback, it’d be great help!

 
comments powered by Disqus