# Extract Top Reddit Posts of #rstats in 3 lines of R Code

## using jsonlite

This post is kept (literally) minimal to demonstrate how simple is this hack using R (of course could be simple in other languages too). This is also to establish a point that R has got use-cases beyond statistics and data-mining.

### Objective

rstats subreddit is one of the popular sources of R-related information / discussion on the internet. We’re trying to extract the top posts of rstats subreddit.

### Data Format

Lucky for us, Reddit offers a json file for every subreddit (also post) and we’ll use that here.

subreddit url: "https://www.reddit.com/r/rstats/"
subreddit json: "https://www.reddit.com/r/rstats/.json"

### jsonlite @ Action

The package that will help us in this endeavor is jsonlite (by Jeroen Ooms and Team) for parsing json files and feeds. It’s got a sweet function that fromJSON() that parses a json file and stores the result in a list object. Ultimately, we can find the required information - title, url of the subreddit in there.

library(jsonlite)

reddit <- fromJSON("https://www.reddit.com/r/rstats/.json")

(top10 <- reddit$data$children\$data[1:10,c("title","url")])
##                                                                                                   title
## 1                                                                         How does one fit a plm model?
## 3                                                          Finding "Optimal" Target Inventory for Parts
## 4  Can you change the limits on a scale in ggplot based on the data based to ggplot? Explanation inside
## 5                                                                       Error: Need Finite 'ylim'values
## 6                                             Why Machine Learning Beats Econometrics in the Real World
## 7                                                                             Help with reshape() Error
## 8                                                         R &amp; stats illustrations by @allison_horst
## 9                                                                                        Time Series Qn
## 10                                                         Flexdashboard runtime shiny renderPlot issue
##                                                                                                 url
## 8                                               https://github.com/allisonhorst/stats-illustrations
## 10    https://www.reddit.com/r/rstats/comments/cqd1u9/flexdashboard_runtime_shiny_renderplot_issue/

### 3-lines

• Retrieve and Parse the json file
• Extract the relevant information for the list object

### Summary

This post while is primarily intended to demonstrate the simplicity of R and jsonlite for json parsing, it can also be used to automate such a script to email or send notification about top 10 rstats subreddit post at a scheduled interval.