tips and tricks

One-line Code using viridis for How to change the color scale in ggplot plots

This is a small code snippet to explain how to change the color scale of a ggplot. Continuous Scale Package: viridis Function: scale_fill_viridis_c() (since it’s a continuous scaled value) library(dplyr) library(ggplot2) library(viridis) mtcars %>% tibble::rownames_to_column('Car') %>% tidyr::separate('Car',c('Brand','Model'), remove = F) %>% group_by(Brand) %>% summarize(avg_mpg = mean(mpg)) %>% ggplot() + geom_bar(aes(reorder(Brand,avg_mpg),avg_mpg, fill = avg_mpg), stat = 'identity') + scale_fill_viridis_c() + theme_minimal() + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + labs( title = 'How to arrange Ggplot Bar plot', x = 'mpg') Discrete Scale Package: viridis Function: scale_fill_viridis_d() (since it’s a discrete scaled value)

How to arrange ggplot (barplot) bars in ascending or descending order

One of the reasons you’d see a bar plot made with ggplot2 with no ascending / descending order - ordering / arranged is because, By default, ggplot arranges bars in a bar plot alphabetically. But most of the times, it would make more sense to arrange it based on the y-axis it represents (rather than alphabetically). It could be your month-wise time series or high-medium-low bars - these are some examples where an alphabetically-sorted bar chart wouldn’t make sense in fact would hinder the process of data communication.

3 tidyverse tricks for most commonly used Excel Features

In this post, We’re simply going to see 5 tricks that could help improve your tooling using {tidyverse}. Create a difference variable between the current value and the next value This is also known as lead and lag - especially in a time series dataset this varaible becomes very important in feature engineering. In Excel, This is simply done by creating a new formula field and subtracting the next cell with the current cell or the current cell with the previous cell and dragging the cell formula to the last cell.

Extract Top Reddit Posts of #rstats in 3 lines of R Code

This post is kept (literally) minimal to demonstrate how simple is this hack using R (of course could be simple in other languages too). This is also to establish a point that R has got use-cases beyond statistics and data-mining. Objective rstats subreddit is one of the popular sources of R-related information / discussion on the internet. We’re trying to extract the top posts of rstats subreddit. Data Format Lucky for us, Reddit offers a json file for every subreddit (also post) and we’ll use that here.