Simple Easy Beginners Web Scraping in R with {ralger}

Web Scraping, by nature requires a lot of understanding from the ability to find the css selector to rightly parse the scraped content. While there are a lot of R packages (even Python packages for that matter), {ralger} does a wonderful job of abstracting the complicated things and providing a simple easy-to-use Beginner-friendly Web Scraping Package. {ralger} has simple functions to quickly scrape / extract Title Text (H1, H2, H3), Tables, URLs, Images from the given web page.

Video Walkthrough

web scraping in R code


Below is an example on how to scrape IMDB Website (for educational purposes) in R with {ralger}



link <- ""

node <- "#main > div > span > div > div > div.lister > table > tbody > tr:nth-child(n) > td.titleColumn > a"

extract <- scrap(link, node)

img_links <- images_preview(link)

imdb250 <- table_scrap(link)

link <- ""

my_nodes <- c(
  ".lister-item-header a", # The title
  ".text-muted.unbold", # The year of release
  ".ratings-imdb-rating strong" # The rating)

names <- c("title", "year", "rating") # respect the nodes order

df_rank <- tidy_scrap(link = link, nodes = my_nodes, colnames = names)


comments powered by Disqus