Almost every R user knows about popular packages like dplyr and ggplot2. But with 10,000+ packages on CRAN and yet more on GitHub, it's not always easy to unearth libraries with great R functions. One of the best way to find cool, new-to-you R code is to see what other useRs have discovered. So, I'm sharing a few of my discoveries -- and hope you'll share some of yours in return (contact info below).
Choose a ColorBrewer palette from an interactive app. Need a color scheme for a map or app? ColorBrewer is well known as a source for pre-configured palettes, and the RColorBrewer package imports those into R. But it's not always easy to remember what's available. The tmaptools package's palette_explorer creates an interactive application that shows you the possibilities.
First, install tmaptools with
install.packages("tmaptools"), then load tmaptools with
library("tmaptools") and run
palette_explorer() (or, don't load tmaptools and run
tmaptools::palette_explorer() ). You'll see all available palettes as in the image above, as well as sliders to adjust options like number of colors. There's also info about basic syntax for using a color scheme below each group of palettes.
palette_explorer also needs shiny and shinyjs packages installed in order to generate the interactive app.
Create character vectors without quotation marks. It can be a bit annoying to manually turn
Firefox, Chrome, Edge, Safari, Internet Explorer, Opera into the
c("Firefox", "Chrome", "Edge", "Safari", "Internet Explorer", "Opera") format R needs to use such text as a vector of character strings.
That's what the Hmisc package's Cs function was designed to do. After loading the Hmisc package,
Cs(Firefox, Chrome, Edge, Safari, Internet Explorer, Opera)
will evaluate the same as
c("Firefox", "Chrome", "Edge", "Safari", "Internet Explorer", "Opera")
If you've ever manually added quotation marks to a lengthy string of words, you'll appreciate the elegance.
RStudio bonus: If you use RStudio, there's another option for sleek vector-string creation. Security pro Bob Rudis created an RStudio add-in that takes selected comma-separated text and adds the necessary quotes and c(). Install it with
devtools::install_github("hrbrmstr/hrbraddins") (which means you need the devtools package as well), and you'll see Bare Combine as an option in the RStudio Tools > Addins menu.
You can run it from that Addins menu, but selecting text and then leaving your coding window to go to the Tools > Addins menu to select Bare Combine doesn't necessarily feel less cumbersome than typing a few quotation marks. Much better to create a custom keyboard shortcut for the addin.
You can do that by going to Tools > Modify Keyboard Shortcuts. Scroll down until you see Bare Combine in the Addins section -- or search for Bare Combine in the filter box. Double click in the shortcut area and type the keystroke(s) you want to assign to the addin (I used
Now, any time you want to turn comma-separated plain text into an R vector of character strings, you can highlight the text and use your keyboard shortcuts.
By the way, RStudio add-ins are mostly just plain R. If you'd like having keyboard shortcuts for R tasks like this, it might be worth learning the syntax.
DT::datatable(mydf) creates an interactive HTML table;
DT::datatable(mydf, filter = "top") adds a filter box above each row.
Easy file conversions. rio is one of my favorite R packages. Instead of remembering which functions to use for importing what types of files (read.csv? read.table? read_excel?), rio vastly simplifies the process with one
import function for a couple of dozen file formats. As long as the file extension is a format that rio recognizes, it will appropriately import from files such as .csv, .json, .xlsx and .html (tables). Same for rio's
export command if you'd like to save to a particular file format. But rio has a third major function: convert, which will import and export in a single step. Have a million-row Excel file you need to save as a CSV? An HTML table you'd like to save as JSON? Use a syntax like
convert("myfile.xlsx", "myfile.csv"), where the first argument is your existing file and the second is your desired file with the desired extension, and your file will be created.
Copy and paste from R to your clipboard. rio bonus: You can copy between your clipboard and R with rio. Send some data from a small R variable to your clipboard with
export(myRobject, "clipboard"). Importing to the clipboard should work as well, although I've had mixed success with that.
Import large files quickly - and perhaps save space. I'm working on a project this week that involves a spreadsheet with more than 600,000 rows and 40 columns. Reading it into R took around 25 to 30 seconds -- doable once, but annoying when I had to do it multiple times. The feather binary file format is not only readable by both R and Python, but is considerably faster to read and write. rio handles feather files, or you can use read_feather from the feather package.
For saving space as well as speed, the fst package looks to be an excellent choice because it offers compression. In my testing,
write.fst(mydf, "myfile.fst", 100) -- maximum compression -- was just as speedy as no compression, and it took about one-third the space of the original spreadsheet. feather, meanwhile, took up almost double the spreadsheet disk space.
A few additional favorites from readers and social media:
More with quotes. In response to the Cs() function that adds quotes, Kwan Lowe touted the usefulness of noquote(), which strips quotes -- useful for importing certain types of data into R. noquote() is part of the varhandle package, aimed it making it easier to wrangle variables. Another useful function: unfactor(), which aims to detect the "real" class of an R data frame column of factors and then turn it into either numeric or character variables.
table() alternative. Need to calculate frequencies of variables in a data frame? "I'm a huge fan of xtabs()," Timothy Teravainen posted at Google+ in response to this blog. "It's in base R, but I sadly went years without knowing about it."
The format is
xtabs(~df$col1 + df$col2), which will return a frequency table with col1 as the rows and col2 as the columns.
Text searching. Finally, if you've been using regular expressions to search for text that starts or ends with a certain character string, there's an easier way. "startsWith() and endsWith() -- did I really not know these?" tweeted data scientist Jonathan Carroll. "That's it, I'm sitting down and reading through dox for every #rstats function."
For more on useful R functions, see Great R packages for data import, wrangling and visualization.