Twitter's Favorite Lesser Known Packages
At the 2018 December NYC R Ladies meetup (yes this post has been sitting in my drafts for over a year), a group started talking about how a few tiny functions in a lesser-known package can provide you with serious magic. The problem is finding those packages and functions! With so many amazing packages on CRAN and GitHub, how do you even begin to search? One way - ask all your twitter followers what they think, and twitter did not disappoint - so here are some examples of amazing packages and functions you might want to learn about.
The types of functions offered seemed to fall in a couple buckets. For example, making tasks you do all the time easier (cleaning data, summary), dealing with data structures that arenβt are easy to deal with (factors, strings.. etc), visualizations, and so much more.
Data Tasks
My favorite lesser known package is Janitor by Sam Firke. This package has basic functions to clean and prep messy data files. The functions are mostly relatively easy to replicate with dplyr, but why write the same thing over and over when Janitor does it for you!
Mine are, from janitor...
— Erin Grand (@astroeringrand) December 11, 2018
1. clean_names
2. get_dupes
3. remove_empty#rstats
Skimr, as suggested by Fernando Flores, started at an ROpenSci Un-conf that provides a better summary function. It creates both a tidy version of the summary table to work with and a visual version to inspect. This is super useful for investigating data issues.
Couldn't choose just one package, so here we go:
— Fernando Flores (@ds_floresf) December 11, 2018
skimr::skim
covr::report
DT::JS
Data Types
The tidyverse packages for dealing with specific data types are not nearly as widely used as they can be; forcats, lubridate, glue, and stringr can help solve so many problems with factor, dates, and strings.
From forcats:
— Emily Zabor (@zabormetrics) December 14, 2018
1. fct_infreq
2. fct_rev
3. fct_drop
forcats::fct_lumphttps://t.co/2BboLbdzuS
— Thomas Mock π¨πΌ π» (@thomas_mock) December 11, 2018
glue::glue and glue::glue_datahttps://t.co/Bxt20MQGi2
Cheated and use 2x packages.
you stole mine! π this is kind of cheating but from lubridate: year(), month(), day()
— Luuuda (@ludmila_janda) December 11, 2018
Plotting Support
A few of the recommendations focused on vizulations and plotting. Key shouts outs for naniar and patchwork. Naniar helps you visualize your missing values. Patchwork allows you to combine plots together.
From two packages, super handy at first steps after loading dataset:
— Radoslaw Panczak (@RPanczak) December 12, 2018
naniar::gg_miss_var
summarytools::descr
summarytools::freq
Other
There are were a ton of other amazing offerings for excellent packages.
The magrittr package has many useful operators outside of the normal %>% pipe.
I was going to say %<>% , %<>% , and %<>% from magrittr - I use it all the time now thanks to @robinson_es - but now I'm browsing other magrittr functions and the aliases like extract() etc would be v handy when piping
— Sarah R (@srhrnkn) December 12, 2018
If you work with spatial data at all, the sf package is a must.
The sf package cleared my skin, cleaned my home & cured my anxiety
— Brooke Watson (@brookLYNevery1) December 11, 2018
I added the `conflicted` package to my RProfile this summer, and I really love that it warns me about possible name conflicts _before_ I run into problems pic.twitter.com/46Y88gexP9
— Irene Steves (@i_steves) January 25, 2019
What is your favorite lesser know package or function? Sound off in the comments (or find me on twitter).