#acl HlpLabGroup:read,write,delete,revert,admin All:read #format wiki #language en #pragma section-numbers 2 = R Packages = == From CRAN == Install these packages from CRAN (using the package manager GUI in R/RStudio, or `install.packages()`). If you're using the GUI, always check the "install dependencies" box. === Hadleyverse === Hadley Wickham has done more than just about anyone to make R more powerful, expressive, and easy to use for common data analysis tasks. These are just the packages that are most often useful for the kind of stuff we do, but if there's a task you are frustrated by in R, Hadley's probably written a package to make it easier. * `ggplot2` — Data visualization using grammar of graphics. * `dplyr` — Data manipulation pipelines made easy. Noticeably distinct from its spiritual predecessor `plyr`. `dplyr` and `plyr` conflict so don't load both at the same time. * `tidyr` — Data cleaning and [[http://blog.rstudio.org/2014/07/22/introducing-tidyr/|tidying]], including reshaping from wide to long (`spread`) and long to wide (`gather`) (replaces `reshape`/`reshape2`). Also has very useful functions like `separate`, for splitting up columns with values like 'beach_b_10' into separate columns with 'beach', 'b', and '10'. * `devtools` — Automate common package development workflows. Most useful for [[http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/|writing custom packages]] but also provides idiot-proof installation packages from source on github/bitbucket or arbitrary URLs via `devtools::install_github` etc. (see below). * `stringr` and `lubridate` — Process strings and dates/times with less pain. === Everything else === * `knitr` — Literate programming for R. Mix code with text (markdown or LaTeX), and `knitr::knit` will run the code, format the output all purdy, and generate an HTML/PDF report. See [[LabmeetingSP13w13|notes from previous lab meeting on using knitr]]. * `lme4` — Mixed effects modeling. * `multcomp` — Confidence intervals and stuff I think. * `gsubfn` — More powerful string replacement. * `hexbin` — Tired of your boring old square bins? Try some exciting hexbins! Now with two extra sides! * `languageR` — Lots of language-specific datasets and code to go along with Baayan's book, "Analyzing Linguistic Data: A practical introduction to statistics". === For heavy Bayesian lifting === You don't need these unless you want to do any kind of Bayesian modeling. * `MCMCglmm` — Does what it says on the tin: Bayesian inference via MCMC for generalized linear mixed models. Much more flexible and powerful than `lme4`, but with a steep learning curve. * `rstan` — [[http://mc-stan.org/rstan.html|R interface]] to the [[http://mc-stan.org/|Stan modeling language]]. Good for very efficient sampling of hierarchical models. Doesn't exactly supersede JAGS/BUGS, which are often easier to use and more appropriate for simple models or models where you need to sample categorical variables. '''Note: this package builds a TON of stuff from source and takes a long time to install'''. * `glmer2stan` and `rethinking` from [[https://github.com/rmcelreath]]. The first compiles glmer-style mixed model formulas into Stan code (see [[https://hlplab.wordpress.com/2013/12/13/going-full-bayesian-with-mixed-effects-regression-models/|this blog post]]). The second is a more mature and flexible (and actively developed) package (and textbook) that includes `map2stan()` for compiling graphical model-style model specifications (like from JAGS/BUGS) into Stan code. Both need to be install from Github, so use `devtools::install_github('rmcelreath/map2stan')` etc. (as discussed below). * `DPpackage` — Functions for Bayesian inference via simulation in nonparametric/semiparametric models (e.g. the eponymous Dirichlet Process or "DP"). * `mvtnorm` — Multivariate normal and t distribution, probability, and sampling functions. A quicker way to do it is to copy and paste the following line at your R prompt: {{{#!highlight r numbers=disable install.packages(c("tidyverse","knitr","devtools","DPpackage","gsubfn","hexbin","languageR","lme4","MCMCglmm","multcomp","ez")) }}} tidyverse includes: broom, dplyr, forcats, ggplot2, haven, httr, hms, jsonlite, lubridate, magrittr, modelr, purrr, readr, readxl, stringr, tibble, rvest, tidyr, xml2 == From github (source) == Sometimes a package isn't available on CRAN (usually temporarily) as a binary for your platform. Or it's not on CRAN at all, but is hosted on Github or Bitbucket or something. In both of these cases, you'll need to install from source. === One-time set up: developer tools === If you need to build a package from source, make sure you have the developer tools for your OS installed. For MacOS, they are available on the App Store if you have the most up-to-date version of MacOS, or from the [[https://developer.apple.com/downloads/index.action?q=xcode|developer site]] (where you'll need to register for a free account first) if you have anything less than the most up-to-date version. I ''think'' You only need to install the "Command Line Tools", not XCode itself. You may also need Fortran, which you can install very easily using [[http://brew.sh/|homebrew]], e.g., `brew install gfortran` (recommended) or from the [[http://cran.r-project.org/bin/macosx/tools/|MacOS tools]] page on CRAN. [[http://scicomp.stackexchange.com/a/2470|This StackExchange answer]] is a good discussion of the pros and cons of various ways to install Fortran on MacOS. The last thing you'll need is to install `devtools` with `install.packages('devtools')` in R. === Installing packages from source === Let's say I want to install `dplyr` from the github source. I google it and find that it's hosted at [[http://github.com/hadley/dplyr]]. Then, in R: {{{#!highlight r numbers=disable library(devtools) devtools::install_github('hadley/dplyr') }}} Piece of cake. `devtools` includes a whole family of functions for installing source from pretty much anywhere you might find it. If, for instance, you want to install from the source archive on CRAN (e.g., [[http://cran.r-project.org/src/contrib/dplyr_0.4.1.tar.gz]]), you can use the `install_url` command: {{{#!highlight r numbers=disable library(devtools) devtools::install_url('http://cran.r-project.org/src/contrib/dplyr_0.4.1.tar.gz') }}} == From Bioconductor == '''I have no idea whether this is still necessary but I'm leaving it here for posterity's sake — Dave''' When you install the packages above, you may get this warning: ```Warning: dependencies ‘marray’, ‘affy’, ‘Biobase’, ‘Rgraphviz’, ‘’ are not available```. To fix it, install the standard packages from [[http://www.bioconductor.org/docs/install/|Bioconductor]] by doing the following at the R prompt: {{{#!highlight r numbers=disable source("http://bioconductor.org/biocLite.R") biocLite(lib='/Library/Frameworks/R.framework/Resources/library/') }}} adjusting ```lib``` as appropriate for your OS. The example above is for Mac OS X. For Windows use: {{{#!highlight r numbers=disable biocLite(lib='C:\\Program Files\\R\\R-2.11.1\\library') }}} adjusting the version number as appropriate. (n.b. You must have Administrator privileges to install anything under `C:/Program Files/`)