Welcome to R!


  • R is an incredibly powerful tool; knowing how to use it is an incredibly valuable skill. However, to learn it takes the same diligence as learning a human language does.
  • R is a very capable calculator; you can use it to do math.
  • R, like human languages, has punctuation marks (operators) that have specific meanings and usage rules.
  • R “sentences” are called commands. They include inputs as well as instructions for operations we want R to perform for us.
  • Script files are text files that allow us to write and store code and annotations and then “teleport” this code to the Console when we’re ready.
  • “Nouns” in R are called objects. These are impermanent until we use assignment commands to name them, in which case they persist until we overwrite them or close R.
  • Our environment is everything we have named since starting our R session.
  • There are rules about what we can and can’t (and should and shouldn’t) name objects in R.
  • R is case-sensitive, so capital and lowercase letters are distinct.
  • It’s good to have a naming convention to name objects.
  • Objects in R take many different shapes, and the values they store can be of many different types (the “adjectives” of the R language).
  • In R, “verbs” are called functions. Functions take inputs, perform operations, and produce outputs. Some functions do math; others might create new objects.
  • Functions have slots for specific inputs. These slots are called parameters, and they have names. The inputs we provide to these slots are called arguments. Arguments should be given in a specific order, and they should be of specific types.
  • Some function inputs are optional; others are required. Optional inputs are like R’s “adverbs” in that they often control how R performs a specific operation.
  • If we want help with a function, we can use the ? operator to open its help page.
  • The = symbol has many different uses in R, which can be confusing. As such, consider using <- for assignment.
  • Logical tests are like “questions” in R. There are many different logical operators to ask a variety of questions.
  • Use the square bracket operators [ ]to peek inside objects in indexing commands. Indexing commands can also be used to update or subset objects, and their format differs for 1D vs. 2D object types.
  • Installing packages is necessary to have access to them, but even then, packages must be turned on to use their features.
  • Your working directory is the folder R assumes it should interact with on your computer when loading/saving files.
  • R Project folders are handy for keeping organized when working on a large, important, or complex project.
  • Loading files in R typically requires a read function; saving files typically requires a write function.

Exploring the Tidyverse, a modern R "dialect"


  • Use the dplyr package to manipulate data frames in efficient, clear, and intuitive ways.
  • Use select() to retain specific columns when creating subsetted data frames.
  • Use filter() to create a subset by rows using logical tests to determine whether a row should be kept or gotten rid of.
  • Use group_by() and summarize() to generate summaries of categorical groups within a data set.
  • Use mutate() to create new variables using old ones as inputs.
  • Use rename() to rename columns.
  • Use arrange() to sort your data set by one or more columns.
  • Use pipes (%>%) to string dplyr verbs together into “paragraphs.”
  • Remember that order matters when stringing together dplyr verbs!
  • Every ggplot graph requires a ggplot() call, a data set, some mapped aesthetics, and one or more geometries. Mixing and matching data and their types, aesthetics, and geometries can result in a near-infinite number of different base graphs.

  • Mapping an aesthetic means linking a visual component of your graph, such as colors or a specific axis, to either a column of data in your data set or to a constant value. To do the former, you must use the aes() function. To do the latter, you can use aes(), but you don’t need to.

  • Aesthetics can be mapped (and data sets provided) globally within the ggplot() call, in which case they will apply to all map components, or locally within individual geom_*() calls, in which case they will apply only to that element.

  • If you want to adjust the appearance of any aesthetic, use the appropriate scale_*() family function.

  • If you want to adjust the appearance of any text box, line, or rectangle, use the theme() function, the proper parameter, and the appropriate element_*() function.

  • In ggplot commands, the + operator is used to (literally) add additional components or layers to a graph. If multiple layers are added, the layers added later in the command will appear on top of layers specified earlier in the command and may cover them up. If multiple, conflicting specifications are given for a property in the same ggplot command, whichever specification is given later will “win out.”

  • Craft a single theme() command you can use to provide consistent base styling for every one of your graphs!

  • Use faceting to automatically create a series of sub-panels, one for each member of a grouping variable, that will share the same aesthetics and design properties.

  • Use the ggsave() function to programmatically save ggplots as image files with the desired resolution, size, and file type.

  • Use the tidyr package to reshape the organization of your data.
  • Use pivot_longer() to go towards a “longer” layout.
  • Use pivot_wider() to go towards a “wider” layout.
  • Recognize that “longness” and “wideness” is a continuum and that your data may not be as “long” or as “wide” as they could be.
  • Recognize that there are advantages and disadvantages to every data layout.

Control Flow--if() and for()


  • Use if and else to have R make choices on the fly for you with respect to what operations it should do.
  • Use for to repeat operations many times.

Vectorization


  • Use vectorized operations instead of loops.

Functions Explained


  • Use function to define a new function in R.
  • Use parameters to pass values into functions.
  • Use stopifnot() to flexibly check function arguments in R.
  • Load functions into programs using source().