expand {tidyr} | R Documentation |
expand()
is often useful in conjunction with left_join
if
you want to convert implicit missing values to explicit missing values.
Or you can use it in conjunction with anti_join()
to figure
out which combinations are missing.
expand(data, ...) crossing(...) nesting(...)
data |
A data frame |
... |
Specification of columns to expand. To find all unique combinations of x, y and z, including those not
found in the data, supply each variable as a separate argument.
To find only the combinations that occur in the data, use nest:
You can combine the two forms. For example,
To fill in values that are missing altogether, use expressions like
|
crossing()
is similar to expand.grid()
, this never
converts strings to factors, returns a tbl_df
without additional
attributes, and first factors vary slowest. nesting()
is the
complement to crossing()
: it only keeps combinations of all variables
that appear in the data.
complete
for a common application of expand
:
completing a data frame with missing combinations.
expand_
for a version that uses regular evaluation
and is suitable for programming with.
library(dplyr) # All possible combinations of vs & cyl, even those that aren't # present in the data expand(mtcars, vs, cyl) # Only combinations of vs and cyl that appear in the data expand(mtcars, nesting(vs, cyl)) # Implicit missings --------------------------------------------------------- df <- data_frame( year = c(2010, 2010, 2010, 2010, 2012, 2012, 2012), qtr = c( 1, 2, 3, 4, 1, 2, 3), return = rnorm(7) ) df %>% expand(year, qtr) df %>% expand(year = 2010:2012, qtr) df %>% expand(year = full_seq(year, 1), qtr) df %>% complete(year = full_seq(year, 1), qtr) # Nesting ------------------------------------------------------------------- # Each person was given one of two treatments, repeated three times # But some of the replications haven't happened yet, so we have # incomplete data: experiment <- data_frame( name = rep(c("Alex", "Robert", "Sam"), c(3, 2, 1)), trt = rep(c("a", "b", "a"), c(3, 2, 1)), rep = c(1, 2, 3, 1, 2, 1), measurment_1 = runif(6), measurment_2 = runif(6) ) # We can figure out the complete set of data with expand() # Each person only gets one treatment, so we nest name and trt together: all <- experiment %>% expand(nesting(name, trt), rep) all # We can use anti_join to figure out which observations are missing all %>% anti_join(experiment) # And use right_join to add in the appropriate missing values to the # original data all %>% right_join(experiment) # Or use the complete() short-hand experiment %>% complete(nesting(name, trt), rep)