sample {dplyr} | R Documentation |
This is a wrapper around sample.int
to make it easy to
select random rows from a table. It currently only works for local
tbls.
sample_n(tbl, size, replace = FALSE, weight = NULL, .env = parent.frame()) sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = parent.frame())
tbl |
tbl of data. |
size |
For |
replace |
Sample with or without replacement? |
weight |
Sampling weights. This expression is evaluated in the context of the data frame. It must return a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. |
.env |
Environment in which to look for non-data names used in
|
by_cyl <- mtcars %>% group_by(cyl) # Sample fixed number per group sample_n(mtcars, 10) sample_n(mtcars, 50, replace = TRUE) sample_n(mtcars, 10, weight = mpg) sample_n(by_cyl, 3) sample_n(by_cyl, 10, replace = TRUE) sample_n(by_cyl, 3, weight = mpg / mean(mpg)) # Sample fixed fraction per group # Default is to sample all data = randomly resample rows sample_frac(mtcars) sample_frac(mtcars, 0.1) sample_frac(mtcars, 1.5, replace = TRUE) sample_frac(mtcars, 0.1, weight = 1 / mpg) sample_frac(by_cyl, 0.2) sample_frac(by_cyl, 1, replace = TRUE)