rmr.sample {rmr2} | R Documentation |
Sample large data sets
rmr.sample(input, output = NULL, method = c("any", "Bernoulli"), ...)
input |
The data set to be sampled as a file path or |
output |
Where to store the result. See |
method |
One of "any" or "Bernoulli". "any" will return some records out, optimized for speed, but with no statistical guarantees. "Bernoulli" implements independent sampling according to the Bernoulli distribution |
... |
Additional arguments to fully specify the sample, they depend on the method selected. If it is "any" then the size of the desired sample should be provided as the argument |
The sampled data. See mapreduce
for details.