spark_apply {sparklyr} | R Documentation |
Applies an R function to a Spark object (typically, a Spark DataFrame).
spark_apply(x, f, columns = colnames(x), memory = TRUE, group_by = NULL, packages = TRUE, ...)
x |
An object (usually a |
f |
A function that transforms a data frame partition into a data frame.
The function |
columns |
A vector of column names or a named vector of column types for the transformed object. Defaults to the names from the original object and adds indexed column names when not enough columns are specified. |
memory |
Boolean; should the table be cached into memory? |
group_by |
Column name used to group by data frame partitions. |
packages |
Boolean to distribute For clusters using Livy or Yarn cluster mode, For offline clusters where |
... |
Optional arguments; currently unused. |