equijoin {rmr2} | R Documentation |
A generalized form of equijoin, hybrid between the SQL brethren and mapreduce
equijoin( left.input = NULL, right.input = NULL, input = NULL, output = NULL, input.format = "native", output.format = "native", outer = c("", "left", "right", "full"), map.left = to.map(identity), map.right = to.map(identity), reduce = reduce.default)
left.input |
The left side input to the join. |
right.input |
The right side input to the join. |
input |
The only input in case of a self join. Mutually exclusive with the previous two. |
output |
Where to write the output. |
input.format |
Input format specification, see |
output.format |
Output format specification, see |
outer |
Whether to perform an outer join, one of the usual three types, left, right or full. |
map.left |
Function to apply to each record from the left input, follows same conventions as any map function. The returned keys will become join keys. |
map.right |
Function to apply to each record from the right input, follows same conventions as any map function. The returned keys will become join keys. |
reduce |
Function to be applied, key by key, on the values associated with that key. Those values are in the arguments |
If output is specified, returns output itself. Otherwise, a big.data.object
Doesn't work with multiple inputs like mapreduce
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. from.dfs(equijoin(left.input = to.dfs(keyval(1:10, 1:10^2)), right.input = to.dfs(keyval(1:10, 1:10^3))))