Aggregate functions {PivotalR} | R Documentation |
An aggregate function is a function where the values of multiple rows are grouped together as input to calculate a single value of more significant meaning or measurement. The aggregate functions included are mean, sum, count, max, min, standard deviation, and variance. Also included is a function to compute the mean value of each column and a function to compute the sum of each column.
## S4 method for signature 'db.obj' mean(x, ...) ## S4 method for signature 'db.obj' sum(x, ..., na.rm = FALSE) ## S4 method for signature 'db.obj' count(x) ## S4 method for signature 'db.obj' max(x, ..., na.rm = FALSE) ## S4 method for signature 'db.obj' min(x, ..., na.rm = FALSE) ## S4 method for signature 'db.obj' sd(x) ## S4 method for signature 'db.obj' var(x) ## S4 method for signature 'db.obj' colMeans(x, na.rm = FALSE, dims = 1, ...) ## S4 method for signature 'db.obj' colSums(x, na.rm = FALSE, dims = 1, ...) colAgg(x) db.array(x, ...)
x |
A For |
... |
further arguments passed to or from other methods This is currently not implemented. |
na.rm |
logical. Should missing values (including 'NaN') be removed? This is currently not implemented. |
dims |
integer: Which dimensions are regarded as 'rows' or 'columns' to sum over. This is currently not implemented and the default behavior is to sum over columns |
For aggregate functions: mean
, sum
, count
,
max
, min
, sd
, and var
, the signature
x
must be a reference to a single column in a table.
For aggregate functions: colMeans
, colSums
, and colAgg
the signature
x
can be a db.obj
referencing to a single column or a single
table, or can be a db.Rquery
referencing to multiple columns in a table.
For mean
, a db.Rquery
which is a SQL query to extract the average of a column of a table. Actually, it can work on multiple columns, so it is the same as colMeans
.
For sum
, a db.Rquery
which is a SQL query to extract the sum of a column of a table. Actually, it can work on multiple columns, so it is the same as colSums
.
For count
, a db.Rquery
which is a SQL query to extract the count of a column of a table.
For max
, a db.Rquery
which is a SQL query to extract the max of a column of a table.
For min
, a db.Rquery
which is a SQL query to extract the min of a column of a table.
For sd
, a db.Rquery
which is a SQL query to extract the standard deviation of a column of a table.
For var
, a db.Rquery
which is a SQL query to extract the variance of a column of a table.
For colMeans
, a db.Rquery
which is a SQL query to extract the mean of multiple columns of a table.
For colSums
, a db.Rquery
which is a SQL query to extract the sum of multiple columns of a table.
For colAgg
, a db.Rquery
which is a SQL query to retreive
the column values as an array aggregate.
For db.array
, a db.Rquery
which is a SQL query which
combine all columns into an array.
Author: Predictive Analytics Team at Pivotal Inc.
Maintainer: Frank McQuillan, fmcquillan@pivotal.io
by,db.obj-method
is usually used together with
aggregate functions.
## Not run: ## get the help for a method ## help("mean,db.obj-method") ## set up the database connection ## Assume that .port is port number and .dbname is the database name cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE) ## ---------------------------------------------------------------------- ## create a table from the example data.frame "abalone" delete("abalone", conn.id = cid) x <- as.db.data.frame(abalone, "abalone", conn.id = cid, verbose = FALSE) ## get the mean of a column mean(x$diameter) ## get the sum of a column sum(x$height) ## get the number of entries in a column count(x$id) ## get the max value of a column max(x$diameter) ## get the min value of a column min(x$diameter) ## get the standard deviation of the values in column sd(x$diameter) ## get the variance of the values in column var(x$diameter) ## get the mean of all columns in the table colMeans(x) ## get the sum of all columns in the table colSums(x) ## get the array aggregate of a specific column in the table colAgg(x$diameter) ## get the array aggregate of all columns in the table colAgg(x) ## put everything into an array plus a constant 1 as the first element db.array(1, x[,3:5], x[,6:7], x[,8:10]) ## ---------------------------------------------------------------------- db.disconnect(cid, verbose = FALSE) ## End(Not run)