src_impala {implyr} | R Documentation |
src_impala
creates a SQL backend to dplyr for
Apache Impala (incubating),
the massively parallel processing query engine for Apache Hadoop.
src_impala
can work with any DBI-compatible interface that provides
connectivity to Impala. Currently, two packages that can provide this
connectivity are odbc and RJDBC.
src_impala(drv, ..., auto_disconnect = TRUE)
drv |
an object that inherits from |
... |
arguments passed to the underlying Impala database connection
method |
auto_disconnect |
Should the connection to Impala be automatically
closed when the object returned by this function is deleted? Pass |
An object with class src_impala
, src_sql
, src
Impala ODBC driver, Impala JDBC driver
# Using ODBC connectivity: ## Not run: library(odbc) drv <- odbc::odbc() impala <- src_impala( drv = drv, driver = "Cloudera ODBC Driver for Impala", host = "host", port = 21050, database = "default", uid = "username", pwd = "password" ) ## End(Not run) # Using JDBC connectivity: ## Not run: library(RJDBC) Sys.setenv(JAVA_HOME = "/path/to/java/home/") impala_classpath <- list.files( path = "/path/to/jdbc/driver", pattern = "\\.jar$", full.names = TRUE ) .jinit(classpath = impala_classpath) drv <- JDBC( driverClass = "com.cloudera.impala.jdbc41.Driver", classPath = impala_classpath, identifier.quote = "`" ) impala <- src_impala( drv, "jdbc:impala://host:21050", "username", "password" ) ## End(Not run)