write_fst {fst} | R Documentation |
Read and write data frames from and to a fast-storage (fst
) file.
Allows for compression and (file level) random access of stored data, even for compressed datasets.
Multiple threads are used to obtain high (de-)serialization speeds but all background threads are
re-joined before write_fst
and read_fst
return (reads and writes are stable).
When using a data.table
object for x
, the key (if any) is preserved,
allowing storage of sorted data.
Methods read_fst
and write_fst
are equivalent to read.fst
and write.fst
(but the
former syntax is preferred).
write_fst(x, path, compress = 50, uniform_encoding = TRUE) read_fst(path, columns = NULL, from = 1, to = NULL, as.data.table = FALSE, old_format = FALSE) write.fst(x, path, compress = 50, uniform_encoding = TRUE) read.fst(path, columns = NULL, from = 1, to = NULL, as.data.table = FALSE, old_format = FALSE)
x |
a data frame to write to disk |
path |
path to fst file |
compress |
value in the range 0 to 100, indicating the amount of compression to use. Lower values mean larger file sizes. The default compression is set to 50. |
uniform_encoding |
If |
columns |
Column names to read. The default is to read all columns. |
from |
Read data starting from this row number. |
to |
Read data up until this row number. The default is to read to the last row of the stored dataset. |
as.data.table |
If TRUE, the result will be returned as a |
old_format |
must be FALSE, the old fst file format is deprecated and can only be read and converted with fst package versions 0.8.0 to 0.8.10. |
read_fst
returns a data frame with the selected columns and rows. write_fst
writes x
to a fst
file and invisibly returns x
(so you can use this function in a pipeline).
# Sample dataset x <- data.frame(A = 1:10000, B = sample(c(TRUE, FALSE, NA), 10000, replace = TRUE)) # Default compression write_fst(x, "dataset.fst") # filesize: 17 KB y <- read_fst("dataset.fst") # read fst file # Maximum compression write_fst(x, "dataset.fst", 100) # fileSize: 4 KB y <- read_fst("dataset.fst") # read fst file # Random access y <- read_fst("dataset.fst", "B") # read selection of columns y <- read_fst("dataset.fst", "A", 100, 200) # read selection of columns and rows