Introduction - Fread.jl
This packages allows you to use R' {data.table}'s excellent fread
function to read CSVs
Installation
using Pkg
Pkg.add("https://github.com/xiaodaigh/Fread.jl")
Install R packages
You need to make sure you have {data.table}
and {feather}
installed in your R. E.g. in your R session
install.packages(c("data.table", "feather"))
Usage
To use the default parameters of fread
using Fread
a = fread(path_to_csv)
To use customised parameters/arguments, you must set them by name using arg =
e.g.
using Fread
a = fread(path_to_csv, sep="|", nrows = 50)
Convert CSVs to feather or parquet
You can use this package to convert CSVs to feather and parquet files
using Fread
csv_to_feather(path_to_csv, outpath)
csv_to_parquent(path_to_csv, outpath)
How does it work internally?
The function fread
does a few of things
- Reads the CSV using
data.table::fread
- Saves the
data.frame
in feather format - Loads the feather file into Julia as a
DataFrame
Step 2 creates a feather file which you can set the location of by using a 2nd unnamed argument .e.g.
fread(path_to_csv, "path/to/out.feather")
by default the feather
output path is path_to_csv*".feather
i.e. with the feather extension attached to the input file.
Why?
Because data.table::fread
is fast! And is often much faster than native pure-Julia solutions at the moment