ARFFFiles.ARFFFiles
— ModuleModule for loading, and saving of ARFFFiles.
See ARFFFiles.load
and ARFFFiles.save
.
ARFFFiles.ARFFAttribute
— TypeARFFFiles.ARFFHeader
— TypeARFFHeader
Represents the header information in an ARFF file.
It has these fields:
relation
: the @relation name.attributes
: vector of each @attribute as anARFFAttribute
.
ARFFFiles.ARFFReader
— TypeARFFReader
An object holding an IO stream of an ARFF file, used to access its data.
Header information is in the header
field, of type ARFFHeader
.
It has the following functionality:
nextrow(r)
returns the next row of data as aNamedTuple{names, types}
, ornothing
if everything has been read.read(r, [n])
reads up ton
rows as a vector.read!(xs, r)
reads up tolength(xs)
rows into the given vector, returning the number of rows read.close(r)
closes the underlying IO stream, unless it was created withown=false
.eof(r)
tests whether the IO stream is at the end.- Iteration yields rows of
r
. - It satisfies the
Tables.jl
interface, so e.g.DataFrame(r)
does what you think.
ARFFFiles.ARFFType
— TypeARFFType
Abstract type of ARFF types. Concrete subtypes are ARFFNumericType
, ARFFStringType
, ARFFDateType
and ARFFNominalType
.
ARFFFiles.load
— Methodload(file, ...)
load(f, file, ...)
The first form loads the entire ARFF file as a table. It is equivalent to load(readcolumns, file, ...)
The second form is equivalent to f(loadstreaming(file, ...))
but ensures that the file is closed afterwards.
See loadstreaming
for the available keyword parameters.
For example load(DataFrame, file)
loads the file as a DataFrame
. Replace DataFrame
with your favourite table type.
ARFFFiles.load_header
— Methodload_header(file, ...)
Equivalent to load(r->r.header, file, ...)
, which loads just the header from the given file as a ARFFHeader
.
ARFFFiles.loadchunks
— Methodloadchunks(file, ...)
loadchunks(f, file, ...)
The first form opens the ARFF file and returns an iterator over chunks of the file. It is equivalent to Tables.partitions(loadstreaming(file, ...))
.
The second form is equivalent to f(loadchunks(file, ...))
but ensures that the file is closed afterwards.
ARFFFiles.loadstreaming
— Functionloadstreaming(io::IO, own=false; [missingcols=true], [missingnan=false], [categorical=true], [chunkbytes=2^26])
loadstreaming(filename::AbstractString; ...)
An ARFFReader
object for reading the given ARFF file one record at a time.
Option missingcols
specifies which columns can contain missing data. It can be :auto
(columns with missing values are automatically detected, the default), :all
or true
(all columns), :none
or false
(no columns), a set or vector of column names, or a function taking a column name and returning true or false. Note that :auto
does not apply if the table is being read in a streaming fashion, in which case it behaves like :all
.
Option missingnan
specifies whether or not to convert missing values in numeric columns to NaN
. This is equivalent to excluding these columns in missingcols
.
Option categorical
specifies whether or not to convert nominal columns to CategoricalValue
or String
.
Option chunkbytes
specifies approximately how many bytes to read per chunk when iterating over chunks or rows.
ARFFFiles.nextrow
— Methodnextrow(r::ARFFReader{names, types}) :: Union{Nothing, NamedTuple{names, types}}
The next row of data from the given ARFFReader
, or nothing
if everything has been read.
ARFFFiles.parse_javadateformat
— Methodparse_javadateformat(java::AbstractString)
Convert the given Java date format string to the equivalent Julia DateFormat
.
See https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
Only the following format characters are currently supported: y
→ y
(year), M
→ m
(month), d
→ d
(day), H
→ H
(hour), m
→ M
(minute), s
→ S
(second) and S
→ s
(millisecond).
ARFFFiles.readcolumns
— Methodreadcolumns(r::ARFFReader, maxbytes=nothing)
Read the data from r
into a columnar table.
By default the entire table is read. If maxbytes
is given, approximately this many bytes of the input stream is read instead, allowing for reading the table in chunks.
The same can be achieved by iterating over Tables.partitions(r)
.
ARFFFiles.save
— Methodsave(file, table; relation="data", comment=...)
Save the Tables.jl-compatible table
in ARFF format to file
, which must be an IO stream or file.
The relation name is relation
. The given comment
is written at the top of the file.