ZipArchives.ZipArchivesModule

ZipArchives

Reading Zip archives

Archives can be read from any AbstractVector{UInt8} containing the data of a zip archive.

For example if you download this repo as a ".zip" from github https://github.com/JuliaIO/ZipArchives.jl/archive/refs/heads/main.zip you can read the README in julia.

using ZipArchives: ZipReader, zip_names, zip_readentry
using Downloads: download
data = take!(download("https://github.com/JuliaIO/ZipArchives.jl/archive/refs/heads/main.zip", IOBuffer()));
archive = ZipReader(data)
zip_names(archive)
zip_readentry(archive, "ZipArchives.jl-main/README.md", String) |> print

Writing Zip archives

using ZipArchives: ZipWriter, zip_newfile
using Test: @test_throws
filename = tempname()
ZipWriter(filename) do w
    @test_throws ArgumentError zip_newfile(w, "test\test1.txt")
    zip_newfile(w, "test/test1.txt")
    write(w, "I am data inside test1.txt in the zip file")

    zip_newfile(w, "test/empty.txt")

    zip_newfile(w, "test/test2.txt")
    write(w, "I am data inside test2.txt in the zip file")
end
ZipArchives.ZipReaderMethod
struct ZipReader{T<:AbstractVector{UInt8}}
ZipReader(buffer::AbstractVector{UInt8})

Create a reader for a zip archive in buffer.

The array must not be modified while being read.

zip_nentries(r::ZipReader)::Int returns the number of entries in the archive.

zip_names(r::ZipReader)::Vector{String} returns the names of all the entries in the archive.

The following get information about an entry in the archive:

Entries are indexed from 1:zip_nentries(r)

  1. zip_name(r::ZipReader, i::Integer)::String
  2. zip_uncompressed_size(r::ZipReader, i::Integer)::UInt64

zip_test_entry(r::ZipReader, i::Integer)::Nothing checks if an entry is valid and has a good checksum.

zip_openentry and zip_readentry can be used to read data from an entry.

A ZipReader object does not need to be closed, and cannot be closed.

Multi threading

The returned ZipReader object can safely be used from multiple threads; however, the streams returned by zip_openentry should only be accessed by one thread at a time.

ZipArchives.ZipWriterMethod
mutable struct ZipWriter{S<:IO} <: IO
ZipWriter(io::IO; zip_kwargs...)::ZipWriter{typeof(io)}
ZipWriter(f::Function, io::IO; zip_kwargs...)::ZipWriter{typeof(io)}

Create a zip archive writer on io.

These methods also work with a filename::AbstractString instead of an io::IO.

In that case, all passed keyword arguments will be used for Base.open in addition to write=true.

io must not be modified before the ZipWriter is closed (except using the wrapping ZipWriter).

The ZipWriter becomes a writable IO after a call to zip_newfile

zip_newfile(w::ZipWriter, name::AbstractString; newfile_kwargs...)

Any writes to the ZipWriter will write to the last specified new file.

If zip_newfile is called while ZipWriter is writable, the previous file is committed to the archive. There is no way to edit previously written data.

An alternative to zip_newfile is zip_writefile

zip_writefile(w::ZipWriter, name::AbstractString, data::AbstractVector{UInt8})

This will directly write a vector of data to a file entry in w. Unlike zip_newfile using zip_writefile doesn't require io to be seekable.

Base.close on a ZipWriter will only close the wrapped io if zip_kwargs has own_io=true or the ZipWriter was created from a filename.

Multi threading

A single ZipWriter instance doesn't allow mutations or writes from multiple threads at the same time.

Appending

ZipWriter assumes io is empty. Trying to write to an io with existing data will result in an invalid archive.

If you want to add entries to existing zip archive, use zip_append_archive

Optional Keywords

  • check_names::Bool=true: Best attempt to error if new entry names aren't valid on windows or already exist in the archive in a case insensitive way.
ZipArchives.parse_central_directoryMethod
parse_central_directory(io::IO)::Tuple{Vector{EntryInfo}, Vector{UInt8}, Int64}

Where io must be readable and seekable. io is assumed to not be changed while this function runs.

Return the entries, the raw data of the central directory, and the offset in io of the start of the central directory as a named tuple. (;entries, central_dir_buffer, central_dir_offset)

The central directory is after all entry data.

ZipArchives.zip_abortfileMethod
zip_abortfile(w::ZipWriter)

Close any open entry making w not writable.

The open entry is not added to the list of entries so will be ignored when the zip archive is read.

ZipArchives.zip_append_archiveMethod
zip_append_archive(io::IO; trunc_footer=true, zip_kwargs=(;))::ZipWriter

Return a ZipWriter that will add entries to the existing zip archive in io.

This also works with a filename::AbstractString instead of an io::IO. In that case, all passed keyword arguments will be used for Base.open in addition to read=true, write=true.

If io doesn't have a valid zip archive footer already, this function will error.

If trunc_footer=true the no longer needed zip archive footer at the end of io will be truncated. Otherwise, it will be left as is.

zip_kwargs will be forwarded to ZipWriter

ZipArchives.zip_commentMethod
zip_comment(x::HasEntries, i::Integer)::String

Return the comment attached to entry i

ZipArchives.zip_compressed_sizeMethod
zip_compressed_size(x::HasEntries, i::Integer)::UInt64

Return the marked compressed size of entry i in number of bytes.

Note: if the zip file was corrupted, this might be wrong.

ZipArchives.zip_definitely_utf8Method
zip_definitely_utf8(x::HasEntries, i::Integer)::Bool

Return true if entry i name is marked as utf8 or is ascii.

Otherwise, the name should probably be treated as a sequence of bytes.

This package will never attempt to transcode filenames.

ZipArchives.zip_findlast_entryMethod
zip_findlast_entry(x::HasEntries, s::AbstractString)::Union{Nothing, Int}

Return the index of the last entry with name s or nothing if not found.

ZipArchives.zip_isdirMethod
zip_isdir(x::HasEntries, s::AbstractString)::Bool

Return if s is an implicit or explicit directory in x

ZipArchives.zip_isdirMethod
zip_isdir(x::HasEntries, i::Integer)::Bool

Return if entry i is a directory.

ZipArchives.zip_mkdirMethod
zip_mkdir(w::ZipWriter, name::AbstractString)

Write a directory entry named name.

name should end in "/". If not, a "/" will be appended.

This is only needed to add an empty directory.

ZipArchives.zip_nameMethod
zip_name(x::HasEntries, i::Integer)::String

Return the name of entry i.

i can range from 1:zip_nentries(x)

ZipArchives.zip_name_collisionMethod
zip_name_collision(w::ZipWriter, new_name::AbstractString)::Bool

Return true if new_name exactly matches an existing committed entry name.

ZipArchives.zip_newfileMethod
zip_newfile(w::ZipWriter, name::AbstractString; 
    compress::Bool=false,
)

Start a new file entry named name.

This will commit any currently open entry and make w writable for file entry name.

The underlying IO in w must be seekable to use this function. If not see zip_writefile

Optional Keywords

  • comment::String="": Entry comment, ncodeunits(comment) ≤ typemax(UInt16).
  • compress::Bool=false: If false no compression is used and other compression options are ignored.
  • compression_level::Int=-1: 1 is fastest, 9 is smallest file size. 0 is no compression, and -1 is a good compromise between speed and file size.
  • compression_method=Deflate: Currently only Deflate and Store are supported.
  • executable::Union{Nothing,Bool}=nothing: Set to true to mark file as executable. Defaults to false.
  • external_attrs::Union{Nothing,UInt32}=nothing: Manually override the external file attributes: See https://unix.stackexchange.com/questions/14705/the-zip-formats-external-file-attribute
ZipArchives.zip_openentryMethod
zip_openentry(r::ZipReader, i::Union{AbstractString, Integer})
zip_openentry(f::Function, r::ZipReader, i::Union{AbstractString, Integer})

Open entry i from r as a readable IO.

If i is a string open the last entry with the exact matching name.

Make sure to close the returned stream when done reading, if not using the do block method.

The stream returned by this function should only be accessed by one thread at a time.

See also zip_readentry.

ZipArchives.zip_readentryMethod
zip_readentry(r::ZipReader, i::Union{AbstractString, Integer}, args...; kwargs...)

Read the contents of entry i in r.

If i is a string read the last entry with the exact matching name.

args...; kwargs... are passed on to read after the entry i in zip reader r is opened with zip_openentry

if args... are empty or String, this will also error if the checksum doesn't match.

See also zip_openentry.

ZipArchives.zip_symlinkMethod
zip_symlink(w::ZipWriter, target::AbstractString, link::AbstractString)

Creates a symbolic link to target with the name link.

This is not supported by most zip extractors. And will error unless check_names is set to false for the ZipWriter.

ZipArchives.zip_test_entryMethod
zip_test_entry(x::ZipReader, i::Integer)::Nothing

If entry i has an issue, error. Otherwise, return nothing.

This will also read the entry and check the crc32 matches.

ZipArchives.zip_uncompressed_sizeMethod
zip_uncompressed_size(x::HasEntries, i::Integer)::UInt64

Return the marked uncompressed size of entry i in number of bytes.

Note: if the zip file was corrupted, this might be wrong.

ZipArchives.zip_writefileMethod
zip_writefile(w::ZipWriter, name::AbstractString, data::AbstractVector{UInt8})

Write data as a file entry named name.

Unlike zip_newfile, the underlying IO only needs to implement Base.unsafe_write and Base.isopen. w isn't writable after. The written data will not be compressed.

See also, zip_newfile

Optional Keywords

  • comment::String="": Entry comment, ncodeunits(comment) ≤ typemax(UInt16).
  • executable::Union{Nothing,Bool}=nothing: Set to true to mark file as executable. Defaults to false.
  • external_attr::Union{Nothing,UInt32}=nothing: Manually set the external file attributes: See https://unix.stackexchange.com/questions/14705/the-zip-formats-external-file-attribute