Official DuckDB Julia Package
DuckDB is a high-performance in-process analytical database system. It is designed to be fast, reliable and easy to use. For more information on the goals of DuckDB, please refer to the Why DuckDB page on our website.
The DuckDB Julia package provides a high-performance front-end for DuckDB. Much like SQLite, DuckDB runs in-process within the Julia client, and provides a DBInterface front-end.
The package also supports multi-threaded execution. It uses Julia threads/tasks for this purpose. If you wish to run queries in parallel, you must launch Julia with multi-threading support (by e.g. setting the JULIA_NUM_THREADS
environment variable).
Installation
pkg> add DuckDB
julia> using DuckDB
Basics
# create a new in-memory database
con = DBInterface.connect(DuckDB.DB, ":memory:")
# create a table
DBInterface.execute(con, "CREATE TABLE integers(i INTEGER)")
# insert data using a prepared statement
stmt = DBInterface.prepare(con, "INSERT INTO integers VALUES(?)")
DBInterface.execute(stmt, [42])
# query the database
results = DBInterface.execute(con, "SELECT 42 a")
print(results)
Scanning DataFrames
The DuckDB Julia package also provides support for querying Julia DataFrames. Note that the DataFrames are directly read by DuckDB - they are not inserted or copied into the database itself.
If you wish to load data from a DataFrame into a DuckDB table you can run a CREATE TABLE AS
or INSERT INTO
query.
using DuckDB
using DataFrames
# create a new in-memory dabase
con = DBInterface.connect(DuckDB.DB)
# create a DataFrame
df = DataFrame(a = [1, 2, 3], b = [42, 84, 42])
# register it as a view in the database
DuckDB.register_data_frame(con, df, "my_df")
# run a SQL query over the DataFrame
results = DBInterface.execute(con, "SELECT * FROM my_df")
print(results)
Original Julia Connector
Credits to kimmolinna for the original DuckDB Julia connector.
Contributing to the Julia Package
Formatting
The format script must be run when changing anything. This can be done by running the following command from within the root directory of the project:
julia tools/juliapkg/scripts/format.jl
Testing
You can run the tests using the test.sh
script:
./test.sh
Specific test files can be run by adding the name of the file as an argument:
./test.sh test_connection.jl
In order to run against a locally compiled version of DuckDB, you will can set the JULIA_DUCKDB_LIBRARY
environment variable, e.g.:
export JULIA_DUCKDB_LIBRARY="`pwd`/../../build/debug/src/libduckdb.dylib"
Note that Julia pre-compilation caching might get in the way of changes to this variable taking effect. You can clear these caches using the following command:
rm -rf ~/.julia/compiled
Submitting a New Package
The DuckDB Julia package depends on the DuckDB_jll package, which can be updated by sending a PR to Yggdrassil.
After the DuckDB_jll
package is updated, the DuckDB package can be updated by incrementing the version number (and dependency version numbers) in Project.toml
, followed by adding a comment containing the text @JuliaRegistrator register subdir=tools/juliapkg
to the commit.