The Rust SDK for building DuckDB loadable extensions — no C++ required.

What is quack-rs?

quack-rs is a production-grade Rust SDK that makes building DuckDB loadable extensions straightforward and safe. It wraps the DuckDB C Extension API — the same API used by official DuckDB extensions — and eliminates every known FFI pitfall so you can focus on writing extension logic in pure Rust.

DuckDB's own documentation acknowledges the gap:

"Writing a Rust-based DuckDB extension requires writing glue code in C++ and will force you to build through DuckDB's CMake & C++ based extension template. We understand that this is not ideal and acknowledge the fact that Rust developers prefer to work on pure Rust codebases."

— DuckDB Community Extensions FAQ

quack-rs closes that gap. No C++. No CMake. No glue code.

What you can build

Extension type	quack-rs support
Scalar functions	✅ `ScalarFunctionBuilder`
Overloaded scalars	✅ `ScalarFunctionSetBuilder`
Aggregate functions	✅ `AggregateFunctionBuilder`
Overloaded aggregates	✅ `AggregateFunctionSetBuilder`
Table functions	✅ `TableFunctionBuilder` (raw) + `TypedTableFunctionBuilder<S>` (closure-based, typed scan state)
Cast / TRY_CAST functions	✅ `CastFunctionBuilder`
Replacement scans	✅ `ReplacementScanBuilder`
SQL macros (scalar)	✅ `SqlMacro::scalar`
SQL macros (table)	✅ `SqlMacro::table`
Copy functions (`COPY TO`)	✅ `CopyFunctionBuilder` (requires `duckdb-1-5`)

Note: Window functions have no counterpart in DuckDB's public C Extension API and cannot be implemented from Rust (or any language) via that API. See Known Limitations.

Why does this exist?

quack-rs was extracted from duckdb-behavioral, a production DuckDB community extension. Building that extension revealed 16 undocumented pitfalls in DuckDB's Rust FFI surface — struct layouts, callback contracts, and initialization sequences that aren't covered anywhere in the DuckDB documentation or libduckdb-sys docs.

Three of those pitfalls caused extension-breaking bugs that passed 435 unit tests before being caught by end-to-end tests:

A SEGFAULT on load (wrong entry point sequence)
6 of 7 functions silently not registered (undocumented function-set naming rule)
Wrong aggregate results under parallel plans (combine callback not propagating configuration fields to fresh target states)

quack-rs makes each of these impossible through type-safe builders and safe wrappers. The full catalog is documented in the Pitfall Reference.

Key features

Zero C++ — no CMakeLists.txt, no header files, no glue code
All C API function types — scalar, aggregate, table, cast, replacement scan, SQL macro, copy function (duckdb-1-5)
Panic-free FFI — init_extension never panics; errors surface via Result
RAII memory management — LogicalType and FfiState<T> prevent leaks and double-frees
Type-safe builders — ScalarFunctionBuilder, AggregateFunctionBuilder, TableFunctionBuilder, CastFunctionBuilder, ReplacementScanBuilder
SQL macros — register CREATE MACRO statements without any FFI callbacks
Testable state — AggregateTestHarness<T> tests aggregate logic without a live DuckDB
Scaffold generator — produces a submission-ready community extension project from code
16 pitfalls documented — every known DuckDB Rust FFI pitfall, with symptoms and fixes

New to DuckDB extensions? → Start with Quick Start

Adding quack-rs to an existing project? → See Installation

Writing your first function? → See Scalar Functions or Aggregate Functions

Want SQL macros without FFI callbacks? → See SQL Macros

Submitting a community extension? → See Community Extensions

Something broke? → See Pitfall Catalog

Quick Start

This page gets you from zero to a working DuckDB extension in three steps.

Prerequisites

Rust ≥ 1.87.0 (MSRV) — install via rustup
DuckDB CLI (for testing the built extension) — download

Step 1 — Add quack-rs to your extension

In your extension's Cargo.toml:

[dependencies]
quack-rs = "0.13"
libduckdb-sys = { version = ">=1.4.4, <2", features = ["loadable-extension"] }

[lib]
name = "my_extension"       # must match your extension name — see Pitfall P1
crate-type = ["cdylib", "rlib"]

[profile.release]
panic = "abort"             # required — panics across FFI are undefined behavior
lto = true
opt-level = 3
codegen-units = 1
strip = true

Start fresh? Use the scaffold generator to generate a complete, submission-ready project from code.

Step 2 — Write the extension

#![allow(unused)]
fn main() {
// src/lib.rs
use quack_rs::entry_point;
use quack_rs::error::ExtensionError;
use quack_rs::scalar::ScalarFunctionBuilder;
use quack_rs::types::TypeId;
use quack_rs::vector::{VectorReader, VectorWriter};
use libduckdb_sys::{duckdb_connection, duckdb_function_info, duckdb_data_chunk, duckdb_vector};

/// Scalar function: double_it(BIGINT) → BIGINT
unsafe extern "C" fn double_it(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    // SAFETY: input is a valid data chunk provided by DuckDB.
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };
    let row_count = reader.row_count();

    for row in 0..row_count {
        if unsafe { !reader.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let value = unsafe { reader.read_i64(row) };
        unsafe { writer.write_i64(row, value * 2) };
    }
}

fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        ScalarFunctionBuilder::new("double_it")
            .param(TypeId::BigInt)
            .returns(TypeId::BigInt)
            .function(double_it)
            .register(con)?;
    }
    Ok(())
}

entry_point!(my_extension_init_c_api, |con| register(con));
}

Step 3 — Build and test

# Build the extension
cargo build --release

# Load in DuckDB CLI
duckdb -cmd "LOAD './target/release/libmy_extension.so'; SELECT double_it(21);"
# ┌───────────────┐
# │ double_it(21) │
# │     int64     │
# ├───────────────┤
# │            42 │
# └───────────────┘

macOS: use .dylib extension. Windows: use .dll.

What's next?

Learn how DuckDB calls your extension: Extension Anatomy
Add an aggregate function: Aggregate Functions
Add SQL macros without any callbacks: SQL Macros
Generate a complete community extension project: Project Scaffold

Installation

Adding quack-rs to an existing extension

Add the following to your extension's Cargo.toml:

[dependencies]
quack-rs = "0.13"
libduckdb-sys = { version = ">=1.4.4, <2", features = ["loadable-extension"] }

Why >=1.4.4, <2? DuckDB 1.4.x and 1.5.x expose the same C API version (v1.2.0), so quack-rs supports both with a single bounded range. The <2 upper bound prevents silent adoption of a future major release whose C API may change in breaking ways — making any such upgrade an explicit, auditable decision. See Extension Anatomy.

Required Cargo.toml settings

Every DuckDB extension requires specific Cargo settings to link and behave correctly:

[lib]
name = "my_extension"       # ← must match extension name exactly (Pitfall P1)
crate-type = ["cdylib", "rlib"]
#             ^^^^^^  cdylib produces the .so/.dylib/.dll DuckDB loads
#                      rlib  allows unit tests and documentation to work

[profile.release]
panic = "abort"             # REQUIRED — panics across FFI are undefined behavior
lto = true                  # recommended — reduces binary size, improves performance
opt-level = 3               # recommended
codegen-units = 1           # recommended — enables full LTO
strip = true                # recommended — reduces binary size

Why `panic = "abort"`?

Rust's default panic behavior unwinds the stack. When a panic crosses an FFI boundary into DuckDB's C++ code, the result is undefined behavior — DuckDB may crash, corrupt memory, or silently produce wrong results. The panic = "abort" setting converts panics into immediate process termination, which is far safer.

quack-rs itself never panics in FFI callbacks, but this setting protects you if a dependency or your own code panics.

Minimum Supported Rust Version

quack-rs requires Rust ≥ 1.87.0.

This MSRV is required for:

&raw mut expr syntax for creating raw pointers without references (sound and stable since 1.84.0)
const extern fn support

Install or update via:

rustup update stable
rustup default stable

Verify:

rustc --version   # must be ≥ 1.87.0

Development dependencies

For testing with a live DuckDB instance (example-extension tests only):

[dev-dependencies]
duckdb = { version = ">=1.4.4, <2", features = ["bundled"] }

Important: you cannot call any duckdb_* function in a cargo test process when using the loadable-extension feature. See Testing Guide for the full explanation.

Starting a new extension from scratch

Use the scaffold generator to produce a complete project with all required files pre-configured. This is the fastest and most reliable way to start a new extension.

Your First Extension

This page walks through hello-ext, the complete reference example bundled with quack-rs. It registers four functions that together cover every major pattern:

SQL	Kind	Signature
`word_count(text)`	Aggregate	`VARCHAR → BIGINT`
`first_word(text)`	Scalar	`VARCHAR → VARCHAR`
`generate_series_ext(n)`	Table	`BIGINT → TABLE(value BIGINT)`
`CAST(VARCHAR AS INTEGER)`	Cast	`VARCHAR → INTEGER`

Full source: examples/hello-ext/src/lib.rs

Build and try it

cargo build --release --manifest-path examples/hello-ext/Cargo.toml

Then in the DuckDB CLI:

LOAD './examples/hello-ext/target/release/libhello_ext.so';

-- Aggregate: total words across all rows
SELECT word_count(sentence) FROM (
    VALUES ('hello world'), ('one two three'), (NULL)
) t(sentence);
-- → 5  (2 + 3; NULL contributes 0)

-- Scalar: first word of each row
SELECT first_word(sentence) FROM (
    VALUES ('hello world'), ('  padded  '), (''), (NULL)
) t(sentence);
-- → 'hello', 'padded', '', NULL

Overview

An extension has four parts:

State struct — holds data accumulated during aggregation (aggregate only)
Callbacks — update, combine, finalize, state_size, state_init, state_destroy (aggregate) or a single function callback (scalar)
Registration — wire callbacks to DuckDB via AggregateFunctionBuilder / ScalarFunctionBuilder
Entry point — DuckDB's initialization hook, generated by entry_point!

Part 1 — Aggregate function: `word_count`

An aggregate function accumulates state across many rows and emits one result per group.

1a. The state struct

#![allow(unused)]
fn main() {
#[derive(Default, Debug)]
struct WordCountState {
    count: i64,
}

impl AggregateState for WordCountState {}
}

AggregateState is a marker trait — no methods required. FfiState<WordCountState> wraps it in a heap-allocated Box<T> behind a raw pointer and manages the full lifecycle (init, combine, destroy).

1b. `state_size` and `state_init`

These two callbacks are always identical boilerplate — delegate to FfiState:

#![allow(unused)]
fn main() {
unsafe extern "C" fn wc_state_size(_info: duckdb_function_info) -> idx_t {
    FfiState::<WordCountState>::size_callback(_info)
}

unsafe extern "C" fn wc_state_init(info: duckdb_function_info, state: duckdb_aggregate_state) {
    unsafe { FfiState::<WordCountState>::init_callback(info, state) };
}
}

size_callback returns size_of::<*mut WordCountState>() — DuckDB allocates a pointer-slot per group. init_callback runs Box::new(WordCountState::default()) and writes the pointer into that slot.

1c. `update` — accumulate one batch

#![allow(unused)]
fn main() {
unsafe extern "C" fn wc_update(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    states: *mut duckdb_aggregate_state,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let row_count = reader.row_count();

    for row in 0..row_count {
        if !unsafe { reader.is_valid(row) } {
            continue; // NULL input → skip (contributes 0 words)
        }
        let s = unsafe { reader.read_str(row) };
        let words = count_words(s);

        let state_ptr = unsafe { *states.add(row) };
        if let Some(st) = unsafe { FfiState::<WordCountState>::with_state_mut(state_ptr) } {
            st.count += words;
        }
    }
}
}

Key points:

Check is_valid(row) before reading — never dereference an invalid (NULL) row
VectorReader::new(chunk, col) gives column col from the chunk
count_words is pure Rust — no unsafe, easy to unit-test separately

1d. `combine` — merge parallel results

Pitfall L1: DuckDB creates fresh zero-initialized target states before calling combine. You must copy all fields — not just the result field. In an aggregate with config fields (e.g., a histogram with a bin_width) you must also copy those, or results will be silently corrupted.

#![allow(unused)]
fn main() {
unsafe extern "C" fn wc_combine(
    _info: duckdb_function_info,
    source: *mut duckdb_aggregate_state,
    target: *mut duckdb_aggregate_state,
    count: idx_t,
) {
    for i in 0..count as usize {
        let src_ptr = unsafe { *source.add(i) };
        let tgt_ptr = unsafe { *target.add(i) };
        let src = unsafe { FfiState::<WordCountState>::with_state(src_ptr) };
        let tgt = unsafe { FfiState::<WordCountState>::with_state_mut(tgt_ptr) };
        if let (Some(s), Some(t)) = (src, tgt) {
            t.count += s.count;
            // If you add fields to WordCountState, combine them here too.
        }
    }
}
}

1e. `finalize` — write output

#![allow(unused)]
fn main() {
unsafe extern "C" fn wc_finalize(
    _info: duckdb_function_info,
    source: *mut duckdb_aggregate_state,
    result: duckdb_vector,
    count: idx_t,
    offset: idx_t,
) {
    let mut writer = unsafe { VectorWriter::new(result) };

    for i in 0..count as usize {
        let state_ptr = unsafe { *source.add(i) };
        match unsafe { FfiState::<WordCountState>::with_state(state_ptr) } {
            Some(st) => unsafe { writer.write_i64(offset as usize + i, st.count) },
            None     => unsafe { writer.set_null(offset as usize + i) },
        }
    }
}
}

offset is DuckDB's output row offset — always use offset as usize + i, not just i.

1f. `state_destroy`

#![allow(unused)]
fn main() {
unsafe extern "C" fn wc_state_destroy(
    states: *mut duckdb_aggregate_state,
    count: idx_t,
) {
    unsafe { FfiState::<WordCountState>::destroy_callback(states, count) };
}
}

destroy_callback calls Box::from_raw and nulls each pointer, preventing double-free.

Part 2 — Scalar function: `first_word`

A scalar function processes one data chunk and returns one output value per row. The callback receives the full chunk and an output vector (not per-row state pointers).

Key rule: always propagate NULL

If the input row is NULL, write NULL to output — never read from an invalid row.

#![allow(unused)]
fn main() {
unsafe extern "C" fn first_word_scalar(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };
    let row_count = reader.row_count();

    for row in 0..row_count {
        if !unsafe { reader.is_valid(row) } {
            unsafe { writer.set_null(row) }; // NULL in → NULL out
            continue;
        }
        let s = unsafe { reader.read_str(row) };
        unsafe { writer.write_varchar(row, first_word(s)) };
    }
}
}

The pure logic:

#![allow(unused)]
fn main() {
pub fn first_word(s: &str) -> &str {
    s.split_whitespace().next().unwrap_or("")
}
}

Note: set_null internally calls duckdb_vector_ensure_validity_writable before writing the null flag — this is required by DuckDB and handled for you by VectorWriter.

Part 3 — Registration

#![allow(unused)]
fn main() {
unsafe fn register(con: libduckdb_sys::duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        AggregateFunctionBuilder::new("word_count")
            .param(TypeId::Varchar)
            .returns(TypeId::BigInt)
            .state_size(wc_state_size)
            .init(wc_state_init)
            .update(wc_update)
            .combine(wc_combine)
            .finalize(wc_finalize)
            .destructor(wc_state_destroy)
            .register(con)?;

        ScalarFunctionBuilder::new("first_word")
            .param(TypeId::Varchar)
            .returns(TypeId::Varchar)
            .function(first_word_scalar)
            .register(con)?;
    }
    Ok(())
}
}

Both builders call the DuckDB C API internally. register returns Err if DuckDB reports a failure — this propagates to the entry point and is surfaced to the user.

Part 4 — Entry point

#![allow(unused)]
fn main() {
quack_rs::entry_point!(hello_ext_init_c_api, |con| unsafe { register(con) });
}

This one line emits:

#![allow(unused)]
fn main() {
#[no_mangle]
pub unsafe extern "C" fn hello_ext_init_c_api(
    info: duckdb_extension_info,
    access: *const duckdb_extension_access,
) -> bool {
    unsafe {
        quack_rs::entry_point::init_extension(
            info, access, quack_rs::DUCKDB_API_VERSION,
            |con| unsafe { register(con) },
        )
    }
}
}

Pass the full symbol name — hello_ext_init_c_api here. DuckDB looks up this exact symbol when loading the extension. See The Entry Point for the full initialization sequence.

Unit tests (no DuckDB process needed)

Test pure logic directly:

#![allow(unused)]
fn main() {
#[test]
fn count_words_whitespace_variants() {
    assert_eq!(count_words("  hello  world  "), 2);
    assert_eq!(count_words("\t\nhello\tworld\n"), 2);
    assert_eq!(count_words("   "), 0); // all whitespace → 0
}

#[test]
fn first_word_empty_and_whitespace() {
    assert_eq!(first_word(""), "");
    assert_eq!(first_word("   "), "");
}
}

Test aggregate state with AggregateTestHarness:

#![allow(unused)]
fn main() {
#[test]
fn word_count_null_rows_are_skipped() {
    // NULL rows: the callback skips them (no update call)
    let mut h = AggregateTestHarness::<WordCountState>::new();
    h.update(|s| s.count += count_words("hello"));
    // NULL row omitted — models callback skip
    h.update(|s| s.count += count_words("world"));
    assert_eq!(h.finalize().count, 2);
}

#[test]
fn word_count_combine() {
    let mut h1 = AggregateTestHarness::<WordCountState>::new();
    h1.update(|s| s.count += count_words("hello world")); // 2

    let mut h2 = AggregateTestHarness::<WordCountState>::new();
    h2.update(|s| s.count += count_words("one two three four")); // 4

    h2.combine(&h1, |src, tgt| tgt.count += src.count);
    assert_eq!(h2.finalize().count, 6);
}
}

Run all tests with:

cargo test --manifest-path examples/hello-ext/Cargo.toml

See the Testing Guide for the full test strategy.

Project Scaffold

quack_rs::scaffold::generate_scaffold generates a complete, submission-ready DuckDB community extension project from a single function call. No manual file creation, no copy-pasting templates.

What it generates

my_extension/
├── Cargo.toml                          # cdylib crate, pinned deps, release profile
├── Makefile                            # delegates to cargo + extension-ci-tools
├── extension_config.cmake              # required by extension-ci-tools
├── src/
│   ├── lib.rs                          # entry point template
│   └── wasm_lib.rs                     # WASM staticlib shim
├── description.yml                     # community extension metadata
├── test/
│   └── sql/
│       └── my_extension.test           # SQLLogicTest skeleton
├── .github/
│   └── workflows/
│       └── extension-ci.yml            # cross-platform CI workflow
├── .gitmodules                         # extension-ci-tools submodule
├── .gitignore
└── .cargo/
    └── config.toml                     # Windows CRT static linking

Usage

use quack_rs::scaffold::{ScaffoldConfig, generate_scaffold};
use std::path::Path;

fn main() {
    let config = ScaffoldConfig {
        name: "my_extension".to_string(),
        description: "My DuckDB extension".to_string(),
        version: "0.1.0".to_string(),
        license: "MIT".to_string(),
        maintainer: "Your Name".to_string(),
        github_repo: "yourorg/duckdb-my-extension".to_string(),
        excluded_platforms: vec![],
    };

    let files = generate_scaffold(&config).expect("scaffold generation failed");

    for file in &files {
        let path = Path::new(&file.path);
        if let Some(parent) = path.parent() {
            std::fs::create_dir_all(parent).unwrap();
        }
        std::fs::write(path, &file.content).unwrap();
        println!("created {}", file.path);
    }
}

ScaffoldConfig fields

Field	Type	Description
`name`	`String`	Extension name — must match `[lib] name` in Cargo.toml and `description.yml`
`description`	`String`	One-line description for `description.yml`
`version`	`String`	Semver or git hash — validated by `validate_extension_version`
`license`	`String`	SPDX license identifier (e.g., `"MIT"`, `"Apache-2.0"`)
`maintainer`	`String`	Your name or org, listed in `description.yml`
`github_repo`	`String`	`"owner/repo"` format
`excluded_platforms`	`Vec<String>`	Platforms to skip (e.g., `["wasm_mvp", "wasm_eh"]`)

Name validation

Extension names must satisfy all of:

Match ^[a-z][a-z0-9_-]*$
Not exceed 64 characters
Be globally unique on community-extensions.duckdb.org

Use vendor-prefixed names to avoid collisions: myorg_analytics, not analytics.

The scaffold generator validates the name before generating any files and returns an error if it violates the rules.

After scaffolding

cd my_extension
git init
git submodule add https://github.com/duckdb/extension-ci-tools.git extension-ci-tools
git submodule update --init --recursive
make configure
make release

Then add your function logic in src/lib.rs, write your SQLLogicTests in test/sql/my_extension.test, and push to GitHub — CI runs automatically.

Excluded platforms

Some extensions cannot be built for all platforms (e.g., extensions that depend on platform-specific system libraries, or WASM environments that lack threading).

#![allow(unused)]
fn main() {
ScaffoldConfig {
    excluded_platforms: vec![
        "wasm_mvp".to_string(),
        "wasm_eh".to_string(),
        "wasm_threads".to_string(),
    ],
    // ...
}
}

Validate individual platform names with quack_rs::validate::validate_platform, or a semicolon-delimited string (as used in description.yml) with quack_rs::validate::validate_excluded_platforms_str.

Extension Anatomy

A DuckDB loadable extension is a shared library (.so / .dylib / .dll) that DuckDB loads at runtime. Understanding what DuckDB expects makes every other part of quack-rs click.

The initialization sequence

When DuckDB loads your extension, it:

Opens the shared library and looks up the symbol {name}_init_c_api
Calls that function with an info handle and a pointer to function dispatch pointers
Your function must: a. Call duckdb_rs_extension_api_init(info, access, api_version) to initialize the dispatch table b. Get the duckdb_database handle via access.get_database(info) c. Open a duckdb_connection via duckdb_connect d. Register functions on that connection e. Disconnect f. Return true (success) or false (failure)

quack_rs::entry_point::init_extension performs all of this correctly. The entry_point! macro generates the required #[no_mangle] extern "C" symbol:

#![allow(unused)]
fn main() {
entry_point!(my_extension_init_c_api, |con| register(con));
// emits: #[no_mangle] pub unsafe extern "C" fn my_extension_init_c_api(...)
}

Symbol naming

The symbol name must be {extension_name}_init_c_api — all lowercase, underscores only. If the symbol is missing or misnamed, DuckDB fails to load the extension.

Extension name: "word_count_ext"
Required symbol: word_count_ext_init_c_api

Pass the full symbol name to entry_point!. This keeps the exported name explicit and visible at the call site — no hidden identifier manipulation at compile time.

The `loadable-extension` feature

libduckdb-sys with features = ["loadable-extension"] changes how DuckDB API functions work fundamentally:

Without feature:  duckdb_query(...)  →  calls linked libduckdb directly
With feature:     duckdb_query(...)  →  dispatches through an AtomicPtr table

The AtomicPtr table starts as null. DuckDB fills it in by calling duckdb_rs_extension_api_init. This means:

Any call before duckdb_rs_extension_api_init panics with "DuckDB API not initialized"
In cargo test, you cannot call any duckdb_* function — the table is never initialized

This is why quack-rs uses AggregateTestHarness for testing: it simulates the aggregate lifecycle in pure Rust, with zero DuckDB API calls.

Dependency model

graph TD
    EXT["your-extension"]
    QR["quack-rs"]
    LDS["libduckdb-sys >=1.4.4, <2<br/>{loadable-extension}<br/>(headers only — no linked library)"]

    EXT --> QR
    EXT --> LDS
    QR  --> LDS

The loadable-extension feature produces a shared library that does not statically link DuckDB. Instead, it receives DuckDB's function pointers at load time. This is the correct model for extensions: you run inside DuckDB's process, using its memory and threading.

Version support

libduckdb-sys = ">=1.4.4, <2" — the bounded range is intentional.

DuckDB 1.4.x and 1.5.x both expose C API version v1.2.0 (the version string embedded in duckdb_rs_extension_api_init). quack-rs has been E2E tested against both releases. Using a range rather than an exact pin means:

Extension authors can choose their DuckDB target (pin to =1.4.4 or =1.5.0 in their own Cargo.toml) and resolve cleanly against quack-rs
quack-rs itself doesn't force a DuckDB downgrade on users

The <2 upper bound is equally intentional: it prevents silent adoption of a future major release that may introduce breaking C API changes. Upgrading beyond the 1.x band requires an explicit quack-rs release that audits the new C API surface.

For your own extension's Cargo.toml: pin libduckdb-sys to the exact DuckDB version you build and test against (e.g., =1.5.0). Your extension binary will only load in the DuckDB version it was compiled for regardless — the range only matters for quack-rs itself as a library dependency.

Binary compatibility

Extension binaries are tied to a specific DuckDB version and platform. Key facts:

An extension compiled for DuckDB 1.4.4 will not load in DuckDB 1.5.0
DuckDB verifies binary compatibility at load time and refuses mismatched binaries
Official DuckDB extensions are cryptographically signed; community extensions are not
To load unsigned extensions: SET allow_unsigned_extensions = true (development only)
The community extension CI provides automated cross-platform builds for each DuckDB release

The Entry Point

Every DuckDB extension must export a single C-callable symbol that DuckDB invokes at load time. quack-rs provides two ways to create it.

Option A: `entry_point_v2!` with `Connection` (recommended)

Added in v0.4.0.

The entry_point_v2! macro gives your closure a &Connection instead of a raw duckdb_connection. The Connection type implements the Registrar trait, which provides ergonomic methods for registering every function type:

#![allow(unused)]
fn main() {
use quack_rs::entry_point_v2;
use quack_rs::connection::{Connection, Registrar};
use quack_rs::error::ExtensionError;

unsafe fn register(con: &Connection) -> Result<(), ExtensionError> {
    unsafe {
        con.register_scalar(/* ScalarFunctionBuilder */)?;
        con.register_aggregate(/* AggregateFunctionBuilder */)?;
        con.register_table(/* TableFunctionBuilder */)?;
        con.register_cast(/* CastFunctionBuilder */)?;
        con.register_scalar_set(/* ScalarFunctionSetBuilder */)?;
        con.register_aggregate_set(/* AggregateFunctionSetBuilder */)?;
        con.register_sql_macro(/* SqlMacro */)?;
        con.register_replacement_scan(/* callback, data, destructor */);
        // con.register_copy_function(/* CopyFunctionBuilder */)?;  // requires duckdb-1-5
    }
    Ok(())
}

entry_point_v2!(my_extension_init_c_api, |con| unsafe { register(con) });
}

This emits:

#![allow(unused)]
fn main() {
#[no_mangle]
pub unsafe extern "C" fn my_extension_init_c_api(
    info: duckdb_extension_info,
    access: *const duckdb_extension_access,
) -> bool {
    unsafe {
        quack_rs::entry_point::init_extension_v2(
            info, access, quack_rs::DUCKDB_API_VERSION,
            |con| unsafe { register(con) },
        )
    }
}
}

Pass the full symbol name to the macro. The symbol {name}_init_c_api must match the name field in description.yml and the [lib] name in Cargo.toml.

Why `Connection` over raw `duckdb_connection`?

Feature	`entry_point!` (raw)	`entry_point_v2!` (Connection)
Receives	`duckdb_connection`	`&Connection`
Registration	Call builders' `.register(con)`	Call `con.register_*()`
Type safety	Raw pointer	Wrapper with lifetime
Future-proofing	Tied to C pointer	Can evolve without breaking extensions

Option B: The `entry_point!` macro

The original macro passes a raw duckdb_connection to your closure. It works identically but requires you to pass the connection to each builder's .register():

#![allow(unused)]
fn main() {
use quack_rs::entry_point;
use quack_rs::error::ExtensionError;

fn register(con: libduckdb_sys::duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        // register your functions here
        Ok(())
    }
}

entry_point!(my_extension_init_c_api, |con| register(con));
}

Option C: Manual entry point

If you need full control (e.g., multiple registration functions, conditional logic):

#![allow(unused)]
fn main() {
use quack_rs::entry_point::init_extension;
use libduckdb_sys::{duckdb_extension_info, duckdb_extension_access};

#[no_mangle]
pub unsafe extern "C" fn my_extension_init_c_api(
    info: duckdb_extension_info,
    access: *const duckdb_extension_access,
) -> bool {
    unsafe {
        init_extension(info, access, quack_rs::DUCKDB_API_VERSION, |con| {
            register_scalar_functions(con)?;
            register_aggregate_functions(con)?;
            register_sql_macros(con)?;
            Ok(())
        })
    }
}
}

What `init_extension` does

flowchart TD
    A["**1. duckdb_rs_extension_api_init**(info, access, version)<br/>Fills the global AtomicPtr dispatch table"]
    B["**2. access.get_database**(info)<br/>Returns the duckdb_database handle"]
    C["**3. duckdb_connect**(db, &amp;mut con)<br/>Opens a connection for function registration"]
    D["**4. register**(con) ← your closure"]
    E["**5. duckdb_disconnect**(&amp;mut con)<br/>Always runs, even if registration failed"]
    F{Error?}
    G["return **true**"]
    H["return **false**<br/>error reported via access.set_error"]

    A --> B --> C --> D --> E --> F
    F -->|no| G
    F -->|yes| H

    style G fill:#1c3b1c,stroke:#4a9e4a,color:#c8ecc8
    style H fill:#3b1c1c,stroke:#9e4a4a,color:#ecc8c8

Errors from step 4 are reported back to DuckDB via access.set_error and the function returns false. DuckDB then surfaces the error message to the user.

The C API version constant

#![allow(unused)]
fn main() {
pub const DUCKDB_API_VERSION: &str = "v1.2.0";
}

Pitfall P2: This is the C API version, not the DuckDB release version. DuckDB 1.4.x, 1.5.0, and 1.5.1 all use C API version v1.2.0. Passing the wrong string causes the metadata script to fail or produce incorrect metadata. See Pitfall P2.

No panics in the entry point

init_extension never panics. All error paths use Result and ?. If your registration closure returns Err, the error message is reported to DuckDB via access.set_error and the extension fails to load gracefully.

Never use unwrap() or expect() in FFI callbacks. See Pitfall L3.

Error Handling

quack-rs uses a single error type throughout: ExtensionError.

`ExtensionError`

#![allow(unused)]
fn main() {
use quack_rs::error::{ExtensionError, ExtResult};

// From a string literal
let e = ExtensionError::from("something went wrong");

// From a format string
let e = ExtensionError::new(format!("failed to register '{}': code {}", name, code));

// Wrapping another error
let e = ExtensionError::from_error(some_std_error);
}

ExtensionError implements:

std::error::Error
Display, Debug, Clone, PartialEq, Eq
From<&str>, From<String>, From<Box<dyn Error>>
From<std::io::Error>, From<std::ffi::NulError>, From<std::fmt::Error>

The From<std::io::Error> impl is especially useful for extensions that allocate runtime resources (e.g., tokio) during initialization — the ? operator works directly without .map_err():

#![allow(unused)]
fn main() {
fn register_all(con: &Connection) -> Result<(), ExtensionError> {
    let _rt = tokio::runtime::Runtime::new()?; // ← io::Error → ExtensionError
    // ... register functions ...
    Ok(())
}
}

`ExtResult<T>`

A type alias for Result<T, ExtensionError>, used throughout the SDK:

#![allow(unused)]
fn main() {
pub type ExtResult<T> = Result<T, ExtensionError>;
}

Propagating errors with `?`

In your registration function:

#![allow(unused)]
fn main() {
fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        ScalarFunctionBuilder::new("my_fn")
            .param(TypeId::BigInt)
            .returns(TypeId::BigInt)
            .function(my_fn)
            .register(con)?;   // ← ? propagates registration errors

        SqlMacro::scalar("my_macro", &["x"], "x + 1")?
            .register(con)?;

        Ok(())
    }
}
}

If any registration call fails, ? returns the error from register, which init_extension then reports to DuckDB via access.set_error.

Error reporting to DuckDB

init_extension converts ExtensionError to a CString for the DuckDB error callback:

#![allow(unused)]
fn main() {
pub fn to_c_string(&self) -> CString {
    // Truncates at the first null byte if message contains one
    CString::new(self.message.as_bytes()).unwrap_or_else(...)
}
}

DuckDB surfaces this string to the user as the extension load error.

No panics, ever

The cardinal rule of DuckDB extension development:

Never unwrap(), expect(), or panic!() in any code path that DuckDB may call.

Rust panics that cross FFI boundaries are undefined behavior. With panic = "abort" in the release profile, a panic terminates the process — which is safer than UB, but still unacceptable in production.

Safe patterns

#![allow(unused)]
fn main() {
// ✅ Use Option methods
if let Some(s) = FfiState::<MyState>::with_state_mut(state_ptr) {
    s.count += 1;
}

// ✅ Use Result and ?
let value = some_fallible_call()?;

// ✅ Use unwrap_or / unwrap_or_else / map
let count = maybe_count.unwrap_or(0);

// ❌ Never in FFI callbacks
let s = FfiState::<MyState>::with_state_mut(state_ptr).unwrap(); // undefined behavior
}

In `init_extension`

init_extension wraps everything in match and reports errors via set_error — it can never panic regardless of what your registration closure returns.

Type System

quack-rs provides TypeId and LogicalType to bridge Rust types and DuckDB column types.

`TypeId`

TypeId is an ergonomic enum covering DuckDB's column types (the GEOMETRY and VARIANT types added in DuckDB 1.5.x are exposed behind the duckdb-1-5-3 feature — see Known Limitations):

#![allow(unused)]
fn main() {
use quack_rs::types::TypeId;

TypeId::Boolean
TypeId::TinyInt     // i8
TypeId::SmallInt    // i16
TypeId::Integer     // i32
TypeId::BigInt      // i64
TypeId::UTinyInt    // u8
TypeId::USmallInt   // u16
TypeId::UInteger    // u32
TypeId::UBigInt     // u64
TypeId::HugeInt     // i128
TypeId::UHugeInt    // u128
TypeId::Float       // f32
TypeId::Double      // f64
TypeId::Timestamp
TypeId::TimestampTz
TypeId::TimestampS
TypeId::TimestampMs
TypeId::TimestampNs
TypeId::Date
TypeId::Time
TypeId::TimeTz
TypeId::Interval
TypeId::Varchar
TypeId::Blob
TypeId::Decimal
TypeId::Enum
TypeId::List
TypeId::Struct
TypeId::Map
TypeId::Uuid
TypeId::Union
TypeId::Bit
TypeId::Array
TypeId::TimeNs      // duckdb-1-5
TypeId::Any              // duckdb-1-5
TypeId::Varint           // duckdb-1-5
TypeId::SqlNull          // duckdb-1-5
TypeId::IntegerLiteral   // duckdb-1-5
TypeId::StringLiteral    // duckdb-1-5
TypeId::Geometry         // duckdb-1-5-3
TypeId::Variant          // duckdb-1-5-3
}

TypeId is Copy, Clone, Debug, PartialEq, Eq, and Display.

SQL name

#![allow(unused)]
fn main() {
assert_eq!(TypeId::BigInt.sql_name(), "BIGINT");
assert_eq!(TypeId::Varchar.sql_name(), "VARCHAR");
assert_eq!(format!("{}", TypeId::Timestamp), "TIMESTAMP");
}

DuckDB constant

TypeId::to_duckdb_type() returns the DUCKDB_TYPE_* integer constant from libduckdb-sys. You rarely need this directly — it's called internally by LogicalType::new.

Reverse conversion

TypeId::from_duckdb_type(raw) converts a raw DUCKDB_TYPE constant back into a TypeId. Panics if the value does not match any known constant.

#![allow(unused)]
fn main() {
use quack_rs::types::TypeId;

let type_id = TypeId::from_duckdb_type(libduckdb_sys::DUCKDB_TYPE_DUCKDB_TYPE_BIGINT);
assert_eq!(type_id, TypeId::BigInt);
}

`LogicalType`

LogicalType is a RAII wrapper around DuckDB's duckdb_logical_type. It is used internally by the function builders.

#![allow(unused)]
fn main() {
use quack_rs::types::{LogicalType, TypeId};

let lt = LogicalType::new(TypeId::Varchar);
// lt.as_raw() returns the duckdb_logical_type pointer
// Drop calls duckdb_destroy_logical_type automatically
}

Pitfall L7: duckdb_create_logical_type allocates memory that must be freed with duckdb_destroy_logical_type. LogicalType's Drop implementation does this automatically, preventing the memory leak that occurs when calling the DuckDB C API directly. See Pitfall L7.

You almost never need to create LogicalType directly. The function builders (ScalarFunctionBuilder, AggregateFunctionBuilder) create and destroy them internally.

Constructors

Constructor	Creates
`LogicalType::new(type_id)`	Simple type from a `TypeId`
`LogicalType::from_raw(ptr)`	Takes ownership of a raw `duckdb_logical_type` handle (unsafe)
`LogicalType::decimal(width, scale)`	`DECIMAL(width, scale)`
`LogicalType::list(element_type)`	`LIST<element_type>` from a `TypeId`
`LogicalType::list_from_logical(element)`	`LIST<element>` from an existing `LogicalType`
`LogicalType::map(key, value)`	`MAP<key, value>` from `TypeId`s
`LogicalType::map_from_logical(key, value)`	`MAP<key, value>` from existing `LogicalType`s
`LogicalType::struct_type(fields)`	`STRUCT` from `&[(&str, TypeId)]`
`LogicalType::struct_type_from_logical(fields)`	`STRUCT` from `&[(&str, LogicalType)]`
`LogicalType::union_type(members)`	`UNION` from `&[(&str, TypeId)]`
`LogicalType::union_type_from_logical(members)`	`UNION` from `&[(&str, LogicalType)]`
`LogicalType::enum_type(members)`	`ENUM` from `&[&str]`
`LogicalType::array(element_type, size)`	`ARRAY<element_type>[size]` from a `TypeId`
`LogicalType::array_from_logical(element, size)`	`ARRAY<element>[size]` from an existing `LogicalType`

Introspection methods

All introspection methods are unsafe (require a valid DuckDB runtime handle).

Method	Returns	Applicable to
`get_type_id()`	`TypeId`	Any
`get_alias()`	`Option<String>`	Any
`set_alias(alias)`	`()`	Any
`decimal_width()`	`u8`	`DECIMAL`
`decimal_scale()`	`u8`	`DECIMAL`
`decimal_internal_type()`	`TypeId`	`DECIMAL`
`enum_internal_type()`	`TypeId`	`ENUM`
`enum_dictionary_size()`	`u32`	`ENUM`
`enum_dictionary_value(index)`	`String`	`ENUM`
`list_child_type()`	`LogicalType`	`LIST`
`map_key_type()`	`LogicalType`	`MAP`
`map_value_type()`	`LogicalType`	`MAP`
`struct_child_count()`	`u64`	`STRUCT`
`struct_child_name(index)`	`String`	`STRUCT`
`struct_child_type(index)`	`LogicalType`	`STRUCT`
`union_member_count()`	`u64`	`UNION`
`union_member_name(index)`	`String`	`UNION`
`union_member_type(index)`	`LogicalType`	`UNION`
`array_size()`	`u64`	`ARRAY`
`array_child_type()`	`LogicalType`	`ARRAY`

Rust type ↔ DuckDB type mapping

When reading from or writing to vectors, use the corresponding VectorReader/VectorWriter method:

DuckDB type	`TypeId`	Reader method	Writer method
`BOOLEAN`	`Boolean`	`read_bool`	`write_bool`
`TINYINT`	`TinyInt`	`read_i8`	`write_i8`
`SMALLINT`	`SmallInt`	`read_i16`	`write_i16`
`INTEGER`	`Integer`	`read_i32`	`write_i32`
`BIGINT`	`BigInt`	`read_i64`	`write_i64`
`UTINYINT`	`UTinyInt`	`read_u8`	`write_u8`
`USMALLINT`	`USmallInt`	`read_u16`	`write_u16`
`UINTEGER`	`UInteger`	`read_u32`	`write_u32`
`UBIGINT`	`UBigInt`	`read_u64`	`write_u64`
`FLOAT`	`Float`	`read_f32`	`write_f32`
`DOUBLE`	`Double`	`read_f64`	`write_f64`
`VARCHAR`	`Varchar`	`read_str`	`write_varchar`
`INTERVAL`	`Interval`	`read_interval`	`write_interval`

NULLs are handled separately — see NULL Handling & Strings.

Scalar Functions

Scalar functions transform a batch of input rows into a corresponding batch of output values. They are the most common DuckDB extension pattern — equivalent to SQL's built-in functions like length(), upper(), or sin().

Function signature

DuckDB calls your scalar function once per data chunk (not once per row). The signature is:

#![allow(unused)]
fn main() {
unsafe extern "C" fn my_fn(
    info: duckdb_function_info,     // function metadata (rarely needed)
    input: duckdb_data_chunk,       // input data — one or more columns
    output: duckdb_vector,          // output vector — one value per input row
)
}

Inside the function, you:

Create a VectorReader for each input column
Create a VectorWriter for the output
Loop over rows, checking for NULLs and transforming values

Registration

#![allow(unused)]
fn main() {
use quack_rs::scalar::ScalarFunctionBuilder;
use quack_rs::types::TypeId;

unsafe fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        ScalarFunctionBuilder::new("my_fn")
            .param(TypeId::BigInt)      // first parameter type
            .param(TypeId::BigInt)      // second parameter type (if any)
            .returns(TypeId::BigInt)    // return type
            .function(my_fn)            // callback
            .register(con)?;
    }
    Ok(())
}
}

The builder validates that returns and function are set before calling duckdb_register_scalar_function. If DuckDB reports failure, register returns Err.

Validated registration

For user-configurable function names (e.g., from a config file), use try_new:

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::try_new(name)?   // validates name before building
    .param(TypeId::Varchar)
    .returns(TypeId::Varchar)
    .function(my_fn)
    .register(con)?;
}

try_new validates the name against DuckDB naming rules: [a-z_][a-z0-9_]*, max 256 characters. new panics on invalid names (suitable for compile-time-known names only).

Complete example: `double_it(BIGINT) → BIGINT`

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorReader, VectorWriter};
use libduckdb_sys::{duckdb_function_info, duckdb_data_chunk, duckdb_vector};

unsafe extern "C" fn double_it(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    // SAFETY: DuckDB provides valid chunk and vector pointers.
    let reader = unsafe { VectorReader::new(input, 0) };   // column 0
    let mut writer = unsafe { VectorWriter::new(output) };
    let row_count = reader.row_count();

    for row in 0..row_count {
        if unsafe { !reader.is_valid(row) } {
            // NULL input → NULL output
            // SAFETY: row < row_count, writer is valid.
            unsafe { writer.set_null(row) };
            continue;
        }
        let value = unsafe { reader.read_i64(row) };
        unsafe { writer.write_i64(row, value * 2) };
    }
}
}

Multi-parameter example: `add(BIGINT, BIGINT) → BIGINT`

#![allow(unused)]
fn main() {
unsafe extern "C" fn add(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let col0 = unsafe { VectorReader::new(input, 0) };  // first param
    let col1 = unsafe { VectorReader::new(input, 1) };  // second param
    let mut writer = unsafe { VectorWriter::new(output) };

    for row in 0..col0.row_count() {
        if unsafe { !col0.is_valid(row) || !col1.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let a = unsafe { col0.read_i64(row) };
        let b = unsafe { col1.read_i64(row) };
        unsafe { writer.write_i64(row, a + b) };
    }
}
}

VARCHAR example: `shout(VARCHAR) → VARCHAR`

#![allow(unused)]
fn main() {
unsafe extern "C" fn shout(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };

    for row in 0..reader.row_count() {
        if unsafe { !reader.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let s = unsafe { reader.read_str(row) };
        let upper = s.to_uppercase();
        unsafe { writer.write_varchar(row, &upper) };
    }
}
}

Overloading with Function Sets

If your function accepts different parameter types or arities, use ScalarFunctionSetBuilder to register multiple overloads under a single name:

#![allow(unused)]
fn main() {
use quack_rs::scalar::{ScalarFunctionSetBuilder, ScalarOverloadBuilder};
use quack_rs::types::TypeId;

unsafe fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        ScalarFunctionSetBuilder::new("my_add")
            .overload(
                ScalarOverloadBuilder::new()
                    .param(TypeId::Integer).param(TypeId::Integer)
                    .returns(TypeId::Integer)
                    .function(add_ints)
            )
            .overload(
                ScalarOverloadBuilder::new()
                    .param(TypeId::Double).param(TypeId::Double)
                    .returns(TypeId::Double)
                    .function(add_doubles)
            )
            .register(con)?;
    }
    Ok(())
}
}

Like AggregateFunctionSetBuilder, this builder calls duckdb_scalar_function_set_name on every individual function before adding it to the set (Pitfall L6).

NULL Handling

By default, DuckDB returns NULL if any argument is NULL — your function callback is never called for those rows. If you need to handle NULLs explicitly (e.g., for a COALESCE-like function), set SpecialNullHandling:

#![allow(unused)]
fn main() {
use quack_rs::types::NullHandling;

ScalarFunctionBuilder::new("coalesce_custom")
    .param(TypeId::BigInt)
    .returns(TypeId::BigInt)
    .null_handling(NullHandling::SpecialNullHandling)
    .function(my_coalesce_fn)
    .register(con)?;
}

With SpecialNullHandling, your callback must check VectorReader::is_valid(row) and handle NULLs yourself.

Complex parameter and return types

For scalar functions that accept or return parameterized types like LIST(BIGINT), use param_logical and returns_logical:

#![allow(unused)]
fn main() {
use quack_rs::scalar::ScalarFunctionBuilder;
use quack_rs::types::{LogicalType, TypeId};

ScalarFunctionBuilder::new("flatten_list")
    .param_logical(LogicalType::list(TypeId::BigInt))  // LIST(BIGINT) input
    .returns(TypeId::BigInt)
    .function(flatten_list_fn)
    .register(con)?;
}

These methods are also available on ScalarOverloadBuilder for function sets:

#![allow(unused)]
fn main() {
ScalarOverloadBuilder::new()
    .param(TypeId::Varchar)
    .returns_logical(LogicalType::list(TypeId::Timestamp))  // LIST(TIMESTAMP) output
    .function(my_fn)
}

Key points

VectorReader::new(input, column_index) — the column index is zero-based
Always check is_valid(row) before reading — skipping this reads garbage for NULL rows
set_null must be called for NULL outputs — it calls ensure_validity_writable automatically (Pitfall L4)
read_bool returns bool — handles DuckDB's non-0/1 boolean bytes correctly (Pitfall L5)
read_str handles both inline and pointer string formats automatically (Pitfall P7)

DuckDB 1.5.0 Additions (`duckdb-1-5`)

The following ScalarFunctionBuilder methods are available when the duckdb-1-5 feature is enabled:

`varargs(type_id: TypeId)`

Declares that the function accepts a variable number of trailing arguments, all of the given TypeId. Maps to duckdb_scalar_function_set_varargs.

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("concat_all")
    .varargs(TypeId::Varchar)
    .returns(TypeId::Varchar)
    .function(concat_all_fn)
    .register(con)?;
}

`varargs_logical(logical_type: LogicalType)`

Like varargs, but accepts a LogicalType for parameterized variadic arguments. Maps to duckdb_scalar_function_set_varargs.

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("merge_lists")
    .varargs_logical(LogicalType::list(TypeId::BigInt))
    .returns_logical(LogicalType::list(TypeId::BigInt))
    .function(merge_lists_fn)
    .register(con)?;
}

`volatile()`

Marks the function as volatile, meaning DuckDB will not cache or reuse its results across calls with the same arguments. Maps to duckdb_scalar_function_set_volatile.

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("random_int")
    .returns(TypeId::Integer)
    .volatile()
    .function(random_int_fn)
    .register(con)?;
}

`bind(bind_fn)`

Sets a custom bind callback that runs at plan time. Use this to inspect argument types and set the return type dynamically. Maps to duckdb_scalar_function_set_bind.

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("dynamic_return")
    .varargs(TypeId::Varchar)
    .returns(TypeId::Varchar)   // default; overridden in bind
    .bind(my_bind_fn)
    .function(dynamic_return_fn)
    .register(con)?;
}

`init(init_fn)`

Sets a local-init callback invoked once per thread before execution begins. Use this to allocate per-thread state. Maps to duckdb_scalar_function_set_init.

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("stateful_fn")
    .param(TypeId::BigInt)
    .returns(TypeId::BigInt)
    .init(my_init_fn)
    .function(stateful_fn)
    .register(con)?;
}

Extra info

Attach arbitrary data to a scalar function using extra_info. This is useful for parameterising the function behaviour (e.g., a locale or configuration struct). The method is available on both ScalarFunctionBuilder and ScalarOverloadBuilder.

#![allow(unused)]
fn main() {
use std::os::raw::c_void;

let config = Box::into_raw(Box::new("en_US".to_string())).cast::<c_void>();
unsafe {
    ScalarFunctionBuilder::new("locale_upper")
        .param(TypeId::Varchar)
        .returns(TypeId::Varchar)
        .extra_info(config, Some(my_destroy))
        .function(locale_upper_fn)
        .register(con)?;
}
}

Inside the callback, retrieve the extra info with ScalarFunctionInfo::get_extra_info().

`ScalarFunctionInfo`

ScalarFunctionInfo wraps the duckdb_function_info handle provided to a scalar function callback. It exposes:

get_extra_info() -> *mut c_void — retrieves the extra-info pointer set during registration
set_error(message) — reports an error, causing DuckDB to abort the query

#![allow(unused)]
fn main() {
use quack_rs::scalar::ScalarFunctionInfo;

unsafe extern "C" fn my_fn(
    info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let info = unsafe { ScalarFunctionInfo::new(info) };
    let extra = unsafe { info.get_extra_info() };
    // ... use extra info, or report errors via info.set_error("...") ...
}
}

With the duckdb-1-5 feature, ScalarFunctionInfo also provides:

get_bind_data() -> *mut c_void — retrieves bind data set during the bind callback
get_state() -> *mut c_void — retrieves per-thread state set during the init callback

`ScalarBindInfo` (`duckdb-1-5`)

ScalarBindInfo wraps the duckdb_bind_info handle provided to a scalar function bind callback. It exposes:

argument_count() -> u64 — number of arguments
get_argument(index) -> duckdb_expression — argument expression at index
get_extra_info() -> *mut c_void — the extra-info pointer from registration
set_bind_data(data, destroy) — stores per-query data retrievable during execution
set_error(message) — reports an error
get_client_context() -> ClientContext — access to the connection's catalog and config

`ScalarInitInfo` (`duckdb-1-5`)

ScalarInitInfo wraps the duckdb_init_info handle provided to a scalar function init callback. It exposes:

get_extra_info() -> *mut c_void — the extra-info pointer from registration
get_bind_data() -> *mut c_void — the bind data from the bind callback
set_state(state, destroy) — stores per-thread state retrievable during execution
set_error(message) — reports an error
get_client_context() -> ClientContext — access to the connection's catalog and config

Aggregate Functions

Aggregate functions reduce multiple rows into a single value per group — like SUM(), COUNT(), or AVG(). DuckDB supports parallel aggregation, which introduces a combine step that merges partial results from parallel workers.

The aggregate lifecycle

flowchart TD
    REG["**Registration**<br/>AggregateFunctionBuilder<br/>→ duckdb_register_aggregate_function"]

    REG     --> SIZE
    SIZE    --> INIT
    INIT    --> UPDATE
    UPDATE  --> COMBINE
    COMBINE --> FINAL
    FINAL   --> DESTROY

    SIZE["**state_size**()<br/>How many bytes to allocate per group?"]
    INIT["**state_init**(state)<br/>Initialize a fresh state"]
    UPDATE["**update**(chunk, states[])<br/>Process one input batch"]
    COMBINE["**combine**(src[], tgt[], count)<br/>Merge partial results from parallel workers<br/>⚠️ Pitfall L1: target starts fresh — copy ALL config fields"]
    FINAL["**finalize**(states[], out, count)<br/>Write results to output vector"]
    DESTROY["**state_destroy**(states[], count)<br/>Free memory"]

    style COMBINE fill:#fff3cd,stroke:#e6ac00,color:#333

DuckDB may call combine multiple times as it merges results from parallel segments. Target states in combine are always fresh (zero-initialized via state_init).

Registration

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionBuilder;
use quack_rs::types::TypeId;

unsafe fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        AggregateFunctionBuilder::new("my_agg")
            .param(TypeId::Varchar)       // input type(s)
            .returns(TypeId::BigInt)      // output type
            .state_size(state_size)
            .init(state_init)
            .update(update)
            .combine(combine)
            .finalize(finalize)
            .destructor(state_destroy)
            .register(con)?;
    }
    Ok(())
}
}

The five core callbacks (state_size, init, update, combine, finalize) must be set before register — the builder will return an error if any are missing. The destructor callback is optional but strongly recommended when your state allocates heap memory (e.g., when using FfiState<T>).

Callback signatures

`state_size`

#![allow(unused)]
fn main() {
unsafe extern "C" fn state_size(_info: duckdb_function_info) -> idx_t {
    FfiState::<MyState>::size_callback(_info)
}
}

Returns the size DuckDB must allocate per group. This is always size_of::<*mut MyState>() — a pointer, since FfiState<T> stores a Box<T> pointer in the allocated slot.

`state_init`

#![allow(unused)]
fn main() {
unsafe extern "C" fn state_init(info: duckdb_function_info, state: duckdb_aggregate_state) {
    unsafe { FfiState::<MyState>::init_callback(info, state) };
}
}

Allocates a Box<MyState> (using MyState::default()) and writes its raw pointer into the DuckDB-allocated state slot.

`update`

#![allow(unused)]
fn main() {
unsafe extern "C" fn update(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    states: *mut duckdb_aggregate_state,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let row_count = reader.row_count();

    for row in 0..row_count {
        if unsafe { !reader.is_valid(row) } { continue; }
        let value = unsafe { reader.read_i64(row) };

        let state_ptr = unsafe { *states.add(row) };
        if let Some(st) = unsafe { FfiState::<MyState>::with_state_mut(state_ptr) } {
            st.accumulate(value);
        }
    }
}
}

states[i] corresponds to chunk row i. Each state belongs to one group.

`combine`

#![allow(unused)]
fn main() {
unsafe extern "C" fn combine(
    _info: duckdb_function_info,
    source: *mut duckdb_aggregate_state,
    target: *mut duckdb_aggregate_state,
    count: idx_t,
) {
    for i in 0..count as usize {
        let src = unsafe { FfiState::<MyState>::with_state(*source.add(i)) };
        let tgt = unsafe { FfiState::<MyState>::with_state_mut(*target.add(i)) };
        if let (Some(s), Some(t)) = (src, tgt) {
            // ⚠️  MUST copy ALL fields — see Pitfall L1
            t.config_field = s.config_field;   // configuration
            t.accumulator  += s.accumulator;    // data
        }
    }
}
}

Pitfall L1 — critical: Target states are fresh T::default() values. You must copy every field, including configuration fields set during update. Forgetting even one config field produces silently wrong results. See Pitfall L1.

`finalize`

#![allow(unused)]
fn main() {
unsafe extern "C" fn finalize(
    _info: duckdb_function_info,
    source: *mut duckdb_aggregate_state,
    result: duckdb_vector,
    count: idx_t,
    offset: idx_t,
) {
    let mut writer = unsafe { VectorWriter::new(result) };
    for i in 0..count as usize {
        let state_ptr = unsafe { *source.add(i) };
        match unsafe { FfiState::<MyState>::with_state(state_ptr) } {
            Some(st) => unsafe { writer.write_i64(offset as usize + i, st.result()) },
            None     => unsafe { writer.set_null(offset as usize + i) },
        }
    }
}
}

The offset parameter is non-zero when DuckDB is writing into a portion of a larger vector. Always add it to your index.

`state_destroy`

#![allow(unused)]
fn main() {
unsafe extern "C" fn state_destroy(states: *mut duckdb_aggregate_state, count: idx_t) {
    unsafe { FfiState::<WordCountState>::destroy_callback(states, count) };
}
}

destroy_callback calls Box::from_raw for each state and then nulls the pointer, preventing double-free. See Pitfall L2.

Complex parameter and return types

For functions that accept or return parameterized types like LIST(BIGINT), MAP(VARCHAR, INTEGER), or STRUCT(...), use param_logical and returns_logical instead of param and returns:

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionBuilder;
use quack_rs::types::{LogicalType, TypeId};

unsafe fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        AggregateFunctionBuilder::new("retention")
            .param(TypeId::Boolean)
            .param(TypeId::Boolean)
            .returns_logical(LogicalType::list(TypeId::Boolean))  // LIST(BOOLEAN)
            .state_size(state_size)
            .init(state_init)
            .update(update)
            .combine(combine)
            .finalize(finalize)
            .destructor(state_destroy)
            .register(con)?;
    }
    Ok(())
}
}

param_logical and param can be interleaved — the parameter position is determined by the total number of calls made so far:

#![allow(unused)]
fn main() {
AggregateFunctionBuilder::new("my_func")
    .param(TypeId::Varchar)                          // position 0: VARCHAR
    .param_logical(LogicalType::list(TypeId::BigInt)) // position 1: LIST(BIGINT)
    .param(TypeId::Integer)                           // position 2: INTEGER
    .returns(TypeId::BigInt)
    // ...
}

If both returns and returns_logical are called, the logical type takes precedence.

Extra info

Attach arbitrary data to an aggregate function using extra_info. This is useful for parameterising the function behaviour (e.g., passing configuration):

#![allow(unused)]
fn main() {
use std::os::raw::c_void;

let config = Box::into_raw(Box::new(42u64)).cast::<c_void>();
unsafe {
    AggregateFunctionBuilder::new("my_agg")
        .param(TypeId::BigInt)
        .returns(TypeId::BigInt)
        .extra_info(config, Some(my_destroy))
        .state_size(state_size)
        .init(state_init)
        .update(update)
        .combine(combine)
        .finalize(finalize)
        .destructor(state_destroy)
        .register(con)?;
}
}

Inside callbacks, retrieve the extra info with AggregateFunctionInfo::get_extra_info().

`AggregateFunctionInfo`

AggregateFunctionInfo wraps the duckdb_function_info handle provided to aggregate function callbacks (update, combine, finalize, etc.). It exposes:

get_extra_info() -> *mut c_void — retrieves the extra-info pointer set during registration
set_error(message) — reports an error, causing DuckDB to abort the query

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionInfo;

unsafe extern "C" fn update(
    info: duckdb_function_info,
    input: duckdb_data_chunk,
    states: *mut duckdb_aggregate_state,
) {
    let info = unsafe { AggregateFunctionInfo::new(info) };
    let extra = unsafe { info.get_extra_info() };
    // ... use extra info, or report errors via info.set_error("...") ...
}
}

Next steps

State Management — FfiState<T>, AggregateState, and lifecycle details
Overloading with Function Sets — register multiple signatures under one name

State Management

FfiState<T> manages the lifecycle of aggregate state — allocation, initialization, access, and destruction — so you never write raw pointer code for state management.

`AggregateState` trait

Any type that is Default + Send + 'static can be used as aggregate state by implementing the AggregateState marker trait:

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateState;

#[derive(Default, Debug)]
struct MyState {
    config: usize,    // set in update, must be propagated in combine
    total: i64,       // accumulated data
}

impl AggregateState for MyState {}
}

AggregateState has no required methods. The Default bound is used in state_init to create fresh states.

`FfiState<T>`

FfiState<T> is a #[repr(C)] struct containing a single raw pointer:

#![allow(unused)]
fn main() {
#[repr(C)]
pub struct FfiState<T> {
    inner: *mut T,
}
}

This matches DuckDB's expectation: DuckDB allocates state_size() bytes per group, and your state lives in a Box<T> heap allocation whose pointer is stored in that space.

Memory layout

DuckDB-allocated slot (state_size bytes = sizeof(*mut T)):
  [ inner: *mut T ]  ──→  Box<T>  (on the Rust heap)

Lifecycle callbacks

#![allow(unused)]
fn main() {
// state_size: DuckDB calls this once to know how many bytes to allocate per group
FfiState::<MyState>::size_callback(_info)
// Returns: size_of::<*mut MyState>()

// state_init: DuckDB calls this once per group after allocating the slot
FfiState::<MyState>::init_callback(info, state)
// Effect: writes Box::into_raw(Box::new(MyState::default())) into the slot

// state_destroy: DuckDB calls this after finalize for every group
FfiState::<MyState>::destroy_callback(states, count)
// Effect: for each state: drop(Box::from_raw(inner)); inner = null
}

Accessing state in callbacks

#![allow(unused)]
fn main() {
// Immutable access (in finalize, combine source):
if let Some(st) = FfiState::<MyState>::with_state(state_ptr) {
    let value = st.total;
}

// Mutable access (in update, combine target):
if let Some(st) = FfiState::<MyState>::with_state_mut(state_ptr) {
    st.total += delta;
}
}

Both methods return Option<&T> / Option<&mut T>. They return None if inner is null (which happens after destroy_callback or if initialization failed). Using Option rather than panicking on null is what keeps the extension panic-free.

The double-free problem — solved

Without quack-rs, a naive destructor looks like:

#![allow(unused)]
fn main() {
// ❌ Naive — causes double-free if DuckDB calls destroy twice
unsafe extern "C" fn destroy(states: *mut duckdb_aggregate_state, count: idx_t) {
    for i in 0..count as usize {
        let ffi = &mut *(*states.add(i) as *mut FfiState<MyState>);
        drop(Box::from_raw(ffi.inner));   // inner is now dangling — crash on second call
    }
}
}

FfiState::destroy_callback does:

#![allow(unused)]
fn main() {
// After drop(Box::from_raw(ffi.inner)):
ffi.inner = std::ptr::null_mut();   // ← prevents double-free
}

If DuckDB calls destroy again, with_state returns None and the loop body is a no-op.

Testing state logic without DuckDB

AggregateTestHarness<S> simulates the DuckDB aggregate lifecycle in pure Rust:

#![allow(unused)]
fn main() {
use quack_rs::testing::AggregateTestHarness;

#[test]
fn combine_propagates_config() {
    let mut source = AggregateTestHarness::<MyState>::new();
    source.update(|s| {
        s.config = 5;    // config field set during update
        s.total += 100;
    });

    let mut target = AggregateTestHarness::<MyState>::new();
    target.combine(&source, |src, tgt| {
        tgt.config = src.config;   // must propagate config — Pitfall L1
        tgt.total  += src.total;
    });

    let result = target.finalize();
    assert_eq!(result.config, 5, "config must be propagated in combine");
    assert_eq!(result.total, 100);
}
}

See the Testing Guide for the full test strategy.

Overloading with Function Sets

DuckDB supports multiple signatures for the same function name via function sets. This is how you implement variadic aggregates like retention(c1, c2, ..., c32).

Note: For scalar function overloads, see ScalarFunctionSetBuilder.

When to use function sets

Use AggregateFunctionSetBuilder when you need:

Multiple type signatures for the same function name (e.g., my_agg(INT) and my_agg(BIGINT))
Variadic arity under one name (e.g., retention(2 columns), retention(3 columns), ...)

For a single signature, use AggregateFunctionBuilder directly.

Registration

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionSetBuilder;
use quack_rs::types::TypeId;

unsafe fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        AggregateFunctionSetBuilder::new("retention")
            .returns(TypeId::Varchar)
            .overloads(2..=3, |n, builder| {
                // Each overload gets `n` BOOLEAN parameters
                let b = (0..n).fold(builder, |b, _| b.param(TypeId::Boolean));
                b.state_size(state_size)
                    .init(state_init)
                    .update(update)
                    .combine(combine)
                    .finalize(finalize)
                    .destructor(state_destroy)
            })
            .register(con)?;
    }
    Ok(())
}
}

The overloads method accepts a RangeInclusive<usize> and a closure that receives the arity n and a fresh OverloadBuilder. The builder sets the function name on each individual member internally.

The silent name bug — solved

Pitfall L6: When using a function set, the name must be set on each individual duckdb_aggregate_function via duckdb_aggregate_function_set_name, not just on the set. If any member lacks a name, it is silently not registered — no error is returned.

This is completely undocumented. It was discovered by reading DuckDB's C++ test code at test/api/capi/test_capi_aggregate_functions.cpp. In duckdb-behavioral, 6 of 7 functions failed to register silently due to this bug.

AggregateFunctionSetBuilder enforces that each member has its name set internally when the overloads closure builds each function.

See Pitfall L6.

Complex return types

If all overloads share a complex return type, use returns_logical on the set builder:

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionSetBuilder;
use quack_rs::types::{LogicalType, TypeId};

AggregateFunctionSetBuilder::new("retention")
    .returns_logical(LogicalType::list(TypeId::Boolean))  // LIST(BOOLEAN) for all overloads
    .overloads(2..=32, |n, builder| {
        (0..n).fold(builder, |b, _| b.param(TypeId::Boolean))
            .state_size(state_size)
            .init(state_init)
            .update(update)
            .combine(combine)
            .finalize(finalize)
            .destructor(destroy)
    })
    .register(con)?;
}

Individual overloads can also use param_logical for complex parameter types:

#![allow(unused)]
fn main() {
.overloads(2..=8, |n, builder| {
    builder
        .param(TypeId::Interval)
        .param_logical(LogicalType::list(TypeId::Timestamp)) // LIST(TIMESTAMP) parameter
        // ...
})
}

Why not varargs?

DuckDB's C API does not provide duckdb_aggregate_function_set_varargs. For true variadic aggregates, you must register N overloads — one for each supported arity. Function sets make this tractable.

Note: As of DuckDB 1.5.0, scalar functions now support varargs directly via ScalarFunctionBuilder::varargs() (requires the duckdb-1-5 feature). This limitation still applies to aggregate functions, which have no varargs counterpart in the C API.

ADR-002 in the architecture docs explains this design decision in detail.

Table Functions

Table functions implement the SELECT * FROM my_function(args) pattern — they return a result set rather than a scalar value. DuckDB table functions have three lifecycle callbacks: bind, init, and scan.

quack-rs provides two layers for registering table functions:

TypedTableFunctionBuilder<S> (recommended for new extensions) — closure-based API that hides bind/init/scan trampolines behind two safe Rust closures and carries a typed scan state from bind into scan for you.
TableFunctionBuilder — the underlying raw builder used by TypedTableFunctionBuilder internally. Reach for it when you need fine-grained control: local_init-driven parallel scans, projection pushdown with column filtering, or callback shapes that don't fit the "produce state in bind, mutate it in scan" model.

Both builders are backed by the helper types BindInfo, InitInfo, FunctionInfo, FfiBindData<T>, FfiInitData<T>, and FfiLocalInitData<T>.

Lifecycle

Phase	Callback	Called when	Typical work
bind	`bind_fn`	Query is planned	Extract parameters; register output columns; store config in bind data
init	`init_fn`	Execution starts	Allocate per-scan state (cursor, row index, etc.)
scan	`scan_fn`	Each output batch	Fill `duckdb_data_chunk` with rows; call `duckdb_data_chunk_set_size`

The scan callback is called repeatedly until it writes 0 rows in a batch, signalling end-of-results.

Closure-based typed state (`with_state`)

For the common "take parameters at bind, stream rows until exhausted" pattern, TypedTableFunctionBuilder<S> replaces all three callback trampolines with two closures:

#![allow(unused)]
fn main() {
use quack_rs::prelude::*;

struct State {
    remaining: u64,
}

fn register(reg: &impl Registrar) -> ExtResult<()> {
    let builder = TableFunctionBuilder::new("count_down")
        .param(TypeId::BigInt)
        // 1. bind closure: declare the output schema, read parameters,
        //    return the initial scan state.
        .with_state::<State, _>(|bind| {
            bind.add_result_column("n", TypeId::BigInt);
            let raw = unsafe { bind.get_parameter_value(0) };
            Ok(State { remaining: raw.as_i64_or(0).max(0) as u64 })
        })
        // 2. scan closure: mutate state, write rows, set chunk size.
        .scan(|state, chunk| {
            if state.remaining == 0 {
                unsafe { chunk.set_size(0) };
                return Ok(());
            }
            let mut writer = unsafe { chunk.writer(0) };
            unsafe { writer.write_i64(0, state.remaining as i64) };
            state.remaining -= 1;
            unsafe { chunk.set_size(1) };
            Ok(())
        })
        .build()?;
    unsafe { reg.register_table(builder) }
}
}

What you get for free

No hand-written unsafe extern "C" fn trampolines. TypedTableFunctionBuilder generates them internally.
Typed scan state. The bind closure returns S; the scan closure receives &mut S. State is moved from the bind phase into init data for you — no manual FfiBindData / FfiInitData shuffling.
Panic safety. User closures run inside catch_unwind. Panics surface as duckdb_bind/init/function_set_error, and the scan forces chunk size to zero so the query terminates cleanly instead of unwinding across the FFI boundary.
Error propagation. Return Err(ExtensionError::new("...")) from either closure to report a SQL error to DuckDB.

Trade-offs and threading

S must be Send + 'static. Sync is not required, so TypedTableFunctionBuilder forces scans to run on a single worker by calling InitInfo::set_max_threads(1) internally.
Extensions that need multi-worker parallelism (local_init + thread-local buffers) should use the raw TableFunctionBuilder directly.
TypedTableFunctionBuilder::build() returns a fully configured TableFunctionBuilder, so you can still pass it through any Registrar — including MockRegistrar for unit tests.

Builder API

#![allow(unused)]
fn main() {
use quack_rs::table::{TableFunctionBuilder, BindInfo, FfiBindData, FfiInitData};
use quack_rs::types::TypeId;

TableFunctionBuilder::new("my_function")
    .param(TypeId::BigInt)                 // positional parameter types
    .bind(my_bind_callback)               // declare output columns inside bind
    .init(my_init_callback)
    .scan(my_scan_callback)
    .register(con)?;
}

Output columns are declared inside the bind callback using BindInfo::add_result_column, not on the builder itself.

State management

Bind data

Bind data persists from the bind phase through all scan batches. Use FfiBindData<T> to allocate it safely:

#![allow(unused)]
fn main() {
struct MyBindData {
    limit: i64,
}

unsafe extern "C" fn my_bind(info: duckdb_bind_info) {
    let n = unsafe { duckdb_get_int64(duckdb_bind_get_parameter(info, 0)) };
    unsafe { FfiBindData::<MyBindData>::set(info, MyBindData { limit: n }) };
}
}

FfiBindData::set stores the value and registers a destructor so DuckDB frees it at the right time — no Box::into_raw / Box::from_raw needed.

Init (scan) state

Per-scan state (e.g., a current row index) uses FfiInitData<T>:

#![allow(unused)]
fn main() {
struct MyScanState {
    pos: i64,
}

unsafe extern "C" fn my_init(info: duckdb_init_info) {
    unsafe { FfiInitData::<MyScanState>::set(info, MyScanState { pos: 0 }) };
}
}

Complete example: `generate_series_ext`

The hello-ext example registers generate_series_ext(n BIGINT) which emits integers 0 .. n-1. See examples/hello-ext/src/lib.rs for the full source.

#![allow(unused)]
fn main() {
// Bind: extract `n`, register one output column
unsafe extern "C" fn gs_bind(info: duckdb_bind_info) {
    let bind_info = unsafe { BindInfo::new(info) };
    // Value is RAII — automatically destroyed when dropped
    let n = unsafe { bind_info.get_parameter_value(0) }.as_i64();

    bind_info.add_result_column("value", TypeId::BigInt);
    unsafe { FfiBindData::<GsBindData>::set(info, GsBindData { total: n }) };
}

// Init: zero-initialise the scan cursor
unsafe extern "C" fn gs_init(info: duckdb_init_info) {
    unsafe { FfiInitData::<GsScanState>::set(info, GsScanState { pos: 0 }) };
}

// Scan: emit a batch of rows using DataChunk wrapper
unsafe extern "C" fn gs_scan(info: duckdb_function_info, output: duckdb_data_chunk) {
    let bind = unsafe { FfiBindData::<GsBindData>::get_from_function(info) }.unwrap();
    let state = unsafe { FfiInitData::<GsScanState>::get_mut(info) }.unwrap();

    let remaining = bind.total - state.pos;
    let batch = remaining.min(2048).max(0) as usize;

    let chunk = unsafe { DataChunk::from_raw(output) };
    let mut writer = unsafe { chunk.writer(0) };
    for i in 0..batch {
        unsafe { writer.write_i64(i, state.pos + i as i64) };
    }
    unsafe { chunk.set_size(batch) };
    state.pos += batch as i64;
}
}

Registration

#![allow(unused)]
fn main() {
TableFunctionBuilder::new("generate_series_ext")
    .param(TypeId::BigInt)
    .bind(gs_bind)
    .init(gs_init)
    .scan(gs_scan)
    .register(con)?;
}

Advanced features

Named parameters

Named parameters let callers pass optional arguments by name (e.g., step := 10):

#![allow(unused)]
fn main() {
TableFunctionBuilder::new("gen_series_v2")
    .param(TypeId::BigInt)                    // positional: n
    .named_param("step", TypeId::BigInt)      // named: step := <value>
    .bind(gs_v2_bind)
    .init(gs_v2_init)
    .scan(gs_v2_scan)
    .register(con)?;
}

In the bind callback, read the named parameter with duckdb_bind_get_named_parameter(info, c"step".as_ptr()).

Local init (per-thread state)

For multi-threaded table functions, use local_init to allocate per-thread state:

#![allow(unused)]
fn main() {
TableFunctionBuilder::new("gen_series_v2")
    .param(TypeId::BigInt)
    .bind(gs_v2_bind)
    .init(gs_v2_init)
    .local_init(gs_v2_local_init)            // per-thread state allocation
    .scan(gs_v2_scan)
    .register(con)?;
}

The local init callback receives duckdb_init_info and can use FfiLocalInitData<T>::set to store per-thread state.

Thread control

Use InitInfo::set_max_threads in the global init callback to tell DuckDB how many threads can scan concurrently:

#![allow(unused)]
fn main() {
unsafe extern "C" fn gs_v2_init(info: duckdb_init_info) {
    let init_info = unsafe { InitInfo::new(info) };
    unsafe { init_info.set_max_threads(1) };
    unsafe { FfiInitData::<MyState>::set(info, MyState { pos: 0 }) };
}
}

Projection pushdown

Enable projection pushdown to let DuckDB skip unrequested columns:

#![allow(unused)]
fn main() {
TableFunctionBuilder::new("my_func")
    .projection_pushdown(true)
    // ...
}

Caution: When projection pushdown is enabled, your scan callback must check which columns DuckDB actually needs using InitInfo::projected_column_count and InitInfo::projected_column_index. Writing to non-projected columns causes crashes.

See examples/hello-ext/src/lib.rs for a complete example using named_param, local_init, and set_max_threads.

Complex parameter types

For parameterised types that TypeId cannot express (e.g. LIST(BIGINT), MAP(VARCHAR, INTEGER), STRUCT(...)), use param_logical and named_param_logical:

#![allow(unused)]
fn main() {
use quack_rs::types::LogicalType;

TableFunctionBuilder::new("read_data")
    .param_logical(LogicalType::list(TypeId::Varchar))        // positional LIST param
    .named_param_logical("options", LogicalType::map(          // named MAP param
        TypeId::Varchar, TypeId::Varchar,
    ))
    .bind(bind_fn)
    .init(init_fn)
    .scan(scan_fn)
    .register(con)?;
}

BindInfo helpers

BindInfo wraps duckdb_bind_info and exposes these methods:

Method	Description
`add_result_column(name, TypeId)`	Declares an output column
`add_result_column_with_type(name, &LogicalType)`	Output column with complex type
`set_cardinality(rows, is_exact)`	Cardinality hint for the optimizer
`set_error(message)`	Report a bind-time error
`parameter_count()`	Number of positional parameters
`get_parameter(index)`	Returns a positional parameter value (`duckdb_value`)
`get_named_parameter(name)`	Returns a named parameter value (`duckdb_value`)
`get_extra_info()`	Returns the extra-info pointer set on the function
`get_client_context()`	Returns a `ClientContext` (requires `duckdb-1-5` feature)

InitInfo helpers

InitInfo wraps duckdb_init_info:

Method	Description
`projected_column_count()`	Number of projected columns (with pushdown)
`projected_column_index(idx)`	Output column index at projection position
`set_max_threads(n)`	Maximum parallel scan threads
`set_error(message)`	Report an init-time error
`get_extra_info()`	Returns the extra-info pointer set on the function

FunctionInfo helpers

FunctionInfo wraps duckdb_function_info (scan callbacks):

Method	Description
`set_error(message)`	Report a scan-time error
`get_extra_info()`	Returns the extra-info pointer set on the function

Extra info

Use TableFunctionBuilder::extra_info to attach function-level data that is accessible from all callbacks (bind, init, and scan) via get_extra_info().

Verified output (DuckDB 1.4.4 and 1.5.0)

SELECT * FROM generate_series_ext(5);
-- 0
-- 1
-- 2
-- 3
-- 4

SELECT value * value AS sq FROM generate_series_ext(4);
-- 0
-- 1
-- 4
-- 9

Replacement Scans

A replacement scan lets users write:

SELECT * FROM 'myfile.myformat'

and have DuckDB automatically invoke your extension's table-valued scan instead of trying to open the path as a built-in file type. This is how DuckDB's built-in CSV, Parquet, and JSON readers work.

quack-rs provides ReplacementScanBuilder (a static registration helper) and ReplacementScanInfo (an ergonomic wrapper for callbacks).

Registration API

Unlike the other builders in quack-rs, ReplacementScanBuilder uses a single static call because the DuckDB C API takes all arguments at once:

#![allow(unused)]
fn main() {
use quack_rs::replacement_scan::ReplacementScanBuilder;

// Low-level: pass raw extra_data and an optional delete callback.
unsafe {
    ReplacementScanBuilder::register(
        db,                            // duckdb_database
        my_scan_callback,              // ReplacementScanFn
        std::ptr::null_mut(),          // extra_data (or a raw pointer)
        None,                          // delete_callback
    );
}

// Ergonomic: pass owned Rust data; boxing and destructor are handled for you.
unsafe {
    ReplacementScanBuilder::register_with_data(db, my_scan_callback, my_state);
}
}

Note: Replacement scans are registered on a database handle (duckdb_database), not a connection. Register them before opening connections.

Callback signature

The raw callback receives duckdb_replacement_scan_info, but you can wrap it with ReplacementScanInfo for ergonomic, safe access:

#![allow(unused)]
fn main() {
use quack_rs::replacement_scan::ReplacementScanInfo;

unsafe extern "C" fn my_scan_callback(
    info: duckdb_replacement_scan_info,
    table_name: *const ::std::os::raw::c_char,
    _data: *mut ::std::os::raw::c_void,
) {
    let path = unsafe { std::ffi::CStr::from_ptr(table_name) }
        .to_str()
        .unwrap_or("");

    if !path.ends_with(".myformat") {
        return; // pass — DuckDB will try other handlers
    }

    // Use ReplacementScanInfo for ergonomic access
    unsafe {
        ReplacementScanInfo::new(info)
            .set_function("read_myformat")
            .add_varchar_parameter(path);
    }
}
}

`ReplacementScanInfo` methods

Method	Description
`set_function(name)`	Redirect to the named table function
`add_varchar_parameter(value)`	Add a VARCHAR parameter to the redirected call
`add_i64_parameter(value)`	Add a BIGINT (i64) parameter (v0.11.0+)
`add_bool_parameter(value)`	Add a BOOLEAN parameter (v0.11.0+)
`add_parameter_raw(duckdb_value)`	Add any typed `duckdb_value` parameter (v0.11.0+)
`set_error(message)`	Report an error (aborts this replacement scan)

When to use replacement scans vs table functions

Scenario	Use
`SELECT * FROM my_function('file.ext')`	Table function
`SELECT * FROM 'file.ext'` (bare path)	Replacement scan → delegates to a table function
File type auto-detection	Replacement scan

Most extensions implement both: a table function that does the actual work, and a replacement scan that detects the file extension and transparently routes bare-path queries to the table function.

Cast Functions

Cast functions let your extension define how DuckDB converts values from one type to another. Once registered, both explicit CAST(x AS T) syntax and (optionally) implicit coercions will use your callback.

When to use cast functions

Your extension introduces a new logical type and needs CAST to/from standard types.
You want to override DuckDB's built-in cast behaviour for a specific type pair.
You need to control implicit cast priority relative to other registered casts.

Registering a cast

#![allow(unused)]
fn main() {
use quack_rs::cast::{CastFunctionBuilder, CastFunctionInfo, CastMode};
use quack_rs::types::TypeId;
use quack_rs::vector::{VectorReader, VectorWriter};
use libduckdb_sys::{duckdb_function_info, duckdb_vector, idx_t};

unsafe extern "C" fn varchar_to_int(
    info: duckdb_function_info,
    count: idx_t,
    input: duckdb_vector,
    output: duckdb_vector,
) -> bool {
    let cast_info = unsafe { CastFunctionInfo::new(info) };
    let reader = unsafe { VectorReader::from_vector(input, count as usize) };
    let mut writer = unsafe { VectorWriter::new(output) };

    for row in 0..count as usize {
        if !unsafe { reader.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let s = unsafe { reader.read_str(row) };
        match s.parse::<i32>() {
            Ok(v) => unsafe { writer.write_i32(row, v) },
            Err(e) => {
                let msg = format!("cannot cast {:?} to INTEGER: {e}", s);
                if cast_info.cast_mode() == CastMode::Try {
                    // TRY_CAST: write NULL and record a per-row error
                    unsafe { cast_info.set_row_error(&msg, row as idx_t, output) };
                    unsafe { writer.set_null(row) };
                } else {
                    // Regular CAST: abort the whole query
                    unsafe { cast_info.set_error(&msg) };
                    return false;
                }
            }
        }
    }
    true
}

fn register(con: libduckdb_sys::duckdb_connection)
    -> Result<(), quack_rs::error::ExtensionError>
{
    unsafe {
        CastFunctionBuilder::new(TypeId::Varchar, TypeId::Integer)
            .function(varchar_to_int)
            .register(con)
    }
}
}

Implicit casts

Provide an implicit_cost to allow DuckDB to use the cast automatically in expressions where the types do not match:

#![allow(unused)]
fn main() {
use quack_rs::cast::CastFunctionBuilder;
use quack_rs::types::TypeId;
use libduckdb_sys::{duckdb_function_info, duckdb_vector, idx_t};
unsafe extern "C" fn my_cast(_: duckdb_function_info, _: idx_t, _: duckdb_vector, _: duckdb_vector) -> bool { true }
fn register(con: libduckdb_sys::duckdb_connection) -> Result<(), quack_rs::error::ExtensionError> {
unsafe {
    CastFunctionBuilder::new(TypeId::Varchar, TypeId::Integer)
        .function(my_cast)
        .implicit_cost(100) // lower = higher priority
        .register(con)
}
}
}

Extra info

Attach arbitrary data to a cast function using extra_info. This is useful for parameterising the cast behaviour (e.g., a rounding mode):

#![allow(unused)]
fn main() {
use quack_rs::cast::CastFunctionBuilder;
use quack_rs::types::TypeId;
use libduckdb_sys::{duckdb_function_info, duckdb_vector, idx_t};
use std::os::raw::c_void;
unsafe extern "C" fn my_cast(_: duckdb_function_info, _: idx_t, _: duckdb_vector, _: duckdb_vector) -> bool { true }
unsafe extern "C" fn my_destroy(_: *mut c_void) {}
fn register(con: libduckdb_sys::duckdb_connection) -> Result<(), quack_rs::error::ExtensionError> {
let mode = Box::into_raw(Box::new("round".to_string())).cast::<c_void>();
unsafe {
    CastFunctionBuilder::new(TypeId::Double, TypeId::BigInt)
        .function(my_cast)
        .implicit_cost(100)
        .extra_info(mode, Some(my_destroy))
        .register(con)
}
}
}

Inside the cast callback, retrieve the extra info with CastFunctionInfo::get_extra_info().

TRY_CAST vs CAST

Inside your callback, check [CastFunctionInfo::cast_mode()] to distinguish between the two modes:

Mode	User wrote	Expected behaviour on error
`CastMode::Normal`	`CAST(x AS T)`	Call `set_error` and return `false`
`CastMode::Try`	`TRY_CAST(x AS T)`	Call `set_row_error`, write `NULL`, continue

Working example

The examples/hello-ext extension registers two cast functions:

CAST(VARCHAR AS INTEGER) / TRY_CAST(VARCHAR AS INTEGER) — basic cast
CAST(DOUBLE AS BIGINT) — with implicit_cost(100) and extra_info for rounding mode

See examples/hello-ext/src/lib.rs for complete, copy-paste-ready references.

Complex source and target types

For casts involving complex types like DECIMAL(18, 3) or LIST(VARCHAR), use the new_logical constructor instead of new:

#![allow(unused)]
fn main() {
use quack_rs::cast::CastFunctionBuilder;
use quack_rs::types::{LogicalType, TypeId};
use libduckdb_sys::{duckdb_function_info, duckdb_vector, idx_t};
unsafe extern "C" fn my_cast(_: duckdb_function_info, _: idx_t, _: duckdb_vector, _: duckdb_vector) -> bool { true }
fn register(con: libduckdb_sys::duckdb_connection) -> Result<(), quack_rs::error::ExtensionError> {
unsafe {
    CastFunctionBuilder::new_logical(
        LogicalType::list(TypeId::Varchar),   // LIST(VARCHAR) source
        LogicalType::list(TypeId::Integer),   // LIST(INTEGER) target
    )
    .function(my_cast)
    .register(con)
}
}
}

The source() and target() accessor methods return Option<TypeId> — they return None when the type was set via new_logical (since a LogicalType cannot always be expressed as a simple TypeId).

API reference

[CastFunctionBuilder][quack_rs::cast::CastFunctionBuilder] — the main builder
[CastFunctionInfo][quack_rs::cast::CastFunctionInfo] — info handle inside callbacks
[CastMode][quack_rs::cast::CastMode] — Normal vs Try cast mode

NULL Handling

By default, DuckDB automatically propagates NULLs: if any argument to a function is NULL, the result is NULL without your function callback being called. This matches the SQL standard and works well for most functions.

However, some functions need to handle NULLs explicitly. For example:

COALESCE — returns the first non-NULL argument
IS_NULL / IS_NOT_NULL — tests whether the value is NULL
Custom aggregates that need to count NULLs

`NullHandling` enum

#![allow(unused)]
fn main() {
use quack_rs::types::NullHandling;

// Default: DuckDB auto-returns NULL for any NULL input
NullHandling::DefaultNullHandling

// Special: DuckDB passes NULLs to your callback
NullHandling::SpecialNullHandling
}

Scalar functions

#![allow(unused)]
fn main() {
use quack_rs::scalar::ScalarFunctionBuilder;
use quack_rs::types::{TypeId, NullHandling};

ScalarFunctionBuilder::new("my_coalesce")
    .param(TypeId::BigInt)
    .param(TypeId::BigInt)
    .returns(TypeId::BigInt)
    .null_handling(NullHandling::SpecialNullHandling)
    .function(my_coalesce_fn)
    .register(con)?;
}

With SpecialNullHandling, your callback must check VectorReader::is_valid(row) for each input column and handle NULLs yourself.

Aggregate functions

#![allow(unused)]
fn main() {
use quack_rs::aggregate::AggregateFunctionBuilder;
use quack_rs::types::{TypeId, NullHandling};

AggregateFunctionBuilder::new("count_with_nulls")
    .param(TypeId::BigInt)
    .returns(TypeId::BigInt)
    .null_handling(NullHandling::SpecialNullHandling)
    .state_size(my_state_size)
    .init(my_init)
    .update(my_update)   // will be called even for NULL rows
    .combine(my_combine)
    .finalize(my_finalize)
    .register(con)?;
}

When to use special NULL handling

Use case	NULL handling
Most scalar/aggregate functions	`DefaultNullHandling` (the default)
Functions that need to see NULLs	`SpecialNullHandling`
`COALESCE`-like functions	`SpecialNullHandling`
NULL-counting aggregates	`SpecialNullHandling`

If you don't call .null_handling(), the default (DefaultNullHandling) is used automatically.

SQL Macros

SQL macros let you package reusable SQL expressions and queries as named DuckDB functions — no FFI callbacks required. quack-rs makes this pure Rust: you define the macro body as a string and call .register(con).

Two macro types

Type	SQL generated	Returns
Scalar	`CREATE OR REPLACE MACRO name(params) AS (expression)`	one value per row
Table	`CREATE OR REPLACE MACRO name(params) AS TABLE query`	a result set

Scalar macros

A scalar macro wraps a SQL expression. Think of it as a parameterized SQL alias:

#![allow(unused)]
fn main() {
use quack_rs::sql_macro::SqlMacro;

fn register(con: duckdb_connection) -> Result<(), ExtensionError> {
    unsafe {
        // clamp(x, lo, hi) → greatest(lo, least(hi, x))
        SqlMacro::scalar("clamp", &["x", "lo", "hi"], "greatest(lo, least(hi, x))")?
            .register(con)?;

        // pi() → 3.14159265358979
        SqlMacro::scalar("pi", &[], "3.14159265358979")?
            .register(con)?;

        // safe_div(a, b) → CASE WHEN b = 0 THEN NULL ELSE a / b END
        SqlMacro::scalar(
            "safe_div",
            &["a", "b"],
            "CASE WHEN b = 0 THEN NULL ELSE a / b END",
        )?
        .register(con)?;
    }
    Ok(())
}
}

Use in DuckDB:

SELECT clamp(rating, 1, 5) FROM reviews;
SELECT safe_div(revenue, orders) FROM monthly_stats;

Table macros

A table macro wraps a SQL query that returns rows:

#![allow(unused)]
fn main() {
unsafe {
    // active_users(tbl) → SELECT * FROM tbl WHERE active = true
    SqlMacro::table(
        "active_users",
        &["tbl"],
        "SELECT * FROM tbl WHERE active = true",
    )?
    .register(con)?;

    // recent_orders(days) → last N days of orders
    SqlMacro::table(
        "recent_orders",
        &["days"],
        "SELECT * FROM orders WHERE order_date >= current_date - INTERVAL (days) DAY",
    )?
    .register(con)?;
}
}

Use in DuckDB:

SELECT * FROM active_users(users);
SELECT count(*) FROM recent_orders(7);

Inspecting the generated SQL

to_sql() returns the CREATE OR REPLACE MACRO statement without requiring a live connection. Use it for logging, debugging, or assertions in tests:

#![allow(unused)]
fn main() {
let m = SqlMacro::scalar("add", &["a", "b"], "a + b")?;
assert_eq!(
    m.to_sql(),
    "CREATE OR REPLACE MACRO add(a, b) AS (a + b)"
);

let t = SqlMacro::table("active_users", &["tbl"], "SELECT * FROM tbl WHERE active = true")?;
assert_eq!(
    t.to_sql(),
    "CREATE OR REPLACE MACRO active_users(tbl) AS TABLE SELECT * FROM tbl WHERE active = true"
);
}

Name and parameter validation

Macro names and parameter names are validated against the same rules as function names:

Must match [a-z_][a-z0-9_]*
Not exceed 256 characters
No null bytes

#![allow(unused)]
fn main() {
SqlMacro::scalar("MyMacro", &[], "1")   // ❌ Err — uppercase
SqlMacro::scalar("my-macro", &[], "1") // ❌ Err — hyphen
SqlMacro::scalar("f", &["X"], "1")     // ❌ Err — uppercase param
SqlMacro::scalar("f", &["_x"], "1")    // ✅ Ok  — underscore prefix allowed
}

SQL injection safety

Macro and parameter names are restricted to [a-z_][a-z0-9_]*, preventing SQL injection at the identifier level. They are interpolated literally (no quoting required, since the character set is already safe).

The body (expression or query) is your own extension code — it is included verbatim. Never build macro bodies from untrusted user input.

How it works under the hood

SqlMacro::register executes the CREATE OR REPLACE MACRO statement via duckdb_query:

#![allow(unused)]
fn main() {
pub unsafe fn register(self, con: duckdb_connection) -> Result<(), ExtensionError> {
    let sql = self.to_sql();
    unsafe { execute_sql(con, &sql) }
}
}

execute_sql zero-initializes a duckdb_result, calls duckdb_query, extracts any error message via duckdb_result_error, and always calls duckdb_destroy_result — even on failure.

Choosing between macros and scalar functions

Scenario	Use
Logic expressible in SQL	SQL macro — simpler, no FFI
Logic needs Rust code (algorithms, external crates, etc.)	Scalar function
Best performance for simple expressions	SQL macro (no FFI overhead)
Type-specific overloads	Scalar function with multiple registrations
Returning a table	SQL table macro

Copy Functions

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

Copy functions let you implement custom COPY TO file format handlers. When a user runs COPY table TO 'file.xyz' (FORMAT my_format), DuckDB invokes your extension's bind, init, sink, and finalize callbacks.

Lifecycle

Bind — called once. Inspect output columns, configure the export.
Global init — called once. Open the output file, allocate global state.
Sink — called once per data chunk. Write rows to the output.
Finalize — called once. Flush buffers, close the file.

Builder API

#![allow(unused)]
fn main() {
use quack_rs::copy_function::CopyFunctionBuilder;

let builder = CopyFunctionBuilder::try_new("my_format")?
    .bind(my_bind_fn)
    .global_init(my_global_init_fn)
    .sink(my_sink_fn)
    .finalize(my_finalize_fn);

// Register on a connection (inside entry_point_v2! callback):
// unsafe { builder.register(con)?; }
Ok::<(), quack_rs::error::ExtensionError>(())
}

Callback signatures

Phase	Signature
Bind	`unsafe extern "C" fn(info: duckdb_copy_function_bind_info)`
Global init	`unsafe extern "C" fn(info: duckdb_copy_function_global_init_info)`
Sink	`unsafe extern "C" fn(info: duckdb_copy_function_sink_info, chunk: duckdb_data_chunk)`
Finalize	`unsafe extern "C" fn(info: duckdb_copy_function_finalize_info)`

Callback info wrappers

Each phase provides an ergonomic wrapper type around its raw info handle. Wrap the handle at the top of your callback to access helper methods:

`CopyBindInfo`

Method	Description
`column_count()`	Number of output columns
`column_type(index)`	`LogicalType` of the column at `index`
`get_extra_info()`	Extra-info pointer set on the copy function
`set_bind_data(data, destroy)`	Store bind data and its destructor
`set_error(message)`	Report a bind-time error
`get_client_context()`	Returns a `ClientContext` for catalog/config access

`CopyGlobalInitInfo`

Method	Description
`get_bind_data()`	Retrieve the bind data pointer
`get_extra_info()`	Extra-info pointer set on the copy function
`get_file_path()`	Output file path for the COPY operation
`set_global_state(state, destroy)`	Store global state and its destructor
`set_error(message)`	Report an init-time error
`get_client_context()`	Returns a `ClientContext`

`CopySinkInfo`

Method	Description
`get_bind_data()`	Retrieve the bind data pointer
`get_extra_info()`	Extra-info pointer set on the copy function
`get_global_state()`	Retrieve the global state pointer
`set_error(message)`	Report a sink-time error
`get_client_context()`	Returns a `ClientContext`

`CopyFinalizeInfo`

Method	Description
`get_bind_data()`	Retrieve the bind data pointer
`get_extra_info()`	Extra-info pointer set on the copy function
`get_global_state()`	Retrieve the global state pointer
`set_error(message)`	Report a finalize-time error
`get_client_context()`	Returns a `ClientContext`

All four wrappers are re-exported from quack_rs::copy_function:

#![allow(unused)]
fn main() {
use quack_rs::copy_function::{CopyBindInfo, CopyGlobalInitInfo, CopySinkInfo, CopyFinalizeInfo};
}

config_option — register custom settings for your format
client_context — access the file system and catalog from callbacks
table_description — inspect table metadata
catalog — look up catalog entries

Reading & Writing Vectors

DuckDB passes data to and from your extension as vectors — columnar arrays of typed values, with a separate NULL bitmap. VectorReader and VectorWriter provide safe, typed access to these vectors.

`VectorReader`

Construction

#![allow(unused)]
fn main() {
// In a scalar function callback:
let reader = unsafe { VectorReader::new(input, column_index) };

// In an aggregate update callback:
let reader = unsafe { VectorReader::new(input, 0) };   // first column
}

VectorReader::new takes the duckdb_data_chunk and a zero-based column index. The reader borrows the chunk — it must not outlive the callback.

Row count

#![allow(unused)]
fn main() {
let n = reader.row_count();   // number of rows in this chunk
}

Chunk sizes vary. Always loop from 0..reader.row_count(), never assume a fixed size.

NULL check

#![allow(unused)]
fn main() {
if unsafe { !reader.is_valid(row) } {
    // row is NULL — skip or propagate NULL to output
    unsafe { writer.set_null(row) };
    continue;
}
}

Always check is_valid before reading. Reading from a NULL row returns garbage data.

Reading values

#![allow(unused)]
fn main() {
let i: i8  = unsafe { reader.read_i8(row) };
let i: i16 = unsafe { reader.read_i16(row) };
let i: i32 = unsafe { reader.read_i32(row) };
let i: i64 = unsafe { reader.read_i64(row) };
let u: u8  = unsafe { reader.read_u8(row) };
let u: u16 = unsafe { reader.read_u16(row) };
let u: u32 = unsafe { reader.read_u32(row) };
let u: u64 = unsafe { reader.read_u64(row) };
let f: f32 = unsafe { reader.read_f32(row) };
let f: f64 = unsafe { reader.read_f64(row) };
let b: bool = unsafe { reader.read_bool(row) };   // safe: uses u8 != 0
let s: &str = unsafe { reader.read_str(row) };    // handles inline + pointer format
let iv = unsafe { reader.read_interval(row) };    // returns DuckInterval

// Temporal and binary types (v0.10.0+):
let d: i32 = unsafe { reader.read_date(row) };      // days since epoch
let ts: i64 = unsafe { reader.read_timestamp(row) }; // microseconds since epoch
let t: i64 = unsafe { reader.read_time(row) };       // microseconds since midnight
let blob: &[u8] = unsafe { reader.read_blob(row) };  // binary data
let uuid: i128 = unsafe { reader.read_uuid(row) };   // UUID as i128
}

`VectorWriter`

Construction

#![allow(unused)]
fn main() {
// In a scalar function callback:
let mut writer = unsafe { VectorWriter::new(output) };

// In an aggregate finalize callback:
let mut writer = unsafe { VectorWriter::new(result) };
}

Writing values

#![allow(unused)]
fn main() {
unsafe { writer.write_i8(row, value) };
unsafe { writer.write_i16(row, value) };
unsafe { writer.write_i32(row, value) };
unsafe { writer.write_i64(row, value) };
unsafe { writer.write_u8(row, value) };
unsafe { writer.write_u16(row, value) };
unsafe { writer.write_u32(row, value) };
unsafe { writer.write_u64(row, value) };
unsafe { writer.write_f32(row, value) };
unsafe { writer.write_f64(row, value) };
unsafe { writer.write_bool(row, value) };
unsafe { writer.write_varchar(row, s) };   // &str (also available as write_str)
unsafe { writer.write_str(row, s) };       // alias for write_varchar
unsafe { writer.write_interval(row, interval) };  // DuckInterval

// Temporal and binary types (v0.10.0+):
unsafe { writer.write_date(row, days_since_epoch) };
unsafe { writer.write_timestamp(row, micros_since_epoch) };
unsafe { writer.write_time(row, micros_since_midnight) };
unsafe { writer.write_blob(row, &bytes) };
unsafe { writer.write_uuid(row, uuid_i128) };
}

Writing NULL

#![allow(unused)]
fn main() {
unsafe { writer.set_null(row) };
}

Pitfall L4: set_null calls duckdb_vector_ensure_validity_writable automatically before accessing the validity bitmap. Calling duckdb_vector_get_validity without this prerequisite returns an uninitialized pointer → SEGFAULT. VectorWriter::set_null handles this correctly. See Pitfall L4.

Clearing NULL (v0.11.0+)

To undo a previous set_null call and mark a row as valid again:

#![allow(unused)]
fn main() {
unsafe { writer.set_valid(row) };
}

set_valid also calls ensure_validity_writable automatically.

`DataChunk`

DataChunk wraps a duckdb_data_chunk handle, providing ergonomic access to vectors and metadata without raw FFI calls:

#![allow(unused)]
fn main() {
use quack_rs::data_chunk::DataChunk;

unsafe extern "C" fn my_scan(info: duckdb_function_info, output: duckdb_data_chunk) {
    let chunk = unsafe { DataChunk::from_raw(output) };
    let mut writer = unsafe { chunk.writer(0) };    // VectorWriter for column 0
    unsafe { writer.write_i64(0, 42) };
    unsafe { chunk.set_size(1) };                   // set output row count
}
}

Methods:

size() — current row count
set_size(n) — set row count (0 = end of stream)
column_count() — number of columns
vector(col) — raw duckdb_vector handle
writer(col) — VectorWriter for a column
reader(col) — VectorReader for a column
struct_writer(col, field_count) — StructWriter for a STRUCT output column
struct_reader(col, field_count) — StructReader for a STRUCT input column
struct_field_reader(col, field) — VectorReader for a specific STRUCT field
into_chunk_writer() — convert to ChunkWriter with auto set_size on drop

`StructWriter` / `StructReader`

For STRUCT columns with many fields, creating individual VectorWriter/VectorReader instances for each field is verbose. StructWriter and StructReader pre-create all field writers/readers at construction:

#![allow(unused)]
fn main() {
// Writing a 5-field STRUCT output:
let mut sw = unsafe { chunk.struct_writer(0, 5) };
unsafe {
    sw.write_bool(row, 0, result.success);
    sw.write_varchar(row, 1, &result.data);
    sw.write_i64(row, 2, result.count);
    sw.write_date(row, 3, result.day);
    sw.write_blob(row, 4, &result.payload);
}

// Reading a 3-field STRUCT input:
let sr = unsafe { chunk.struct_reader(0, 3) };
for row in 0..chunk.size() {
    let name = unsafe { sr.read_str(row, 0) };
    let age = unsafe { sr.read_i32(row, 1) };
    let active = unsafe { sr.read_bool(row, 2) };
}
}

`ChunkWriter`

ChunkWriter wraps an output duckdb_data_chunk and tracks rows. It automatically calls set_size on drop, preventing the common off-by-one bug:

#![allow(unused)]
fn main() {
let mut cw = unsafe { DataChunk::from_raw(output).into_chunk_writer() };
while let Some(row) = cw.next_row() {
    unsafe { cw.writer(0).write_varchar(row, &data[row].name) };
    unsafe { cw.writer(1).write_i64(row, data[row].value) };
    if cw.is_full() { break; }
}
// set_size called automatically when `cw` is dropped
}

`ValidityBitmap`

For advanced NULL handling beyond VectorWriter::set_null, use ValidityBitmap directly:

#![allow(unused)]
fn main() {
use quack_rs::vector::ValidityBitmap;

// Writing NULLs:
let mut bitmap = unsafe { ValidityBitmap::ensure_writable(some_vector) };
unsafe { bitmap.set_row_invalid(row as u64) };   // mark as NULL
unsafe { bitmap.set_row_valid(row as u64) };     // mark as non-NULL

// Reading NULLs:
let bitmap = unsafe { ValidityBitmap::get_read_only(some_vector) };
let is_valid = unsafe { bitmap.row_is_valid(row as u64) };
}

ValidityBitmap is available in the prelude: use quack_rs::prelude::*.

Utility functions

The quack_rs::vector module provides two utility functions:

#![allow(unused)]
fn main() {
use quack_rs::vector::{vector_size, vector_get_column_type};

// Returns the default vector size used by DuckDB (typically 2048).
let size: u64 = vector_size();

// Returns the LogicalType of a vector (unsafe — requires a valid duckdb_vector).
let lt = unsafe { vector_get_column_type(some_vector) };
}

Memory layout details

DuckDB stores vector data as flat arrays. VectorReader and VectorWriter compute element addresses as base_ptr + row * stride:

[value0][value1][value2]...[valueN]   ← typed array
[validity bitmap]                      ← separate bit array, 1 bit per row

The validity bitmap is lazily allocated — it may be null if no NULLs have been written. This is why ensure_validity_writable must be called before any get_validity call that follows a write path.

Complete scalar function pattern

#![allow(unused)]
fn main() {
unsafe extern "C" fn my_scalar(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };

    for row in 0..reader.row_count() {
        if unsafe { !reader.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let value = unsafe { reader.read_i64(row) };
        unsafe { writer.write_i64(row, transform(value)) };
    }
}
}

Values & Parameter Extraction

When a table function receives bind-time parameters, DuckDB passes them as duckdb_value handles. These handles are heap-allocated and must be destroyed after use. The Value wrapper handles this automatically via RAII.

The problem

Without Value, every parameter extraction requires three raw FFI calls and careful manual cleanup:

#![allow(unused)]
fn main() {
// Before: raw FFI — easy to leak memory
let param = duckdb_bind_get_parameter(info, 0);
let n = duckdb_get_int64(param);
duckdb_destroy_value(&mut { param });  // forget this → memory leak
}

The solution: `Value`

Value wraps a duckdb_value handle and calls duckdb_destroy_value on drop:

#![allow(unused)]
fn main() {
use quack_rs::table::BindInfo;

unsafe extern "C" fn my_bind(info: duckdb_bind_info) {
    let bind_info = unsafe { BindInfo::new(info) };

    // Value is RAII — automatically destroyed when dropped
    let n = unsafe { bind_info.get_parameter_value(0) }.as_i64();

    // Named parameters work the same way
    let path = unsafe { bind_info.get_named_parameter_value("path") }
        .as_str()
        .unwrap_or_default();
}
}

Typed extraction methods

Method	DuckDB type	Rust type
`as_str()`	VARCHAR	`Result<String, ExtensionError>`
`as_i8()`	TINYINT	`i8`
`as_i16()`	SMALLINT	`i16`
`as_i32()`	INTEGER	`i32`
`as_i64()`	BIGINT	`i64`
`as_i128()`	HUGEINT	`i128`
`as_u8()`	UTINYINT	`u8`
`as_u16()`	USMALLINT	`u16`
`as_u32()`	UINTEGER	`u32`
`as_u64()`	UBIGINT	`u64`
`as_f32()`	FLOAT	`f32`
`as_f64()`	DOUBLE	`f64`
`as_bool()`	BOOLEAN	`bool`

DuckDB will attempt to cast the value to the requested type. If the cast fails, numeric methods return 0 / 0.0 / false; as_str() returns an error.

Null-safe convenience variants

Every extraction method has an _or(default) variant that returns default when the value handle is null:

#![allow(unused)]
fn main() {
let timeout = val.as_i64_or(30);       // default 30 if NULL
let host = val.as_str_or("localhost");  // default "localhost" if NULL
let port = val.as_u16_or(5432);        // default 5432 if NULL
}

Checking for NULL

#![allow(unused)]
fn main() {
let val = unsafe { bind_info.get_parameter_value(0) };
if val.is_null() {
    // parameter was NULL or not provided
}
}

Escape hatch

If you need the raw handle for an API not yet wrapped:

#![allow(unused)]
fn main() {
let val = unsafe { bind_info.get_parameter_value(0) };
let raw: duckdb_value = val.into_raw();  // takes ownership, no auto-destroy
// ... use raw handle ...
// caller must call duckdb_destroy_value manually
}

`DataChunk`

Scan callbacks receive a duckdb_data_chunk for output. The DataChunk wrapper provides ergonomic access:

#![allow(unused)]
fn main() {
use quack_rs::data_chunk::DataChunk;

unsafe extern "C" fn my_scan(info: duckdb_function_info, output: duckdb_data_chunk) {
    let chunk = unsafe { DataChunk::from_raw(output) };

    // Get a writer for column 0
    let mut writer = unsafe { chunk.writer(0) };
    unsafe { writer.write_i64(0, 42) };

    // Set the output row count (0 = end of stream)
    unsafe { chunk.set_size(1) };
}
}

Methods

Method	Description
`size()`	Current row count
`set_size(n)`	Set row count (0 signals end of stream)
`column_count()`	Number of columns
`vector(col)`	Raw `duckdb_vector` handle
`writer(col)`	`VectorWriter` for a column
`reader(col)`	`VectorReader` for a column

DataChunk is non-owning — it does not destroy the chunk on drop. DuckDB manages the chunk's lifetime.

Complex Types: STRUCT, LIST, MAP, ARRAY

DuckDB's complex types — STRUCT, LIST, MAP, and ARRAY — are stored as nested vectors. quack-rs provides four helper types in vector::complex to access the child vectors without manual offset arithmetic.

Overview

DuckDB type	Storage	quack-rs helper
`STRUCT{a T, b U, …}`	Parent vector + N child vectors (one per field)	`StructVector`
`LIST<T>`	Parent vector holds `{offset, length}` per row; flat child vector holds elements	`ListVector`
`MAP<K, V>`	Stored as `LIST<STRUCT{key K, value V}>`	`MapVector`
`ARRAY<T>[N]`	Fixed-size array; single child vector	`ArrayVector`

Reading complex types (input vectors)

STRUCT

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorReader, complex::StructVector};

// Inside a scan or finalize callback:
// parent_vec comes from duckdb_data_chunk_get_vector(chunk, col_idx)
let x_reader = unsafe { StructVector::field_reader(parent_vec, 0, row_count) };
let y_reader = unsafe { StructVector::field_reader(parent_vec, 1, row_count) };

for row in 0..row_count {
    if unsafe { x_reader.is_valid(row) } {
        let x: f64 = unsafe { x_reader.read_f64(row) };
        let y: f64 = unsafe { y_reader.read_f64(row) };
        // process (x, y) …
    }
}
}

LIST

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorReader, complex::ListVector};

let total_elements = unsafe { ListVector::get_size(list_vec) };
let elem_reader = unsafe { ListVector::child_reader(list_vec, total_elements) };

for row in 0..row_count {
    let entry = unsafe { ListVector::get_entry(list_vec, row) };
    for i in 0..entry.length as usize {
        let elem_idx = entry.offset as usize + i;
        if unsafe { elem_reader.is_valid(elem_idx) } {
            let val: i64 = unsafe { elem_reader.read_i64(elem_idx) };
            // process val …
        }
    }
}
}

MAP

MAP is LIST<STRUCT{key, value}>. Access keys and values via the inner struct:

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorReader, complex::MapVector};

let total = unsafe { MapVector::total_entry_count(map_vec) };
let key_reader   = unsafe { VectorReader::from_vector(MapVector::keys(map_vec), total) };
let value_reader = unsafe { VectorReader::from_vector(MapVector::values(map_vec), total) };

for row in 0..row_count {
    let entry = unsafe { MapVector::get_entry(map_vec, row) };
    for i in 0..entry.length as usize {
        let idx = entry.offset as usize + i;
        let k = unsafe { key_reader.read_str(idx) };
        let v: i64 = unsafe { value_reader.read_i64(idx) };
        // process (k, v) …
    }
}
}

Writing complex types (output vectors)

STRUCT

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorWriter, complex::StructVector};

let mut x_writer = unsafe { StructVector::field_writer(out_vec, 0) };
let mut y_writer = unsafe { StructVector::field_writer(out_vec, 1) };

for row in 0..batch_size {
    unsafe { x_writer.write_f64(row, x_values[row]) };
    unsafe { y_writer.write_f64(row, y_values[row]) };
}
}

Nested complex types inside STRUCT (v0.11.0+)

When a STRUCT field is itself a LIST, MAP, or ARRAY, use child_vector() on StructWriter or StructReader to get the raw vector handle for complex operations:

#![allow(unused)]
fn main() {
use quack_rs::vector::{StructWriter, complex::ListVector};

// STRUCT(name VARCHAR, services LIST<VARCHAR>, message VARCHAR)
let mut sw = unsafe { StructWriter::new(struct_vec, 3) };

// Write scalar fields normally
unsafe { sw.write_varchar(row, 0, "hello") };
unsafe { sw.write_varchar(row, 2, "ok") };

// For the LIST field at index 1, get the raw vector
let list_vec = sw.child_vector(1);
unsafe { ListVector::reserve(list_vec, 10) };
unsafe { ListVector::set_entry(list_vec, row, 0, 3) };
let mut elem_writer = unsafe { ListVector::child_writer(list_vec) };
unsafe { elem_writer.write_varchar(0, "a") };
unsafe { elem_writer.write_varchar(1, "b") };
unsafe { elem_writer.write_varchar(2, "c") };
unsafe { ListVector::set_size(list_vec, 3) };
}

LIST

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorWriter, complex::ListVector};

let total_elements: usize = rows.iter().map(|r| r.len()).sum();
unsafe { ListVector::reserve(list_vec, total_elements) };

let mut child_writer = unsafe { ListVector::child_writer(list_vec) };
let mut offset = 0usize;
for (row, elements) in rows.iter().enumerate() {
    for (i, &val) in elements.iter().enumerate() {
        unsafe { child_writer.write_i64(offset + i, val) };
    }
    unsafe { ListVector::set_entry(list_vec, row, offset as u64, elements.len() as u64) };
    offset += elements.len();
}
unsafe { ListVector::set_size(list_vec, total_elements) };
}

MAP

The MAP write workflow is identical to LIST, but keys and values are written into the two struct child vectors:

#![allow(unused)]
fn main() {
use quack_rs::vector::{VectorWriter, complex::MapVector};

unsafe { MapVector::reserve(map_vec, total_pairs) };

let mut key_writer   = unsafe { VectorWriter::from_vector(MapVector::keys(map_vec)) };
let mut val_writer   = unsafe { VectorWriter::from_vector(MapVector::values(map_vec)) };
let mut offset = 0usize;
for (row, pairs) in all_pairs.iter().enumerate() {
    for (i, (k, v)) in pairs.iter().enumerate() {
        unsafe { key_writer.write_varchar(offset + i, k) };
        unsafe { val_writer.write_i64(offset + i, *v) };
    }
    unsafe { MapVector::set_entry(map_vec, row, offset as u64, pairs.len() as u64) };
    offset += pairs.len();
}
unsafe { MapVector::set_size(map_vec, total_pairs) };
}

Constructing complex logical types

Use LogicalType constructors to define complex column types. Each constructor has a variant that accepts TypeId values (for simple element types) and a _from_logical variant (for nested complex types):

Constructor	`_from_logical` variant	Creates
`LogicalType::list(TypeId)`	`list_from_logical(&LogicalType)`	`LIST<T>`
`LogicalType::map(TypeId, TypeId)`	`map_from_logical(&LogicalType, &LogicalType)`	`MAP<K, V>`
`LogicalType::struct_type(&[(&str, TypeId)])`	`struct_type_from_logical(&[(&str, LogicalType)])`	`STRUCT{...}`
`LogicalType::union_type(&[(&str, TypeId)])`	`union_type_from_logical(&[(&str, LogicalType)])`	`UNION(...)`
`LogicalType::array(TypeId, u64)`	`array_from_logical(&LogicalType, u64)`	`ARRAY<T>[N]`
`LogicalType::enum_type(&[&str])`	—	`ENUM(...)`
`LogicalType::decimal(u8, u8)`	—	`DECIMAL(w, s)`

API reference

All helpers are in quack_rs::vector::complex (re-exported from quack_rs::prelude).

`StructVector`

Method	Description
`get_child(vec, field_idx)`	Returns the raw child vector for field `field_idx`
`field_reader(vec, field_idx, row_count)`	Creates a `VectorReader` for a STRUCT field
`field_writer(vec, field_idx)`	Creates a `VectorWriter` for a STRUCT field

`StructWriter` / `StructReader` complex field access (v0.11.0+)

Method	Description
`StructWriter::child_vector(field_idx)`	Returns the raw `duckdb_vector` for a complex child field (LIST, MAP, ARRAY)
`StructReader::child_vector(field_idx)`	Same for reading nested complex fields

`ListVector`

Method	Description
`get_child(vec)`	Returns the flat element child vector
`get_size(vec)`	Total number of elements across all rows
`set_size(vec, n)`	Sets the number of elements after writing
`reserve(vec, capacity)`	Reserves capacity in the child vector
`get_entry(vec, row)`	Returns `{offset, length}` for a row (reading)
`set_entry(vec, row, offset, length)`	Sets `{offset, length}` for a row (writing)
`child_reader(vec, count)`	Creates a `VectorReader` for the element vector
`child_writer(vec)`	Creates a `VectorWriter` for the element vector

`MapVector`

Method	Description
`struct_child(vec)`	Returns the inner STRUCT vector
`keys(vec)`	Returns the key vector (STRUCT field 0)
`values(vec)`	Returns the value vector (STRUCT field 1)
`total_entry_count(vec)`	Total key-value pairs
`reserve(vec, n)`	Reserves capacity
`set_size(vec, n)`	Sets total entry count after writing
`get_entry(vec, row)`	Returns `{offset, length}` for a row (reading)
`set_entry(vec, row, offset, length)`	Sets `{offset, length}` for a row (writing)

`ArrayVector`

Method	Description
`get_child(vec)`	Returns the child vector of a fixed-size ARRAY vector

NULL Handling & Strings

This page covers two topics that are handled together in practice: checking for NULL before reading, and reading VARCHAR values from DuckDB vectors.

NULL checks

Every row in a DuckDB vector may be NULL. Always check validity before reading:

#![allow(unused)]
fn main() {
for row in 0..reader.row_count() {
    if unsafe { !reader.is_valid(row) } {
        // Propagate NULL to output
        unsafe { writer.set_null(row) };
        continue;
    }
    // Safe to read
    let value = unsafe { reader.read_str(row) };
}
}

Reading from a NULL row returns garbage data — the vector's data buffer is not zeroed at NULL positions. There is no bounds check or error; you get random bytes from the data buffer.

Writing NULL

#![allow(unused)]
fn main() {
unsafe { writer.set_null(row) };
}

Pitfall L4: VectorWriter::set_null calls duckdb_vector_ensure_validity_writable before accessing the validity bitmap. Calling duckdb_vector_get_validity without this prerequisite returns an uninitialized pointer → SEGFAULT. Never write NULL manually; always use set_null. See Pitfall L4.

Clearing NULL (v0.11.0+)

To mark a row as valid after a previous set_null:

#![allow(unused)]
fn main() {
unsafe { writer.set_valid(row) };
}

VARCHAR reading

Read VARCHAR columns with VectorReader::read_str:

#![allow(unused)]
fn main() {
let s: &str = unsafe { reader.read_str(row) };
}

The returned &str borrows from the DuckDB vector — it must not outlive the callback. Do not store it in a struct; clone it to a String if you need to keep it.

The `duckdb_string_t` format

Pitfall P7 — The duckdb_string_t format is not documented in the Rust bindings. This is the internalized knowledge encoded in quack-rs.

DuckDB stores VARCHAR values in a 16-byte duckdb_string_t struct with two representations, selected at runtime based on string length:

Format	Condition	Layout
Inline	length ≤ 12	`[len: u32][data: [u8; 12]]`
Pointer	length > 12	`[len: u32][prefix: [u8; 4]][ptr: *const u8][unused: u32]`

VectorReader::read_str and the underlying read_duck_string function handle both formats transparently. You never need to inspect the raw struct.

Empty strings vs NULL

An empty string ("") and NULL are distinct values:

#![allow(unused)]
fn main() {
// NULL: is_valid returns false
// Empty string: is_valid returns true, read_str returns ""
if unsafe { !reader.is_valid(row) } {
    // This is NULL
} else {
    let s = unsafe { reader.read_str(row) };
    if s.is_empty() {
        // This is an empty string, not NULL
    }
}
}

Writing VARCHAR

#![allow(unused)]
fn main() {
unsafe { writer.write_varchar(row, my_str) };  // &str
}

write_varchar copies the string bytes into DuckDB's managed storage. The &str reference is no longer needed after the call returns.

Complete NULL-safe VARCHAR pattern

#![allow(unused)]
fn main() {
unsafe extern "C" fn my_scalar(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };

    for row in 0..reader.row_count() {
        if unsafe { !reader.is_valid(row) } {
            unsafe { writer.set_null(row) };
            continue;
        }
        let s = unsafe { reader.read_str(row) };
        let upper = s.to_uppercase();
        unsafe { writer.write_varchar(row, &upper) };
    }
}
}

DuckStringView

For advanced use cases where you need access to the raw string bytes or the inline/pointer distinction, quack_rs::vector::string::DuckStringView is available:

#![allow(unused)]
fn main() {
use quack_rs::vector::string::{DuckStringView, DUCK_STRING_SIZE};

// From raw 16-byte data (inside a vector callback)
let raw: &[u8; 16] = unsafe { &*data.add(idx * DUCK_STRING_SIZE).cast() };
let view = DuckStringView::from_bytes(raw);

println!("length: {}", view.len());
println!("is_empty: {}", view.is_empty());
if let Some(s) = view.as_str() {
    println!("content: {s}");
}
}

In practice, prefer reader.read_str(row) — DuckStringView is only needed when you have a raw pointer and want to avoid creating a full VectorReader.

Constants

Constant	Value	Meaning
`DUCK_STRING_SIZE`	`16`	Size of one `duckdb_string_t` in bytes
`DUCK_STRING_INLINE_MAX_LEN`	`12`	Max length stored inline (no heap ptr)

INTERVAL Type

DuckDB's INTERVAL type represents a duration with three independent components: months, days, and sub-day microseconds. The quack_rs::interval module provides the DuckInterval struct and safe conversion utilities.

Why a custom struct?

Pitfall P8 — The INTERVAL struct layout and its conversion semantics are not documented in the Rust bindings. This module encodes that knowledge.

DuckDB's C duckdb_interval struct is 16 bytes with this exact layout:

offset 0:  months (i32)  — calendar months
offset 4:  days   (i32)  — calendar days
offset 8:  micros (i64)  — sub-day microseconds
total:     16 bytes

DuckInterval is #[repr(C)] with the same field order and is verified at compile time to be exactly 16 bytes.

Reading INTERVAL values

#![allow(unused)]
fn main() {
let iv: DuckInterval = unsafe { reader.read_interval(row) };
println!("{} months, {} days, {} µs", iv.months, iv.days, iv.micros);
}

VectorReader::read_interval handles the raw pointer arithmetic and alignment using read_interval_at internally.

DuckInterval fields

#![allow(unused)]
fn main() {
use quack_rs::interval::DuckInterval;

let iv = DuckInterval {
    months: 1,    // 1 calendar month
    days: 15,     // 15 calendar days
    micros: 3600_000_000,  // 1 hour in microseconds
};
}

Fields are public and can be constructed directly.

Zero interval

#![allow(unused)]
fn main() {
let zero = DuckInterval::zero();    // { months: 0, days: 0, micros: 0 }
let zero = DuckInterval::default(); // same
}

Converting to microseconds

Intervals are not directly comparable because months and days have variable lengths in wall-clock time. When you need a single numeric value, convert to microseconds using the DuckDB approximation: 1 month = 30 days.

Checked conversion (returns `Option`)

#![allow(unused)]
fn main() {
use quack_rs::interval::interval_to_micros;

let iv = DuckInterval { months: 0, days: 1, micros: 500_000 };
match interval_to_micros(iv) {
    Some(us) => println!("{us} microseconds"),
    None => println!("overflow"),
}

// Method form:
let us: Option<i64> = iv.to_micros();
}

Returns None if the result would overflow i64. This can happen with extreme values (e.g., months: i32::MAX).

Saturating conversion (never panics)

#![allow(unused)]
fn main() {
use quack_rs::interval::interval_to_micros_saturating;

let iv = DuckInterval { months: i32::MAX, days: i32::MAX, micros: i64::MAX };
let us: i64 = interval_to_micros_saturating(iv); // i64::MAX

// Method form:
let us: i64 = iv.to_micros_saturating();
}

Use the saturating form in FFI callbacks where panics are not allowed.

Conversion constants

Constant	Value	Meaning
`MICROS_PER_DAY`	`86_400_000_000`	Microseconds in 24 hours
`MICROS_PER_MONTH`	`2_592_000_000_000`	Microseconds in 30 days

#![allow(unused)]
fn main() {
use quack_rs::interval::{MICROS_PER_DAY, MICROS_PER_MONTH};

assert_eq!(MICROS_PER_DAY, 86_400 * 1_000_000);
assert_eq!(MICROS_PER_MONTH, 30 * MICROS_PER_DAY);
}

Low-level: `read_interval_at`

If you have a raw data pointer (e.g., from duckdb_vector_get_data), you can read an interval directly:

#![allow(unused)]
fn main() {
use quack_rs::interval::read_interval_at;

// SAFETY: data is a valid DuckDB INTERVAL vector data pointer, idx is in bounds.
let iv = unsafe { read_interval_at(data_ptr, row_idx) };
}

In practice you should use VectorReader::read_interval(row) instead, which handles all safety invariants.

Complete example: aggregate over INTERVAL

#![allow(unused)]
fn main() {
#[derive(Default)]
struct TotalDurationState {
    total_micros: i64,
}
impl AggregateState for TotalDurationState {}

unsafe extern "C" fn update(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    states: *mut duckdb_aggregate_state,
) {
    let reader = unsafe { VectorReader::new(input, 0) };
    for row in 0..reader.row_count() {
        if unsafe { !reader.is_valid(row) } { continue; }
        let iv = unsafe { reader.read_interval(row) };
        let us = iv.to_micros_saturating();
        let state_ptr = unsafe { *states.add(row) };
        if let Some(st) = unsafe { FfiState::<TotalDurationState>::with_state_mut(state_ptr) } {
            st.total_micros = st.total_micros.saturating_add(us);
        }
    }
}
}

Memory layout verification

DuckInterval includes a compile-time assertion that validates its size and alignment against DuckDB's C struct. If the assertion fails, the crate will not compile — catching any future mismatch at build time rather than runtime.

Structured Errors

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

ErrorData is an RAII wrapper around DuckDB's duckdb_error_data handle — the structured error type returned by several DuckDB 1.5 C API surfaces. Unlike a bare error string, an ErrorData carries both a human-readable message and a machine-readable category (DuckDbErrorType), so your extension can branch on the kind of failure (for example, distinguishing Io from OutOfMemory).

It is the common currency of the other 1.5 modules: Expression::fold, the virtual file system, and the appender all report failures as an ErrorData.

Inspecting an error

#![allow(unused)]
fn main() {
use quack_rs::error_data::{DuckDbErrorType, ErrorData};

fn handle(err: ErrorData) {
if err.has_error() {
    match err.error_type() {
        DuckDbErrorType::Io => eprintln!("I/O failure"),
        DuckDbErrorType::OutOfMemory => eprintln!("out of memory"),
        other => eprintln!("{other:?}: {}", err.message().unwrap_or_default()),
    }
}
}
}

Constructing an error

Build a structured error to hand back to DuckDB (for example from a callback):

#![allow(unused)]
fn main() {
use quack_rs::error_data::{DuckDbErrorType, ErrorData};

let err = ErrorData::new(DuckDbErrorType::InvalidInput, "row index out of range");
assert!(err.has_error());
assert_eq!(err.error_type(), DuckDbErrorType::InvalidInput);
}

Propagating with `?`

into_extension_error converts an ErrorData into the SDK's ExtensionError, so a structured DuckDB error can flow through the ? operator in your registration or callback logic:

#![allow(unused)]
fn main() {
use quack_rs::error::ExtensionError;
use quack_rs::file_system::{FileOpenOptions, FileSystem};
use quack_rs::client_context::ClientContext;

fn read_header(ctx: &ClientContext) -> Result<(), ExtensionError> {
let fs = FileSystem::from_client_context(ctx)
    .ok_or_else(|| ExtensionError::new("no file system"))?;
let handle = fs
    .open(c"data.bin", &FileOpenOptions::read_only())
    .map_err(ErrorData::into_extension_error)?;
let _ = handle;
Ok(())
}
}

API

Item	Description
`ErrorData::new(error_type, message)`	Construct a structured error
`ErrorData::from_raw(raw)` (unsafe)	Take ownership of a raw `duckdb_error_data`
`has_error()`	`true` if the handle represents an actual error
`error_type()`	The `DuckDbErrorType` category
`message()`	`Option<String>` — the error text
`into_extension_error()`	Consume into an `ExtensionError`
`is_null()` / `as_raw()` / `into_raw()`	Handle inspection / escape hatches

DuckDbErrorType is a #[non_exhaustive] enum mirroring duckdb_error_type (Io, OutOfMemory, Conversion, Catalog, Constraint, Permission, …). Unknown or future categories map to DuckDbErrorType::Invalid.

UTF-8 validation

The free function check_valid_utf8 exposes DuckDB's own UTF-8 validator, which is stricter than Rust's in some cases. Use it to validate externally-sourced bytes before handing them to DuckDB string APIs:

#![allow(unused)]
fn main() {
use quack_rs::error_data::check_valid_utf8;

fn demo(bytes: &[u8]) {
match check_valid_utf8(bytes) {
    Ok(()) => { /* safe to pass to DuckDB */ }
    Err(err) => eprintln!("invalid UTF-8: {}", err.message().unwrap_or_default()),
}
}
}

Ownership

ErrorData calls duckdb_destroy_error_data on drop. When you receive one from a fallible 1.5 API, it owns the handle — just let it drop, or call into_extension_error() / into_raw() to move the data out.

Bound Expressions — Expression::fold returns ErrorData
Virtual File System — file operations return ErrorData
Bulk Appender — Appender::error_data returns ErrorData
Error Handling — the SDK's primary ExtensionError type

Bound Expressions

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

Expression is an RAII wrapper around DuckDB's duckdb_expression handle. You obtain one from a scalar function's bind callback via ScalarBindInfo::argument, which lets the bind phase inspect each argument's static type and — when the argument is a constant — fold it to a concrete Value.

This is the canonical way to write scalar functions whose behaviour depends on a constant argument (a format string, a precision, a regex) that should be validated or pre-computed once at bind time rather than on every row.

Folding a constant argument at bind time

#![allow(unused)]
fn main() {
use quack_rs::scalar::ScalarBindInfo;
use libduckdb_sys::duckdb_bind_info;

unsafe extern "C" fn my_bind(info: duckdb_bind_info) {
    let bind = unsafe { ScalarBindInfo::new(info) };

    if let Some(arg) = unsafe { bind.argument(0) } {
        // Inspect the argument's static return type.
        let _ty = arg.return_type();

        // If the argument is constant, evaluate it once here instead of
        // recomputing it for every row in the execute callback.
        if arg.is_foldable() {
            let ctx = unsafe { bind.get_client_context() };
            match arg.fold(&ctx) {
                Ok(value) => {
                    // Stash `value` as bind data for the execute phase.
                    let _ = value;
                }
                Err(err) => bind.set_error(&err.message().unwrap_or_default()),
            }
        }
    }
}
}

API

Method	Description
`return_type()`	`Option<LogicalType>` — the expression's static type
`is_foldable()`	`true` if the expression is constant and can be `fold`ed
`fold(&client_context)`	`Result<Value,` `ErrorData>` — evaluate a constant expression
`from_raw(raw)` (unsafe) / `as_raw()` / `is_null()`	Handle inspection / escape hatches

fold only succeeds when is_foldable returns true; otherwise it returns a structured ErrorData.

Obtaining an `Expression`

ScalarBindInfo (the wrapper around a scalar bind callback's duckdb_bind_info) provides two accessors:

Method	Returns
`argument(index)` (unsafe)	`Option<Expression>` — RAII, the ergonomic path
`get_argument(index)` (unsafe)	raw `duckdb_expression` — escape hatch

Use argument_count() to bound the index.

Ownership

Expression calls duckdb_destroy_expression on drop. The handle returned by argument() is owned by the caller, so the wrapper cleans it up automatically.

Scalar Functions — registering the function whose bind callback yields these expressions
Values & Parameter Extraction — working with the Value produced by fold
Structured Errors — the ErrorData returned on failure

Bulk Appender

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

Appender is an RAII wrapper around DuckDB's appender — the fastest way to bulk-insert rows into an existing table. It pairs the core appender lifecycle (create, append a DataChunk, flush, close) with the DuckDB 1.5 additions: structured ErrorData reporting, reverting buffered rows, and appending a column's DEFAULT value.

Appending data chunks

#![allow(unused)]
fn main() {
use quack_rs::appender::Appender;
use quack_rs::data_chunk::DataChunk;
use libduckdb_sys::duckdb_connection;

unsafe fn load(con: duckdb_connection, chunks: &[DataChunk]) -> Result<(), quack_rs::error_data::ErrorData> {
// SAFETY: `con` is a valid, open connection (e.g. from an entry point).
let appender = unsafe { Appender::new(con, None, c"events") }?;

for chunk in chunks {
    appender.append_chunk(chunk)?;
}

// Flush and surface any final error explicitly (see "Drop" below).
appender.close()?;
Ok(())
}
}

Pass a schema (or fully-qualified catalog + schema) when the default schema is not what you want:

#![allow(unused)]
fn main() {
use quack_rs::appender::Appender;
use libduckdb_sys::duckdb_connection;

unsafe fn demo(con: duckdb_connection) -> Result<(), quack_rs::error_data::ErrorData> {
let a = unsafe { Appender::new(con, Some(c"main"), c"events") }?;
let b = unsafe { Appender::with_catalog(con, Some(c"mydb"), Some(c"main"), c"events") }?;
let _ = (a, b);
Ok(())
}
}

Recovering from a failed flush

If a flush fails (for example a constraint violation), the rows buffered since the last successful flush can be discarded with clear, letting you continue without re-appending already-committed rows:

#![allow(unused)]
fn main() {
use quack_rs::appender::Appender;
use libduckdb_sys::duckdb_connection;
unsafe fn demo(appender: &Appender) {
if let Err(err) = appender.flush() {
    eprintln!("flush failed: {}", err.message().unwrap_or_default());
    let _ = appender.clear(); // drop the offending buffered rows
}
}
}

API

Method	Description
`Appender::new(con, schema, table)` (unsafe)	Create for `table` in `schema` (`None` = default)
`Appender::with_catalog(con, catalog, schema, table)` (unsafe)	Create fully qualified
`append_chunk(&chunk)`	Append an entire `DataChunk`
`append_default_to_chunk(&chunk, col, row)`	Write a column's `DEFAULT` into a chunk cell
`flush()`	Flush buffered rows without closing
`close()`	Flush and close (no further appends)
`clear()`	Discard buffered, unflushed rows
`error_data()`	Structured `ErrorData` from the last failed operation

Every fallible method returns Result<(), ErrorData> (or Result<Self, ErrorData> for the constructors).

Safety

new and with_catalog are unsafe: you must pass a valid, open duckdb_connection (such as the one provided to your extension's entry point).

Drop

Dropping an Appender flushes and destroys it, but the final flush's result is ignored. To observe an error from the last batch, call close() (or a final flush()) explicitly before the appender goes out of scope.

Reading & Writing Vectors — building the DataChunks you append
Structured Errors — the ErrorData returned on failure
The Entry Point — where you obtain a connection

Virtual File System

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

This module exposes DuckDB's virtual file system (VFS) to your extension, so a custom table function, replacement scan, or copy function can read and write files through the same abstraction DuckDB uses internally. That means transparently honouring httpfs (s3://, http://), in-memory files, and any other registered file system — instead of reaching for std::fs and only ever seeing local disk.

Obtaining a `FileSystem`

A FileSystem comes from a ClientContext, which most function callbacks can hand you (for example via BindInfo::get_client_context() or ScalarBindInfo::get_client_context()).

Reading a file

#![allow(unused)]
fn main() {
use quack_rs::client_context::ClientContext;
use quack_rs::file_system::{FileOpenOptions, FileSystem};

fn read_all(ctx: &ClientContext) -> Option<Vec<u8>> {
let fs = FileSystem::from_client_context(ctx)?;
let handle = fs.open(c"s3://bucket/data.csv", &FileOpenOptions::read_only()).ok()?;

let size = handle.size().max(0) as usize;
let mut buf = vec![0u8; size];
let n = handle.read(&mut buf).ok()?;
buf.truncate(n);
Some(buf)
}
}

Writing a file

#![allow(unused)]
fn main() {
use quack_rs::client_context::ClientContext;
use quack_rs::file_system::{FileOpenOptions, FileSystem};

fn write_report(ctx: &ClientContext, bytes: &[u8]) -> Result<(), quack_rs::error_data::ErrorData> {
let fs = FileSystem::from_client_context(ctx).expect("file system");
let handle = fs.open(c"report.bin", &FileOpenOptions::write_create())?;
handle.write(bytes)?;
handle.sync()?;
handle.close()?;
Ok(())
}
}

Open options and flags

FileOpenOptions describes how a file is opened. Two convenience constructors cover the common cases; use set_flag for anything else.

Constructor / method	Effect
`FileOpenOptions::read_only()`	Open for reading
`FileOpenOptions::write_create()`	Open for writing, creating if absent
`FileOpenOptions::new()`	Empty; configure with `set_flag`
`set_flag(flag, value)`	Toggle an individual `FileFlag`; returns `true` on success

FileFlag variants: Read, Write, Create, CreateNew, Append.

`FileHandle` operations

Method	Returns	Description
`read(&mut buf)`	`Result<usize, ErrorData>`	Read up to `buf.len()` bytes (0 = EOF)
`write(&buf)`	`Result<usize, ErrorData>`	Write up to `buf.len()` bytes
`seek(position)`	`Result<(), ErrorData>`	Seek to an absolute byte offset
`tell()`	`i64`	Current byte offset
`size()`	`i64`	Total file size in bytes
`sync()`	`Result<(), ErrorData>`	Flush buffered writes to durable storage
`close()`	`Result<(), ErrorData>`	Close the file
`error_data()`	`ErrorData`	Structured error from the last failed operation

FileSystem exposes open(path, options) and error_data(). Both FileSystem and FileHandle are RAII: they are destroyed (and the handle closed) on drop.

Replacement Scans — SELECT * FROM 'file.xyz' handlers that read through this file system
Copy Functions — COPY TO handlers that write through it
Structured Errors — the ErrorData returned on failure

Selection Vectors

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

A SelectionVector is a list of row indices used to logically reorder or filter a data vector without copying its payload — the building block behind DuckDB's zero-copy filtering. Extensions that implement custom filtering or reordering in vectorized callbacks can allocate one, fill in the indices, and hand its raw handle to the relevant DuckDB vector operations.

This is an advanced, low-level primitive; most extensions never need it.

Allocating and filling

#![allow(unused)]
fn main() {
use quack_rs::selection_vector::SelectionVector;

// Select source rows 3, 1, 4, 1, 5 (in that order) — note repeats are allowed.
let mut sel = SelectionVector::new(5);
sel.as_mut_slice().copy_from_slice(&[3, 1, 4, 1, 5]);

assert_eq!(sel.len(), 5);
assert_eq!(sel.as_slice(), &[3, 1, 4, 1, 5]);
}

The indices are 32-bit (sel_t / u32) and are uninitialised after new — fill them via as_mut_slice() before use.

API

Method	Description
`SelectionVector::new(size)`	Allocate a vector of `size` indices
`len()` / `is_empty()`	Number of indices
`as_slice()`	`&[u32]` — read the indices
`as_mut_slice()`	`&mut [u32]` — fill the indices
`as_raw()`	The raw `duckdb_selection_vector` handle for DuckDB vector ops

SelectionVector is RAII: it is destroyed on drop.

Reading & Writing Vectors — the data vectors a selection vector reorders or filters
Complex Types — STRUCT / LIST / MAP / ARRAY vectors

Instance Cache

Requires the duckdb-1-5 feature flag (DuckDB 1.5.0+).

An InstanceCache lets multiple connections share a single underlying DuckDB instance for a given database path. Opening the same path twice through the cache returns handles backed by the same instance, which avoids the "database is already open in another instance" conflict and saves the cost of re-initialising the database.

This is primarily useful for extensions or host integrations that open secondary databases on behalf of a query.

Opening through the cache

#![allow(unused)]
fn main() {
use quack_rs::instance_cache::InstanceCache;

fn demo() -> Result<(), quack_rs::error::ExtensionError> {
let cache = InstanceCache::new();

// Returns a duckdb_database the caller OWNS and must close with duckdb_close.
let db = cache.get_or_create(c"analytics.db", None)?;
let _ = db;
Ok(())
}
}

Pass a DbConfig to control how a freshly created instance is configured; it is ignored when an instance already exists for the path:

#![allow(unused)]
fn main() {
use quack_rs::instance_cache::InstanceCache;
use quack_rs::config::DbConfig;

fn demo() -> Result<(), quack_rs::error::ExtensionError> {
let cache = InstanceCache::new();
let config = DbConfig::new()?;
// configure `config` as needed...
let db = cache.get_or_create(c"analytics.db", Some(&config))?;
let _ = db;
Ok(())
}
}

API

Method	Description
`InstanceCache::new()`	Create a new, empty cache
`get_or_create(path, config)`	Open `path`, creating the instance if needed
`as_raw()`	The raw `duckdb_instance_cache` handle

get_or_create returns Result<duckdb_database, ExtensionError>; on failure the error carries DuckDB's message.

Ownership

InstanceCache is RAII and destroys the cache on drop. The duckdb_database returned by get_or_create is, however, owned by the caller — you must close it with duckdb_close when finished. The cache keeps the underlying instance alive so that subsequent opens of the same path are cheap.

config — DbConfig, the RAII configuration builder accepted here

TLS Configuration

Extensions that make outbound HTTPS connections (e.g., fetching remote data, calling REST APIs) need a way to inject TLS configuration — client certificates for mTLS, custom CA bundles, or restricted cipher suites.

The tls module provides the [TlsConfigProvider] trait so that extensions can supply their TLS setup through a uniform interface, regardless of which TLS library they use (rustls, native-tls, etc.).

Design

The trait is type-erased via Arc<dyn Any + Send + Sync> so that quack-rs does not depend on any specific TLS library. The implementing extension downcasts the returned Arc to its concrete config type.

Implementing a TLS Provider

#![allow(unused)]
fn main() {
use quack_rs::tls::{TlsConfigProvider, TlsVersion};
use quack_rs::error::ExtensionError;
use std::any::Any;
use std::sync::Arc;

struct MyTlsProvider {
    // In practice: Arc<rustls::ClientConfig>
    config: Arc<String>,
    mtls_enabled: bool,
}

impl TlsConfigProvider for MyTlsProvider {
    fn client_config(&self) -> Result<Arc<dyn Any + Send + Sync>, ExtensionError> {
        Ok(self.config.clone())
    }

    fn provider_name(&self) -> &str { "my-extension-tls" }
    fn config_type_name(&self) -> &str { "rustls::ClientConfig" }

    fn min_tls_version(&self) -> TlsVersion {
        TlsVersion::Tls12  // Minimum recommended
    }

    fn supports_mtls(&self) -> bool { self.mtls_enabled }

    fn accepts_invalid_certs(&self) -> bool {
        false  // MUST default to false
    }
}
}

Security Requirements

Implementations must:

Return false from accepts_invalid_certs() unless explicitly configured otherwise by the user. Certificate validation bypass (CWE-295) should never be the default.
Return TlsVersion::Tls12 or higher from min_tls_version(). TLS 1.0 and 1.1 are deprecated per RFC 8996.
Emit an ExtensionWarning via WarningCollector when certificate validation is disabled or when using a TLS version below 1.2.

Auditing a Provider

The audit_tls_provider() function checks common misconfigurations:

#![allow(unused)]
fn main() {
use quack_rs::tls::audit_tls_provider;
use quack_rs::warning::WarningCollector;

// let warnings = audit_tls_provider(&my_provider);
// let collector = WarningCollector::new();
// for w in warnings {
//     collector.emit(w);
// }
}

It detects:

Certificate verification bypass (CWE-295) — emits TLS_NO_VERIFY warning
Deprecated TLS versions (CWE-327) — emits TLS_DEPRECATED_VERSION warning

Downcasting Safely

Never use .unwrap() or .expect() when downcasting in FFI callback contexts (see Pitfall L3). Always handle the None case gracefully:

#![allow(unused)]
fn main() {
use std::any::Any;
use std::sync::Arc;
use quack_rs::error::ExtensionError;

fn use_config(config: Arc<dyn Any + Send + Sync>) -> Result<(), ExtensionError> {
    // let rustls_config = config.downcast_ref::<rustls::ClientConfig>()
    //     .ok_or(ExtensionError::new("expected rustls::ClientConfig"))?;
    Ok(())
}
}

Structured Warnings

Extensions that access external resources (network, files, credentials) should emit structured warnings when potentially unsafe operations occur. The warning module provides a consistent, thread-safe API for collecting and surfacing security warnings.

Core Types

`ExtensionWarning`

A structured warning with:

Field	Type	Description
`code`	`&'static str`	Machine-readable code (e.g., `"TLS_NO_VERIFY"`)
`severity`	`WarningSeverity`	Info / Low / Medium / High / Critical
`message`	`String`	Human-readable description
`cwe`	`Option<u32>`	Optional CWE identifier

`WarningSeverity`

Five levels mirroring common security advisory severity:

Info — no security impact, but worth noting
Low — minimal security impact
Medium — potential security concern
High — significant security risk
Critical — immediate action recommended

`WarningCollector`

A thread-safe collector backed by Mutex<Vec<ExtensionWarning>>. Safe to share across threads via Arc<WarningCollector>.

Usage

#![allow(unused)]
fn main() {
use quack_rs::warning::{ExtensionWarning, WarningSeverity, WarningCollector};

let collector = WarningCollector::new();

// Emit a warning when detecting an insecure configuration
collector.emit(ExtensionWarning {
    code: "TLS_NO_VERIFY",
    severity: WarningSeverity::High,
    message: "TLS certificate verification is disabled".into(),
    cwe: Some(295),
});

// Check warnings
assert_eq!(collector.len(), 1);
assert!(!collector.is_empty());

// Read without clearing
let snapshot = collector.snapshot();
assert_eq!(snapshot.len(), 1);
assert_eq!(collector.len(), 1);  // still there

// Consume all warnings
let warnings = collector.drain();
assert_eq!(warnings.len(), 1);
assert!(collector.is_empty());  // now empty
}

Display Format

Warnings format as [SEVERITY] CODE: message (CWE-nnn):

[HIGH] TLS_NO_VERIFY: TLS certificate verification is disabled (CWE-295)
[MEDIUM] TLS_DEPRECATED_VERSION: TLS provider allows deprecated TLS 1.0 (CWE-327)

Integration with TLS Auditing

The audit_tls_provider() function returns Vec<ExtensionWarning> that can be fed directly into a WarningCollector:

#![allow(unused)]
fn main() {
use quack_rs::tls::audit_tls_provider;
use quack_rs::warning::WarningCollector;

// let warnings = audit_tls_provider(&my_tls_provider);
// let collector = WarningCollector::new();
// for w in warnings {
//     collector.emit(w);
// }
}

Best Practices

Create a single WarningCollector per extension (typically in global or bind-data state)
Use snapshot() for read-only diagnostics; use drain() when consuming warnings for output
Always include CWE identifiers for security-related warnings
Surface collected warnings through a table function (e.g., SELECT * FROM __extension_warnings())

Secrets Management

Extensions that access external services (HTTP APIs, databases, cloud storage) commonly need credentials. DuckDB provides a native secrets API via CREATE SECRET, and the secrets module defines the Rust-side traits and types that extensions implement to bridge into that system.

Core Types

`SecretEntry`

A single secret entry with metadata and key-value fields. Designed to minimize accidental credential leakage:

Debug redacts field values — only keys are shown, values replaced with "[REDACTED]"
Drop zeroizes sensitive data — all field values overwritten with zeros using std::ptr::write_volatile before deallocation
No PartialEq — prevents accidental non-constant-time comparisons of secret material
Clone is explicit — callers are aware they are duplicating sensitive material in memory

`SecretsManager`

The trait extensions implement to provide secret lookup:

#![allow(unused)]
fn main() {
use quack_rs::secrets::{SecretEntry, SecretsManager};

struct MySecrets {
    entries: Vec<SecretEntry>,
}

impl SecretsManager for MySecrets {
    fn get_secret(&self, name: &str, secret_type: &str) -> Option<SecretEntry> {
        self.entries.iter()
            .find(|e| e.name() == name && e.secret_type() == secret_type)
            .cloned()
    }

    fn list_secrets(&self, secret_type: Option<&str>) -> Vec<SecretEntry> {
        self.entries.iter()
            .filter(|e| secret_type.is_none() || secret_type == Some(e.secret_type()))
            .cloned()
            .collect()
    }

    fn remove_secret(&self, _name: &str, _secret_type: &str) -> bool {
        false // read-only example
    }
}
}

Building Secret Entries

Use the builder pattern:

#![allow(unused)]
fn main() {
use quack_rs::secrets::SecretEntry;

let entry = SecretEntry::new("my_api_key", "bearer")
    .with_provider("config")
    .with_scope("https://api.example.com")
    .with_field("token", "sk-abc123")
    .with_field("refresh_token", "xyz789");

assert_eq!(entry.name(), "my_api_key");
assert_eq!(entry.secret_type(), "bearer");
assert_eq!(entry.get_field("token"), Some("sk-abc123"));
}

Safe Diagnostics

Use field_keys() for logging without leaking secrets:

#![allow(unused)]
fn main() {
use quack_rs::secrets::SecretEntry;

let entry = SecretEntry::new("key", "s3")
    .with_field("access_key", "AKIA...")
    .with_field("secret_key", "wJalr...");

// Safe for logging — returns keys only, no values
let keys = entry.field_keys();
// keys: ["access_key", "secret_key"]
}

Debug Output

The Debug implementation redacts all sensitive values:

SecretEntry {
    name: "api_key",
    secret_type: "bearer",
    provider: "config",
    scope: "",
    fields: {"token": "[REDACTED]"}
}

Security Best Practices

Never log secret field values — use field_keys() for diagnostics
Drop clones promptly — minimize the window during which sensitive data resides in memory
Implement remove_secret with zeroization — don't just remove the reference; zeroize the data before deallocation
Thread safety — SecretsManager implementations must be Send + Sync as DuckDB may invoke callbacks concurrently

Testing Guide

quack-rs provides a two-tier testing strategy: pure-Rust unit tests for business logic (no DuckDB required), and SQLLogicTest E2E tests that run inside an actual DuckDB process.

Architectural limitation: the `loadable-extension` dispatch wall

This is the most important thing to understand before writing tests.

DuckDB loadable extensions use libduckdb-sys with features = ["loadable-extension"]. This intentionally does not link the DuckDB runtime into the extension binary. Instead, every DuckDB C API call (duckdb_vector_get_data, duckdb_create_logical_type, etc.) goes through a lazy dispatch table — a global struct of AtomicPtr<fn> pointers initialized only when DuckDB calls duckdb_rs_extension_api_init at extension-load time.

In cargo test, no DuckDB process loads your extension. The dispatch table is never initialized, and the first call to any DuckDB C API function panics:

DuckDB API not initialized

What this breaks

API	Why it fails
`VectorReader::new`	calls `duckdb_vector_get_data`
`VectorWriter::new`	calls `duckdb_vector_get_data`
`Connection::register_*`	calls DuckDB registration C API
`LogicalType::new`	calls `duckdb_create_logical_type`
`LogicalType::drop`	calls `duckdb_destroy_logical_type`
`BindInfo::add_result_column`	calls `duckdb_bind_add_result_column`

What still works in `cargo test`

API	Why it works
`AggregateTestHarness`	pure Rust, zero DuckDB dependency
`MockVectorWriter` / `MockVectorReader`	in-memory buffers, zero DuckDB dependency
`MockRegistrar`	records registrations without calling C API
`SqlMacro::to_sql()`	generates SQL strings, no DuckDB needed
`interval_to_micros`	pure arithmetic
`validate` / `scaffold`	pure Rust
`InMemoryDb`	uses bundled DuckDB via `duckdb` crate (`bundled-test` feature)

Mock types for callback logic

When your scalar or table function callback reads inputs and writes outputs, extract that logic into a pure-Rust function. Then test it with MockVectorReader (input) and MockVectorWriter (output):

#![allow(unused)]
fn main() {
use quack_rs::testing::{MockVectorReader, MockVectorWriter};

// Pure Rust logic — extracted from the FFI callback
fn compute_upper(reader: &MockVectorReader, writer: &mut MockVectorWriter) {
    for i in 0..reader.row_count() {
        if reader.is_valid(i) {
            let s = reader.try_get_str(i).unwrap_or("");
            writer.write_varchar(i, &s.to_uppercase());
        } else {
            writer.set_null(i);
        }
    }
}

#[test]
fn test_compute_upper() {
    let reader = MockVectorReader::from_strs([Some("hello"), None, Some("world")]);
    let mut writer = MockVectorWriter::new(3);
    compute_upper(&reader, &mut writer);

    assert_eq!(writer.try_get_str(0), Some("HELLO"));
    assert!(writer.is_null(1));
    assert_eq!(writer.try_get_str(2), Some("WORLD"));
}
}

The real FFI callback becomes a thin wrapper:

#![allow(unused)]
fn main() {
unsafe extern "C" fn my_scalar(
    _info: duckdb_function_info,
    input: duckdb_data_chunk,
    output: duckdb_vector,
) {
    // Real DuckDB wrappers — only used in production, not in cargo test
    let reader = unsafe { VectorReader::new(input, 0) };
    let mut writer = unsafe { VectorWriter::new(output) };
    // TODO: adapt mock-compatible logic to real readers/writers
}
}

Testing registration with `MockRegistrar`

MockRegistrar implements the Registrar trait without calling any DuckDB C API. Use it to verify your registration function registers the right set of functions:

#![allow(unused)]
fn main() {
use quack_rs::connection::Registrar;
use quack_rs::testing::MockRegistrar;
use quack_rs::scalar::ScalarFunctionBuilder;
use quack_rs::types::TypeId;
use quack_rs::error::ExtensionError;

fn register_all(reg: &impl Registrar) -> Result<(), ExtensionError> {
    let upper = ScalarFunctionBuilder::new("upper_ext")
        .param(TypeId::Varchar)
        .returns(TypeId::Varchar);
    let lower = ScalarFunctionBuilder::new("lower_ext")
        .param(TypeId::Varchar)
        .returns(TypeId::Varchar);
    unsafe {
        reg.register_scalar(upper)?;
        reg.register_scalar(lower)?;
    }
    Ok(())
}

#[test]
fn test_register_all() {
    let mock = MockRegistrar::new();
    register_all(&mock).unwrap();
    assert_eq!(mock.total_registrations(), 2);
    assert!(mock.has_scalar("upper_ext"));
    assert!(mock.has_scalar("lower_ext"));
}
}

Limitation: MockRegistrar cannot be used with builders that hold LogicalType values (created via .returns_logical() or .param_logical()), because LogicalType::drop calls duckdb_destroy_logical_type. Use TypeId parameters with MockRegistrar.

SQL-level testing with `InMemoryDb` (`bundled-test` feature)

For SQL-level assertions — verifying that a SQL macro produces the correct output, or that a CREATE TABLE + INSERT + SELECT pipeline works — enable the bundled-test Cargo feature. This provides InMemoryDb, which wraps the duckdb crate's bundled DuckDB and automatically initialises the loadable-extension dispatch table before opening a connection (see Pitfall P9).

Two features expose InMemoryDb; pick the one that fits your build-time budget:

# Zero-config but slow: compile libduckdb from C++ source (~5-10 min cold).
[dev-dependencies]
quack-rs = { version = "0.13", features = ["bundled-test"] }

# Fast: link against a pre-built libduckdb. Set DUCKDB_DOWNLOAD_LIB=1 at build
# time and libduckdb-sys downloads the upstream release zip (~40 MB, cached
# under target/); or set DUCKDB_LIB_DIR=/path/to/libduckdb if you already have
# one extracted (requires libduckdb-sys >= 1.10503 for header auto-discovery).
[dev-dependencies]
quack-rs = { version = "0.13", features = ["bundled-test-prebuilt"] }

Both keep duckdb out of a plain cargo test and out of your published crate's dependency tree — it is pulled in only when one of these features is on.

If your tests need to LOAD your own locally-built .duckdb_extension artifact, use InMemoryDb::open_unsigned instead of open() — the allow_unsigned_extensions config option is startup-only and can't be set via SET after the connection has opened.

#![allow(unused)]
fn main() {
#[cfg(feature = "bundled-test")]
use quack_rs::testing::InMemoryDb;
use quack_rs::sql_macro::SqlMacro;

#[test]
fn test_clamp_macro_sql() {
    let db = InMemoryDb::open().unwrap();

    // Generate and execute the CREATE MACRO SQL
    let m = SqlMacro::scalar("clamp", &["x", "lo", "hi"], "greatest(lo, least(hi, x))").unwrap();
    db.execute_batch(&m.to_sql()).unwrap();

    // Verify correct output
    let result: i64 = db.query_one("SELECT clamp(5, 1, 10)").unwrap();
    assert_eq!(result, 5);

    let clamped: i64 = db.query_one("SELECT clamp(15, 1, 10)").unwrap();
    assert_eq!(clamped, 10);
}
}

Note: InMemoryDb cannot test your FFI callbacks (VectorReader, VectorWriter) because those still route through the loadable-extension dispatch. Use InMemoryDb for SQL logic and mocks for callback logic.

Why two tiers?

Pitfall P3 — Unit tests are insufficient. 435 unit tests passed in duckdb-behavioral while the extension had three critical bugs: a SEGFAULT on load, 6 of 7 functions not registering, and wrong results from a combine bug. E2E tests caught all three.

Test tier	What it catches	What it misses
Unit tests	Logic bugs in state structs	FFI wiring, registration failures, SEGFAULT
E2E tests	Everything above + FFI integration	Nothing (it's real DuckDB)

Both tiers are required. Unit tests give fast, deterministic feedback. E2E tests prove the extension actually works inside DuckDB.

Unit tests with `AggregateTestHarness`

AggregateTestHarness<S> simulates the DuckDB aggregate lifecycle in pure Rust without any DuckDB dependency:

flowchart LR
    N["new()"] --> U["update() × N"]
    U --> C["combine() *(optional)*"]
    C --> F["finalize()"]

Basic usage

#![allow(unused)]
fn main() {
use quack_rs::testing::AggregateTestHarness;
use quack_rs::aggregate::AggregateState;

#[derive(Default, Debug, PartialEq)]
struct SumState { total: i64 }
impl AggregateState for SumState {}

#[test]
fn test_sum() {
    let mut h = AggregateTestHarness::<SumState>::new();
    h.update(|s| s.total += 10);
    h.update(|s| s.total += 20);
    h.update(|s| s.total += 5);
    assert_eq!(h.finalize().total, 35);
}
}

Convenience: `aggregate`

For testing over a collection of inputs:

#![allow(unused)]
fn main() {
#[test]
fn test_word_count() {
    let result = AggregateTestHarness::<WordCountState>::aggregate(
        ["hello world", "one", "two three four", ""],
        |s, text| s.count += count_words(text),
    );
    assert_eq!(result.count, 6);  // 2 + 1 + 3 + 0
}
}

Testing `combine` (Pitfall L1)

DuckDB creates fresh zero-initialized target states and calls combine to merge into them. You MUST propagate ALL fields — including configuration fields — not just accumulated data. Test this explicitly:

#![allow(unused)]
fn main() {
#[test]
fn combine_propagates_config() {
    let mut h1 = AggregateTestHarness::<MyState>::new();
    h1.update(|s| {
        s.window_size = 3600;  // config field
        s.count += 5;          // data field
    });

    // h2 simulates a fresh zero-initialized state created by DuckDB
    let mut h2 = AggregateTestHarness::<MyState>::new();

    h2.combine(&h1, |src, tgt| {
        tgt.window_size = src.window_size;  // MUST propagate config
        tgt.count += src.count;
    });

    let result = h2.finalize();
    assert_eq!(result.window_size, 3600);  // Would be 0 if forgotten
    assert_eq!(result.count, 5);
}
}

Inspecting intermediate state

#![allow(unused)]
fn main() {
let mut h = AggregateTestHarness::<SumState>::new();
h.update(|s| s.total += 5);
assert_eq!(h.state().total, 5);   // borrow without consuming
h.update(|s| s.total += 3);
assert_eq!(h.state().total, 8);
}

Resetting

#![allow(unused)]
fn main() {
let mut h = AggregateTestHarness::<SumState>::new();
h.update(|s| s.total = 999);
h.reset();
assert_eq!(h.state().total, 0);  // back to S::default()
}

Pre-populating state

#![allow(unused)]
fn main() {
let initial = MyState { window_size: 3600, count: 0 };
let h = AggregateTestHarness::with_state(initial);
}

Unit tests for scalar functions

Scalar logic is pure Rust — test it directly:

#![allow(unused)]
fn main() {
// From examples/hello-ext/src/lib.rs — scalar function logic
pub fn first_word(s: &str) -> &str {
    s.split_whitespace().next().unwrap_or("")
}

#[test]
fn first_word_basic() {
    assert_eq!(first_word("hello world"), "hello");
    assert_eq!(first_word("  padded  "), "padded");
    assert_eq!(first_word(""), "");
    assert_eq!(first_word("   "), "");
}
}

Unit tests for SQL macros

SqlMacro::to_sql() is pure Rust — no DuckDB connection needed:

#![allow(unused)]
fn main() {
use quack_rs::sql_macro::SqlMacro;

#[test]
fn scalar_macro_sql() {
    let m = SqlMacro::scalar("double_it", &["x"], "x * 2").unwrap();
    assert_eq!(m.to_sql(),
        "CREATE OR REPLACE MACRO double_it(x) AS (x * 2)");
}

#[test]
fn table_macro_sql() {
    let m = SqlMacro::table("recent", &["n"], "SELECT * FROM events LIMIT n").unwrap();
    assert_eq!(m.to_sql(),
        "CREATE OR REPLACE MACRO recent(n) AS TABLE SELECT * FROM events LIMIT n");
}
}

E2E testing with SQLLogicTest

Community extensions are tested using DuckDB's SQLLogicTest format. This format runs SQL directly in DuckDB and verifies output line-by-line.

File location

test/sql/my_extension.test

Format

# my_extension tests

require my_extension

statement ok
LOAD my_extension;

query I
SELECT my_function('hello world');
----
2

Directives:

Directive	Meaning
`require`	Skip test if extension not available
`statement ok`	SQL must succeed
`statement error`	SQL must fail
`query I`	Query returning one INTEGER column
`query II`	Query returning two columns
`query T`	Query returning one TEXT column
`----`	Expected output follows

Installing DuckDB (1.4.4, 1.5.0, or 1.5.1)

A live DuckDB CLI is required for E2E testing. Install it via curl (no system package manager needed). DuckDB 1.4.4, 1.5.0, or 1.5.1 all work — they use the same C API version (v1.2.0). We recommend 1.5.1 for critical WAL and ART index fixes:

# DuckDB 1.5.1 (recommended)
curl -fsSL https://github.com/duckdb/duckdb/releases/download/v1.5.1/duckdb_cli-linux-amd64.zip \
    -o /tmp/duckdb.zip \
    && unzip -o /tmp/duckdb.zip -d /tmp/ \
    && chmod +x /tmp/duckdb \
    && /tmp/duckdb --version
# → v1.5.1

For macOS, replace linux-amd64 with osx-universal. For Windows, use windows-amd64 and unzip to a directory on %PATH%.

Running E2E tests

# Build the extension
cargo build --release

# Package with metadata footer (required by DuckDB's extension loader)
cargo run --bin append_metadata -- \
    target/release/libmy_extension.so \
    /tmp/my_extension.duckdb_extension \
    --abi-type C_STRUCT \
    --extension-version v0.1.0 \
    --duckdb-version v1.2.0 \
    --platform linux_amd64

# Load it in DuckDB CLI (-unsigned allows loading without a signed certificate)
/tmp/duckdb -unsigned -c "
SET allow_extensions_metadata_mismatch=true;
LOAD '/tmp/my_extension.duckdb_extension';
SELECT my_function('hello world');
"

The community extension CI runs SQLLogicTest automatically. Each function must have at least one test:

# Test NULL handling
query I
SELECT my_function(NULL);
----
NULL

# Test empty input
query I
SELECT my_function('');
----
0

# Test normal case
query I
SELECT my_function('hello world');
----
2

Pitfall P5 — SQLLogicTest does exact string matching. Copy expected values directly from DuckDB CLI output. NULL is represented as NULL (uppercase). Floats must match to the number of decimal places DuckDB outputs.

Property-based testing with `proptest`

The proptest crate is well-suited for testing aggregate logic over arbitrary inputs:

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn saturating_never_panics(months: i32, days: i32, micros: i64) {
        let iv = DuckInterval { months, days, micros };
        // Must not panic for any input
        let _ = interval_to_micros_saturating(iv);
    }
}
}

quack-rs's own test suite uses proptest for interval conversion and aggregate harness properties.

What to test

Scenario	Unit	E2E
NULL input → NULL output		✓
Empty string	✓	✓
Unicode strings	✓
Numeric edge cases (0, MAX, MIN)	✓
Combine propagates config	✓
Multi-group aggregation		✓
Function registration success		✓
Extension loads without crash		✓
SQL macro produces correct output	✓ (to_sql)	✓

Dev dependencies

[dev-dependencies]
quack-rs = { version = "0.13", features = [] }
proptest = "1"

The testing module is compiled unconditionally (not #[cfg(test)]) so it is available as a dev-dependency to downstream crates.

Community Extensions

DuckDB's community extension ecosystem allows anyone to publish a loadable extension that DuckDB users can install with a single SQL command. This page covers everything you need to submit and maintain a community extension built with quack-rs.

Prerequisites

A working extension that passes local E2E tests
A GitHub repository (the community build runs from it)
All functions tested with SQLLogicTest format
A globally unique extension name

Scaffolding a new project

quack_rs::scaffold::generate_scaffold generates all required files from a single function call:

#![allow(unused)]
fn main() {
use quack_rs::scaffold::{ScaffoldConfig, generate_scaffold};

let config = ScaffoldConfig {
    name: "my_extension".to_string(),
    description: "Does something useful".to_string(),
    version: "0.1.0".to_string(),
    license: "MIT".to_string(),
    maintainer: "Your Name".to_string(),
    github_repo: "yourorg/duckdb-my-extension".to_string(),
    excluded_platforms: vec![],
};

let files = generate_scaffold(&config).expect("scaffold failed");
for file in &files {
    std::fs::create_dir_all(std::path::Path::new(&file.path).parent().unwrap()).unwrap();
    std::fs::write(&file.path, &file.content).unwrap();
}
}

This generates:

my_extension/
├── Cargo.toml
├── Makefile
├── extension_config.cmake
├── src/lib.rs
├── src/wasm_lib.rs
├── description.yml
├── test/sql/my_extension.test
├── .github/workflows/extension-ci.yml
├── .gitmodules
├── .gitignore
└── .cargo/config.toml

`description.yml`

Required fields for community submission:

extension:
  name: my_extension
  description: One-line description of what your extension does
  version: 0.1.0
  language: Rust
  build: cargo
  license: MIT
  requires_toolchains: rust;python3
  excluded_platforms: ""   # or "wasm_mvp;wasm_eh;wasm_threads"
  maintainers:
    - Your Name

repo:
  github: yourorg/duckdb-my-extension
  ref: main

Use quack_rs::validate to pre-validate fields before submission:

#![allow(unused)]
fn main() {
use quack_rs::validate::{
    validate_extension_name,
    validate_extension_version,
    validate_spdx_license,
    validate_excluded_platforms_str,
};

validate_extension_name("my_extension")?;
validate_extension_version("0.1.0")?;
validate_spdx_license("MIT")?;
validate_excluded_platforms_str("wasm_mvp;wasm_eh")?;
}

Naming rules

Extension names must satisfy all of the following:

Match ^[a-z][a-z0-9_-]*$ (lowercase, digits, hyphens, underscores)
Not exceed 64 characters
Be globally unique across the entire DuckDB community extensions ecosystem

Check existing names at community-extensions.duckdb.org before choosing. Use vendor-prefixed names to avoid collisions:

myorg_analytics   ✓
analytics         ✗  (likely taken or too generic)

Pitfall P1 — The [lib] name in Cargo.toml MUST exactly match the extension name. If your crate name is duckdb-my-ext (producing libduckdb_my_ext.so) but description.yml says name: my_ext, the community build fails with FileNotFoundError.

Versioning

Format	Example	Meaning
7+ hex chars	`690bfc5`	Unstable — no guarantees
`0.y.z`	`0.1.0`	Pre-release — working toward stability
`x.y.z` (x > 0)	`1.0.0`	Stable — full semver guarantees

Use validate_extension_version to accept all three formats, and classify_extension_version to determine the stability tier:

#![allow(unused)]
fn main() {
use quack_rs::validate::semver::classify_extension_version;

match classify_extension_version("0.1.0")? {
    ExtensionStability::Unstable => println!("git hash"),
    ExtensionStability::PreRelease => println!("0.y.z"),
    ExtensionStability::Stable => println!("x.y.z, x>0"),
}
}

Platform targets

Community extensions are built for:

Platform	Description
`linux_amd64`	Linux x86_64
`linux_amd64_gcc4`	Linux x86_64 (GCC 4 ABI)
`linux_arm64`	Linux AArch64
`osx_amd64`	macOS x86_64
`osx_arm64`	macOS Apple Silicon
`windows_amd64`	Windows x86_64
`windows_amd64_mingw`	Windows x86_64 (MinGW)
`windows_arm64`	Windows AArch64
`wasm_mvp`	WebAssembly (MVP)
`wasm_eh`	WebAssembly (exception handling)
`wasm_threads`	WebAssembly (threads)

If your extension cannot be built for a platform (e.g., it uses a platform-specific system library), add it to excluded_platforms:

#![allow(unused)]
fn main() {
ScaffoldConfig {
    excluded_platforms: vec![
        "wasm_mvp".to_string(),
        "wasm_eh".to_string(),
        "wasm_threads".to_string(),
    ],
    // ...
}
}

Validate individual platform names with validate_platform:

#![allow(unused)]
fn main() {
use quack_rs::validate::validate_platform;
validate_platform("linux_amd64")?;  // Ok
validate_platform("invalid")?;       // Err
}

`Cargo.toml` requirements

[package]
name = "my_extension"
version = "0.1.0"
edition = "2021"

[lib]
name = "my_extension"       # Must match description.yml `name`
crate-type = ["cdylib", "rlib"]

[dependencies]
quack-rs = "0.13"
libduckdb-sys = { version = ">=1.4.4, <2", features = ["loadable-extension"] }

[profile.release]
panic = "abort"              # Required — no stack unwinding in FFI
opt-level = 3
lto = "thin"
strip = "symbols"

Pitfall ADR-1 — Do NOT use the duckdb crate's bundled feature. A loadable extension must link against the DuckDB that loads it, not bundle its own copy. libduckdb-sys with loadable-extension provides lazy function pointers populated by DuckDB at load time.

Release profile check

The validate_release_profile validator checks that your release profile is correctly configured:

#![allow(unused)]
fn main() {
use quack_rs::validate::validate_release_profile;

// Pass all four release profile settings from your Cargo.toml
validate_release_profile("abort", "true", "3", "1")?;   // Ok
validate_release_profile("unwind", "true", "3", "1")?;   // Err — panics across FFI are UB
}

CI workflow

The scaffold generates .github/workflows/extension-ci.yml which:

Runs on push and pull request
Checks, lints, and tests in Rust (all platforms)
Calls extension-ci-tools to build the .duckdb_extension artifact
Runs SQLLogicTest integration tests

After scaffolding:

cd my_extension
git init
git submodule add https://github.com/duckdb/extension-ci-tools.git extension-ci-tools
git submodule update --init --recursive
make configure
make release

Pitfall P4 — The extension-ci-tools submodule must be initialized. make configure fails if the submodule is missing.

Submitting to the community registry

Create a pull request against the community-extensions repository
Add your description.yml under extensions/my_extension/description.yml
CI runs automatically to verify the build
Once approved, users can install your extension:

INSTALL my_extension FROM community;
LOAD my_extension;

Binary compatibility

Extension binaries are tied to a specific DuckDB version. When DuckDB releases a new version:

New binaries must be built against that version
Old binaries will be refused by the new DuckDB runtime
The community build pipeline re-builds all extensions for each DuckDB release

Pin libduckdb-sys with = (exact version) to ensure you always build against the exact version you intend. The quack_rs::DUCKDB_API_VERSION constant ("v1.2.0") is passed to init_extension and must match the C API version of your pinned libduckdb-sys.

Pitfall P2 — The -dv flag to append_extension_metadata.py must be the C API version (v1.2.0), not the DuckDB release version (v1.4.4). Use quack_rs::DUCKDB_API_VERSION to avoid hardcoding this.

Security considerations

Community extensions are not vetted for security by the DuckDB team:

Never panic across FFI boundaries (panic = "abort" enforces this)
Validate user inputs at system boundaries (extension entry point is the boundary)
Do not include secrets, API keys, or credentials in your binary
Dynamic SQL in SQL macros must not construct queries from unsanitized user data

Pitfall Catalog

All known DuckDB Rust FFI pitfalls, discovered while building duckdb-behavioral, a production DuckDB community extension. Every future developer who builds a Rust DuckDB extension will hit the majority of these. quack-rs makes most of them impossible.

L1: COMBINE must propagate ALL config fields

Status: Testable with AggregateTestHarness.

Symptom: Aggregate function returns wrong results. No error, no crash.

Root cause: DuckDB's segment tree creates fresh zero-initialized target states via state_init, then calls combine to merge source states into them. If your combine only propagates data fields (count, sum) but omits configuration fields (window_size, mode), the configuration will be zero at finalize time, silently corrupting results.

This bug passed 435 unit tests before being caught by E2E tests.

Fix:

#![allow(unused)]
fn main() {
unsafe extern "C" fn combine(
    _info: duckdb_function_info,
    source: *mut duckdb_aggregate_state,
    target: *mut duckdb_aggregate_state,
    count: idx_t,
) {
    for i in 0..count as usize {
        let src_ptr = unsafe { *source.add(i) };
        let tgt_ptr = unsafe { *target.add(i) };
        if let (Some(src), Some(tgt)) = (
            FfiState::<MyState>::with_state(src_ptr),
            FfiState::<MyState>::with_state_mut(tgt_ptr),
        ) {
            tgt.window_size = src.window_size;  // config — MUST copy
            tgt.mode = src.mode;                // config — MUST copy
            tgt.count += src.count;             // data — accumulate
        }
    }
}
}

Test this with AggregateTestHarness::combine — see Testing Guide.

L2: State destroy double-free

Status: Made impossible by FfiState<T>.

Symptom: Crash or memory corruption on extension unload.

Root cause: If state_destroy frees the inner Box but does not null the pointer, a second state_destroy call (common in error paths) frees already-freed memory → undefined behavior.

Fix: FfiState<T>::destroy_callback nulls inner after freeing. Use it instead of writing your own destructor:

#![allow(unused)]
fn main() {
unsafe extern "C" fn state_destroy(states: *mut duckdb_aggregate_state, count: idx_t) {
    unsafe { FfiState::<MyState>::destroy_callback(states, count) };
}
}

L3: No panic across FFI boundaries

Status: Made impossible by init_extension and panic = "abort".

Symptom: Extension causes DuckDB to crash or behave unpredictably.

Root cause: panic!() and .unwrap() in unsafe extern "C" functions is undefined behavior. Panics cannot unwind across FFI boundaries in Rust.

Fix: Use Result and ? inside init_extension. Never use unwrap() in FFI callbacks. FfiState::with_state_mut returns Option, not Result, so callers use if let:

#![allow(unused)]
fn main() {
// Safe pattern — no unwrap in FFI callback
if let Some(st) = unsafe { FfiState::<MyState>::with_state_mut(state_ptr) } {
    st.count += 1;
}

// Dangerous — never do this in an FFI callback
let st = unsafe { FfiState::<MyState>::with_state_mut(state_ptr) }.unwrap(); // UB if None
}

The scaffold-generated Cargo.toml sets panic = "abort" in the release profile, which terminates the process instead of unwinding — still bad, but not undefined behavior.

L4: `ensure_validity_writable` is required before NULL output

Status: Made impossible by VectorWriter::set_null.

Symptom: SEGFAULT when writing NULL values to the output vector.

Root cause: duckdb_vector_get_validity returns an uninitialized pointer if duckdb_vector_ensure_validity_writable has not been called first. Writing to an uninitialized address → SEGFAULT.

Fix: Always call duckdb_vector_ensure_validity_writable before accessing the validity bitmap on the write path. VectorWriter::set_null does this automatically:

#![allow(unused)]
fn main() {
// Correct — handled by set_null
unsafe { writer.set_null(row) };

// Wrong — validity bitmap may not be allocated yet
// let validity = duckdb_vector_get_validity(output);
// set_bit(validity, row, false);  // SEGFAULT
}

L5: Boolean reading must use `u8 != 0`, not `*const bool`

Status: Made impossible by VectorReader::read_bool.

Symptom: Undefined behavior; Rust requires bool to be exactly 0 or 1.

Root cause: DuckDB's C API does not guarantee that boolean values in vectors are exactly 0 or 1. Values of 2, 255, etc. cast to Rust bool is undefined behavior.

Fix: Read as u8 and compare with != 0. VectorReader::read_bool always does this:

#![allow(unused)]
fn main() {
let b: bool = unsafe { reader.read_bool(row) };  // safe: uses u8 != 0 internally
}

L6: Function set name must be set on EACH member

Status: Made impossible by AggregateFunctionSetBuilder.

Symptom: Functions are silently not registered. No error returned.

Root cause: When using duckdb_register_aggregate_function_set, the function name must be set on EACH individual duckdb_aggregate_function using duckdb_aggregate_function_set_name, not just on the set.

This is completely undocumented. Discovered by reading DuckDB's C++ test code at test/api/capi/test_capi_aggregate_functions.cpp.

In duckdb-behavioral, 6 of 7 functions failed to register silently due to this bug.

Fix: AggregateFunctionSetBuilder calls duckdb_aggregate_function_set_name on every individual function before adding it to the set. Use it instead of managing the set manually.

L7: LogicalType memory leak

Status: Made impossible by LogicalType RAII wrapper.

Symptom: Memory leak proportional to number of registered functions.

Root cause: duckdb_create_logical_type allocates memory that must be freed with duckdb_destroy_logical_type. Forgetting leaks memory.

Fix: LogicalType implements Drop and calls duckdb_destroy_logical_type automatically when it goes out of scope.

P1: Library name must match extension name

Status: Must be configured in Cargo.toml. Scaffold handles this.

Symptom: Community build fails with FileNotFoundError.

Root cause: The community build expects lib{extension_name}.so. If the Cargo crate name produces a different .so filename, the build fails.

Fix: Set name explicitly in [lib]:

[lib]
name = "my_extension"   # Must match description.yml `name: my_extension`
crate-type = ["cdylib", "rlib"]

P2: Metadata version is C API version, not DuckDB version

Status: DUCKDB_API_VERSION constant encodes the correct value.

Symptom: Metadata script fails or produces incorrect metadata.

Root cause: The -dv flag to append_extension_metadata.py must be the C API version (v1.2.0), not the DuckDB release version (v1.4.4). These are different strings.

Fix: Use quack_rs::DUCKDB_API_VERSION ("v1.2.0") in init_extension, and use the same version with append_extension_metadata.py -dv v1.2.0.

P3: E2E testing is mandatory

Status: Documented. See Testing Guide.

Symptom: All unit tests pass but the extension is completely broken.

Root cause: Unit tests cannot detect SEGFAULTs on load, silent registration failures, or wrong results from combine bugs.

Fix: Always run E2E tests using an actual DuckDB binary. The scaffold generates a complete SQLLogicTest skeleton.

P4: `extension-ci-tools` submodule must be initialized

Status: Build-time check.

Symptom: make configure or make release fails.

Fix:

git submodule update --init --recursive

P5: SQLLogicTest expected values must match exactly

Status: Test-authoring care required.

Symptom: Tests fail in CI but pass locally (or vice versa).

Root cause: SQLLogicTest does exact string matching. Output format (decimal places, NULL representation, column separators) must match character-for-character.

Fix: Generate expected values by running the SQL in DuckDB CLI and copying the output. NULL is NULL (uppercase). Integers have no decimal places.

P6: `duckdb_register_aggregate_function_set` silently fails

Status: Builder returns Err. Also see L6.

Symptom: Function appears registered but is not found in SQL.

Root cause: The return value of duckdb_register_aggregate_function_set is often ignored. When it returns DuckDBError, the function set is not registered.

Fix: The builder checks the return value and propagates it as Err.

P7: `duckdb_string_t` format is undocumented

Status: Handled by VectorReader::read_str and DuckStringView.

Symptom: VARCHAR reading produces garbage, empty strings, or crashes.

Root cause: DuckDB stores strings in a 16-byte struct with two formats (inline ≤ 12 bytes, pointer > 12 bytes) that are not documented in libduckdb-sys.

Fix: Use VectorReader::read_str(row). See NULL Handling & Strings.

P8: `INTERVAL` struct layout is undocumented

Status: Handled by DuckInterval and read_interval_at.

Symptom: Interval calculations produce wrong results or crashes.

Root cause: DuckDB's INTERVAL is { months: i32, days: i32, micros: i64 } (16 bytes total). This is not documented in libduckdb-sys. Month conversion uses 1 month = 30 days (DuckDB's approximation).

Fix: Use VectorReader::read_interval(row) and DuckInterval. See INTERVAL Type.

P9: `loadable-extension` dispatch table uninitialised in `cargo test`

Status: Fixed. InMemoryDb::open() initialises the dispatch table automatically.

Symptom: All three InMemoryDb unit tests panic at runtime:

thread 'testing::in_memory_db::tests::in_memory_db_opens' panicked at
'DuckDB API not initialized or DuckDB feature omitted'

This failure appears only when running cargo test --features bundled-test. Regular cargo test (no feature) does not exercise this code path, so CI can miss it entirely.

Root cause: Cargo's feature-unification merges loadable-extension (from the main libduckdb-sys dependency) and bundled-full (pulled in by the duckdb crate's features = ["bundled"]) into a single libduckdb-sys build with both features active. In loadable-extension mode every DuckDB C API call is routed through an AtomicPtr<fn> dispatch table, which is normally populated at extension-load time when DuckDB calls duckdb_rs_extension_api_init. In cargo test, no DuckDB host process loads the extension, so the table stays uninitialised and every call panics.

Discovery: This was triggered by the crates.io release workflow (which runs --all-features) failing on macOS. Regular CI (--no-default-features, --all-targets) never compiled the bundled-test path, so the bug was hidden during development and code review.

Fix (implemented in quack-rs 0.6.0):

src/testing/bundled_api_init.cpp — a thin C++ shim that wraps DuckDB's internal CreateAPIv1() (from duckdb/main/capi/extension_api.hpp) as a C-linkage symbol:
```
#include "duckdb/main/capi/extension_api.hpp"
extern "C" duckdb_ext_api_v1 quack_rs_create_api_v1() {
    return CreateAPIv1();
}
```
build.rs — compiles the shim (via the cc crate) only when the bundled-test feature is active, locating the DuckDB headers from the libduckdb-sys build output directory.
InMemoryDb::open() — calls init_dispatch_table_once() before opening the connection. That function calls quack_rs_create_api_v1() once and feeds the result through duckdb_rs_extension_api_init, populating all 459 AtomicPtr slots in the dispatch table. A std::sync::Once guard makes it safe to call from any number of threads and test cases.
CI test-bundled job — runs cargo test --all-targets --features bundled-test on Linux, macOS, and Windows on every PR, so this class of failure is caught before release.

ABI compatibility note: DuckDB's duckdb_ext_api_v1 struct is defined identically in both the public duckdb_extension.h (used by libduckdb-sys bindgen) and the internal extension_api.hpp (used by CreateAPIv1()). Both include the DUCKDB_EXTENSION_API_VERSION_UNSTABLE fields. CreateAPIv1() sets all 459 fields. The Rust and C++ structs are produced from the same DuckDB release and therefore stay in sync.

Risk table (using DuckDB's internal C++ API):

Risk	Mitigation
`extension_api.hpp` is renamed or moved	`build.rs` fails with a clear compile error
`CreateAPIv1()` is renamed	Same — C++ compile error
`duckdb_ext_api_v1` gains new fields	`CreateAPIv1()` fills new fields too
`duckdb_ext_api_v1` field order changes	Both structs from same DuckDB release, stay in sync
`libduckdb-sys` drops `loadable-extension` dispatch	Problem disappears; `Once` guard becomes cheap no-op

Summary

Pitfall	SDK status	Your action
L1: combine config fields	Testable	Test with `AggregateTestHarness::combine`
L2: state double-free	Prevented	Use `FfiState::destroy_callback`
L3: panic across FFI	Prevented	Use `init_extension`, no `unwrap` in callbacks
L4: validity bitmap SEGFAULT	Prevented	Use `VectorWriter::set_null`
L5: bool UB	Prevented	Use `VectorReader::read_bool`
L6: function set name	Prevented	Use `AggregateFunctionSetBuilder`
L7: LogicalType leak	Prevented	Use `LogicalType` (RAII)
P1: lib name mismatch	Scaffold	Set `[lib] name` in `Cargo.toml`
P2: API version string	Constant	Use `DUCKDB_API_VERSION`
P3: unit tests insufficient	Documented	Write SQLLogicTest E2E tests
P4: submodule not initialized	Build-time	`git submodule update --init`
P5: SQLLogicTest exact match	Documented	Copy output from DuckDB CLI
P6: register set silent fail	Prevented	Builder returns `Err`
P7: VARCHAR format undocumented	Prevented	Use `VectorReader::read_str`
P8: INTERVAL layout undocumented	Prevented	Use `DuckInterval`
P9: dispatch table uninitialised	Fixed	`InMemoryDb::open()` initialises it via C++ shim

TypeId Reference

quack_rs::types::TypeId is an ergonomic enum of all DuckDB column types supported by the builder APIs. It wraps the DUCKDB_TYPE_* integer constants from libduckdb-sys and provides safe, named variants.

Full variant table

Variant	SQL name	libduckdb-sys constant	Notes
`TypeId::Boolean`	`BOOLEAN`	`DUCKDB_TYPE_BOOLEAN`	true/false stored as u8
`TypeId::TinyInt`	`TINYINT`	`DUCKDB_TYPE_TINYINT`	8-bit signed
`TypeId::SmallInt`	`SMALLINT`	`DUCKDB_TYPE_SMALLINT`	16-bit signed
`TypeId::Integer`	`INTEGER`	`DUCKDB_TYPE_INTEGER`	32-bit signed
`TypeId::BigInt`	`BIGINT`	`DUCKDB_TYPE_BIGINT`	64-bit signed
`TypeId::UTinyInt`	`UTINYINT`	`DUCKDB_TYPE_UTINYINT`	8-bit unsigned
`TypeId::USmallInt`	`USMALLINT`	`DUCKDB_TYPE_USMALLINT`	16-bit unsigned
`TypeId::UInteger`	`UINTEGER`	`DUCKDB_TYPE_UINTEGER`	32-bit unsigned
`TypeId::UBigInt`	`UBIGINT`	`DUCKDB_TYPE_UBIGINT`	64-bit unsigned
`TypeId::HugeInt`	`HUGEINT`	`DUCKDB_TYPE_HUGEINT`	128-bit signed
`TypeId::Float`	`FLOAT`	`DUCKDB_TYPE_FLOAT`	32-bit IEEE 754
`TypeId::Double`	`DOUBLE`	`DUCKDB_TYPE_DOUBLE`	64-bit IEEE 754
`TypeId::Timestamp`	`TIMESTAMP`	`DUCKDB_TYPE_TIMESTAMP`	µs since Unix epoch
`TypeId::TimestampTz`	`TIMESTAMPTZ`	`DUCKDB_TYPE_TIMESTAMP_TZ`	timezone-aware timestamp
`TypeId::Date`	`DATE`	`DUCKDB_TYPE_DATE`	days since epoch
`TypeId::Time`	`TIME`	`DUCKDB_TYPE_TIME`	µs since midnight
`TypeId::Interval`	`INTERVAL`	`DUCKDB_TYPE_INTERVAL`	months + days + µs
`TypeId::Varchar`	`VARCHAR`	`DUCKDB_TYPE_VARCHAR`	UTF-8 string
`TypeId::Blob`	`BLOB`	`DUCKDB_TYPE_BLOB`	binary data
`TypeId::Decimal`	`DECIMAL`	`DUCKDB_TYPE_DECIMAL`	fixed-point decimal
`TypeId::TimestampS`	`TIMESTAMP_S`	`DUCKDB_TYPE_TIMESTAMP_S`	seconds since epoch
`TypeId::TimestampMs`	`TIMESTAMP_MS`	`DUCKDB_TYPE_TIMESTAMP_MS`	milliseconds since epoch
`TypeId::TimestampNs`	`TIMESTAMP_NS`	`DUCKDB_TYPE_TIMESTAMP_NS`	nanoseconds since epoch
`TypeId::Enum`	`ENUM`	`DUCKDB_TYPE_ENUM`	enumeration type
`TypeId::List`	`LIST`	`DUCKDB_TYPE_LIST`	variable-length list
`TypeId::Struct`	`STRUCT`	`DUCKDB_TYPE_STRUCT`	named fields (row type)
`TypeId::Map`	`MAP`	`DUCKDB_TYPE_MAP`	key-value pairs
`TypeId::Uuid`	`UUID`	`DUCKDB_TYPE_UUID`	128-bit UUID
`TypeId::Union`	`UNION`	`DUCKDB_TYPE_UNION`	tagged union of types
`TypeId::Bit`	`BIT`	`DUCKDB_TYPE_BIT`	bitstring
`TypeId::TimeTz`	`TIMETZ`	`DUCKDB_TYPE_TIME_TZ`	timezone-aware time
`TypeId::UHugeInt`	`UHUGEINT`	`DUCKDB_TYPE_UHUGEINT`	128-bit unsigned
`TypeId::Array`	`ARRAY`	`DUCKDB_TYPE_ARRAY`	fixed-length array
`TypeId::TimeNs`	`TIME_NS`	`DUCKDB_TYPE_TIME_NS`	nanosecond-precision time (`duckdb-1-5`)
`TypeId::Any`	`ANY`	`DUCKDB_TYPE_ANY`	wildcard for function signatures (`duckdb-1-5`)
`TypeId::Varint`	`VARINT`	`DUCKDB_TYPE_BIGNUM`	variable-length integer (`duckdb-1-5`)
`TypeId::SqlNull`	`SQLNULL`	`DUCKDB_TYPE_SQLNULL`	explicit SQL NULL type (`duckdb-1-5`)
`TypeId::IntegerLiteral`	`INTEGER_LITERAL`	`DUCKDB_TYPE_INTEGER_LITERAL`	unresolved integer literal (`duckdb-1-5`)
`TypeId::StringLiteral`	`STRING_LITERAL`	`DUCKDB_TYPE_STRING_LITERAL`	unresolved string literal (`duckdb-1-5`)
`TypeId::Geometry`	`GEOMETRY`	`DUCKDB_TYPE_GEOMETRY`	spatial geometry value (`duckdb-1-5-3`)
`TypeId::Variant`	`VARIANT`	`DUCKDB_TYPE_VARIANT`	self-describing nested value, e.g. Iceberg v3 (`duckdb-1-5-3`)

Feature gate for Geometry / Variant: DUCKDB_TYPE_GEOMETRY (40) and DUCKDB_TYPE_VARIANT (41) require the duckdb-1-5-3 feature, which layers on top of duckdb-1-5 and needs libduckdb-sys >= 1.10503.1 (DuckDB 1.5.3). They sit behind a separate feature because these type-enum values postdate the duckdb-1-5 feature's 1.5.0 floor (VARIANT was added in DuckDB 1.5.3); gating them this way avoids breaking consumers pinned to libduckdb-sys 1.5.0–1.5.2. See Known Limitations.

Methods

`to_duckdb_type() → DUCKDB_TYPE`

Converts to the raw C API integer constant. Used internally by the builder APIs.

#![allow(unused)]
fn main() {
use quack_rs::types::TypeId;

let raw: libduckdb_sys::DUCKDB_TYPE = TypeId::BigInt.to_duckdb_type();
}

`from_duckdb_type(raw) → TypeId`

Converts a raw DUCKDB_TYPE constant back into a TypeId. Recognizes every variant available in the active feature set, including the duckdb-1-5 values (TIME_NS, ANY, VARINT, SQLNULL, INTEGER_LITERAL, STRING_LITERAL) and the duckdb-1-5-3 values (GEOMETRY, VARIANT) when those features are enabled. Panics if the value does not correspond to any variant available in the current feature configuration.

#![allow(unused)]
fn main() {
use quack_rs::types::TypeId;

let type_id = TypeId::from_duckdb_type(libduckdb_sys::DUCKDB_TYPE_DUCKDB_TYPE_BIGINT);
assert_eq!(type_id, TypeId::BigInt);
}

`sql_name() → &'static str`

Returns the SQL type name as a static string.

#![allow(unused)]
fn main() {
assert_eq!(TypeId::BigInt.sql_name(), "BIGINT");
assert_eq!(TypeId::Varchar.sql_name(), "VARCHAR");
assert_eq!(TypeId::TimestampTz.sql_name(), "TIMESTAMPTZ");
}

`Display`

TypeId implements Display, which outputs the SQL name:

#![allow(unused)]
fn main() {
println!("{}", TypeId::Interval);  // prints: INTERVAL
let s = format!("{}", TypeId::UBigInt); // "UBIGINT"
}

VectorReader/VectorWriter mapping

The read and write methods on VectorReader/VectorWriter map to TypeId variants as follows:

TypeId	Read method	Write method	Rust type
`Boolean`	`read_bool`	`write_bool`	`bool`
`TinyInt`	`read_i8`	`write_i8`	`i8`
`SmallInt`	`read_i16`	`write_i16`	`i16`
`Integer`	`read_i32`	`write_i32`	`i32`
`BigInt`	`read_i64`	`write_i64`	`i64`
`UTinyInt`	`read_u8`	`write_u8`	`u8`
`USmallInt`	`read_u16`	`write_u16`	`u16`
`UInteger`	`read_u32`	`write_u32`	`u32`
`UBigInt`	`read_u64`	`write_u64`	`u64`
`Float`	`read_f32`	`write_f32`	`f32`
`Double`	`read_f64`	`write_f64`	`f64`
`Varchar`	`read_str`	`write_varchar`	`&str`
`Interval`	`read_interval`	`write_interval`	`DuckInterval`

HugeInt, Blob, List, Struct, Map, Uuid, Date, Time, Timestamp, TimestampTz, Decimal, TimestampS, TimestampMs, TimestampNs, Enum, Union, Bit, TimeTz, UHugeInt, Array, TimeNs, Any, Varint, SqlNull, IntegerLiteral, StringLiteral, Geometry, Variant do not yet have dedicated read/write helpers. Access these via the raw data pointer from duckdb_vector_get_data.

Properties

TypeId implements Debug, Clone, Copy, PartialEq, Eq, and Hash, making it usable as map keys, set elements, and in match expressions:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use quack_rs::types::TypeId;

let mut type_names: HashMap<TypeId, &str> = HashMap::new();
type_names.insert(TypeId::BigInt, "count");
type_names.insert(TypeId::Varchar, "label");
}

`#[non_exhaustive]`

TypeId is marked #[non_exhaustive]. This means future DuckDB versions may add new variants without it being a breaking change. If you match on TypeId, include a wildcard arm:

#![allow(unused)]
fn main() {
match type_id {
    TypeId::BigInt => { /* ... */ }
    TypeId::Varchar => { /* ... */ }
    _ => { /* handle future types */ }
}
}

`LogicalType`

For types that require runtime parameters (such as DECIMAL(p, s) or parameterized LIST), use quack_rs::types::LogicalType:

#![allow(unused)]
fn main() {
use quack_rs::types::{LogicalType, TypeId};

let lt = LogicalType::new(TypeId::BigInt);
// or use the From impl:
let lt: LogicalType = TypeId::BigInt.into();
// LogicalType implements Drop → calls duckdb_destroy_logical_type automatically
}

LogicalType wraps duckdb_logical_type with RAII cleanup, preventing the memory leak described in Pitfall L7.

Constructors

Constructor	Creates
`new(type_id)`	Simple type from a `TypeId`
`from_raw(ptr)`	Takes ownership of a raw handle (unsafe)
`decimal(width, scale)`	`DECIMAL(width, scale)`
`list(element_type)`	`LIST<T>` from a `TypeId`
`list_from_logical(element)`	`LIST<T>` from an existing `LogicalType`
`map(key, value)`	`MAP<K, V>` from `TypeId`s
`map_from_logical(key, value)`	`MAP<K, V>` from existing `LogicalType`s
`struct_type(fields)`	`STRUCT` from `&[(&str, TypeId)]`
`struct_type_from_logical(fields)`	`STRUCT` from `&[(&str, LogicalType)]`
`union_type(members)`	`UNION` from `&[(&str, TypeId)]`
`union_type_from_logical(members)`	`UNION` from `&[(&str, LogicalType)]`
`enum_type(members)`	`ENUM` from `&[&str]`
`array(element_type, size)`	`ARRAY<T>[size]` from a `TypeId`
`array_from_logical(element, size)`	`ARRAY<T>[size]` from an existing `LogicalType`

Introspection methods

All introspection methods are unsafe (require a valid DuckDB runtime handle):

get_type_id, get_alias, set_alias, decimal_width, decimal_scale, decimal_internal_type, enum_internal_type, enum_dictionary_size, enum_dictionary_value, list_child_type, map_key_type, map_value_type, struct_child_count, struct_child_name, struct_child_type, union_member_count, union_member_name, union_member_type, array_size, array_child_type.

See Type System for the full introspection table.

Known Limitations

Window functions are not available

DuckDB window functions (OVER (...) clauses) are implemented entirely in DuckDB's C++ layer and have no counterpart in the public C extension API.

This is not a gap in quack-rs or in libduckdb-sys — the relevant symbol (duckdb_create_window_function) simply does not exist in the C API:

Symbol	C API (1.4.x)?	C API (1.5.0+)?	C++ API?
`duckdb_create_window_function`	No	No	Yes
`duckdb_create_copy_function`	No	Yes	Yes
`duckdb_create_scalar_function`	Yes	Yes	Yes
`duckdb_create_aggregate_function`	Yes	Yes	Yes
`duckdb_create_table_function`	Yes	Yes	Yes
`duckdb_create_cast_function`	Yes	Yes	Yes

What this means for your extension:

If your extension needs window-function semantics, you can approximate them with aggregate functions in most cases (DuckDB will push down the window logic). True custom window operator registration requires writing a C++ extension.

If DuckDB exposes window registration in a future C API version, quack-rs will add wrappers in the corresponding release.

COPY functions (resolved in DuckDB 1.5.0)

DuckDB 1.5.0 added duckdb_create_copy_function and related symbols to the public C extension API. quack-rs wraps these in the copy_function module behind the duckdb-1-5 feature flag. See CopyFunctionBuilder for usage.

This was previously listed as a known limitation (no C API counterpart prior to 1.5.0).

Callback accessor wrappers (resolved)

quack-rs now wraps all major callback accessor functions — the C API functions used inside your callbacks to retrieve arguments, set errors, access bind data, etc.

Category	Wrapper type	Available
Scalar function execution	`ScalarFunctionInfo`	Always
Scalar function bind	`ScalarBindInfo`	`duckdb-1-5`
Scalar function init	`ScalarInitInfo`	`duckdb-1-5`
Aggregate function callbacks	`AggregateFunctionInfo`	Always
Table function bind	`BindInfo`	Always
Table function init	`InitInfo`	Always
Table function scan	`FunctionInfo`	Always
Cast function callbacks	`CastFunctionInfo`	Always
Copy function bind	`CopyBindInfo`	`duckdb-1-5`
Copy function global init	`CopyGlobalInitInfo`	`duckdb-1-5`
Copy function sink	`CopySinkInfo`	`duckdb-1-5`
Copy function finalize	`CopyFinalizeInfo`	`duckdb-1-5`

All callback accessor functions are now wrapped, including get_client_context on all callback types (returns a [ClientContext][crate::client_context::ClientContext]).

Complex type creation (resolved)

LogicalType now provides constructors for all complex parameterized types:

Method	Type created
`LogicalType::decimal(width, scale)`	`DECIMAL(p, s)`
`LogicalType::enum_type(members)`	`ENUM('a', 'b', ...)`
`LogicalType::array(child, size)`	`type[N]`
`LogicalType::union_type(members)`	`UNION(a INT, b VARCHAR)`
`LogicalType::list(child)`	`LIST(type)`
`LogicalType::struct_type(fields)`	`STRUCT(...)`
`LogicalType::map(key, value)`	`MAP(K, V)`

All constructors have _from_logical variants for nested complex types. Introspection methods (get_type_id, list_child_type, struct_child_count, decimal_width, etc.) are also available.

VARIANT and GEOMETRY types (resolved — exposed behind `duckdb-1-5-3`)

DuckDB v1.5.1 introduced the VARIANT type for Iceberg v3 support. As of DuckDB 1.5.3 it is present in the C type enum as DUCKDB_TYPE_VARIANT (41), and the GEOMETRY type (DUCKDB_TYPE_GEOMETRY, 40) is present as well.

quack-rs exposes these as TypeId::Variant and TypeId::Geometry, gated behind the duckdb-1-5-3 feature. That feature layers on top of duckdb-1-5 and requires libduckdb-sys >= 1.10503.1 (DuckDB 1.5.3). The separate gate exists because these type-enum values postdate the duckdb-1-5 feature's 1.5.0 floor (VARIANT only landed in 1.5.3); keeping them out of duckdb-1-5 preserves compatibility for consumers pinned to libduckdb-sys 1.5.0–1.5.2.

[dependencies]
quack-rs = { version = "0.13", features = ["duckdb-1-5-3"] }

Neither type yet has dedicated VectorReader/VectorWriter helpers; access their data via the raw pointer from duckdb_vector_get_data when needed.

Changelog

All notable changes to quack-rs, mirrored from CHANGELOG.md.

The format follows Keep a Changelog. quack-rs adheres to Semantic Versioning.

Unreleased

[0.13.0] — 2026-05-24

Added

New safe wrappers for the DuckDB 1.5.0+ C extension API, all gated behind the duckdb-1-5 feature, plus a new duckdb-1-5-3 feature that surfaces the two DuckDB 1.5.3 type-enum values. DuckDB 1.5.3's C extension function-pointer API (version v1.2.0) is unchanged from 1.5.2; the one new C addition — the DUCKDB_TYPE_VARIANT (41) type-enum value — is now exposed as TypeId::Variant behind the duckdb-1-5-3 feature (see below). So the additions below mostly expose 1.5.x capabilities the SDK had not previously wrapped rather than anything new to 1.5.3 specifically.

error_data module — ErrorData, an RAII wrapper over duckdb_error_data (the structured error type returned by several 1.5 APIs). Carries a DuckDbErrorType category and a message, and converts into ExtensionError. Adds the free function check_valid_utf8, exposing DuckDB's own UTF-8 validator.
expression module — Expression, an RAII wrapper over duckdb_expression, with return_type, is_foldable, and fold. This closes a real gap: ScalarBindInfo already returned a raw, unusable duckdb_expression from get_argument; the new ScalarBindInfo::argument returns a safe Expression, so bind callbacks can inspect argument types and pre-fold constant arguments once at bind time.
file_system module — FileSystem, FileHandle, FileOpenOptions, and FileFlag: read and write files through DuckDB's virtual file system (honouring httpfs, in-memory files, and other registered file systems) instead of reaching for std::fs.
appender module — Appender: bulk row insertion (create, append a DataChunk, flush, close) plus the 1.5 additions clear (revert buffered rows), error_data (structured errors), and append_default_to_chunk.
selection_vector module — SelectionVector: allocate and fill zero-copy row-index selection vectors.
instance_cache module — InstanceCache: share one underlying database instance across repeated opens of the same path.
Value gains display_string (canonical string rendering of any value, via duckdb_value_to_string) and TIME_NS accessors Value::time_ns / Value::as_time_ns (pairing with the existing TypeId::TimeNs).
Catalog gains type_name (the catalog's storage type, e.g. "duckdb" or a storage extension's name).
All new public types are re-exported from the prelude behind the duckdb-1-5 feature.
duckdb-1-5-3 feature + TypeId::Variant / TypeId::Geometry — a new feature flag (duckdb-1-5-3, which implies duckdb-1-5) exposes the DUCKDB_TYPE_VARIANT (41, added in DuckDB 1.5.3) and DUCKDB_TYPE_GEOMETRY (40) type-enum values as TypeId::Variant and TypeId::Geometry, with full to_duckdb_type / from_duckdb_type / sql_name / Display coverage. It is a separate gate because these constants postdate the duckdb-1-5 feature's 1.5.0 floor and require libduckdb-sys >= 1.10503.1; keeping them out of duckdb-1-5 preserves compatibility for consumers pinned to libduckdb-sys 1.5.0–1.5.2.
ErrorData is now a first-class error type — implements std::fmt::Display and std::error::Error, gains a structured Debug impl, and converts into ExtensionError via From (alongside the existing into_extension_error) so it propagates through ?. DuckDbErrorType now implements Display (backed by a new pub const fn as_str).
TableDescription::as_raw() — exposes the raw handle, matching the accessor convention of the other 1.5 wrappers.

Changed

duckdb / libduckdb-sys 1.10502.0 → 1.10503.1 (DuckDB 1.5.2 → 1.5.3) in both the workspace and examples/hello-ext Cargo.lock. DuckDB 1.5.3 is a bugfix release (announcement); since the >=1.4.4, <2 constraint already permitted it, the bundled fixes are picked up purely by the lock-file update with no source changes required for the bump itself.
cc → 1.2.62 in both Cargo.lock files — workspace (1.2.61 → 1.2.62, folding in Dependabot PR #89, the patch-updates group) and examples/hello-ext (1.2.57 → 1.2.62, re-syncing the example lock's older cc). Build-dependency; no API impact.
MSRV corrected to 1.87.0. The crate declared rust-version = "1.84.1", but libduckdb-sys (1.5.x line, a non-optional dependency) is edition = "2024" / rust-version = "1.85.1" — so quack-rs has in fact required Rust ≥ 1.85.1 since before this release (cargo +1.84.1 check cannot even parse the manifest). The declared MSRV, the CI MSRV job (now explicitly pinned with toolchain: "1.87.0" so it genuinely gates instead of silently falling back to the rust-toolchain.toml stable channel), the release matrix, and all docs/badges are updated to 1.87.0 — a small headroom margin above the 1.85.1 floor.

Fixed

TypeId::from_duckdb_type no longer panics on the duckdb-1-5 type-enum values. It previously recognised only the base (1.4) values and panic!ed on everything else — including the duckdb-1-5 values (TIME_NS, ANY, BIGNUM/VARINT, SQLNULL, INTEGER_LITERAL, STRING_LITERAL). Because the public LogicalType::get_type_id() calls it, inspecting such a type inside a bind callback could panic across the FFI boundary (Pitfall L3). It now maps every variant available in the active feature set (plus the duckdb-1-5-3 GEOMETRY / VARIANT values when that feature is enabled).
TableDescription's Drop now null-checks the handle before destroying it, matching every other RAII wrapper in the crate.

Documentation

New book section "DuckDB 1.5+ APIs" — dedicated guide pages for the error_data, expression, appender, file_system, selection_vector, and instance_cache modules, wired into SUMMARY.md.
Refreshed the reference docs (docs/architecture.md, docs/ffi-reference.md, the TypeId reference, CONTRIBUTING.md/book source trees) to cover the new modules, and updated the VARIANT/GEOMETRY entries in Known Limitations, concepts/types.md, and the TypeId reference to document the new duckdb-1-5-3 gate (previously tracked as a follow-up).
Added // SAFETY: comments to previously-undocumented unsafe blocks in the get_client_context accessors (scalar, copy_function) and TableDescription::create, and SPDX headers to benches/interval_bench.rs and the test submodule files.
Corrected the README install note (it claimed v0.11.0 was the latest published crate; v0.12.1 was in fact already on crates.io) and bumped install-example version references throughout the README, book, and scaffold template to 0.13.

CI

docs.rs now builds with duckdb-1-5-3 ([package.metadata.docs.rs]), so the feature-gated modules and new TypeId variants render on docs.rs and the README's docs.rs links resolve (previously docs.rs built the empty default feature set and omitted them).
CI exercises the duckdb-1-5-3 feature — check / test / clippy for duckdb-1-5-3 alongside duckdb-1-5, with the Clippy (beta) and doc jobs on duckdb-1-5-3.
Fixed the Nightly CI job silently running stable (the SHA-pinned dtolnay/rust-toolchain step lacked with: toolchain: nightly).
Mutation testing scoped to testable code — DuckDB FFI-wrapper modules whose methods require a live runtime (tests bundled-test-gated or absent) are excluded from cargo mutants, since their mutants can't be killed by unit tests. Extends the existing exclusion pattern to the 1.5.x wrappers (expression, file_system, appender, selection_vector, instance_cache, table_description, and the scalar/copy *Info accessors). Pure-logic code (e.g. DuckDbErrorType, TypeId conversions) stays in scope; the mutants feature set is bumped to duckdb-1-5-3.

0.12.1 — 2026-05-01

Security

Closes nine GitHub Dependabot alerts (two High, seven Low) split across the workspace Cargo.lock and examples/hello-ext/Cargo.lock.

rustls-webpki 0.103.10 → 0.103.13 picks up fixes for three RustSec advisories reachable via the bundled DuckDB build's transitive reqwest → rustls chain: RUSTSEC-2026-0098 (URI name constraints silently ignored), RUSTSEC-2026-0103 (wildcard name constraints accepted), RUSTSEC-2026-0104 (DoS panic on malformed CRL BIT STRING). None of these paths are exercised by quack-rs itself, but the advisories trip cargo deny for downstream consumers, so the patch bump removes friction.
rand 0.9.2 → 0.9.4 / 0.8.5 → 0.8.6 picks up the fix for RUSTSEC-2026-0097 (ThreadRng Stacked-Borrows UB when a custom global logger reentered rand::rng() during reseed). Patched on every line: 0.8.6+, 0.9.3+, 0.10.1+.

Changed

Workspace lockfile: cc 1.2.59 → 1.2.61 (build-dep), duckdb / libduckdb-sys 1.10501.0 → 1.10502.0, rand 0.8.5 → 0.8.6, rand 0.9.2 → 0.9.4.
examples/hello-ext lockfile: libduckdb-sys 1.10501.0 → 1.10502.0, rand 0.9.2 → 0.9.4, rustls-webpki 0.103.10 → 0.103.13.

CI

GitHub Actions pin updates: actions/cache v5.0.4 → v5.0.5, actions/upload-artifact v7.0.0 → v7.0.1, actions/upload-pages-artifact v4.0.0 → v5.0.0 (all SHA-pinned).
New informational Clippy (beta) job runs the same clippy invocation on the beta toolchain (continue-on-error), so lint promotions surface ~6 weeks before they reach stable.

Fixed

WarningCollector::len: rewrite map(|w| w.len()).unwrap_or(0) as map_or(0, |w| w.len()) to satisfy clippy::map_unwrap_or, which graduated to stable clippy in Rust 1.95.0.
WarningCollector::snapshot: same defensive rewrite for the sibling map(|w| w.clone()).unwrap_or_default() call site.

0.12.0 — 2026-04-09

Added

TypedTableFunctionBuilder<S> — closure-based table functions with typed scan state
- Entry point: TableFunctionBuilder::with_state::<S, _>(|bind| Ok(S { ... })).scan(|state, chunk| { ... Ok(()) }).build()?
- bind closure: &BindInfo -> Result<S, ExtensionError> — declares output schema, reads parameters, returns the initial scan state
- scan closure: &mut S, &DataChunk -> Result<(), ExtensionError> — writes rows; set chunk size to zero to signal end-of-stream
- Eliminates hand-rolled unsafe extern "C" fn bind/init/scan trampolines in FFI-heavy extensions
- Panics in user closures are caught via catch_unwind and reported through duckdb_*_set_error
- S: Send + 'static; scans are serialised (set_max_threads(1)) — use the raw builder + local_init for parallel scans
- Re-exported from quack_rs::prelude
ExtensionError ergonomics — From<std::io::Error>, From<std::ffi::NulError>, From<std::fmt::Error> for direct ? operator usage in register_all()
tls module — TlsConfigProvider trait for type-erased TLS client configuration injection (no external deps)
warning module — ExtensionWarning, WarningSeverity, WarningCollector for structured security warnings with CWE codes
secrets module — SecretsManager trait and SecretEntry for bridging DuckDB's native CREATE SECRET storage
StructWriter::child_list_vector() — semantic alias for LIST-typed struct fields
Prelude additions — TlsConfigProvider, ExtensionWarning, WarningSeverity, WarningCollector, SecretEntry, SecretsManager

0.11.0 — 2026-03-30

Added

StructWriter::child_vector() / StructReader::child_vector() — raw child vector access for nested complex types (LIST, MAP, ARRAY) inside STRUCT fields
ChunkWriter::vector() — raw vector access for complex column types
ChunkWriter::column_count() — column count without needing DataChunk
VectorWriter::set_valid() / StructWriter::set_valid() — undo set_null(), mark row as non-NULL
ReplacementScanInfo::add_parameter_raw() — non-VARCHAR replacement scan parameters
ReplacementScanInfo::add_i64_parameter() / add_bool_parameter() — typed convenience methods

Changed

table_scan_callback! now reports panic messages to DuckDB via duckdb_function_set_error (previously silent)

0.10.0 — 2026-03-29

Added

StructWriter — batched typed writer for STRUCT output vectors; eliminates repeated duckdb_struct_vector_get_child calls
StructReader — batched typed reader for STRUCT input vectors; read-side counterpart to StructWriter
ChunkWriter — auto-sizing chunk writer for scan callbacks; calls set_size on Drop
scalar_callback! / table_scan_callback! macros — panic-safe extern "C" callback wrappers using catch_unwind
Value integer extraction — as_i8(), as_i16(), as_u8(), as_u16(), as_u32(), as_u64(), as_i128() + null-safe _or(default) variants for all types
Temporal/binary vector methods — read_date/write_date, read_timestamp/write_timestamp, read_time/write_time, read_blob/write_blob, read_uuid/write_uuid on VectorReader/VectorWriter/StructReader/StructWriter
DataChunk bridges — struct_writer(), struct_reader(), struct_field_reader(), into_chunk_writer()
Mock type completeness — 8 missing try_get_* methods, 10 missing from_* constructors, Blob variant, uuid/date/timestamp/time aliases
Prelude — StructReader, StructWriter, ChunkWriter re-exported

Changed

TableDescription::column_type() returns Option<LogicalType> (RAII) instead of raw handle
Version references updated to "0.10"

Fixed

13 expect() calls in FFI callback contexts replaced with non-panicking str_to_cstring()
9 non-idiomatic &mut { expr } patterns replaced with &raw mut

0.9.0 — 2026-03-29

Added

Value RAII wrapper — owned wrapper around duckdb_value with as_str(), as_i64(), as_i32(), as_f64(), as_f32(), as_bool() and automatic Drop cleanup
DataChunk wrapper — ergonomic wrapper around duckdb_data_chunk with reader(col), writer(col), size(), set_size(n), column_count(), vector(col)
VectorWriter::write_str() — alias for write_varchar for discoverability
BindInfo::get_parameter_value() / get_named_parameter_value() — return owned Value instead of raw duckdb_value
MapVector reader/writer helpers — key_writer(), value_writer(), key_reader(), value_reader()
MockVectorWriter::write_str() — alias matching VectorWriter API
Prelude additions — Value, DataChunk, ValidityBitmap

Changed

Version references updated across all docs to "0.9"

0.8.0 — 2026-03-28

Added

LogicalType::from_raw(ptr) — construct from raw handle
Complex type constructors — decimal, array, array_from_logical, union_type, union_type_from_logical, enum_type
_from_logical variants — struct_type_from_logical, list_from_logical, map_from_logical for nested complex types
20 introspection methods on LogicalType — get_type_id, get_alias, set_alias, decimal/enum/list/map/struct/union/array child access
TypeId::from_duckdb_type() — reverse conversion from raw C enum
extra_info on ScalarFunctionBuilder, ScalarOverloadBuilder, AggregateFunctionBuilder
param_logical / named_param_logical on TableFunctionBuilder
CastFunctionBuilder::new_logical() for complex source/target types
Callback info wrappers — ScalarFunctionInfo, ScalarBindInfo (duckdb-1-5), ScalarInitInfo (duckdb-1-5), AggregateFunctionInfo, CopyBindInfo (duckdb-1-5), CopyGlobalInitInfo (duckdb-1-5), CopySinkInfo (duckdb-1-5), CopyFinalizeInfo (duckdb-1-5)
get_client_context() on all callback info types
BindInfo — get_parameter, get_named_parameter, get_extra_info, get_client_context
InitInfo / FunctionInfo — get_extra_info
ArrayVector helper with get_child()
vector_size() and vector_get_column_type() utilities
Prelude — StructVector, ListVector, MapVector, ArrayVector, ScalarFunctionInfo, AggregateFunctionInfo

Changed

Breaking: CastFunctionBuilder::source() / target() return Option<TypeId> (was TypeId)
Breaking: CastRecord::source / target fields changed to Option<TypeId>

0.7.1 — 2026-03-27

Added

TypeId::Any — wildcard type for function overload resolution (duckdb-1-5)
TypeId::Varint — variable-length arbitrary-precision integer (duckdb-1-5)
TypeId::SqlNull — explicit SQL NULL type for bare NULL literals (duckdb-1-5)
TypeId::IntegerLiteral — integer literal type for overload resolution (duckdb-1-5)
TypeId::StringLiteral — string literal type for overload resolution (duckdb-1-5)
MockVectorReader/MockVectorWriter tests — 12 new tests for untested constructors and getters
DuckDB v1.5.1 evaluation — see docs/duckdb-v1.5.1-evaluation.md

Fixed

ARM64 / aarch64 build — use c_char instead of i8 for cross-platform pointer casts

Changed

DuckDB v1.5.1 compatibility — documentation updated to explicitly cover v1.5.1. C API version unchanged (v1.2.0). Recommend upgrading DuckDB runtime for WAL corruption and ART index fixes.

0.7.0 — 2026-03-22

Added

duckdb-1-5 feature modules — the duckdb-1-5 feature flag is no longer a placeholder. When enabled, it gates five new modules wrapping DuckDB 1.5.0 C Extension API additions:
- catalog — catalog entry lookup (CatalogEntry, Catalog, CatalogEntryType)
- client_context — client context access (ClientContext) for retrieving catalogs, config options, and connection IDs from within registered function callbacks
- config_option — extension-defined configuration options (ConfigOptionBuilder, ConfigOptionScope) registered via SET/RESET/current_setting()
- copy_function — custom COPY TO handlers (CopyFunctionBuilder) with bind → global init → sink → finalize lifecycle
- table_description — table metadata queries (TableDescription) for column count, names, and logical types
TypeId::TimeNs — new TIME_NS column type variant for nanosecond- precision time of day (DuckDB 1.5.0+, requires duckdb-1-5 feature)
ScalarFunctionBuilder::varargs() / varargs_logical() — mark a scalar function as accepting variadic arguments (requires duckdb-1-5)
ScalarFunctionBuilder::volatile() — mark a scalar function as volatile (re-evaluated for every row even with constant arguments, requires duckdb-1-5)
ScalarFunctionBuilder::bind() — set a bind callback invoked once during query planning for per-query state allocation (requires duckdb-1-5)
ScalarFunctionBuilder::init() — set an init callback invoked once per thread for per-thread local state allocation (requires duckdb-1-5)

Changed

DuckDB 1.5.0 support — upgraded default libduckdb-sys from 1.4.4 to 1.10500.0 (DuckDB 1.5.0) and duckdb from 1.4.4 to 1.10500.0. The version range ">=1.4.4, <2" in Cargo.toml is unchanged, preserving backward compatibility with DuckDB 1.4.x.
CI action updates — Swatinem/rust-cache v2.8.2→v2.9.1, actions/download-artifact v8.0.0→v8.0.1, actions/cache 5.0.3→5.0.4, codecov/codecov-action 5.4.3→5.5.3.

Fixed

COPY format handlers — previously listed as a known limitation (no C API counterpart). DuckDB 1.5.0 adds duckdb_create_copy_function and related symbols; the new copy_function module wraps them behind duckdb-1-5.

0.6.0 — 2026-03-12

Added

InMemoryDb dispatch table initialisation — InMemoryDb::open() now correctly initialises the loadable-extension dispatch table from bundled DuckDB symbols before opening a connection. Previously, every call panicked with "DuckDB API not initialized" when the bundled-test feature was enabled in cargo test. See Pitfall P9 for the full technical analysis.
src/testing/bundled_api_init.cpp — thin C++ shim exposing DuckDB's internal CreateAPIv1() as a C-linkage symbol, compiled at build time via the cc crate. Populates all 459 AtomicPtr dispatch table slots with real bundled DuckDB function pointers.
build.rs — Cargo build script that locates the libduckdb-sys include path and compiles the C++ shim when the bundled-test feature is active.
CI: test-bundled job — new CI job runs cargo test --all-targets --features bundled-test on Linux, macOS, and Windows on every PR, closing the gap that allowed this failure to reach the release workflow undetected.
Pitfall P9 documented — full analysis in LESSONS.md and the Pitfall Catalog: root cause, CreateAPIv1() solution, ABI compatibility details, risks of the internal C++ API, and a mitigation table.

Fixed

InMemoryDb::open() no longer panics under cargo test --features bundled-test. This was broken from the initial 0.5.1 release.

Changed

bundled-test feature documentation updated to describe dispatch table initialisation accurately.

0.5.1 — 2026-03-12

Added

Testing primitives (quack_rs::testing) — MockVectorWriter, MockVectorReader, MockDuckValue, MockRegistrar, CastRecord.
bundled-test Cargo feature — enables InMemoryDb for SQL-level assertions in cargo test. (Note: InMemoryDb::open() was broken in this release and fixed in 0.6.0.)
InMemoryDb — wraps duckdb::Connection for SQL-level integration tests; available behind the bundled-test feature.
Builder introspection accessors — name() on all function builders; source()/target() on CastFunctionBuilder.

Security

Bump quinn-proto 0.11.13 → 0.11.14 (addresses RUSTSEC advisory).

0.5.0 — 2026-03-10

Added

param_logical(LogicalType) on all builders — register parameters with complex parameterized types (LIST(BIGINT), MAP(VARCHAR, INTEGER), STRUCT(...)) that TypeId alone cannot express. Available on AggregateFunctionBuilder, AggregateFunctionSetBuilder::OverloadBuilder, ScalarFunctionBuilder, and ScalarOverloadBuilder. Parameters added via param() and param_logical() are interleaved by position, so the order you call them is the order DuckDB sees them.
returns_logical(LogicalType) on all builders — set a complex parameterized return type. When both returns(TypeId) and returns_logical(LogicalType) are called, the logical type takes precedence. Available on AggregateFunctionBuilder, AggregateFunctionSetBuilder, ScalarFunctionBuilder, and ScalarOverloadBuilder. This eliminates the need for raw FFI when returning LIST(BOOLEAN), LIST(TIMESTAMP), MAP(K, V), or any other parameterized type.
null_handling(NullHandling) on set overload builders — per-overload NULL handling configuration for AggregateFunctionSetBuilder::OverloadBuilder and ScalarOverloadBuilder. Previously only available on single-function builders.

Notes

Upstream fix: duckdb-loadable-macros panic-at-FFI-boundary — the safe entry-point pattern developed in quack-rs (using ? / ok_or_else throughout instead of .unwrap()) was contributed upstream as duckdb/duckdb-rs#696 and merged 2026-03-09. All users of the duckdb_entrypoint_c_api! macro from duckdb-loadable-macros will receive this fix in the next duckdb-rs release. quack-rs users have always been protected via the safe entry_point! / entry_point_v2! macros provided by this crate.

0.4.0 — 2026-03-09

Added

Connection and Registrar trait — version-agnostic extension registration facade. Connection wraps the duckdb_connection and duckdb_database handles provided at initialization time. The Registrar trait provides uniform methods for registering all extension components (scalar, scalar set, aggregate, aggregate set, table, SQL macro, cast), making registration code interchangeable across DuckDB 1.4.x and 1.5.x.
init_extension_v2 — new entry point helper that passes &Connection to the registration callback instead of a raw duckdb_connection. Prefer this over init_extension for new extensions.
entry_point_v2! macro — companion macro to entry_point! that generates the #[no_mangle] unsafe extern "C" entry point using init_extension_v2.
duckdb-1-5 cargo feature — placeholder feature flag for DuckDB 1.5.0-specific C API wrappers. Currently empty; will be populated when libduckdb-sys 1.5.0 is published on crates.io.

Changed

DuckDB version support broadened to 1.4.x and 1.5.x — the libduckdb-sys dependency requirement was relaxed from an exact pin (=1.4.4) to a range (>=1.4.4, <2). DuckDB v1.5.0 does not change the C API version string (v1.2.0); the existing DUCKDB_API_VERSION constant remains correct for both releases. Extension authors can pin their own libduckdb-sys to either =1.4.4 or =1.5.0 and resolve cleanly against quack-rs. The scaffold template and CI workflow template were updated to default to DuckDB v1.5.0.

0.3.0 — 2026-03-08

Added

TableFunctionBuilder — type-safe builder for registering DuckDB table functions (SELECT * FROM my_function(args)). Covers the full bind/init/scan lifecycle with ergonomic callbacks; BindInfo, FfiBindData<T>, and FfiInitData<T> eliminate all raw pointer manipulation. Verified end-to-end against DuckDB 1.4.4. See Table Functions.
ReplacementScanBuilder — builder for registering DuckDB replacement scans (SELECT * FROM 'file.xyz' patterns). 4-method chain handles callback registration, path extraction, and bind-info population. See Replacement Scans.
StructVector, ListVector, MapVector — safe wrappers for reading and writing nested-type vectors. Eliminate manual offset arithmetic and raw pointer casts over child vector handles. Re-exported from quack_rs::vector::complex. See Complex Types.
CastFunctionBuilder — type-safe builder for registering custom type cast functions. Covers explicit CAST(x AS T) and implicit coercions (optional implicit_cost). CastFunctionInfo exposes cast_mode(), set_error(), and set_row_error() inside callbacks for correct TRY_CAST / CAST error handling. See Cast Functions.
DbConfig — RAII wrapper for duckdb_config. Builder-style .set(name, value)? chain with automatic duckdb_destroy_config on drop and flag_count() / get_flag(index) for enumerating all available options. See quack_rs::config.
ScalarFunctionSetBuilder — builder for registering scalar function overload sets, mirroring AggregateFunctionSetBuilder.
NullHandling enum and .null_handling() builder method — configurable NULL propagation for scalar and aggregate functions.
TypeId variants — Decimal, Struct, Map, UHugeInt, TimeTz, TimestampS, TimestampMs, TimestampNs, Array, Enum, Union, Bit.
From<TypeId> for LogicalType — idiomatic conversion from TypeId.
#[must_use] on builder structs — compile-time warning if a builder is constructed but never consumed.
VectorWriter::write_interval — writes INTERVAL values to output vectors.
append_metadata binary — native Rust replacement for the Python metadata script. Install with cargo install quack-rs --bin append_metadata.
hello-ext cast demo — the example extension now registers CAST(VARCHAR AS INTEGER) and TRY_CAST(VARCHAR AS INTEGER) using CastFunctionBuilder, demonstrating both error modes with five unit tests.
prelude additions — TableFunctionBuilder, BindInfo, FfiBindData, FfiInitData, ReplacementScanBuilder, StructVector, ListVector, MapVector, CastFunctionBuilder, CastFunctionInfo, CastMode added to quack_rs::prelude.

Not implemented (upstream C API gap)

Window functions and COPY format handlers are absent from DuckDB's public C extension API and cannot be wrapped. See Known Limitations.

Fixed

hello-ext gs_bind callback — replaced incorrect duckdb_value_int64(param) with duckdb_get_int64(param). All 11 live SQL tests now pass against DuckDB 1.4.4.

Changed

Bump criterion dev-dependency from 0.5 to 0.8.
Bump Swatinem/rust-cache GitHub Action from v2.7.5 to v2.8.2.
Bump dtolnay/rust-toolchain CI pin from v2.7.5 to latest SHA.
Bump actions/attest-build-provenance from v2 to v4.
Bump actions/configure-pages to latest SHA (d5606572…).
Bump actions/upload-pages-artifact from v3.0.1 to v4.0.0.

0.2.0 — 2026-03-07

Added

validate::description_yml module — parse and validate a complete description.yml metadata file end-to-end. Includes:
- DescriptionYml struct — structured representation of all required and optional fields
- parse_description_yml(content: &str) — parse and validate in one step
- validate_description_yml_str(content: &str) — pass/fail validation
- validate_rust_extension(desc: &DescriptionYml) — enforce Rust-specific fields (language: Rust, build: cargo, requires_toolchains includes rust)
- 25+ unit tests covering all required fields, optional fields, error paths, and edge cases
prelude module — ergonomic glob-import for the most commonly used items. use quack_rs::prelude::*; brings in all builder types, state traits, vector helpers, types, error handling, and the API version constant. Reduces boilerplate for extension authors.
Scaffold: extension_config.cmake generation — the scaffold generator now produces extension_config.cmake, which is referenced by the EXT_CONFIG variable in the Makefile and required by extension-ci-tools for CI integration.
Scaffold: SQLLogicTest skeleton — generate_scaffold now produces test/sql/{name}.test, a ready-to-fill SQLLogicTest file with require directive, format comments, and example query/result blocks. E2E tests are required for community extension submission (Pitfall P3).
Scaffold: GitHub Actions CI workflow — generate_scaffold now produces .github/workflows/extension-ci.yml, a complete cross-platform CI workflow that builds and tests the extension on Linux, macOS, and Windows against a real DuckDB binary.
validate::validate_excluded_platforms_str — validates the excluded_platforms field from description.yml as a semicolon-delimited string (e.g., "wasm_mvp;wasm_eh;wasm_threads"). Splits on ; and validates each token. An empty string is valid (no exclusions).
validate::validate_excluded_platforms — re-exported at the validate module level (previously only accessible as validate::platform::validate_excluded_platforms).
validate::semver::classify_extension_version — returns ExtensionStability (Unstable/PreRelease/Stable) classifying the tier a version falls into.
validate::semver::ExtensionStability — enum for DuckDB extension version stability tiers (Unstable, PreRelease, Stable) with Display implementation.
scalar module — ScalarFunctionBuilder for registering scalar functions with the DuckDB C Extension API. Includes try_new with name validation, param, returns, function setters, and register. Full unit tests included.
entry_point! macro — generates the required #[no_mangle] extern "C" entry point with zero boilerplate from an identifier and registration closure.
VectorWriter::write_varchar — writes VARCHAR string values to output vectors using duckdb_vector_assign_string_element_len (handles both inline and pointer formats).
VectorWriter::write_bool — writes BOOLEAN values as a single byte.
VectorWriter::write_u16 — writes USMALLINT values.
VectorWriter::write_i16 — writes SMALLINT values.
VectorReader::read_interval — reads INTERVAL values from input vectors via the correct 16-byte layout helper.
CI: Windows testing — the CI matrix now includes windows-latest in the test job, covering all three major platforms (Linux, macOS, Windows).
CI: example-check job — CI now checks, lints, and tests examples/hello-ext as part of every PR, ensuring the example extension always compiles and its tests pass.
validate::validate_release_profile — checks Cargo release profile settings for loadable-extension correctness. Validates panic, lto, opt-level, and codegen-units.

Fixed

MSRV documentation now consistently states 1.84.1 across README.md, CONTRIBUTING.md, and Cargo.toml (previously README.md stated 1.80).

0.1.0 — 2025-05-01

Added

Initial release
entry_point module: init_extension helper for correct extension initialization
aggregate module: AggregateFunctionBuilder, AggregateFunctionSetBuilder
aggregate::state module: AggregateState trait, FfiState<T> wrapper
aggregate::callbacks module: type aliases for all 6 aggregate callback signatures
vector module: VectorReader, VectorWriter, ValidityBitmap, DuckStringView
types module: TypeId enum (33 variants), LogicalType RAII wrapper
interval module: DuckInterval, interval_to_micros, read_interval_at
error module: ExtensionError, ExtResult<T>
testing module: AggregateTestHarness<S> for pure-Rust aggregate testing
scaffold module: generate_scaffold for generating complete extension projects
sql_macro module: SqlMacro for registering SQL macros without FFI callbacks
Complete hello-ext example extension
Documentation of all 15 DuckDB Rust FFI pitfalls (LESSONS.md)
CI pipeline: check, test, clippy, fmt, doc, msrv, bench-compile
SECURITY.md vulnerability disclosure policy

FAQ

Frequently asked questions about quack-rs and building DuckDB extensions in Rust.

General

What is quack-rs?

quack-rs is a Rust SDK for building DuckDB loadable extensions using DuckDB's pure C Extension API. It provides safe, ergonomic builders for registering scalar functions, aggregate functions, table functions, cast functions, replacement scans, SQL macros, and copy functions (via the duckdb-1-5 feature), along with helpers for reading and writing DuckDB vectors, and utilities for publishing community extensions.

Why does this exist?

Building a DuckDB extension in Rust requires solving a set of undocumented FFI problems that every developer discovers independently. quack-rs encodes solutions to all 16 known pitfalls so you don't have to rediscover them. See the Pitfall Catalog.

What DuckDB version does quack-rs target?

quack-rs requires libduckdb-sys = ">=1.4.4, <2" (DuckDB 1.4.x and 1.5.x). The C API version string passed to the dispatch-table initializer is "v1.2.0", available as quack_rs::DUCKDB_API_VERSION. Both DuckDB 1.4.x and 1.5.x use the same C API version. These are two distinct version identifiers — the crate version and the C API protocol version.

What is the minimum supported Rust version (MSRV)?

Rust 1.87.0 or later. This is enforced in Cargo.toml with rust-version = "1.87.0".

Is quack-rs production-ready?

Yes. It was extracted from duckdb-behavioral, a production DuckDB community extension. All 16 pitfalls it solves were discovered in production.

Functions

Can I expose SQL macros as an extension?

Yes, without any C++ wrapper code. Use quack_rs::sql_macro::SqlMacro:

#![allow(unused)]
fn main() {
use quack_rs::sql_macro::SqlMacro;

// Scalar macro
let m = SqlMacro::scalar("double_it", &["x"], "x * 2")?;
unsafe { m.register(con) }?;

// Table macro
let m = SqlMacro::table("recent_events", &["n"],
    "SELECT * FROM events ORDER BY ts DESC LIMIT n")?;
unsafe { m.register(con) }?;
}

Can I register multiple overloads of the same function?

Yes, using AggregateFunctionSetBuilder (for aggregates) or ScalarFunctionSetBuilder (for scalars). Both support complex parameter types via param_logical(LogicalType) and complex return types via returns_logical(LogicalType). See Overloading with Function Sets.

Can I register multiple functions in one extension?

Yes. The init_extension closure receives a duckdb_connection and can call as many register_* functions as needed:

#![allow(unused)]
fn main() {
quack_rs::entry_point::init_extension(info, access, DUCKDB_API_VERSION, |con| {
    unsafe { register_word_count(con) }?;
    unsafe { register_sentence_count(con) }?;
    unsafe {
        SqlMacro::scalar("double_it", &["x"], "x * 2")?
            .register(con)?;
    }
    Ok(())
})
}

Can I use the `duckdb` crate instead of `libduckdb-sys`?

No. The duckdb crate's bundled feature embeds its own copy of DuckDB. A loadable extension must link against the DuckDB that loads it, not bundle a separate copy. Use libduckdb-sys with the loadable-extension feature.

Can I have a scalar function with no parameters?

Yes. Pass an empty slice to param:

#![allow(unused)]
fn main() {
ScalarFunctionBuilder::new("current_quack")
    .returns(TypeId::Varchar)
    .function(quack_callback)
    .register(con)?;
}

Testing

Do I need a DuckDB instance to run unit tests?

No. AggregateTestHarness simulates the aggregate lifecycle in pure Rust without any DuckDB dependency. You can run cargo test without loading a DuckDB binary.

My unit tests all pass but the extension crashes. Why?

Unit tests cannot detect FFI wiring bugs. See Pitfall P3 and the Testing Guide. Always run E2E tests by loading the extension into an actual DuckDB process.

How do I test SQL macros?

SqlMacro::to_sql() is pure Rust and requires no DuckDB connection:

#![allow(unused)]
fn main() {
let m = SqlMacro::scalar("triple", &["x"], "x * 3").unwrap();
assert_eq!(m.to_sql(), "CREATE OR REPLACE MACRO triple(x) AS (x * 3)");
}

For E2E testing, include the macro in your SQLLogicTest file:

query I
SELECT double_it(21);
----
42

Publishing

How do I publish to the DuckDB community extensions registry?

Scaffold your project with generate_scaffold
Push to GitHub
Submit a pull request to the community-extensions repo with your description.yml

See Community Extensions for the full workflow.

My extension name is taken. What should I do?

Use a vendor-prefixed name: myorg_analytics instead of analytics. Extension names must be globally unique across the entire DuckDB ecosystem. Check community-extensions.duckdb.org first.

Do I need to set up CI manually?

No. generate_scaffold produces .github/workflows/extension-ci.yml which builds and tests your extension on Linux, macOS, and Windows automatically.

Can my extension be installed with `INSTALL ... FROM community`?

Yes, once your pull request is merged into the community-extensions repository. Until then, users load the .duckdb_extension binary directly:

LOAD './path/to/libmy_extension.duckdb_extension';

Troubleshooting

My aggregate returns wrong results with no error.

The most common cause is Pitfall L1: your combine callback is not propagating all configuration fields. See Pitfall L1 and test with AggregateTestHarness::combine.

I'm getting a SEGFAULT when writing NULL.

You are likely calling duckdb_vector_get_validity without first calling duckdb_vector_ensure_validity_writable. Use VectorWriter::set_null instead. See Pitfall L4.

My function is not found in SQL after `LOAD`.

Most likely cause: the function was not registered (Pitfall L6 — function set name not set on each member), or the entry point symbol name does not match the extension name. The symbol must be {extension_name}_init_c_api (all lowercase, underscores).

`make configure` fails with a missing file error.

The extension-ci-tools submodule is not initialized:

git submodule update --init --recursive

My SQLLogicTest fails in CI but passes locally.

SQLLogicTest does exact string matching. The most common issue is a difference in NULL representation, decimal places, or line endings. Run the query in the same DuckDB version used by CI and copy the output verbatim.

How do I read a VARCHAR that is longer than 12 bytes?

VectorReader::read_str handles both the inline (≤ 12 bytes) and pointer (> 12 bytes) formats automatically. No special handling needed.

What happens if I read from a NULL row?

You get garbage data from the vector's data buffer. Always check is_valid before reading. See NULL Handling & Strings.

Architecture

Why use `libduckdb-sys` with `loadable-extension` instead of the `duckdb` crate?

The duckdb crate is designed for embedding DuckDB, not for extending it. Its bundled feature includes a statically linked DuckDB binary, which conflicts with the DuckDB runtime that loads your extension. libduckdb-sys with loadable-extension provides lazy-initialized function pointers that are populated by DuckDB at extension load time.

Why not use `duckdb-loadable-macros`?

duckdb-loadable-macros relies on extract_raw_connection which uses the internal Rc<RefCell<InnerConnection>> layout. This is fragile and causes SEGFAULTs when the layout changes between duckdb crate versions. init_extension uses the correct C API entry sequence directly.

Why is `panic = "abort"` required?

Panics cannot unwind across FFI boundaries in Rust. A panic in an unsafe extern "C" callback is undefined behavior. panic = "abort" converts panics to process termination, which is still bad but not undefined behavior. Always use Result and ? in your callbacks instead.

Can I use async Rust in my extension?

Not directly in FFI callbacks. DuckDB's callbacks are synchronous C functions. You can run a Tokio or async-std runtime and block on async tasks inside callbacks (using Runtime::block_on), but the callbacks themselves must return synchronously.

How does `FfiState<T>` prevent double-free?

FfiState<T> stores the Box<T> as a raw pointer in inner. When destroy_callback is called, it reconstitutes the Box (which drops T and frees memory) and then sets inner to null. A second call to destroy_callback on the same state sees a null inner and returns without freeing.

Contributing

quack-rs is an open source project. Contributions of all kinds are welcome: bug reports, documentation improvements, new pitfall discoveries, and code.

Development prerequisites

Tool	Version	Purpose
Rust	≥ 1.87.0 (MSRV)	Compiler
`rustfmt`	stable	Formatting
`clippy`	stable	Linting
`cargo-msrv`	latest	MSRV verification

Install the Rust toolchain via rustup.rs.

Building

# Build the library
cargo build

# Build in release mode (enables LTO + strip)
cargo build --release

# Build the hello-ext example extension
cargo build --release --manifest-path examples/hello-ext/Cargo.toml

Quality gates

All of the following must pass before merging any pull request:

# Tests — zero failures, zero ignored
cargo test

# Integration tests
cargo test --test integration_test

# Linting — zero warnings (warnings are errors)
cargo clippy --all-targets -- -D warnings

# Formatting
cargo fmt -- --check

# Documentation — zero broken links or missing docs
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps

# MSRV — must compile on Rust 1.87.0 (excludes benches; matches CI)
cargo +1.87.0 check

These same checks run in CI on every push and pull request.

Test strategy

Unit tests

Unit tests live in #[cfg(test)] modules within each source file. They test pure-Rust logic that does not require a live DuckDB instance.

Important constraint: libduckdb-sys with features = ["loadable-extension"] makes all DuckDB C API functions go through lazy AtomicPtr dispatch. These pointers are only populated when duckdb_rs_extension_api_init is called from within a real DuckDB extension load. Calling any duckdb_* function in a unit test will panic. Move such tests to integration tests or example-extension tests.

Integration tests

tests/integration_test.rs contains pure-Rust tests that cross module boundaries — testing interval with AggregateTestHarness, verifying FfiState lifecycle, and so on. These still cannot call duckdb_* functions.

Property-based tests

Selected modules include proptest-based tests:

interval.rs — overflow edge cases across the full i32/i64 range
testing/harness.rs — sum associativity, identity element for AggregateState

Example-extension tests

examples/hello-ext/ contains #[cfg(test)] unit tests for the pure logic (count_words). Full E2E testing (loading the .so into DuckDB) is left to consumers.

Code standards

Safety documentation

Every unsafe block must have a // SAFETY: comment explaining:

Which invariant the caller guarantees
Why the operation is valid given that invariant

#![allow(unused)]
fn main() {
// SAFETY: `states` is a valid array of `count` pointers, each initialized
// by `init_callback`. We are the only owner of `inner` at this point.
unsafe { drop(Box::from_raw(ffi.inner)) };
}

No panics across FFI

unwrap(), expect(), and panic!() are forbidden in any function that may be called by DuckDB (callbacks and entry points). Use Option/Result and ? throughout.

Clippy lint policy

The crate enables pedantic, nursery, and cargo lint groups. All warnings are treated as errors in CI. Lints are suppressed only where they produce false positives for SDK API patterns:

[lints.clippy]
module_name_repetitions = "allow"  # e.g., AggregateFunctionBuilder
must_use_candidate = "allow"       # builder methods
missing_errors_doc = "allow"       # unsafe extern "C" callbacks
return_self_not_must_use = "allow" # builder pattern

Documentation

Every public item must have a doc comment. Follow these conventions:

First line: short summary (noun phrase, no trailing period)
# Safety: mandatory on every unsafe fn
# Panics: mandatory if the function can panic
# Errors: mandatory on functions returning Result
# Example: encouraged on public types and key methods

Repository structure

quack-rs/
├── src/
│   ├── lib.rs                     # Crate root; module declarations; DUCKDB_API_VERSION
│   ├── entry_point.rs             # init_extension() / init_extension_v2() + entry_point! / entry_point_v2!
│   ├── connection.rs              # Connection facade + Registrar trait (version-agnostic registration)
│   ├── config.rs                  # DbConfig — RAII wrapper for duckdb_config
│   ├── error.rs                   # ExtensionError, ExtResult<T>
│   ├── interval.rs                # DuckInterval, interval_to_micros
│   ├── sql_macro.rs               # SqlMacro — CREATE MACRO without FFI callbacks
│   ├── aggregate/
│   │   ├── mod.rs
│   │   ├── builder/               # Builder types for aggregate function registration
│   │   │   ├── mod.rs             # Module doc + re-exports
│   │   │   ├── single.rs          # AggregateFunctionBuilder (single-signature)
│   │   │   ├── set.rs             # AggregateFunctionSetBuilder, OverloadBuilder
│   │   │   └── tests.rs           # Unit tests
│   │   ├── info.rs                # AggregateFunctionInfo
│   │   ├── callbacks.rs           # Callback type aliases
│   │   └── state.rs               # AggregateState trait, FfiState<T>
│   ├── scalar/
│   │   ├── mod.rs
│   │   ├── info.rs                # ScalarFunctionInfo, ScalarBindInfo, ScalarInitInfo
│   │   └── builder/               # Builder types for scalar function registration
│   │       ├── mod.rs             # Module doc + re-exports
│   │       ├── single.rs          # ScalarFn type alias, ScalarFunctionBuilder
│   │       ├── set.rs             # ScalarFunctionSetBuilder, ScalarOverloadBuilder
│   │       └── tests.rs           # Unit tests
│   ├── catalog.rs                 # Catalog access helpers (requires `duckdb-1-5`)
│   ├── cast/
│   │   ├── mod.rs                 # Re-exports
│   │   └── builder.rs             # CastFunctionBuilder, CastFunctionInfo, CastMode
│   ├── client_context.rs          # ClientContext wrapper (requires `duckdb-1-5`)
│   ├── config_option.rs           # ConfigOption registration (requires `duckdb-1-5`)
│   ├── copy_function/
│   │   ├── mod.rs                 # CopyFunctionBuilder (requires `duckdb-1-5`)
│   │   └── info.rs                # CopyBindInfo, CopySinkInfo, etc.
│   ├── appender.rs                # Appender — bulk row insertion (requires `duckdb-1-5`)
│   ├── error_data.rs              # ErrorData, DuckDbErrorType — structured errors (requires `duckdb-1-5`)
│   ├── expression.rs              # Expression — bound expr inspection/folding (requires `duckdb-1-5`)
│   ├── file_system.rs             # FileSystem, FileHandle — DuckDB virtual file system (requires `duckdb-1-5`)
│   ├── instance_cache.rs          # InstanceCache — shared DB instance cache (requires `duckdb-1-5`)
│   ├── selection_vector.rs        # SelectionVector — zero-copy row-index vectors (requires `duckdb-1-5`)
│   ├── replacement_scan/
│   │   └── mod.rs                 # ReplacementScanBuilder — SELECT * FROM 'file.xyz' patterns
│   ├── types/
│   │   ├── mod.rs
│   │   ├── type_id.rs             # TypeId enum (33 base + 6 with duckdb-1-5)
│   │   └── logical_type.rs        # LogicalType RAII wrapper
│   ├── vector/
│   │   ├── mod.rs
│   │   ├── reader.rs              # VectorReader
│   │   ├── writer.rs              # VectorWriter
│   │   ├── validity.rs            # ValidityBitmap
│   │   ├── string.rs              # DuckStringView, read_duck_string
│   │   └── complex.rs             # StructVector, ListVector, MapVector, ArrayVector
│   ├── validate/
│   │   ├── mod.rs
│   │   ├── description_yml/       # Parse and validate description.yml metadata
│   │   │   ├── mod.rs             # Module doc + re-exports
│   │   │   ├── model.rs           # DescriptionYml struct
│   │   │   ├── parser.rs          # parse_description_yml and helpers
│   │   │   ├── validator.rs       # validate_description_yml_str, validate_rust_extension
│   │   │   └── tests.rs           # Unit tests
│   │   ├── extension_name.rs
│   │   ├── function_name.rs
│   │   ├── platform.rs
│   │   ├── release_profile.rs
│   │   ├── semver.rs
│   │   └── spdx.rs
│   ├── scaffold/
│   │   ├── mod.rs                 # ScaffoldConfig, GeneratedFile, generate_scaffold
│   │   ├── templates.rs           # Template generators for scaffold files (pub(super))
│   │   └── tests.rs               # Unit tests
│   ├── table_description.rs       # TableDescription wrapper (requires `duckdb-1-5`)
│   ├── table/
│   │   ├── mod.rs
│   │   ├── builder.rs             # TableFunctionBuilder, BindFn/InitFn/ScanFn aliases
│   │   ├── info.rs                # BindInfo, InitInfo, FunctionInfo
│   │   ├── bind_data.rs           # FfiBindData<T>
│   │   └── init_data.rs           # FfiInitData<T>, FfiLocalInitData<T>
│   └── testing/
│       ├── mod.rs
│       ├── harness.rs             # AggregateTestHarness<S>
│       ├── mock_vector.rs         # MockVectorReader, MockVectorWriter, MockDuckValue
│       ├── mock_registrar.rs      # MockRegistrar, CastRecord
│       └── in_memory_db.rs        # InMemoryDb (requires `bundled-test`)
├── tests/
│   └── integration_test.rs
├── benches/
│   └── interval_bench.rs          # Criterion benchmarks
├── examples/
│   └── hello-ext/                 # Reference example: word_count (aggregate) + first_word (scalar)
├── book/                          # mdBook documentation source
│   ├── src/                       # Markdown pages (this site)
│   └── theme/custom.css
├── .github/workflows/ci.yml       # CI pipeline
├── .github/workflows/docs.yml     # GitHub Pages deployment
├── CONTRIBUTING.md
├── LESSONS.md                     # The 16 DuckDB Rust FFI pitfalls
├── CHANGELOG.md
└── README.md

Releasing

quack-rs uses libduckdb-sys = ">=1.4.4, <2" — a bounded range covering DuckDB 1.4.x and 1.5.x, whose C API (v1.2.0) is stable across both releases. The <2 upper bound prevents silent adoption of a future major release that may change the C API. Before broadening the range to a new major band:

Read the DuckDB changelog for C API changes
Check the new C API version string (used in duckdb_rs_extension_api_init)
Update DUCKDB_API_VERSION in src/lib.rs if the C API version changed
Audit all callback signatures against the new bindgen.rs output
Update the range bounds in Cargo.toml (runtime and dev-deps)

Versions follow Semantic Versioning. Breaking changes to the public API require a major version bump.

Reporting issues

Use GitHub Issues. For security vulnerabilities, see SECURITY.md for responsible disclosure policy.

License

quack-rs is licensed under the MIT License. Contributions are accepted under the same license. By submitting a pull request, you agree to license your contribution under MIT.