Table Functions
Table functions implement the SELECT * FROM my_function(args) pattern — they
return a result set rather than a scalar value. DuckDB table functions have three
lifecycle callbacks: bind, init, and scan.
quack-rs provides TableFunctionBuilder plus the helper types BindInfo,
InitInfo, FunctionInfo, FfiBindData<T>, FfiInitData<T>, and
FfiLocalInitData<T> to eliminate the raw FFI boilerplate.
Lifecycle
| Phase | Callback | Called when | Typical work |
|---|---|---|---|
| bind | bind_fn | Query is planned | Extract parameters; register output columns; store config in bind data |
| init | init_fn | Execution starts | Allocate per-scan state (cursor, row index, etc.) |
| scan | scan_fn | Each output batch | Fill duckdb_data_chunk with rows; call duckdb_data_chunk_set_size |
The scan callback is called repeatedly until it writes 0 rows in a batch, signalling end-of-results.
Builder API
#![allow(unused)] fn main() { use quack_rs::table::{TableFunctionBuilder, BindInfo, FfiBindData, FfiInitData}; use quack_rs::types::TypeId; TableFunctionBuilder::new("my_function") .param(TypeId::BigInt) // positional parameter types .bind(my_bind_callback) // declare output columns inside bind .init(my_init_callback) .scan(my_scan_callback) .register(con)?; }
Output columns are declared inside the bind callback using BindInfo::add_result_column,
not on the builder itself.
State management
Bind data
Bind data persists from the bind phase through all scan batches. Use
FfiBindData<T> to allocate it safely:
#![allow(unused)] fn main() { struct MyBindData { limit: i64, } unsafe extern "C" fn my_bind(info: duckdb_bind_info) { let n = unsafe { duckdb_get_int64(duckdb_bind_get_parameter(info, 0)) }; unsafe { FfiBindData::<MyBindData>::set(info, MyBindData { limit: n }) }; } }
FfiBindData::set stores the value and registers a destructor so DuckDB frees
it at the right time — no Box::into_raw / Box::from_raw needed.
Init (scan) state
Per-scan state (e.g., a current row index) uses FfiInitData<T>:
#![allow(unused)] fn main() { struct MyScanState { pos: i64, } unsafe extern "C" fn my_init(info: duckdb_init_info) { unsafe { FfiInitData::<MyScanState>::set(info, MyScanState { pos: 0 }) }; } }
Complete example: generate_series_ext
The hello-ext example registers generate_series_ext(n BIGINT) which emits
integers 0 .. n-1. See examples/hello-ext/src/lib.rs for the full source.
#![allow(unused)] fn main() { // Bind: extract `n`, register one output column unsafe extern "C" fn gs_bind(info: duckdb_bind_info) { let param = unsafe { duckdb_bind_get_parameter(info, 0) }; let n = unsafe { duckdb_get_int64(param) }; unsafe { duckdb_destroy_value(&mut { param }) }; let out_type = LogicalType::new(TypeId::BigInt); unsafe { duckdb_bind_add_result_column(info, c"value".as_ptr(), out_type.as_raw()) }; unsafe { FfiBindData::<GsBindData>::set(info, GsBindData { total: n }) }; } // Init: zero-initialise the scan cursor unsafe extern "C" fn gs_init(info: duckdb_init_info) { unsafe { FfiInitData::<GsScanState>::set(info, GsScanState { pos: 0 }) }; } // Scan: emit a batch of rows unsafe extern "C" fn gs_scan(info: duckdb_function_info, output: duckdb_data_chunk) { let bind = unsafe { FfiBindData::<GsBindData>::get_from_function(info) }.unwrap(); let state = unsafe { FfiInitData::<GsScanState>::get_mut(info) }.unwrap(); let remaining = bind.total - state.pos; let batch = remaining.min(2048).max(0) as usize; let mut writer = unsafe { VectorWriter::new(duckdb_data_chunk_get_vector(output, 0)) }; for i in 0..batch { unsafe { writer.write_i64(i, state.pos + i as i64) }; } unsafe { duckdb_data_chunk_set_size(output, batch as idx_t) }; state.pos += batch as i64; } }
Registration
#![allow(unused)] fn main() { TableFunctionBuilder::new("generate_series_ext") .param(TypeId::BigInt) .bind(gs_bind) .init(gs_init) .scan(gs_scan) .register(con)?; }
Advanced features
Named parameters
Named parameters let callers pass optional arguments by name (e.g., step := 10):
#![allow(unused)] fn main() { TableFunctionBuilder::new("gen_series_v2") .param(TypeId::BigInt) // positional: n .named_param("step", TypeId::BigInt) // named: step := <value> .bind(gs_v2_bind) .init(gs_v2_init) .scan(gs_v2_scan) .register(con)?; }
In the bind callback, read the named parameter with
duckdb_bind_get_named_parameter(info, c"step".as_ptr()).
Local init (per-thread state)
For multi-threaded table functions, use local_init to allocate per-thread state:
#![allow(unused)] fn main() { TableFunctionBuilder::new("gen_series_v2") .param(TypeId::BigInt) .bind(gs_v2_bind) .init(gs_v2_init) .local_init(gs_v2_local_init) // per-thread state allocation .scan(gs_v2_scan) .register(con)?; }
The local init callback receives duckdb_init_info and can use
FfiLocalInitData<T>::set to store per-thread state.
Thread control
Use InitInfo::set_max_threads in the global init callback to tell DuckDB how
many threads can scan concurrently:
#![allow(unused)] fn main() { unsafe extern "C" fn gs_v2_init(info: duckdb_init_info) { let init_info = unsafe { InitInfo::new(info) }; unsafe { init_info.set_max_threads(1) }; unsafe { FfiInitData::<MyState>::set(info, MyState { pos: 0 }) }; } }
Projection pushdown
Enable projection pushdown to let DuckDB skip unrequested columns:
#![allow(unused)] fn main() { TableFunctionBuilder::new("my_func") .projection_pushdown(true) // ... }
Caution: When projection pushdown is enabled, your scan callback must check which columns DuckDB actually needs using
InitInfo::projected_column_countandInitInfo::projected_column_index. Writing to non-projected columns causes crashes.
See examples/hello-ext/src/lib.rs for a complete example using named_param,
local_init, and set_max_threads.
Complex parameter types
For parameterised types that TypeId cannot express (e.g. LIST(BIGINT),
MAP(VARCHAR, INTEGER), STRUCT(...)), use param_logical and
named_param_logical:
#![allow(unused)] fn main() { use quack_rs::types::LogicalType; TableFunctionBuilder::new("read_data") .param_logical(LogicalType::list(TypeId::Varchar)) // positional LIST param .named_param_logical("options", LogicalType::map( // named MAP param TypeId::Varchar, TypeId::Varchar, )) .bind(bind_fn) .init(init_fn) .scan(scan_fn) .register(con)?; }
BindInfo helpers
BindInfo wraps duckdb_bind_info and exposes these methods:
| Method | Description |
|---|---|
add_result_column(name, TypeId) | Declares an output column |
add_result_column_with_type(name, &LogicalType) | Output column with complex type |
set_cardinality(rows, is_exact) | Cardinality hint for the optimizer |
set_error(message) | Report a bind-time error |
parameter_count() | Number of positional parameters |
get_parameter(index) | Returns a positional parameter value (duckdb_value) |
get_named_parameter(name) | Returns a named parameter value (duckdb_value) |
get_extra_info() | Returns the extra-info pointer set on the function |
get_client_context() | Returns a ClientContext (requires duckdb-1-5 feature) |
InitInfo helpers
InitInfo wraps duckdb_init_info:
| Method | Description |
|---|---|
projected_column_count() | Number of projected columns (with pushdown) |
projected_column_index(idx) | Output column index at projection position |
set_max_threads(n) | Maximum parallel scan threads |
set_error(message) | Report an init-time error |
get_extra_info() | Returns the extra-info pointer set on the function |
FunctionInfo helpers
FunctionInfo wraps duckdb_function_info (scan callbacks):
| Method | Description |
|---|---|
set_error(message) | Report a scan-time error |
get_extra_info() | Returns the extra-info pointer set on the function |
Extra info
Use TableFunctionBuilder::extra_info to attach function-level data that is
accessible from all callbacks (bind, init, and scan) via get_extra_info().
Verified output (DuckDB 1.4.4 and 1.5.0)
SELECT * FROM generate_series_ext(5);
-- 0
-- 1
-- 2
-- 3
-- 4
SELECT value * value AS sq FROM generate_series_ext(4);
-- 0
-- 1
-- 4
-- 9
See also
tablemodule documentationreplacement_scan— for file-path-triggered table scanshello-extREADME