Reading & Writing Vectors
DuckDB passes data to and from your extension as vectors — columnar arrays of typed
values, with a separate NULL bitmap. VectorReader and VectorWriter provide safe,
typed access to these vectors.
VectorReader
Construction
#![allow(unused)] fn main() { // In a scalar function callback: let reader = unsafe { VectorReader::new(input, column_index) }; // In an aggregate update callback: let reader = unsafe { VectorReader::new(input, 0) }; // first column }
VectorReader::new takes the duckdb_data_chunk and a zero-based column index. The
reader borrows the chunk — it must not outlive the callback.
Row count
#![allow(unused)] fn main() { let n = reader.row_count(); // number of rows in this chunk }
Chunk sizes vary. Always loop from 0..reader.row_count(), never assume a fixed size.
NULL check
#![allow(unused)] fn main() { if unsafe { !reader.is_valid(row) } { // row is NULL — skip or propagate NULL to output unsafe { writer.set_null(row) }; continue; } }
Always check is_valid before reading. Reading from a NULL row returns garbage data.
Reading values
#![allow(unused)] fn main() { let i: i8 = unsafe { reader.read_i8(row) }; let i: i16 = unsafe { reader.read_i16(row) }; let i: i32 = unsafe { reader.read_i32(row) }; let i: i64 = unsafe { reader.read_i64(row) }; let u: u8 = unsafe { reader.read_u8(row) }; let u: u16 = unsafe { reader.read_u16(row) }; let u: u32 = unsafe { reader.read_u32(row) }; let u: u64 = unsafe { reader.read_u64(row) }; let f: f32 = unsafe { reader.read_f32(row) }; let f: f64 = unsafe { reader.read_f64(row) }; let b: bool = unsafe { reader.read_bool(row) }; // safe: uses u8 != 0 let s: &str = unsafe { reader.read_str(row) }; // handles inline + pointer format let iv = unsafe { reader.read_interval(row) }; // returns DuckInterval // Temporal and binary types (v0.10.0+): let d: i32 = unsafe { reader.read_date(row) }; // days since epoch let ts: i64 = unsafe { reader.read_timestamp(row) }; // microseconds since epoch let t: i64 = unsafe { reader.read_time(row) }; // microseconds since midnight let blob: &[u8] = unsafe { reader.read_blob(row) }; // binary data let uuid: i128 = unsafe { reader.read_uuid(row) }; // UUID as i128 }
VectorWriter
Construction
#![allow(unused)] fn main() { // In a scalar function callback: let mut writer = unsafe { VectorWriter::new(output) }; // In an aggregate finalize callback: let mut writer = unsafe { VectorWriter::new(result) }; }
Writing values
#![allow(unused)] fn main() { unsafe { writer.write_i8(row, value) }; unsafe { writer.write_i16(row, value) }; unsafe { writer.write_i32(row, value) }; unsafe { writer.write_i64(row, value) }; unsafe { writer.write_u8(row, value) }; unsafe { writer.write_u16(row, value) }; unsafe { writer.write_u32(row, value) }; unsafe { writer.write_u64(row, value) }; unsafe { writer.write_f32(row, value) }; unsafe { writer.write_f64(row, value) }; unsafe { writer.write_bool(row, value) }; unsafe { writer.write_varchar(row, s) }; // &str (also available as write_str) unsafe { writer.write_str(row, s) }; // alias for write_varchar unsafe { writer.write_interval(row, interval) }; // DuckInterval // Temporal and binary types (v0.10.0+): unsafe { writer.write_date(row, days_since_epoch) }; unsafe { writer.write_timestamp(row, micros_since_epoch) }; unsafe { writer.write_time(row, micros_since_midnight) }; unsafe { writer.write_blob(row, &bytes) }; unsafe { writer.write_uuid(row, uuid_i128) }; }
Writing NULL
#![allow(unused)] fn main() { unsafe { writer.set_null(row) }; }
Pitfall L4:
set_nullcallsduckdb_vector_ensure_validity_writableautomatically before accessing the validity bitmap. Callingduckdb_vector_get_validitywithout this prerequisite returns an uninitialized pointer → SEGFAULT.VectorWriter::set_nullhandles this correctly. See Pitfall L4.
Clearing NULL (v0.11.0+)
To undo a previous set_null call and mark a row as valid again:
#![allow(unused)] fn main() { unsafe { writer.set_valid(row) }; }
set_valid also calls ensure_validity_writable automatically.
DataChunk
DataChunk wraps a duckdb_data_chunk handle, providing ergonomic access to
vectors and metadata without raw FFI calls:
#![allow(unused)] fn main() { use quack_rs::data_chunk::DataChunk; unsafe extern "C" fn my_scan(info: duckdb_function_info, output: duckdb_data_chunk) { let chunk = unsafe { DataChunk::from_raw(output) }; let mut writer = unsafe { chunk.writer(0) }; // VectorWriter for column 0 unsafe { writer.write_i64(0, 42) }; unsafe { chunk.set_size(1) }; // set output row count } }
Methods:
size()— current row countset_size(n)— set row count (0 = end of stream)column_count()— number of columnsvector(col)— rawduckdb_vectorhandlewriter(col)—VectorWriterfor a columnreader(col)—VectorReaderfor a columnstruct_writer(col, field_count)—StructWriterfor a STRUCT output columnstruct_reader(col, field_count)—StructReaderfor a STRUCT input columnstruct_field_reader(col, field)—VectorReaderfor a specific STRUCT fieldinto_chunk_writer()— convert toChunkWriterwith autoset_sizeon drop
StructWriter / StructReader
For STRUCT columns with many fields, creating individual VectorWriter/VectorReader
instances for each field is verbose. StructWriter and StructReader pre-create all
field writers/readers at construction:
#![allow(unused)] fn main() { // Writing a 5-field STRUCT output: let mut sw = unsafe { chunk.struct_writer(0, 5) }; unsafe { sw.write_bool(row, 0, result.success); sw.write_varchar(row, 1, &result.data); sw.write_i64(row, 2, result.count); sw.write_date(row, 3, result.day); sw.write_blob(row, 4, &result.payload); } // Reading a 3-field STRUCT input: let sr = unsafe { chunk.struct_reader(0, 3) }; for row in 0..chunk.size() { let name = unsafe { sr.read_str(row, 0) }; let age = unsafe { sr.read_i32(row, 1) }; let active = unsafe { sr.read_bool(row, 2) }; } }
ChunkWriter
ChunkWriter wraps an output duckdb_data_chunk and tracks rows. It automatically
calls set_size on drop, preventing the common off-by-one bug:
#![allow(unused)] fn main() { let mut cw = unsafe { DataChunk::from_raw(output).into_chunk_writer() }; while let Some(row) = cw.next_row() { unsafe { cw.writer(0).write_varchar(row, &data[row].name) }; unsafe { cw.writer(1).write_i64(row, data[row].value) }; if cw.is_full() { break; } } // set_size called automatically when `cw` is dropped }
ValidityBitmap
For advanced NULL handling beyond VectorWriter::set_null, use ValidityBitmap
directly:
#![allow(unused)] fn main() { use quack_rs::vector::ValidityBitmap; // Writing NULLs: let mut bitmap = unsafe { ValidityBitmap::ensure_writable(some_vector) }; unsafe { bitmap.set_row_invalid(row as u64) }; // mark as NULL unsafe { bitmap.set_row_valid(row as u64) }; // mark as non-NULL // Reading NULLs: let bitmap = unsafe { ValidityBitmap::get_read_only(some_vector) }; let is_valid = unsafe { bitmap.row_is_valid(row as u64) }; }
ValidityBitmap is available in the prelude: use quack_rs::prelude::*.
Utility functions
The quack_rs::vector module provides two utility functions:
#![allow(unused)] fn main() { use quack_rs::vector::{vector_size, vector_get_column_type}; // Returns the default vector size used by DuckDB (typically 2048). let size: u64 = vector_size(); // Returns the LogicalType of a vector (unsafe — requires a valid duckdb_vector). let lt = unsafe { vector_get_column_type(some_vector) }; }
Memory layout details
DuckDB stores vector data as flat arrays. VectorReader and VectorWriter compute
element addresses as base_ptr + row * stride:
[value0][value1][value2]...[valueN] ← typed array
[validity bitmap] ← separate bit array, 1 bit per row
The validity bitmap is lazily allocated — it may be null if no NULLs have been written.
This is why ensure_validity_writable must be called before any get_validity call
that follows a write path.
Complete scalar function pattern
#![allow(unused)] fn main() { unsafe extern "C" fn my_scalar( _info: duckdb_function_info, input: duckdb_data_chunk, output: duckdb_vector, ) { let reader = unsafe { VectorReader::new(input, 0) }; let mut writer = unsafe { VectorWriter::new(output) }; for row in 0..reader.row_count() { if unsafe { !reader.is_valid(row) } { unsafe { writer.set_null(row) }; continue; } let value = unsafe { reader.read_i64(row) }; unsafe { writer.write_i64(row, transform(value)) }; } } }