Develop Database Extensions Using PGRX
This document explains how to develop database extensions using Rust and the PGRX framework. PGRX is a Rust framework for developing extensions for Apache Cloudberry, offering a safe and efficient development experience.
For the core features of PGRX, see PGRX Core Features. For notes of PGRX, see Considerations and Best Practices for PGRX.
Requirements for Development Environment
- Make sure that your OS is one of Debian/Ubuntu and RHEL/CentOS.
- Make sure that your Apache Cloudberry cluster is compiled from source code, not installed using RPM package.
Basic Software Environment
Required software:
- Rust toolchain (rustc,cargo, andrustfmt) - install via https://rustup.rs
- Git
- libclang 11 or higher (for bindgen)
- GCC 7 or higher
PostgreSQL Dependencies
Install required PostgreSQL dependencies for your OS:
For Debian/Ubuntu:
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison \
    libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config
For RHEL/CentOS:
sudo yum install -y bison-devel readline-devel zlib-devel openssl-devel wget ccache
sudo yum groupinstall -y 'Development Tools'
After installing the dependencies, you can start developing extensions.
Quick Start for PGRX
This section introduces the process of quickly developing extensions using PGRX, including:
- Setting up and installing PGRX
- Creating extension
- Installing and using extension
Set up and Install PGRX
PGRX is maintained by PgCentral Foundation, Inc., while we used here is one forked PGRX version with better compatibility within Cloudberry. It's contributed by the community members and customized for Cloudberry, but please note that it is not maintained as one official Cloudberry project.
- 
Set the environment variable for Apache Cloudberry's pg_configpath, where<pg_config_path>is the path in your Apache Cloudberry cluster (for example,/usr/local/cloudberry-db/bin/pg_config):export PGRX_PG_CONFIG_PATH=<pg_config_path>
- 
Build the PGRX framework: - 
Clone the Apache Cloudberry-compatible pgrxrepository:git clone https://github.com/cloudberry-contrib/pgrx
 cd pgrx
- 
Build the code with pg14andcbdbfeatures enabled:cargo build --features "pg14, cbdb"
 
- 
- 
Install the Apache Cloudberry-compatible cargo-pgrxtool:cargo install --path cargo-pgrx/
- 
Initialize the environment with your database kernel version: cargo pgrx init --pg14=`which pg_config`
Create an Extension
- 
Generate an extension template. This example creates an extension named my_extension:cargo pgrx new my_extension
 cd my_extensionThe created directory structure is as follows: .
 ├── Cargo.toml
 ├── my_extension.control
 ├── sql
 └── src
 ├── bin
 │ └── pgrx_embed.rs
 └── lib.rs
- 
Modify dependencies in Cargo.tomlto use local PGRX:- 
Change pgrx = "0.12.7"under[dependencies]to point to thepgrxdirectory in your local PGRX repository. For example:[dependencies]
 pgrx = { path = "/home/gpadmin/pgrx/pgrx/", features = ["pg14", "cbdb"] }
- 
Add pgrx-pg-sysunder[dependencies]to point to thepgrx-pg-sysdirectory in your local PGRX repository. For example:[dependencies]
 pgrx-pg-sys = { path = "/home/gpadmin/pgrx/pgrx-pg-sys/", features = ["pg14", "cbdb"] }
- 
Change pgrx-tests = "0.12.7"under[dev-dependencies]to point to thepgrx-testsdirectory in your local PGRX repository:[dev-dependencies]
 pgrx-tests = { path = "/home/gpadmin/pgrx/pgrx-tests/" }
 
- 
- 
Append the extension name my_extensionto theworkspace.membersarray of theCargo.tomlfile in the root directory of your local PGRX repository. For example:vi /home/gpadmin/pgrx/Cargo.toml[workspace]
 resolver = "2"
 members = [
 "cargo-pgrx",
 "pgrx",
 "pgrx-macros",
 "pgrx-pg-config",
 "pgrx-pg-sys",
 "pgrx-sql-entity-graph",
 "pgrx-tests",
 "pgrx-bindgen",
 "my_extension"
 ]
- 
Grant the current system user the permissions to the Apache Cloudberry directory. For example, if the current user is gpadminand Apache Cloudberry directory is/usr/local/cloudberrydb:sudo chown -R gpadmin:gpadmin /usr/local/cloudberrydb
Install and Use the Extension
- 
Install the extension: cargo pgrx install
- 
To use the extension in the database, connect to the database and execute the following statements: CREATE EXTENSION my_extension;
 -- Tests example function
 SELECT hello_my_extension();
PGRX Type Mapping
The table below lists the complete mapping of Apache Cloudberry (PostgreSQL) data types to Rust types:
| Database data type | Rust type ( Option<T>) | 
|---|---|
| bytea | Vec<u8>or&[u8](zero-copy) | 
| text | Stringor&str(zero-copy) | 
| varchar | Stringor&str(zero-copy) orchar | 
| "char" | i8 | 
| smallint | i16 | 
| integer | i32 | 
| bigint | i64 | 
| oid | u32 | 
| real | f32 | 
| double precision | f64 | 
| bool | bool | 
| json | pgrx::Json(serde_json::Value) | 
| jsonb | pgrx::JsonB(serde_json::Value) | 
| date | pgrx::Date | 
| time | pgrx::Time | 
| timestamp | pgrx::Timestamp | 
| time with time zone | pgrx::TimeWithTimeZone | 
| timestamp with time zone | pgrx::TimestampWithTimeZone | 
| anyarray | pgrx::AnyArray | 
| anyelement | pgrx::AnyElement | 
| box | pgrx::pg_sys::BOX | 
| point | pgrx::pg_sys::Point | 
| tid | pgrx::pg_sys::ItemPointerData | 
| cstring | &core::ffi::CStr | 
| numeric | pgrx::Numeric<P, S>orpgrx::AnyNumeric | 
| void | () | 
| ARRAY[]::<type> | Vec<Option<T>>orpgrx::Array<T>(zero-copy) | 
| int4range | pgrx::Range<i32> | 
| int8range | pgrx::Range<i64> | 
| numrange | pgrx::Range<Numeric<P, S>>orpgrx::Range<AnyRange> | 
| daterange | pgrx::Range<pgrx::Date> | 
| tsrange | pgrx::Range<pgrx::Timestamp> | 
| tstzrange | pgrx::Range<pgrx::TimestampWithTimeZone> | 
| NULL | Option::None | 
| internal | pgrx::PgBox<T>(whereTcan be any Rust/Postgres struct) | 
| uuid | pgrx::Uuid([u8; 16]) | 
Custom Type Conversions
You can implement additional type conversions in the following ways:
- Implement IntoDatumandFromDatumtraits.
- Use #[derive(PostgresType)]and#[derive(PostgresEnum)]for automatic type conversions.
Type Mapping Details
PGRX converts text and varchar to &str or String, and verifies whether the encoding is UTF-8. If an encoding other than UTF-8 is detected, PGRX triggers a panic to alert the developer. Because UTF-8 validation might affect performance, it is not recommended to rely on UTF-8 validation.
The default encoding for PostgreSQL servers is SQL_ASCII, which guarantees neither ASCII nor UTF-8 (Apache Cloudberry will accept but ignore non-ASCII bytes). For best results, always use UTF-8 encoding with PGRX and explicitly set the database encoding when creating the database.
PGRX Core Features
Complete Management for Development Environment
cargo-pgrx provides a complete set of command-line tools:
- cargo pgrx new: Quickly creates a new extension.
- cargo pgrx init: Installs or registers an Apache Cloudberry (PostgreSQL) instance.
- cargo pgrx run: Interactively tests the extension in psql (or pgcli).
- cargo pgrx test: Performs unit tests across multiple Apache Cloudberry (PostgreSQL) versions.
- cargo pgrx package: Creates an extension installation package.
Automatic Mode Generation
- Fully implements the extension using Rust.
- Automatically maps various Rust types to Apache Cloudberry (PostgreSQL) types.
- Automatically generates SQL schema (can also be manually generated using cargo pgrx schema).
- Uses extension_sql!andextension_sql_file!to include custom SQL.
Security First
- Converts Rust's panic!to Apache Cloudberry/PostgreSQL'sERROR(abort the transaction, not the process).
- Memory management follows Rust's DROPsemantics, including handlingpanic!andelog(ERROR)cases.
- Uses #[pg_guard]procedural macro to ensure safety.
- Apache Cloudberry's Datumis represented asOption<T> where T: FromDatum, with NULL values safely represented asOption::<T>::None.
UDF Supports
- Uses #[pg_extern]annotation to expose functions to Apache Cloudberry.
- Returns pgrx::iter::SetOfIterator<'a, T>to implementRETURNS SETOF.
- Returns pgrx::iter::TableIterator<'a, T>to implementRETURNS TABLE (...).
- Uses #[pg_trigger]to create trigger functions.
Simple Custom Types
- Uses #[derive(PostgresType)]to treat Rust structs as Apache Cloudberry types.- By default, CBOR encoding is used for storage, and JSON is used as a human-readable format.
- Supports custom memory/disk/readable formats.
 
- Uses #[derive(PostgresEnum)]to treat Rust enums as Apache Cloudberry (PostgreSQL) enums.
- Supports composite types via pgrx::composite_type!("Sample")macro.
Server Programming Interface (SPI)
- Secure access to SPI.
- Transparently returns ownership of Datum from SPI context.
Advanced Features
- Securely accesses Apache Cloudberry's memory context system via pgrx::PgMemoryContexts.
- Supports executor/planner/transaction/subtransaction hooks.
- Securely handles Apache Cloudberry pointers using pgrx::PgBox<T>(similar toalloc::boxed::Box<T>).
- Protects Rust functions passed to Apache Cloudberry's extern "C"using#[pg_guard]procedural macro.
- Accesses Apache Cloudberry's logging system via the eprintln!macro.
- Directly (unsafe) accesses Apache Cloudberry internals via the pgrx::pg_sysmodule.
Considerations and Best Practices for PGRX
Thread supports:
- Apache Cloudberry strictly follows a single-threaded model.
- Custom threads cannot call internal database functions.
- The interaction method for asynchronous contexts is still under exploration.
Encoding requirements:
- It is recommended to use UTF-8 encoding.
- The default server encoding is SQL_ASCII.
- It is recommended to explicitly set the encoding when creating the database.
Debugging and Development Tips
- Uses cargo pgrx testfor unit testing.
- Uses #[pg_guard]to ensure memory safety.
- For custom types, uses appropriate serialization methods.
Learning Resources for PGRX
The following resources can help you gain a deeper understanding of PGRX:
- Learn about all available cargo-pgrxsubcommands and options: cargo-pgrx command details
- Learn how to define and use custom data types: custom type examples
- Explore how to implement custom operators: operator functions and operator classes/families
- Learn how to use shared memory: shared memory support
- Browse example code implementations: more example code