AllocDB API
Scope
This document defines the transport-neutral API exposed by
crates/allocdb-node::api.
It is intentionally not an HTTP or RPC spec. It first fixes:
- request and response shapes
- binary wire encoding
- definite vs indefinite submission failures
- strict-read behavior for resource and reservation queries
- metrics exposure
Network transport, authentication, and multi-node routing are out of scope for this stage.
Current Rust Entry Points
The API is available through:
allocdb_node::SingleNodeEngine::handle_api_requestallocdb_node::SingleNodeEngine::handle_api_bytesallocdb_node::encode_requestallocdb_node::decode_requestallocdb_node::encode_responseallocdb_node::decode_response
The public request encoder is fallible. It returns length_too_large if a variable-length field
cannot fit in the binary wire length prefix.
handle_api_bytes is the transport-facing path:
encoded request -> decode -> engine -> encode -> encoded response
Request Types
The current request set is:
submitget_resourceget_reservationget_metricstick_expirations
Submit
submit {
request_slot : u64
payload : bytes
}
payload is the existing encoded core ClientRequest payload:
operation_id : u128
client_id : u128
command : ...
The API deliberately wraps that payload instead of redefining command encoding a second time.
Get Resource
get_resource {
required_lsn : u64 | none
resource_id : u128
}
If required_lsn is present, the node enforces the strict-read fence before serving the read.
Get Reservation
get_reservation {
required_lsn : u64 | none
current_slot : u64
reservation_id : u128
}
current_slot is required because reservation retirement is defined in logical slots.
Get Metrics
get_metrics {
current_wall_clock_slot : u64
}
The caller provides wall-clock progress so the engine can derive logical_slot_lag and
expiration_backlog deterministically.
Tick Expirations
tick_expirations {
current_wall_clock_slot : u64
}
This request drives the bounded expiration scheduler for one maintenance tick. The node inspects
due reservations, commits up to MAX_EXPIRATIONS_PER_TICK internal expire commands through the
WAL, and applies them through the normal executor path.
Submission Responses
Submission responses are split into two classes:
committedrejected
Committed
committed {
applied_lsn : u64
result_code : enum
reservation_id : u128 | none
deadline_slot : u64 | none
from_retry_cache : bool
}
This is the committed state-machine result. A committed command may still return domain failures
such as resource_busy or holder_mismatch; those are allocator outcomes, not submission-layer
failures.
Rejected
rejected {
category : enum { definite_failure, indefinite }
code : enum
}
Current submission failure codes are:
invalid_request(buffer_too_short | invalid_command_tag(tag) | invalid_layout)slot_overflow(kind, request_slot, delta)command_too_large(encoded_len, max_command_bytes)lsn_exhausted(last_applied_lsn)overloaded(queue_depth, queue_capacity)engine_haltedstorage_failure
The required category mapping is:
- definite failure:
invalid_requestslot_overflowcommand_too_largelsn_exhaustedoverloaded
- indefinite:
engine_haltedstorage_failure
This distinction is mandatory. The API must not flatten indefinite write outcomes into ordinary hard failures.
Read Responses
Resource Query
get_resource returns one of:
found(resource_view)not_foundengine_haltedfence_not_applied(required_lsn, last_applied_lsn)
Current resource_view fields are:
resource_id : u128
state : enum { available, reserved, confirmed }
current_reservation_id : u128 | none
version : u64
Reservation Query
get_reservation returns one of:
found(reservation_view)not_foundretiredengine_haltedfence_not_applied(required_lsn, last_applied_lsn)
Current reservation_view fields are:
reservation_id : u128
resource_id : u128
holder_id : u128
state : enum { reserved, confirmed, released, expired }
created_lsn : u64
deadline_slot : u64
released_lsn : u64 | none
retire_after_slot : u64 | none
retired is distinct from not_found because bounded history is part of the product contract.
The live reservation record remains queryable until retire_after_slot; after that, the engine
may drop the full record but must keep returning retired for shard-local reservation_id
values at or below its retired watermark. This watermark is conservative: once full history is
gone, the API no longer distinguishes an aged-out reservation from an older shard-local
reservation_id below that watermark.
If the live engine has halted after a WAL-path ambiguity, reads must fail closed with
engine_halted until recovery rebuilds memory from durable state.
Expiration Tick Responses
tick_expirations returns one of:
applied(processed_count, last_applied_lsn)rejected(category, code)
applied {
processed_count : u32
last_applied_lsn : u64 | none
}
processed_count is the number of internal expire commands committed in this maintenance tick.
last_applied_lsn is absent when no due expiration was processed.
rejected(category, code) reuses the same failure envelope as submit. The current tick path can
return the same indefinite failures as submit, and it can also return definite failures such as
slot_overflow if the derived expiration retirement slot would exceed u64::MAX.
Metrics Response
get_metrics returns:
metrics {
queue_depth
queue_capacity
accepting_writes
recovery {
startup_kind
loaded_snapshot_lsn
replayed_wal_frame_count
replayed_wal_last_lsn
active_snapshot_lsn
}
core {
last_applied_lsn
last_request_slot
logical_slot_lag
expiration_backlog
operation_table_used
operation_table_capacity
operation_table_utilization_pct
}
}
loaded_snapshot_lsn remains optional. An empty durable snapshot reports startup_kind = snapshot_only with loaded_snapshot_lsn = none.
Wire Discipline
The current binary encoding is:
- little-endian
- fixed-width integers
- explicit request and response tags
- length-prefixed byte payloads where variable length is required
- no generic serializer dependency
Malformed outer API frames are codec failures. Malformed submit.payload values are submission
failures with category definite_failure and code invalid_request(...).
Non-Goals
The current API does not yet specify:
- HTTP or gRPC transport
- authentication or authorization
- batch submission
- streaming reads
- replication-aware routing
- client SDK ergonomics