Aranya Documentation An overview of the Aranya project

Aranya Product MVP Spec

Introduction

A product specification for version 1 of the standalone Aranya daemon and user library. The goal is to provide a commercial-off-the-shelf solution for integrating Aranya. Customers should be able to download the Aranya daemon and user library, setup a team, and begin using Aranya.

Much of this document will serve as an extension to the Aranya Beta spec that was implemented as version beta of the product. To capture all details related to the MVP in a single spot, relevant information from the beta spec has been carried into this document below.

Primary Goals:

  1. Provide a low friction solution customers can use to better secure their infrastructure.
  2. Easily setup a team and onboard devices.
  3. Implement a default policy that works well enough for a wide variety of situations that customers can relate to.
  4. Provide stable and backwards compatible APIs that allow devices to interact with Aranya.
  5. Expose an API for point-to-point high performance encrypted communication using IP for transport.

Secondary internal goals:

  1. Design for the ability to swap out policies in a future version.
  2. Design for future improvements, like additional data planes and configurable roles.

A glossary is available in Appendix C.

Usage Requirements

Aranya is a decentralized message delivery platform with authorization built in. Below are some basic requirements for running this version of the product:

System Requirements:

  • We will target x86, ARM, and ARM64 running Linux.
    • Must run on Mac for development reasons.
    • ARM (arm32) is in progress, additional work is required to add it to CI. https://github.com/aranya-project/aranya-docs/pull/24/files#r1917143524
  • We will assume there is IPv4 or IPv6 connectivity.

Measure the following values to estimate system requirements and include results in this spec.

  • Memory usage
  • Disk requirements:
    • Storage amount
    • Storage device write speed
    • Storage device seek speed

For a full and up to date list, start with this issue: https://github.com/aranya-project/aranya/issues/62

Architecture

There will be two subsystems as part of the product:

  1. The daemon
  2. The client library

The daemon is a standalone process that runs an Aranya instance and exposes the control plane API via IPC. It handles setting up all the dependencies that Aranya needs including storage, policy, and network communication. The daemon will periodically sync with registered peers and handle commands as they are synced. The current design does not require that effects are stored on disk. Other implementations have required effects be stored on disk in case the program is terminated after the effect has been emitted, but before it has been processed by the daemon or user application. This extends to the shared memory; there is no requirement it be preserved between program restarts.

The client library exists inside the device’s process and is used to interact with the daemon. The client library also includes a light wrapper around QUIC Channels, allowing the device to encrypt data and send messages to peers. QUIC channels will provide networking and mostly live in the client library. Channel setup will require communication with the daemon. Current plans for language bindings are to initially focus on the C API. The Rust client library is a bonus that we gain from the Rust -> C compilation. The C API should include autogenerated docs.

The daemon is able to participate in multiple teams using the same root identity by using some scheme of generating leaf keybundles per team. This also removes the one-keybundle-per-process requirement. For MVP we can lay the groundwork for the multi-keybundle approach, and better management of identities can be added post-MVP or as time allows for MVP.

daemon subsystems with detail

Config

On startup, the daemon requires a path to the working directory. That directory contains a configuration file with the path to the unix domain socket that should be created and other values. The daemon working directory will structured to be easily understood and contain the required state information. The directory structure will have subdirectories (example: storage, config, etc.) to better organize the content.

All other config values are provided when a client context is initialized and passed to the daemon over the Unix Domain Socket. This approach allows the device API to drive the configuration of the daemon, and can help reduce errors in config mismatch by minimizing the number of duplicate config values. The daemon will need to validate the configuration files and be able to handle cases of partial initialization.

The daemon will expose a simple API that clients can use to request connection information for data planes like AQC. The client will be able to request the AQC config from the daemon and use that directly. Other configuration values can also be added to this mechanism. After MVP, additional APIs can be exposed to change those configured values.

A config object will be used to instantiate a graph, i.e. in the action to perform the Init command and the API call for adding an existing graph (see CreateTeam and AddTeam APIs below). The config object will be versioned and extensible, and should not break API compatibility.

Components

The system is made up of multiple components:

The “client API” is the top level component and contains local operations such as enabling/ disabling syncing or initializing a new device.

The Access Control Plane is the top level control plane, enabling IDAM operations and other on-graph operations. The Access Control Plane is used to manage keys, address assignment, roles, and labels as set out in the policy which is written in the MVP using version 2.0 of the policy language.

For the MVP, Aranya Quic Channels (AQC) will be built to provide a simple API for sending and receiving messages using a modified Quic transport. The Aranya Quic Channels contains its own control plane for control messages, as well as the main data plane for moving data between devices. See Aranya Quic Channels API.

Component structure:

  • Local Client API: syncing, local device management
    • Access control plane (IDAM control plane): IDAM lifecycle
      • Quic Channels
        • Quic Channels control plane
        • Quic Channels data plane
      • … additional planes in future versions

Client APIs

The client APIs are local-only API endpoints that do not create commands on the graph. They are mostly used to manage the local state. Depending on the language, the endpoints may be a different format that is more idiomatic to that language such as snake_case for C.

  • ClientInit(client_config) -> client - creates a client connection to the daemon IPC.
  • GetKeyBundle() -> keybundle - returns the current device’s public key bundle.
  • GetDeviceId() -> device_id - returns the device’s device ID.
  • AddTeam(add_team_config) -> bool - add an existing team to the local device store with a specified team configuration. Not an Aranya action/command. Add team can accept either a raw IKM or wrapped PSK seed depending on the mode provided in the team config.
  • RemoveTeam(team_id) -> bool - remove a team from the local device store. Not an Aranya action/ command.
  • SerializeKeyBundle(scheme, keybundle) -> bytes - serialize a keybundle to a given format/ scheme. Ideally, this can be used to serialize to either human readable or machine readable formats.
  • DeserializeKeyBundle(scheme, bytes) -> keybundle - desrialize a keybundle from a given format/ scheme.
  • SerializeId(id) -> bytes - serialize an ID to the standard base58 format. https://github.com/aranya-project/aranya-docs/pull/24/files#r1915516900
  • DeserializeId(bytes) -> id - deserialize an ID from the standard base58 format. https://github.com/aranya-project/aranya-docs/pull/24/files#r1915516900

Config

Client Config
  • Daemon IPC unix domain socket path
  • AQC config (network address to bind to)
Create Team Config

The create team config contains information needed to configure a created team in Aranya. The QUIC syncer config is a field of the create team config.

struct CreateTeamConfig {
  quic_sync: Option<CreateTeamQuicSyncConfig>,
}
Add Team Config

The add team config contains information needed to configure an added team in Aranya. The QUIC syncer config is a field of the add team config.

A device needs to know the ID of the team when adding it to local device storage.

struct AddTeamConfig {
  team_id: TeamId,
  quic_sync: Option<AddTeamQuicSyncConfig>,
}
QUIC Syncer Config

To configure the QUIC syncer for a team, a PSK seed is needed to bootstrap the rustls PSK used to secure the sync protocol for a team.

There are 3 mutually exclusive modes for configuring the team PSK for the QUIC syncer (represented as an enum). SeedMode:

  • GeneratePskSeed Default and most secure option. Aranya generates the PSK seed internally and returns a wrapped PSK seed.
  • WrappedPskSeed(peer_enc_pk, encrypted_psk, encap_key) Encrypted PSK seed passed in as input. Key is authenticated using the sender’s public encryption key.
  • IKM(ikm) Provides raw input key material to derive a PSK seed.

CreateTeam(...) accepts one of the PSK modes as input and returns the PSK seed bytes. If GeneratePskSeed mode is specified, the input key material used to derive the PSK seed is generated internally which is the preferred, most secure option. WrappedPskSeed is not a valid mode for this operation. Specifying the IKM mode as input will use raw IKM provided to derive the PSK seed. AddTeam(...) accepts the PSK bytes from WrappedPskSeed or IKM modes. GeneratePskSeed is not a valid mode for this operation.

Sync Peer Config

Fields for configuring a sync peer:

  • interval time to wait between syncs
  • sync_now whether to sync immediately

Sync API

The IDAM control plane will be managed by the daemon process which will be accessed with the APIs provided in IDAM Control Plane API via an IPC mechanism. The daemon will be responsible for syncing state with peers. To enable, the APIs below will be provided to add and remove peers to sync from.

Unix Domain Sockets and shm will be used for IPC between the daemon and client library. Additional IPC mechanisms may be explored in the future.

  • AddSyncPeer(address, team_id, sync_config) -> bool - add a peer to start syncing with at a specific rate. Syncing should support DNS resolution in the case a domain name is used. Syncs immediately the first time. The config object includes details for authorization (potentially authentication too) and options for syncing by pushing or pulling.
  • RemoveSyncPeer(address, team_id) -> bool - remove a sync peer associated with the given address, team_id.
  • SyncNow(address, team_id, Option<sync_config>) -> bool - Trigger an immediate sync for the peer. If a sync config is provided, use that. If no config arg is provided, fallback to the config used when the peer was added via AddSyncPeer. If the peer was not added, use a default config or error. https://github.com/aranya-project/aranya-docs/pull/24/files#r1917188406

Onboarding API

Easy to implement, key moving is done by integration.

  1. Create Device (NewDevice)
  2. Get Device Key (Current device KeyBundle (what you give to the admin))
  3. Give KeyBundle to admin on team (integration problem)
  4. Admin does AddDevice(device_key_bundle)
  5. Get team_id from admin (integration detail)
  6. Add team_id to client (AddTeam)
  7. Sync with device on team (AddSyncPeer)

IDAM Control Plane API

The IDAM control plane is for managing identity and authorization by interacting with the graph. Each endpoint creates one or more commands on the graph. The first command in the graph, aka the Init command, contains the system’s policy that defines the IDAM control plane for bootstrapping.

  • InitTeamConfig(Option<seed>) -> team_config - Initialize a TeamConfig object with a QUIC syncer PSK seed.
  • CreateTeam(owner_keybundle, create_team_config) -> team_id - initialize the graph, creating the team with the author as the owner. Configures team based on the team config. Includes policy for bootstrapping. Accepts one of the PSK modes as input. If GenerateKey mode is specified, a PSK seed is generated internally which is the preferred, most secure option.
  • Rand() -> random_bytes - generate random bytes from CSPRNG. Can be used to generate a raw PSK IKM for the QUIC syncer.
  • EncryptPskSeedForPeer(team_id, keybundle) -> wrapped_seed - encrypts a QUIC syncer PSK seed for another peer device using the peer’s public encryption key. Returns wrapped PSK seed type containing the team ID and encrypted PSK seed. The team ID is included in this type so only a single serialized type needs to be transmitted to the peer before it can invoke AddTeam().
  • CloseTeam(team_id) -> bool - close the team and stop all operations on the graph.
  • AddDeviceToTeam(team_id, keybundle) -> bool - add a device to the team with the default role.
  • RemoveDeviceFromTeam(team_id, device_id) -> bool - remove a device from the team.
  • AssignRole(team_id, device_id, role) -> bool - assign a role to a device.
  • RevokeRole(team_id, device_id, role) -> bool - remove a role from a device.
  • DefineRole(role_name) -> bool - define a new role type that can be assigned to devices.
  • ReplaceCommandRole(command_name, authoring_role) -> bool - allow a particular role type to author the specified command type.
  • ReplaceEntityClass(team_id, device_id, entity_class) -> bool - replace the entity class associated to a device.

Aranya Channels API

AQC uses a modified Quic transport implementation that supports the ability to use custom cryptography and has latency-based congestion control (see (s2n-quic)[https://github.com/aws/s2n-quic]). More information on AQC, including the list of AQC-specific APIs, can be found in the AQC spec:

  • Draft version: https://github.com/aranya-project/aranya-docs/blob/2-quic-channels/docs/quic-channels.md
  • Eventually where the final spec will live: https://github.com/aranya-project/aranya-docs/blob/main/docs/quic-channels.md

The Quic Channels plane is split in two different sub-planes: the Quic channels control plane and the Quic channels data plane. The AQC control plane is responsible for any Aranya command or ephemeral commands, while the AQC data plane contains only data-related APIs.

Embedded devices that implement a subset of Aranya library should still be able to sync with clients that have the full product integrated. This compatibility is planned for Post-MVP.

Both peer devices must be granted permission to use a label prior to creating an AQC channel with each other:

  • CreateLabel(team_id, label) - create a label
  • DeleteLabel(team_id, label) - delete a label
  • AssignLabel(team_id, device_id, label) - assign a label to a device
  • RevokeLabel(team_id, device_id, label) - revoke a label from a device

AQC API

  • SetAqcNetIdentifier(team_id, device_id, net_identifier) - associate a network address to a device for use with AQC. If the address already exists for this device, it is replaced with the new address. Capable of resolving addresses via DNS. For use with CreateChannel and receiving messages. Can take either DNS name, IPv4, or IPv6. Current implementation uses a bidi map, so we can reverse lookup.
  • UnsetAqcNetIdentifier(team_id, device_id, net_identifier) - disassociate an AQC network address from a device.
  • CreateAqcBidiChannel(team_id, peer_net_ident, label) -> channel - create a bidirectional AQC channel with the given peer.
  • CreateAqcUniChannel(team_id, peer_net_ident, label) -> channel - create a unidirectional AQC channel with the given peer.
  • ReceiveAqcChannel() -> channel - receive the next available unidirectional or bidirectional AQC channel.
  • TryReceiveAqcChannel() -> channel - non-blocking version of ReceiveAqcChannel().
  • ReceiveAqcStream(channel) -> stream - receive the next available AQC stream for an AQC channel.
  • CreateAqcBidiStream(channel) -> stream - create a new bidirectional stream on an existing bidirectional AQC channel.
  • CreateAqcUniStream(channel) -> stream - create a new unidirectional stream on an existing bidirectional or unidirectional AQC channel.
  • SendAqcStreamData(stream, data) -> bool - send data via an AQC stream.
  • ReceiveAqcStreamData(stream) -> Option<data> - block until data is received via an AQC stream.
  • TryReceiveAqcStreamData(stream) -> Option<data> - attempt to receive data from an AQC stream.
  • DeleteAqcBidiChannel(channel) - delete a bidirectional AQC channel.
  • DeleteAqcUniChannel(channel) - delete a unidirectional AQC channel.

Graph Querying APIs

FactDB queries over the current perspective of the graph should be possible through ephemeral commands in the policy that will return query results in their emitted effects. These APIs are likely to be moved to nice-to-have or Post-MVP, but are currently planned for MVP.

  • QueryRoleAssignment(device_id) -> Role
  • QueryDeviceKeybundle(device_id) -> Keybundle
  • QueryAqcNetworkId(device_id) -> network_str
  • QueryAqcLabelAssignments(device_id) -> Vec<label>
  • QueryAqcLabelExists(device_id) -> Vec<label>

Roles & Permissions

There will be 4 default roles with the following set of permissions for each. The MVP will also include an expansion of the role system, allowing the device to create custom roles and reassign permissions for specific commands to custom roles. There will be a CLI for managing permissions.

owner

  • all permissions excluding sending data on channel label
  • create/close team
  • add/remove devices
  • elevate/revoke permissions for devices up to and including owner
  • define/undefine channel labels
  • assign/remove addresses/names for Aranya channels
  • assign/revoke channel labels

admin

  • elevate/revoke permissions for devices up to a max level of operator
  • define/undefine channel labels
  • assign/remove addresses/names for Aranya channels
  • cannot send data on channel labels

operator

  • add (new) / rm device in team
  • assign member role to devices in team
  • assign/revoke channel labels
  • assign/remove addresses/names for channel
  • cannot send data on channel labels

member

  • use Aranya channel

Notes:

  • Devices can always remove and demote themselves
  • Devices can only remove or demote another device of equal role when their entity class is higher.

Documentation

API documentation must be provided for the client API covering the functions and behavior of each API call. Most likely, this will take the form of a doxygen-like web page. Developers can use this to look up language agnostic functions for operating the client API. The documentation should also include tutorials and a quickstart to get developers up and running with the product as soon as possible. Documentation should also be provided for the daemon so that developers and sysadmins can understand the requirements and operations of the daemon.

Appendix

Appendix A: TODOs needed to complete MVP updates

This is a list of things that need to be tracked and accomplished for the MVP. Scope is subject to change. This list is unordered within the sections.

Urgent

High

  • Measure system requirements and record values in this spec. https://github.com/aranya-project/aranya/issues/62
  • Maybe also include some measures on compile time and build requirements
  • Quic channels spec and implementation
  • ARM32 support for CI and releases
  • Expose bytes of IDs to end user

Normal

  • Update existing code and docs to use “device” instead of “user”.
  • Ensure “graph” is used in core, while “team” is used in product.
  • Add “ephemeral” marker in command metadata.
  • Update AddSyncPeer to sync immediately
  • Add config object with just version field to CreateTeam and AddTeam APIs.
  • Implement SyncNow
  • Add config object to AddSyncPeer and include the rate parameter in it.
  • Might want to add max # of bytes to sync, timeout for number of secs it stays open, etc.
  • Custom policy roles spec and implementation
  • Allow a single keybundle to participate in multiple teams via some sort of root-identity-key (or other mechanism)
  • Fact DB queries via session commands
  • Standardize C API names to follow naming scheme.
  • Update key agreement commands to have a field for the sender’s graph head ID. https://github.com/aranya-project/aranya-docs/pull/24#discussion_r1937642908

Low

  • Set up CI to measure resource usage
  • Implement FactDB query APIs
  • CLI for managing permissions
  • Update default policy to have one role type authoring each command

Nice to have

This list encompasses anything moved to Post-MVP but would be nice to get into the MVP if we have resources.

  • Implement Finalization
  • Implement AwaitCommand and IsPresent (see Post-MVP spec).

Uncategorized

  • Update spec diagrams.
  • Outline and apply daemon working directory changes.
  • Establish best terminology for reference (e.g. “device store”?).
  • Improve polling mechanism for Quic channels

Appendix B: Naming Schemes

The C API will use the following naming scheme:

(libraryPrefix)_((receiver)_)?(functionName)

Where

  • libraryPrefix is a global prefix for the whole library (ex: Aranya)
  • receiver is a thing we are calling a function on (see below for example)
  • functionName is the name of the function to call.

For example, an API endpoint with the name FooClientBar breaks down as follows:

  • Foo is the library.
  • Client is the receiver.
  • Bar is the function name.

FooClientBar calls the function Bar on an instance of Client (passed as an argument) provided by the library Foo.

Verb pairs

  • Add/Remove
  • Assign/Revoke
  • Create/Delete
  • Set/Unset

Additional Notes

  • All Rust APIs use Result unless stated otherwise
  • C APIs return the AranyaError type, and use parameters for return values unless some other pattern is required.

Appendix C: Glossary

  • Aranya Quic Channels (AQC) - An integration of Aranya with QUIC to provide a secure and integrated transport. See QUIC channels spec. TODO(declan): link
  • Aranya - the main library that drives the control plane and policy execution.
  • daemon - a long-lived process, typically running in the background, that handles commands and keeps state.
  • device - a computer, sometimes associated with a user but can also be independent. In this model, we consider devices instead of users directly to accommodate autonomous entities.
  • IPC - inter-process communication.
  • policy - an Aranya policy, containing the logic and rules of the system.
  • sync - a request to synchronize the commands on the control plane. Syncs are currently pull only, so the device that requests a sync receives commands from the requestee.
  • team - a group of devices with an associated policy.
  • user - a person who may operate a device.