Aranya Documentation An overview of the Aranya project

Aranya Product MVP Spec

Introduction

A product specification for version 1 of the standalone Aranya daemon and user library. The goal is to provide a commercial-off-the-shelf solution for integrating Aranya. Customers should be able to download the Aranya daemon and user library, setup a team, and begin using Aranya.

Much of this document will serve as an extension to the Aranya Beta spec that was implemented as version beta of the product. To capture all details related to the MVP in a single spot, relevant information from the beta spec has been carried into this document below.

Primary Goals:

  1. Provide a low friction solution customers can use to better secure their infrastructure.
  2. Easily setup a team and onboard devices.
  3. Implement a default policy that works well enough for a wide variety of situations that customers can relate to.
  4. Provide stable and backwards compatible APIs that allow devices to interact with Aranya.
  5. Expose an API for point-to-point high performance encrypted communication using IP for transport.

Secondary internal goals:

  1. Design for the ability to swap out policies in a future version.
  2. Design for future improvements, like additional data planes and configurable roles.

A glossary is available in Appendix C.

Usage Requirements

Aranya is a decentralized message delivery platform with authorization built in. Below are some basic requirements for running this version of the product:

System Requirements:

  • We will target x86, ARM, and ARM64 running Linux.
    • Must run on Mac for development reasons.
    • ARM (arm32) is in progress, additional work is required to add it to CI. https://github.com/aranya-project/aranya-docs/pull/24/files#r1917143524
  • We will assume there is IPv4 or IPv6 connectivity.

Measure the following values to estimate system requirements and include results in this spec.

  • Memory usage
  • Disk requirements:
    • Storage amount
    • Storage device write speed
    • Storage device seek speed

For a full and up to date list, start with this issue: https://github.com/aranya-project/aranya/issues/62

Architecture

There will be two subsystems as part of the product:

  1. The daemon
  2. The client library

The daemon is a standalone process that runs an Aranya instance and exposes the control plane API via IPC. It handles setting up all the dependencies that Aranya needs including storage, policy, and network communication. The daemon will periodically sync with registered peers and handle commands as they are synced. The current design does not require that effects are stored on disk. Other implementations have required effects be stored on disk in case the program is terminated after the effect has been emitted, but before it has been processed by the daemon or user application. This extends to the shared memory; there is no requirement it be preserved between program restarts.

The client library exists inside the device’s process and is used to interact with the daemon. The client library also includes a light wrapper around QUIC Channels (and similarly for AFC), allowing the device to encrypt data and send messages to peers. QUIC channels will provide networking and mostly live in the client library. Channel setup will require communication with the daemon. Current plans for language bindings are to initially focus on the C API. The Rust client library is a bonus that we gain from the Rust -> C compilation. The C API should include autogenerated docs.

The daemon is able to participate in multiple teams using the same root identity by using some scheme of generating leaf keybundles per team. This also removes the one-keybundle-per-process requirement. For MVP we can lay the groundwork for the multi-keybundle approach, and better management of identities can be added post-MVP or as time allows for MVP.

daemon subsystems with detail

Config

On startup, the daemon requires a path to the working directory. That directory contains a configuration file with the path to the unix domain socket that should be created and other values. The daemon working directory will structured to be easily understood and contain the required state information. The directory structure will have subdirectories (example: storage, config, etc.) to better organize the content.

All other config values are provided when a client context is initialized and passed to the daemon over the Unix Domain Socket. This approach allows the device API to drive the configuration of the daemon, and can help reduce errors in config mismatch by minimizing the number of duplicate config values. The daemon will need to validate the configuration files and be able to handle cases of partial initialization.

The daemon will expose a simple API that clients can use to request connection information for data planes like AFC. The client will be able to request the AFC config from the daemon and use that directly. Other configuration values can also be added to this mechanism. After MVP, additional APIs can be exposed to change those configured values.

A config object will be used to instantiate a graph, i.e. in the action to perform the Init command and the API call for adding an existing graph (see CreateTeam and AddTeam APIs below). The config object will be versioned and extensible, and should not break API compatibility.

Components

The system is made up of multiple components:

The “client API” is the top level component and contains local operations such as enabling/ disabling syncing or initializing a new device.

The Access Control Plane is the top level control plane, enabling IDAM operations and other on-graph operations. The Access Control Plane is used to manage keys, address assignment, roles, and labels as set out in the policy which is written in the MVP using version 2.0 of the policy language.

For the MVP, Aranya Quic Channels (AQC) will be built to provide a simple API for sending and receiving messages using a modified Quic transport. The Aranya Quic Channels contains its own control plane for control messages, as well as the main data plane for moving data between devices. This is similar to Aranya Fast Channels (AFC) from the beta version but with a different underlying transport (AFC uses TCP). See Aranya Quic Channels API.

Component structure:

  • Local Client API: syncing, local device management
    • Access control plane (IDAM control plane): IDAM lifecycle
      • Quic Channels
        • Quic Channels control plane
        • Quic Channels data plane
      • AFC (build flag required to enable API)
        • AFC control plane
        • AFC data plane
      • … additional planes in future versions

Rust features will be used for some features like the raw AFC interface. By default, AFC will not be included in the user facing API unless a specific build flag is present. The goal of this choice is to better signal which APIs are best suited for common use.

Client APIs

The client APIs are local-only API endpoints that do not create commands on the graph. They are mostly used to manage the local state. Depending on the language, the endpoints may be a different format that is more idiomatic to that language such as snake_case for C.

  • Connect(daemon_sock, afc_shm_path, max_chans, afc_listen_addr) -> client - creates a client connection to the daemon.
  • GetKeyBundle() -> keybundle - returns the current device’s public key bundle.
  • GetDeviceId() -> device_id - returns the device’s device ID.
  • AddTeam(team_id, config) -> bool - add an existing team to the local device store. Not an Aranya action/command.
  • RemoveTeam(team_id) -> bool - remove a team from the local device store. Not an Aranya action/ command.
  • SerializeKeyBundle(scheme, keybundle) -> bytes - serialize a keybundle to a given format/ scheme. Ideally, this can be used to serialize to either human readable or machine readable formats.
  • DeserializeKeyBundle(scheme, bytes) -> keybundle - desrialize a keybundle from a given format/ scheme.
  • SerializeId(id) -> bytes - serialize an ID to the standard base58 format. https://github.com/aranya-project/aranya-docs/pull/24/files#r1915516900
  • DeserializeId(bytes) -> id - deserialize an ID from the standard base58 format. https://github.com/aranya-project/aranya-docs/pull/24/files#r1915516900

Sync API

The IDAM control plane will be managed by the daemon process which will be accessed with the APIs provided in IDAM Control Plane API via an IPC mechanism. The daemon will be responsible for syncing state with peers. To enable, the APIs below will be provided to add and remove peers to sync from.

Unix Domain Sockets and shm will be used for IPC between the daemon and client library. Additional IPC mechanisms may be explored in the future.

  • AddSyncPeer(address, team_id, config) -> bool - add a peer to start syncing with at a specific rate. Syncing should support DNS resolution in the case a domain name is used. Syncs immediately the first time. The config object includes details for authorization (potentially authentication too) and options for syncing by pushing or pulling.
  • RemoveSyncPeer(address, team_id) -> bool - remove a sync peer associated with the given address, team_id.
  • SyncNow(address, team_id, Option<config>) -> bool - Trigger an immediate sync for the peer. If a sync config is provided, use that. If no config arg is provided, fallback to the config used when the peer was added via AddSyncPeer. If the peer was not added, use a default config or error. https://github.com/aranya-project/aranya-docs/pull/24/files#r1917188406

Onboarding API

Easy to implement, key moving is done by integration.

  1. Create Device (NewDevice)
  2. Get Device Key (Current device KeyBundle (what you give to the admin))
  3. Give KeyBundle to admin on team (integration problem)
  4. Admin does AddDevice(device_key_bundle)
  5. Get team_id from admin (integration detail)
  6. Add team_id to client (AddTeam)
  7. Sync with device on team (AddSyncPeer)

IDAM Control Plane API

The IDAM control plane is for managing identity and authorization by interacting with the graph. Each endpoint creates one or more commands on the graph. The first command in the graph, aka the Init command, contains the system’s policy that defines the IDAM control plane for bootstrapping.

  • CreateTeam(owner_keybundle, config) -> team_id - initialize the graph, creating the team with the author as the owner. Includes policy for bootstrapping.
  • CloseTeam(team_id) -> bool - close the team and stop all operations on the graph.
  • AddDeviceToTeam(team_id, keybundle) -> bool - add a device to the team with the default role.
  • RemoveDeviceFromTeam(team_id, device_id) -> bool - remove a device from the team.
  • AssignRole(team_id, device_id, role) -> bool - assign a role to a device.
  • RevokeRole(team_id, device_id, role) -> bool - remove a role from a device.
  • DefineRole(role_name) -> bool - define a new role type that can be assigned to devices.
  • ReplaceCommandRole(command_name, authoring_role) -> bool - allow a particular role type to author the specified command type.
  • ReplaceEntityClass(team_id, device_id, entity_class) -> bool - replace the entity class associated to a device.

Aranya Channels API

The beta spec describes Aranya Fast Channels (AFC) which is replaced by Aranya Quic Channels (AQC) for the MVP. AFC is instead viewed as an experimental feature of the product which might still be preferred when running on embedded devices or when using a unidirectional transport. Refer to the beta spec and other existing documentation (AFC and AFC-Crypto) for more details on AFC.

AQC uses a modified Quic transport implementation that supports the ability to use custom cryptography and has latency-based congestion control (see (s2n-quic)[https://github.com/aws/s2n-quic]). More information on AQC, including the list of AQC-specific APIs, can be found in the AQC spec:

  • Draft version: https://github.com/aranya-project/aranya-docs/blob/2-quic-channels/docs/quic-channels.md
  • Eventually where the final spec will live: https://github.com/aranya-project/aranya-docs/blob/main/docs/quic-channels.md

Just like AFC, the Quic Channels plane is split in two different sub-planes: the Quic channels control plane and the Quic channels data plane. The AQC control plane is responsible for any Aranya command or ephemeral commands, while the AQC data plane contains only data-related APIs.

Embedded devices that implement a subset of Aranya library should still be able to sync with clients that have the full product integrated. AFC should also be compatible between subset implementations and the full implementation. This compatibility is planned for Post-MVP.

The following APIs are used both for AQC and AFC:

  • CreateLabel(team_id, label) - create a label
  • DeleteLabel(team_id, label) - delete a label
  • AssignLabel(team_id, device_id, label) - assign a label to a device
  • RevokeLabel(team_id, device_id, label) - revoke a label from a device

AFC API

The AFC APIs are being moved to a lower level in the API. They will still be available via a build flag to allow embedded devices and advanced users to access them.

  • SetAfcNetIdentifier(team_id, device_id, net_identifier) - associate a network address to a device for use with AFC. If the address already exists for this device, it is replaced with the new address. Capable of resolving addresses via DNS. For use with CreateChannel and receiving messages. Can take either DNS name, IPv4, or IPv6. Current implementation uses a bidi map, so we can reverse lookup.
  • UnsetAfcNetIdentifier(team_id, device_id, net_identifier) - disassociate a network address from a device.
  • CreateAfcBidiChannel(team_id, peer_net_ident, label) -> channel_id - create a bidirectional channel with the given peer.
  • DeleteAfcChannel(team_id, channel_id) - delete a channel.
  • PollAfcData(timeout) - blocks until new AFC data is available, or timeout elapsed
  • SendAfcData(channel_id, data) - send data on the given channel.
  • ReceiveAfcData() -> (data, metadata) - receive data from AFC.

Graph Querying APIs

FactDB queries over the current perspective of the graph should be possible through ephemeral commands in the policy that will return query results in their emitted effects. These APIs are likely to be moved to nice-to-have or Post-MVP, but are currently planned for MVP.

  • QueryRoleAssignment(device_id) -> Role
  • QueryDeviceKeybundle(device_id) -> Keybundle
  • QueryAfcNetworkId(device_id) -> network_str
  • QueryAfcLabelAssignments(device_id) -> Vec<label>
  • QueryAfcLabelExists(device_id) -> Vec<label>

Roles & Permissions

There will be 4 default roles with the following set of permissions for each. The MVP will also include an expansion of the role system, allowing the device to create custom roles and reassign permissions for specific commands to custom roles. There will be a CLI for managing permissions.

owner

  • all permissions excluding sending data on channel label
  • create/close team
  • add/remove devices
  • elevate/revoke permissions for devices up to and including owner
  • define/undefine channel labels
  • assign/remove addresses/names for Aranya channels
  • assign/revoke channel labels

admin

  • elevate/revoke permissions for devices up to a max level of operator
  • define/undefine channel labels
  • assign/remove addresses/names for Aranya channels
  • cannot send data on channel labels

operator

  • add (new) / rm device in team
  • assign member role to devices in team
  • assign/revoke channel labels
  • assign/remove addresses/names for channel
  • cannot send data on channel labels

member

  • use Aranya channel

Notes:

  • Devices can always remove and demote themselves
  • Devices can only remove or demote another device of equal role when their entity class is higher.

Documentation

API documentation must be provided for the client API covering the functions and behavior of each API call. Most likely, this will take the form of a doxygen-like web page. Developers can use this to look up language agnostic functions for operating the client API. The documentation should also include tutorials and a quickstart to get developers up and running with the product as soon as possible. Documentation should also be provided for the daemon so that developers and sysadmins can understand the requirements and operations of the daemon.

Appendix

Appendix A: TODOs needed to complete MVP updates

This is a list of things that need to be tracked and accomplished for the MVP. Scope is subject to change. This list is unordered within the sections.

Urgent

High

  • Measure system requirements and record values in this spec. https://github.com/aranya-project/aranya/issues/62
  • Maybe also include some measures on compile time and build requirements
  • Quic channels spec and implementation
  • ARM32 support for CI and releases
  • Expose bytes of IDs to end user

Normal

  • Update existing code and docs to use “device” instead of “user”.
  • Ensure “graph” is used in core, while “team” is used in product.
  • Add “ephemeral” marker in command metadata.
  • Take measures to deem AFC as experimental (e.g., updating docs)?
  • Update AddSyncPeer to sync immediately
  • Add config object with just version field to CreateTeam and AddTeam APIs.
  • Implement SyncNow
  • Add config object to AddSyncPeer and include the rate parameter in it.
  • Might want to add max # of bytes to sync, timeout for number of secs it stays open, etc.
  • Update existing AFC APIs to include “afc” in their name to differentiate from Quic channels
  • Custom policy roles spec and implementation
  • Allow a single keybundle to participate in multiple teams via some sort of root-identity-key (or other mechanism)
  • Fact DB queries via session commands
  • Standardize C API names to follow naming scheme.
  • Update key agreement commands to have a field for the sender’s graph head ID. https://github.com/aranya-project/aranya-docs/pull/24#discussion_r1937642908
  • Move AFC to behind build flag

Low

  • Set up CI to measure resource usage
  • Implement FactDB query APIs
  • CLI for managing permissions
  • Update default policy to have one role type authoring each command

Nice to have

This list encompasses anything moved to Post-MVP but would be nice to get into the MVP if we have resources.

  • Implement Finalization
  • Implement AwaitCommand and IsPresent (see Post-MVP spec).

Uncategorized

  • Update spec diagrams.
  • Outline and apply daemon working directory changes.
  • Establish best terminology for reference (e.g. “device store”?).
  • Improve polling mechanism for Quic channels

Appendix B: Naming Schemes

The C API will use the following naming scheme:

(libraryPrefix)_((receiver)_)?(functionName)

Where

  • libraryPrefix is a global prefix for the whole library (ex: Aranya)
  • receiver is a thing we are calling a function on (see below for example)
  • functionName is the name of the function to call.

For example, an API endpoint with the name FooClientBar breaks down as follows:

  • Foo is the library.
  • Client is the receiver.
  • Bar is the function name.

FooClientBar calls the function Bar on an instance of Client (passed as an argument) provided by the library Foo.

Verb pairs

  • Add/Remove
  • Assign/Revoke
  • Create/Delete
  • Set/Unset

Additional Notes

  • All Rust APIs use Result unless stated otherwise
  • C APIs return the AranyaError type, and use parameters for return values unless some other pattern is required.

Appendix C: Glossary

  • AFC - the library used to do high performance encryption using keys managed by Aranya.
  • Aranya Quic Channels (maybe AQC) - An integration of Aranya with QUIC to provide a secure and integrated transport. See QUIC channels spec. TODO(declan): link
  • Aranya - the main library that drives the control plane and policy execution.
  • daemon - a long-lived process, typically running in the background, that handles commands and keeps state.
  • device - a computer, sometimes associated with a user but can also be independent. In this model, we consider devices instead of users directly to accommodate autonomous entities.
  • IPC - inter-process communication.
  • policy - an Aranya policy, containing the logic and rules of the system.
  • sync - a request to synchronize the commands on the control plane. Syncs are currently pull only, so the device that requests a sync receives commands from the requestee.
  • team - a group of devices with an associated policy.
  • user - a person who may operate a device.