Erebos protocol specification
Note
Specification is currently being written, so content is incomplete.
Introduction
Erebos is intended to be a fully decentralized communication and synchronization protocol. Here, “fully decentralized” means that as opposed to federated services, erebos peers do not need to rely on any servers for their functionality. And even though some servers can help with certain task like peer discovery, a user or his identity is not bound to any such server. In the core of its design is a content-addressable filesystem using objects quite similar to those used in git, although git usage and thus its design differ in some important aspects.
Erebos objects that can be stored locally or sent among nodes and are uniquely identified by Blake2 hash of their canonical representation. Some object types can reference other objects using this hash, including for example references to a previous state if some modification is recorded (similarly as commits referece their parent in git). Whenever some objects represent a state that is shared and synchronized among multiple nodes, it can happen that such state is modified independently on different nodes. As synchronization can happen at arbitrary time and typically in background, it is necessary that these modifications can be merged automatically, so a merged state needs to be properly defined without requiring any user interaction (as opposed to git where sometimes merges require manual resolution of conflicts).
Object representation
This section describes the canonical representation of erebos object. Each object is uniquely identified by hash of this representation, but it does not necessarily need to be stored or transmitted in this form, when a more efficient one can be used as appropriate.
Even though most objects described here or given as examples will have textural representation, keep in mind that it is still binary format and can contain arbitrary data.
Basic structure common for all types of objects is:
<type> 0x20 <data length> 0x0A <data>
There are currently two types of objects: blob
(arbitrary binary data) and rec
(record).
<data length>
is a ASCII-encoded decimal representation of length of the <data>
.
<data>
is the actual object data, whose format depends on object type.
Blob
Blob just contains arbitrary data without structure relevant for erebos protocol.
Mainly intended to be used small files like message attachments and similar.
For example, blob object with hash 9331f492583a8f47f9bf21e50ad298e9b395aa4dfb989257e26c15109526ca3c
:
blob 13
Hello world!
Record
Record is a type used for most erebos-relevant structures. The <type>
value is "rec"
and <data>
consist of items in the form:
<name> ':' <type> ' ' <value> '\n'
<name>
is name of the item. Custom record typs should use lower-case letters and '-'
(dash) character only.
Core specification will define some values with uppercase letters.
However, implementation should accept and work correctly with arbitrary binary value as the item <name>
delimited by the first ':'
character.
If <value>
contains a '\n'
(newline) character, it is encoded as a pair of bytes '\n' '\t'
, that is, a tab character is appended to distinguish it from the newline ending the record item.
This means that <name>
can not start with a '\t'
character.
Format of <value>
depends on <type>
:
<type> |
description | <value> format |
---|---|---|
e |
empty | none, the size should be zero |
i |
integer | ASCII-encoded decimal representation, optionally prefixed by '-' if negative |
t |
text | UTF-8 encoded string |
b |
binary data | hexadecimal representation of binary data |
d |
date | decimal representation of UNIX time, followed by ' ' (space) character, followed by time zone offset in the form [+-]HHMM |
u |
UUID | xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx where each x is an hexadecimal character |
r |
reference | "blake2#" followed by hexadecimal representation of the hash of referenced object |
Data updates and history
Keys and signatures
Security of the protocol relies on public-key cryptography. For good security properties and small sizes of keys and signatures, the elliptic-curve Ed25519 scheme is used.
Public key is represented as a record with following fields:
type:t |
"ed25519" |
pubkey:b |
public key data |
The Ed25519 public keys are 256 bits (32 bytes) long, so the pubkey
field value consists of 64 hexadecimal characters.
Corresponding private key is not stored as erebos object, but in separate storage, which can be for example some secure key store.
Signature represented as a record with:
key:r |
reference to public key corresponding to the key used for the signature |
sig:b |
signature data |
Signature in Ed25519 is 64 bytes long, so the sig
field contains 128 hexadecimal characters.
The signature here is always signature of the unique identifier (the Blake2 hash) of some other erebos object.
So finally, to put together signature and the signed data, following Signature
structure is used:
SDATA:r |
reference to signed data |
sig:r |
reference to signature object signing the SDATA digest; can be many |
There can be multiple sig
fiels referencing signatures of the same data with different keys.
The signed object is valid only when all given signatures can be correctly validated with associated public keys.
Identity
As the focus of the protocol is decentralization, there is no central authority that would certify some kind of identity of users. Instead, design similar to PGP with web of trust is used, just with more focus on ease of use and with updates of identity information and keys being part of normal communaciton, not requiring explicit actions or key servers.
Single identity comprises of a set of individual signed IdentityData
objects with the following structure:
SPREV:r |
* | reference to signed IdentityData representing previous version of the identity |
name:t |
? | identity name, to be displayed in messages, contact lists, etc |
owner:r |
? | reference to owner identity (see below) |
key-id:r |
reference to the public identity key | |
key-msg:r |
? | reference to the public message key |
Signed IdentityData
means a Signature
object whose SDATA
member points to an IdentityData
object.
Note that the only required field is the identity key (key-id
).
Validation
Signed IdentityData
object is valid iff:
- The
Signature
is valid and signed by all of the following keys:- identity key (
key-id
) of theIdentityData
it points to, and - identity keys of all
IdentityData
referenced bySPREV
, and - identity key of
IdentityData
referenced byowner
.
- identity key (
- All the signed
IdentityData
objects referenced bySPREV
andowner
fields are valid.
Creation and updates
Merging
Owner hierarchy
Network protocol
(version 0.1)
The erebos network protocol enables secure and reliable communication between two nodes. Each node is expected to poses an erebos identity, which is used for secure key exchange. Once the secure communication is established, the protocol allows for sending individual packets, e.g. with short text messages or status information, as well as using multiple independent streams for bigger or continuous data.
Establishing connection
Connection between nodes starts with 4-way handshake, during which nodes exchange their identity information and derive session key for secure communication. This phase uses plaintext packets, which start with header consisting of erebos record object, potentially followed by additional objects referenced from the header.
The plaintext header can contain following fields:
ACK:r |
acknowledgement of received packet |
REJ:r |
rejected packet, e.g. data or connection request |
VER:t |
network protocol version |
ANN:n |
announce own identity |
INI:r |
connection initiation |
CKS:b |
cookie set |
CKE:b |
cookie echo |
REQ:r |
request for data |
RSP:r |
response for data request |
CRQ:r |
secure channel request |
CAC:r |
secure channel accepted |
Secure communication
Local discovery
Services
Storage
Local and shared state
Attaching devices
Attach service
UUID: 4995a5f9-2d4d-48e9-ad3b-0bf1c2a1be7f
Synchronization
Sync service
UUID: a4f538d0-4e50-4082-8e10-7e3ec2af175d
Contacts
Contact service
UUID: d9c37368-0da1-4280-93e9-d9bd9a198084
Contact state
UUID: 34fbb61e-6022-405f-b1b3-a5a1abecd25e
Direct messages
Direct message service
UUID: c702076c-4928-4415-8b6b-3e839eafcb0d
Direct message state
UUID: ee793681-5976-466a-b0f0-4e1907d3fade