1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
|
# Portable Storage Format
## Background
Monero makes use of a set of helper classes from a small library named
[epee](https://github.com/monero-project/monero/tree/master/contrib/epee). Part
of this library implements a networking protocol called
[Levin](https://github.com/monero-project/monero/blob/master/contrib/epee/include/net/levin_base.h),
which internally uses a storage format called [Portable
Storage](https://github.com/monero-project/monero/tree/master/contrib/epee/include/storages).
This format (amongst the rest of the
[epee](https://github.com/monero-project/monero/tree/master/contrib/epee)
library), is undocumented - or rather relies on the code itself to serve as the
documentation. Unfortunately, whilst the rest of the library is fairly
straightforward to decipher, the Portable Storage is less-so. Hence this
document.
## Preliminaries
### String and integer encoding
#### varint
Varints are used to pack integers in an portable and space optimized way. The
lowest 2 bits store the amount of bytes required, which means the largest value
integer that can be packed into 1 byte is 63 (6 bits).
| Lowest 2 bits | Size value | Value range |
|---------------|---------------|-----------------------------------|
| b00 | 1 byte | 0 to 63 |
| b01 | 2 bytes | 64 to 16383 |
| b10 | 4 bytes | 16384 to 1073741823 |
| b11 | 8 bytes | 1073741824 to 4611686018427387903 |
#### string
These are simply length (varint) prefixed char strings.
## Packet format
### Header
A packet starts with a header:
| Header | Type | Value |
|---------------|-----------|-----------------------|
| Signature | 8 bytes | 0x0111010101010201| |
| Version | byte | 0x01 |
### Section
Next we have a root object (or section as the library calls it). This is a map
of name-value pairs called [entries](#Entry). It starts with a count:
| Section | Type |
|---------------|-----------|
| Entry count | varint |
Which is followed by the section's name-value [entries](#Entry) sequentially:
### Entry
| Entry | Type |
|-------------------|-----------------------|
| Name | string<sup>1</sup> |
| Type | byte |
| Count<sup>2</sup> | varint |
| Value(s) | (type dependant data) |
<sup>1</sup> Note, the string used for the entry name is not prefixed with a
varint, it is prefixed with a single byte to specify the length of the name.
This means an entry name cannot be more that 255 chars, which seems a reasonable
restriction.
<sup>2</sup> Note, this is only present if the entry type has the array flag
(see below).
#### Entry types
The types defined are:
```cpp
#define SERIALIZE_TYPE_INT64 1
#define SERIALIZE_TYPE_INT32 2
#define SERIALIZE_TYPE_INT16 3
#define SERIALIZE_TYPE_INT8 4
#define SERIALIZE_TYPE_UINT64 5
#define SERIALIZE_TYPE_UINT32 6
#define SERIALIZE_TYPE_UINT16 7
#define SERIALIZE_TYPE_UINT8 8
#define SERIALIZE_TYPE_DUOBLE 9
#define SERIALIZE_TYPE_STRING 10
#define SERIALIZE_TYPE_BOOL 11
#define SERIALIZE_TYPE_OBJECT 12
#define SERIALIZE_TYPE_ARRAY 13
```
The entry type can be bitwise OR'ed with a flag:
```cpp
#define SERIALIZE_FLAG_ARRAY 0x80
```
This signals there are multiple *values* for the entry. When we are dealing with
an array, the next value is a varint specifying the array length followed by
the array item values. For example:
<p style="padding-left:1em; font:italic larger serif">name, type, count,
value<sub>1</sub>, value<sub>2</sub>,..., value<sub>n</sub></p>
#### Entry values
It's important to understand that entry *values* can be encoded any way in which
an implementation chooses. For example, the integers can be in either big or
little endian byte order.
Entry values which are objects (i.e. `SERIALIZE_TYPE_OBJECT`), are stored as
[sections](#Section).
Note, I have not yet seen the type `SERIALIZE_TYPE_ARRAY` in use. My assumption
is this would be used for *untyped* arrays and so subsequent entries could be of
any type.
## Monero specifics
### Entry values
#### Strings
These are prefixed with a varint to specify the string length.
#### Integers
These are stored little endian byte order.
#### Hashes, Keys, Blobs
These are stored as strings, `SERIALIZE_TYPE_STRING`.
#### STL containers (vector, list)
These can be arrays of standard integer types, strings or
`SERIALIZE_TYPE_OBJECT`'s for structs.
#### Links to some Monero struct definitions
- [Core RPC
definitions](https://github.com/monero-project/monero/blob/master/src/rpc/core_rpc_server_commands_defs.h)
- [CryptoNote protocol
definitions](https://github.com/monero-project/monero/blob/master/src/cryptonote_protocol/cryptonote_protocol_defs.h)
[//]: # ( vim: set tw=80: )
|