Module iroh_blobs::store::fs

source ·
Expand description

redb backed storage

Data can get into the store in two ways:

  1. import from local data
  2. sync from a remote

These two cases are very different. In the first case, we have the data completely and don’t know the hash yet. We compute the outboard and hash, and only then move/reference the data into the store.

The entry for the hash comes into existence already complete.

In the second case, we know the hash, but don’t have the data yet. We create a partial entry, and then request the data from the remote. This is the more complex case.

Partial entries always start as pure in memory entries without a database entry. Only once we receive enough data, we convert them into a persistent partial entry. This is necessary because we can’t trust the size given by the remote side before receiving data. It is also an optimization, because for small blobs it is not worth it to create a partial entry.

A persistent partial entry is always stored as three files in the file system: The data file, the outboard file, and a sizes file that contains the most up to date information about the size of the data.

The redb database entry for a persistent partial entry does not contain any information about the size of the data until the size is exactly known.

Updating this information on each write would be too costly.

Marking a partial entry as complete is done from the outside. At this point the size is taken as validated. Depending on the size we decide whether to store data and outboard inline or to keep storing it in external files.

Data can get out of the store in two ways:

  1. the data and outboard of both partial and complete entries can be read at any time and shared over the network. Only data that is complete will be shared, everything else will lead to validation errors.

  2. entries can be exported to the file system. This currently only works for complete entries.

Tables:

The blobs table contains a mapping from hash to rough entry state. The inline_data table contains the actual data for complete entries. The inline_outboard table contains the actual outboard for complete entries. The tags table contains a mapping from tag to hash.

Design:

The redb store is accessed in a single threaded way by an actor that runs on its own std thread. Communication with this actor is via a flume channel, with oneshot channels for the return values if needed.

Errors:

ActorError is an enum containing errors that can happen inside message handlers of the actor. This includes various redb related errors and io errors when reading or writing non-inlined data or outboard files.

OuterError is an enum containing all the actor errors and in addition errors when communicating with the actor.

Structs§

  • Options for transaction batching.
  • Options for inlining small complete data or outboards.
  • Options for the file store.
  • Options for directories used by the file store.
  • Storage that is using a redb database for small files and files for large files.

Type Aliases§

  • Use BaoFileHandle as the entry type for the map.