Database Configuration
LAPIS and SILO need a database_config.yaml.
Its main purpose is to define the database schema for the sequence metadata.
See the tutorial for an example,
or use our config generator to generate your own config.
More examples can be found in our tests.
The database config is considered static configuration that doesn’t change with data updates. This page contains the technical specification of the database config.
Top-Level Structure
Section titled “Top-Level Structure”The database_config.yaml permits the following top-level keys:
| Key | Type | Required | Description |
|---|---|---|---|
schema | object | true | The schema object. |
defaultNucleotideSequence | string | false | Name of the default nucleotide sequence segment. Only meaningful when there is more than one segment. |
defaultAminoAcidSequence | string | false | Name of the default amino acid gene |
siloClientThreadCount | int | false | How many threads (connections) LAPIS uses to talk to SILO. |
The Schema Object
Section titled “The Schema Object”The schema object permits the following fields:
| Key | Type | Required | Description |
|---|---|---|---|
instanceName | string | true | The name assigned to the instance. Used for display purposes. |
metadata | array | true | A list of metadata objects describing the metadata fields available on the underlying sequence data. |
primaryKey | string | true | The name of the metadata field that serves as the primary key. The value must match one of the entries in metadata. |
features | array | false | A list of feature objects that enable additional query capabilities. Defaults to no features. |
The Metadata Object
Section titled “The Metadata Object”Each entry in schema.metadata describes a single metadata field. The following keys are permitted:
| Key | Type | Required | Description |
|---|---|---|---|
name | string | true | The name of the metadata field. Must be unique within metadata. |
type | enum | true | The type of the metadata. |
generateIndex | boolean | false | If true, SILO builds an index for this field so that filter queries become a trivial lookup. See Generating an index. Only valid for fields of type string. |
generateLineageIndex | string | false | If set, SILO treats the field as a lineage-indexed field belonging to the named lineage system. See Lineage-indexed fields. Only valid for fields of type string. |
isPhyloTreeField | boolean | false | If true, marks the field as a phylogenetic tree field. Sequences can then be queried by their position in a tree (e.g. via mostRecentCommonAncestor). Only valid for fields of type string. |
Metadata Types
Section titled “Metadata Types”LAPIS supports the following metadata types:
string: Arbitrary text values.int: Integer values.float: Floating-point values.boolean:trueorfalse.date: Values must be valid dates in the formYYYY-MM-DD.
Generating an Index
Section titled “Generating an Index”For string fields, setting generateIndex: true makes SILO precompute bitmaps for the field’s distinct values,
turning queries against the field into very fast lookups.
Lineage-Indexed Fields
Section titled “Lineage-Indexed Fields”Setting generateLineageIndex: <systemName> on a string field tells SILO that the values form a hierarchy
(e.g. Pango lineages). The value of generateLineageIndex is the name of the lineage system — a SILO-side
definition that lists how the lineages relate to each other (parent/child relationships, aliases).
Multiple metadata fields can share the same lineage system.
The lineage definitions themselves are provided to SILO at preprocessing time and are not part of the LAPIS database config. See SILO’s documentation for how to supply lineage definitions.
Phylogenetic Tree Fields
Section titled “Phylogenetic Tree Fields”Setting isPhyloTreeField: true on a string field declares that the field stores identifiers in a phylogenetic tree
(for example node labels of an UShER tree). The tree itself is supplied to SILO at preprocessing time.
Features
Section titled “Features”Each entry in schema.features enables a feature in LAPIS:
| Key | Type | Required | Description |
|---|---|---|---|
name | string | true | The name of the feature. |
The following feature names are recognized. Any other value will cause LAPIS to fail on startup.
| Feature name | Description |
|---|---|
sarsCoV2VariantQuery | Enables the SARS-CoV-2-specific variant query language, exposed via the variantQuery request parameter. The feature is used for CoV-Spectrum and it is not recommended to use it otherwise. |
generalizedAdvancedQuery | Enables the generic advanced query language, exposed via the advancedQuery request parameter. Recommended for non-SARS-CoV-2 instances. |