YAML Selectors
Write resource selectors in YAML, save them with a human-friendly name, and reference them using the --selector
flag.
By recording selectors in a top-level selectors.yml
file:
- Legibility: complex selection criteria are composed of dictionaries and arrays
- Version control: selector definitions are stored in the same git repository as the dbt project
- Reusability: selectors can be referenced in multiple job definitions, and their definitions are extensible (via YAML anchors)
Selectors live in a top-level file named selectors.yml
. Each must have a name
and a definition
, and can optionally define a description
and default
flag.
selectors:- name: nodes_to_joydefinition: ...- name: nodes_to_a_grecian_urndescription: Attic shape with a fair attitudedefault: truedefinition: ...
Definitions
Each definition
is comprised of one or more arguments, which can be one of the following:
- CLI-style: strings, representing CLI-style) arguments
- Key-value: pairs in the form
method: value
- Full YAML: fully specified dictionaries with items for
method
,value
, operator-equivalent keywords, and support forexclude
Use union
and intersection
to organize multiple arguments.
CLI-style
definition:'tag:nightly'
This simple syntax supports use of the +
, @
, and *
operators. It does
not support exclude
.
Key-value
definition:tag: nightly
This simple syntax does not support any operators or exclude
.
Full YAML
This is the most thorough syntax, which can include graph and set operators.
definition:method: tagvalue: nightly# Optional keywords map to the `+` and `@` operators:children: true | falseparents: true | falsechildren_depth: 1 # if children: true, degrees to includeparents_depth: 1 # if parents: true, degrees to includechildrens_parents: true | false # @ operatorgreedy: true | false # include all tests selected indirectly? false by default
The *
operator to select all nodes can be written as:
definition:method: fqnvalue: "*"
Exclude
The exclude
keyword is only supported by fully-qualified dictionaries.
It may be passed as an argument to each dictionary, or as
an item in a union
. The following are equivalent:
- method: tagvalue: nightlyexclude:- "@tag:daily"
- union:- method: tagvalue: nightly- exclude:- method: tagvalue: daily
Note: The exclude
argument in YAML selectors is subtly different from
the --exclude
CLI argument. Here, exclude
always returns a set difference,
and it is always applied last within its scope.
This gets us more intricate subset definitions than what's available on the CLI,
where we can only pass one "yeslist" (--select
) and one "nolist" (--exclude
).
Greedy
As a general rule, dbt will indirectly select tests if they touch resources that you're selecting directly,
but not tests that also touch unselected resources (e.g. a relationships
test, with one parent selected and one parent
not selected). Starting in v0.21, you can optionally turn this on by setting greedy: true
for a specific criterion:
- union:- method: fqnvalue: model_agreedy: true # will include all tests that touch model_a- method: fqnvalue: model_bgreedy: false # default: will not include tests touching model_b# if they have other unselected parents
In CLI-based selection, dbt will warn you about tests that aren't greedily included. Here, you're in "full control" mode—dbt will not warn you about which tests your yaml selector definition does or does not include. Remember that you can always use list
to check.
See test selection examples for more details about greediness and indirect selection.
Example
Here are two ways to represent:
$ dbt run --select @source:snowplow,tag:nightly models/export --exclude package:snowplow,config.materialized:incremental export_performance_timing
- CLI-style
- Full YML
selectors:- name: nightly_diet_snowplowdescription: "Non-incremental Snowplow models that power nightly exports"definition:union:- intersection:- '@source:snowplow'- 'tag:nightly'- 'models/export'- exclude:- intersection:- 'package:snowplow'- 'config.materialized:incremental'- export_performance_timing
Then in our job definition:
$ dbt run --selector nightly_diet_snowplow
Default
Starting in v0.21, selectors may define a boolean default
property. If a selector has default: true
, dbt will use this selector's criteria when tasks do not define their own selection criteria.
Let's say we define a default selector that only selects resources defined in our root project:
selectors:- name: root_project_onlydescription: >Only resources from the root project.Excludes resources defined in installed packages.default: truedefinition:method: projectvalue: <my_root_project_name>
If I run an "unqualified" command, dbt will use the selection criteria defined in root_project_only
—that is, dbt will only build / freshness check / generate compiled SQL for resources defined in my root project.
$ dbt build$ dbt source freshness$ dbt docs generate
If I run a command that defines its own selection criteria (via --select
, --exclude
, or --selector
), dbt will ignore the default selector and use the flag criteria instead. It will not try to combine the two.
$ dbt run --select model_a$ dbt run --exclude model_a
Only one selector may set default: true
for a given invocation; otherwise, dbt will return an error. You may use a Jinja expression to adjust the value of default
depending on the environment, however:
selectors:- name: default_for_devdefault: "{{ target.name == 'dev' | as_bool }}"definition: ...- name: default_for_proddefault: "{{ target.name == 'prod' | as_bool }}"definition: ...