The Discovery Pipeline window in the Catalog Settings tab provides a full and comprehensive view of the Discovery job configuration. It displays the product's default baseline configuration (retrieved from the product's plugins.discovery file) and the project-level rules.
The Baseline rule includes a list of the product's built-in plugins with their input parameters, data snapshot sample size and more.
The Discovery Pipeline window enables performing the following actions, described further in this article:
The overrides are saved in the project pluginsOverride.discovery file, which is created in the Project's Implementation/SharedObjects/Interfaces/Discovery/ folder.
This article describes the capabilities of the Discovery Pipeline window and explains how they can impact the Discovery job.
The Baseline rule is a default configuration, applied when running the Discovery job on any data platform. It includes a sample size definition, a global schema exclude list and a list of product plugins with their settings.
The Baseline rule is always enabled. It can be edited by checking the Override checkbox. The following changes can be applied to the Baseline rule:
Note that the Baseline rule overrides are automatically propagated to the project-level rules. For example, when a plugin is changed from 'inactive' to 'active' in the baseline, it will become 'active' in all project-level rules. A project rule, however, can override the Baseline rule.
The Baseline rule overrides can be reverted by one of the following ways:

The Discovery Pipeline window enables the user to refine the default configuration per the project's requirements.
A rule should be attached to a data platform, along with several other parameters that may become mandatory, based on conditions. Mandatory and optional parameters of each rule type are described further in this article.

Click on 'Add Rule +' to create a new rule.
Rules may be of three types:
For filter rule creation, set either 'Exclude Others' or 'Exclude This' in the Crawler Filter column. In this case, the Data Platform and Schema(s) fields are mandatory, while the Dataset field is optional.
For override rule creation, the only mandatory actions are selecting a Data Platform and checking the Override checkbox. This rule will apply to the entire Data Platform. Populating the Schema(s) and Dataset(s) fields will make this rule more specific.
For creating a combined rule, which includes both a filter and the overrides, set Crawler Filter = 'Exclude Others' and check the Override checkbox.
The purpose of this rule type is to limit the Discovery process to the specified source entities.
Example
The below image presents a rule defined for sakilla_pg data platform and crm schema. The purpose of this rule is to limit the Discovery process to crm schema only, since sakilla_pg includes multiple schemas that are irrelevant for the current run.

The purpose of this rule type is to override one or multiple baseline settings without filtering the data source.
The rule requires selecting a data platform and checking the Override checkbox.
The Crawler Filter should be set to 'No Filter' as the discovery should be executed on the entire data platform.
Note that when the schema(s) and dataset(s) fields are populated, the override rules are applied only to them. This type of rule does not have any filtering effect.
Example
The below image presents an override rule defined for the CUSTOMER table of the CRM_DB data platform and main schema.
The purpose of this rule is to override the Sample Size definition, increasing it to 25% (instead of the default 10% setting). This override is applicable only to the specified dataset — CUSTOMER. The discovery is executed on the entire CRM_DB data platform without any filters.

The purpose of this rule type is to limit the Discovery process to the specified source entities and at the same time to override some of the baseline settings.
Example
The below image presents a rule defined for sakilla_pg data platform and crm schema. The purpose of this rule is to limit the Discovery process to crm schema only, since sakilla_pg includes multiple schemas that are irrelevant for the current run. In addition to the filter, the rule also defines a baseline override by setting one of the inactive plugins to 'active'.

The purpose of this rule type is to exclude specified source entities during the Crawler run.
The rule requires to selecting a data platform and populating at least one schema.
When one or multiple datasets are populated, these dataset(s) will be excluded.
This rule cannot be combined with the override action, as the Crawler will exclude the specified nodes.
Example
The below image presents a rule that excludes the CASE_NOTES table of the CRM_DB data platform and the main schema from the Discovery process. This means that discovery runs on all CRM_DB tables except CASE_NOTES.

Multiple rules can be defined for the same data platform. The purpose of creating multiple rules is to allow variations of the Discovery process execution for different elements. For example, one may need to set a larger sample size for some datasets or execute a specific plugin on a designated dataset or schema.
When multiple rules are defined for the same data platform, they adhere to the following hierarchy:
Example of rule combinations and hierarchy
The below image presents three rules defined for the AdventureWorks data platform:

When a new plugin is created in a project, it should be added to the Baseline rule in order to be included in the Discovery job execution. Once added to the baseline, the new plugin is automatically propagated to all existing rules and can have different settings in each.
For example, if a newly created plugin is applicable only to running Discovery on the CRM_DB, it should be added to the baseline as 'inactive'. Then, a rule for the CRM_DB should be created, where this plugin is set to 'active'.
The steps for adding a new plugin to the pipeline are:
icon to open the Plugins context menu and choose Add Plugin.
The new plugin is always added to the end of the Plugins list. However, the plugin's execution order can be changed by dragging it to the desired position within the list.
Note that the Delete selected option in the context menu is available only for the project plugins, as product plugins cannot be deleted. If a product plugin is not needed, it can be set to 'inactive' in the Baseline rule.
The Discovery Pipeline window in the Catalog Settings tab provides a full and comprehensive view of the Discovery job configuration. It displays the product's default baseline configuration (retrieved from the product's plugins.discovery file) and the project-level rules.
The Baseline rule includes a list of the product's built-in plugins with their input parameters, data snapshot sample size and more.
The Discovery Pipeline window enables performing the following actions, described further in this article:
The overrides are saved in the project pluginsOverride.discovery file, which is created in the Project's Implementation/SharedObjects/Interfaces/Discovery/ folder.
This article describes the capabilities of the Discovery Pipeline window and explains how they can impact the Discovery job.
The Baseline rule is a default configuration, applied when running the Discovery job on any data platform. It includes a sample size definition, a global schema exclude list and a list of product plugins with their settings.
The Baseline rule is always enabled. It can be edited by checking the Override checkbox. The following changes can be applied to the Baseline rule:
Note that the Baseline rule overrides are automatically propagated to the project-level rules. For example, when a plugin is changed from 'inactive' to 'active' in the baseline, it will become 'active' in all project-level rules. A project rule, however, can override the Baseline rule.
The Baseline rule overrides can be reverted by one of the following ways:

The Discovery Pipeline window enables the user to refine the default configuration per the project's requirements.
A rule should be attached to a data platform, along with several other parameters that may become mandatory, based on conditions. Mandatory and optional parameters of each rule type are described further in this article.

Click on 'Add Rule +' to create a new rule.
Rules may be of three types:
For filter rule creation, set either 'Exclude Others' or 'Exclude This' in the Crawler Filter column. In this case, the Data Platform and Schema(s) fields are mandatory, while the Dataset field is optional.
For override rule creation, the only mandatory actions are selecting a Data Platform and checking the Override checkbox. This rule will apply to the entire Data Platform. Populating the Schema(s) and Dataset(s) fields will make this rule more specific.
For creating a combined rule, which includes both a filter and the overrides, set Crawler Filter = 'Exclude Others' and check the Override checkbox.
The purpose of this rule type is to limit the Discovery process to the specified source entities.
Example
The below image presents a rule defined for sakilla_pg data platform and crm schema. The purpose of this rule is to limit the Discovery process to crm schema only, since sakilla_pg includes multiple schemas that are irrelevant for the current run.

The purpose of this rule type is to override one or multiple baseline settings without filtering the data source.
The rule requires selecting a data platform and checking the Override checkbox.
The Crawler Filter should be set to 'No Filter' as the discovery should be executed on the entire data platform.
Note that when the schema(s) and dataset(s) fields are populated, the override rules are applied only to them. This type of rule does not have any filtering effect.
Example
The below image presents an override rule defined for the CUSTOMER table of the CRM_DB data platform and main schema.
The purpose of this rule is to override the Sample Size definition, increasing it to 25% (instead of the default 10% setting). This override is applicable only to the specified dataset — CUSTOMER. The discovery is executed on the entire CRM_DB data platform without any filters.

The purpose of this rule type is to limit the Discovery process to the specified source entities and at the same time to override some of the baseline settings.
Example
The below image presents a rule defined for sakilla_pg data platform and crm schema. The purpose of this rule is to limit the Discovery process to crm schema only, since sakilla_pg includes multiple schemas that are irrelevant for the current run. In addition to the filter, the rule also defines a baseline override by setting one of the inactive plugins to 'active'.

The purpose of this rule type is to exclude specified source entities during the Crawler run.
The rule requires to selecting a data platform and populating at least one schema.
When one or multiple datasets are populated, these dataset(s) will be excluded.
This rule cannot be combined with the override action, as the Crawler will exclude the specified nodes.
Example
The below image presents a rule that excludes the CASE_NOTES table of the CRM_DB data platform and the main schema from the Discovery process. This means that discovery runs on all CRM_DB tables except CASE_NOTES.

Multiple rules can be defined for the same data platform. The purpose of creating multiple rules is to allow variations of the Discovery process execution for different elements. For example, one may need to set a larger sample size for some datasets or execute a specific plugin on a designated dataset or schema.
When multiple rules are defined for the same data platform, they adhere to the following hierarchy:
Example of rule combinations and hierarchy
The below image presents three rules defined for the AdventureWorks data platform:

When a new plugin is created in a project, it should be added to the Baseline rule in order to be included in the Discovery job execution. Once added to the baseline, the new plugin is automatically propagated to all existing rules and can have different settings in each.
For example, if a newly created plugin is applicable only to running Discovery on the CRM_DB, it should be added to the baseline as 'inactive'. Then, a rule for the CRM_DB should be created, where this plugin is set to 'active'.
The steps for adding a new plugin to the pipeline are:
icon to open the Plugins context menu and choose Add Plugin.
The new plugin is always added to the end of the Plugins list. However, the plugin's execution order can be changed by dragging it to the desired position within the list.
Note that the Delete selected option in the context menu is available only for the project plugins, as product plugins cannot be deleted. If a product plugin is not needed, it can be set to 'inactive' in the Baseline rule.