The Plugin Framework is an internal platform for running the plugins. Each plugin is a piece of business logic that executes predefined rules in order to complement the Catalog schema.
The Plugin Framework is executed by the Discovery job after completion of the Crawler. It runs over the Catalog schema and executes the plugins. If the plugin’s rule returns 'true', it can result with a change to the Catalog schema, such as creation or removal of Catalog elements. Each plugin calculates a score - a confidence level of a plugin result's accuracy. The score is calculated per each Catalog element.
The Data Discovery solution includes a constantly growing list of built-in plugins.
This article describes the main configuration parameters of the plugins.discovery file. The list of active plugins, their execution order and configuration parameters are described in the next article.
The plugins.discovery is the configuration file of the Plugins Pipeline process. This file is located in the Web Studio under the Implementation/SharedObjects/Interfaces/Discovery/
folder.
The plugins.discovery configuration file includes the settings of the Discovery job such as:
This file can be updated per your project's requirements. Once the plugins.discovery configuration file is updated, the Discovery job should be rerun, applying the changes on the Catalog.
The data sample is retrieved from the data source during the Discovery job run. The data is encrypted and is being used by the various plugins during the job run. Once the plugins' execution has been completed, the data sample is deleted.
The sample size is configured in the [sample_size] section of the plugins.discovery file as follows:
The [data_platforms] section of the plugins.discovery file enables setting:
The syntax should provide either the schema name - <schema>
- to be fully included (or excluded), or the comma-separated list of tables - <schema>.<table>
or *.<table>
.
Example:
"data_platforms":{
"AdventureWorks": {
"include_list": ["Production"]
},
"SF_DB": {
"exclude_list": ["SFORCE.APEXCLASS","SFORCE.APEXLOG","SFORCE.ASSETHISTORY"]
}
}
The above configuration defines the following rules:
If the interface's driver supports wildcards (used in conjunction with the LIKE operator), they can be included in the <table>
definition of the exclude or include lists. For example, the % symbol usually represents one or more characters in JDBC driver. Thus, writing <schema>.<ABC%>
will define the datasets with name starting with 'ABC'.
Each plugin's definition in the plugins.discovery includes a threshold - the score above which the plugin result impacts the Catalog. When the threshold is set to 0.4 and the rule receives a calculated score of 0.4 or below, this rule has no impact on the Catalog.
To enable the Catalog to show more results - update the threshold to a number lower than 0.4, and to show less results - update the threshold to a higher number.
The Plugin Framework supports execution of custom plugins. In order to incorporate a custom plugin into the process, it needs to be added to the Plugins Pipeline configuration file.
The Plugin Framework is an internal platform for running the plugins. Each plugin is a piece of business logic that executes predefined rules in order to complement the Catalog schema.
The Plugin Framework is executed by the Discovery job after completion of the Crawler. It runs over the Catalog schema and executes the plugins. If the plugin’s rule returns 'true', it can result with a change to the Catalog schema, such as creation or removal of Catalog elements. Each plugin calculates a score - a confidence level of a plugin result's accuracy. The score is calculated per each Catalog element.
The Data Discovery solution includes a constantly growing list of built-in plugins.
This article describes the main configuration parameters of the plugins.discovery file. The list of active plugins, their execution order and configuration parameters are described in the next article.
The plugins.discovery is the configuration file of the Plugins Pipeline process. This file is located in the Web Studio under the Implementation/SharedObjects/Interfaces/Discovery/
folder.
The plugins.discovery configuration file includes the settings of the Discovery job such as:
This file can be updated per your project's requirements. Once the plugins.discovery configuration file is updated, the Discovery job should be rerun, applying the changes on the Catalog.
The data sample is retrieved from the data source during the Discovery job run. The data is encrypted and is being used by the various plugins during the job run. Once the plugins' execution has been completed, the data sample is deleted.
The sample size is configured in the [sample_size] section of the plugins.discovery file as follows:
The [data_platforms] section of the plugins.discovery file enables setting:
The syntax should provide either the schema name - <schema>
- to be fully included (or excluded), or the comma-separated list of tables - <schema>.<table>
or *.<table>
.
Example:
"data_platforms":{
"AdventureWorks": {
"include_list": ["Production"]
},
"SF_DB": {
"exclude_list": ["SFORCE.APEXCLASS","SFORCE.APEXLOG","SFORCE.ASSETHISTORY"]
}
}
The above configuration defines the following rules:
If the interface's driver supports wildcards (used in conjunction with the LIKE operator), they can be included in the <table>
definition of the exclude or include lists. For example, the % symbol usually represents one or more characters in JDBC driver. Thus, writing <schema>.<ABC%>
will define the datasets with name starting with 'ABC'.
Each plugin's definition in the plugins.discovery includes a threshold - the score above which the plugin result impacts the Catalog. When the threshold is set to 0.4 and the rule receives a calculated score of 0.4 or below, this rule has no impact on the Catalog.
To enable the Catalog to show more results - update the threshold to a number lower than 0.4, and to show less results - update the threshold to a higher number.
The Plugin Framework supports execution of custom plugins. In order to incorporate a custom plugin into the process, it needs to be added to the Plugins Pipeline configuration file.