Configure Data Loader for Feature Engineering
Contents
The Cloud Feature Engineering Pipeline (CFEP) facilitates computation of aggregated features and complex data joins for more broad-based predictive analysis.
About feature engineering
The CFEP augments your datasets with engineered features and data joins. It automatically performs complex data selection and manipulation steps that otherwise require much more hands-on effort.
NOTE: Although the CFEP is enabled by default, your Genesys GPR account in the Genesys Multicloud CX must have the CFEP configured before you can use it. Contact your Genesys representative for more information.
After you have arranged to have the CFEP configured on the GPR Core Platform, follow the procedures in this topic to have Data Loader upload your data to the pipeline. The CFEP is available in Data Loader release 9.0.017.01 and higher.
With the CFEP enabled, Data Loader does the following:
- Automatically extracts user data from Genesys Info Mart.
- Uploads historical interaction data processed on specified virtual queues.
- Optionally, triggers execution of the CFEP job every upload period after all dataset chunks are uploaded to the GPR Core Platform.
Turn on feature engineering
Enable feature engineering
Feature engineering is enabled by default. However, you can turn off feature engineering separately for any datasets you do not want to have processed by the CFEP. To turn it off, open the Data Loader Application object in GAX and set the value of the use-cloud-feature-engineering configuration option to false
. By default, this option is set to true
.
Start feature engineering automatically
To have the CFEP automatically start processing each time Data Loader uploads fresh data, set the trigger-pipeline-execution option to true
. If you prefer to start pipeline processing manually, you can trigger it by sending a request to the GPR API.
Configure data extraction from Genesys Info Mart
SQL Query Templates
The fields that Data Loader extracts from the Genesys Info Mart database for upload are defined by an SQL query file. Genesys provides default SQL query template files with the Data Loader IP to populate the interactions and Agent Profile datasets.
Genesys recommends that you use the standard SQL query templates to upload data to the CFEP. If you need custom SQL queries, review the information in Create your own SQL query for mandatory fields and other important guidelines.
Note: If you use the standard SQL query templates, do not configure the sql-query option for the interactions and Agent Profile datasets.
Upload User Data
Data Loader extracts user data stored in the Genesys Info Mart database, which is expected to follow user data mapping and propagation rules. See User Data Mapping in the Genesys Info Mart Deployment Guide for a discussion of this topic.
- If a user data key in the Info Mart database has only one configured rule, Data Loader automatically adds it to the interactions dataset.
- If you have configured multiple mappings for a user data key and have not specified which is the default rule, Data Loader adds each mapped user data value to a different column in the dataset. For example, a key, CustomData has two mappings configured, IRF_ROUTE and PARTY. With no default rule configured, Data Loader adds the following two columns to the dataset: CustomData_IRF_ROUTE and CustomData_PARTY.
- If one of the mapped user data values is more important than the others for predictor creation and model building, you can specify it as the default rule. See Configure user data with multiple mappings for instructions.
Note: If you do not have multiple mappings for user data in the Info Mart database, or if you are satisfied with Data Loader assigning each mapping to a separate column, you do not need to do any additional configuration.
Filter by VQ
To limit the amount data that Data Loader uploads to the CFEP, list the names of the virtual queues (VQs) from which Data Loader should take historical interaction data.
To specify VQ names, list the names of the desired VQs in the vq-filter option.
Note: The length of the comma-separated list of VQ names should not exceed 4096 characters. Genesys always recommends that you use Genesys Administrator Extension (GAX) to configure options, but this recommendation is especially important of your list of VQ names is longer than 255 characters.
Migrate datasets to FE
If you have already-configured datasets that you would now like the CFEP to process, use the following procedure.
Note: This procedure must be done by a user with the STAFF role.
- Disable interaction processing using GPR for the time it takes to perform the migration.
- From Genesys Administrator Extension (GAX), export your current Data Loader configuration and save it as a reference.
- Delete the existing Agent Profile schema from the GPR web application. See View the Agent Profile schema in the Predictive Routing Help for instructions.
Use GAX to update your Data Loader configuration:
- Set the upload-dataset and use-cloud-feature-engineering options to
false
for all the previously-uploaded datasets you plan to migrate. - Remove the following configuration sections, if present. They are not used with the CFEP, which performs data aggregation automatically: [dataset-agents-gim-ext] and [schema-agents-gim-ext].
- Your configuration needs to include both the [dataset-interactions-gim] and [schema-interactions-gim] pair of configuration sections and a new pair of sections that include the same options and the same initial configuration settings.
- NOTE: Data Loader releases prior to 9.0.017.01 did not include [dataset-interactions-gim] and [schema-interactions-gim] in the default template. If you do not have those sections in your environment, use GAX to import the Data Loader Application Template for release 9.0.017.01 or higher, then continue the following procedure.
- To create the required configuration sections, use GAX to perform the following steps:
- Rename the [dataset-interactions-gim] and [schema-interactions-gim] sections to [dataset-interactions-gim-temp] and [schema-interactions-gim-temp]. This prevents them from being overwritten in the following step.
- In GAX, import the DataLoader.cfg template file, which is located in the Data Loader Installation Package. For detailed information about importing Application templates, see Bulk Provisioning of Configuration Options in the Genesys Administrator Extension Help.
- Locate the newly-created [dataset-interactions-gim] and [schema-interactions-gim] sections, which were provisioned by default when you imported the .cfg file. Rename them for use with the CFEP. For example, the new names of these sections might be [dataset-interactions-fe] and [schema-interactions-fe]. All further configuration for the CFEP should be done using these sections.
- Return to your original sections, now named [dataset-interactions-gim-temp] and [schema-interactions-gim-temp]. Return their names to [dataset-interactions-gim] and [schema-interactions-gim]. These sections do not require any additional configuration.
- (Optional) To use multiple SQL queries to extract data from the Genesys Info Mart database into separate interactions datasets, configure additional [dataset-<name>] and [schema-<name>] configuration sections, one for each separate SQL query.
- NOTE: In this case you need to configure the [dataset-<name>].sql-query option for each additional dataset and provide the correct path to the associated SQL file location.
- Configure the date range for each dataset to be used with the CFEP by setting the desired values for the [dataset-<name>].start-date and [dataset-<name>].end-date options.
- Save the changes to your configuration.
After configuring your datasets in GAX, continue with the following steps:
- If you have configured multiple User Data Propagation rules for a single user data key in Genesys Info Mart, see the Configure user data rules section on this page.
- Install Data Loader release 9.0.017.01 or higher, following the instructions to Deploy Data Loader. The version you deploy must support the CFEP.
- In GAX, set the use-cloud-feature-engineering option to
true
for the [dataset-agents-gim] and [dataset-interactions-fe] configuration sections in the Data Loader Application object. If you are using additional datasets, make this change in the associated dataset configuration sections as well. - Start Data Loader.
- In GAX, set the upload-dataset option to
true
for the [dataset-agents-gim] and [dataset-interactions-fe] configuration sections in the Data Loader Application object. If you are using additional datasets, make this change in the associated dataset configuration sections as well. This triggers the initial upload to the CFEP. - After the CFEP job is complete, open the GPR web application and review the datasets created. If necessary, refer to View your uploaded data in the Predictive Routing Help for help finding and understanding the data displays. You can now use the uploaded data to create predictors and models.
- After you have verified the quality of the datasets produced by CFEP, created predictors with the new datasets, and trained the new models, use GAX to remove your old datasets from the Data Loader Application configuration.
- Re-enable GPR interaction processing.
Configure user data rules
If you have used Genesys Info Mart user data mapping rules to have certain user data stored in multiple Info Mart database fields, you can choose to specify one of those as the main value for your dataset. To do so, create an option in the [schema-interactions-*] section of the Data Loader Application. The option name is the user data key name. The value should indicate the datatype and the parameter 'rule:<rule_name>'.
Example
- If the dataset is identified with the suffix "fe", the section name where you should create the option is [schema-interactions-fe].
- The option name should be the name of the user data key. In this example, the key name is CustomData.
- Set the option value using the following format: datatype, rule:<rule_name>. For example, your option value in this example might be: string,rule:IRF_ROUTE.
Data Loader adds a column named CustomData to the uploaded dataset, which contains the values from the Info Mart database column defined by the mapping rule IRF_ROUTE.
Note: Data from the other mapping rules is retained and added to the dataset, in case it might prove useful. The value for each rule is stored in a separate column named <user_data_keyname>_<rule_name>. For example, you might have a rule, PARTY, for the CustomData user data key. In that case, Data Loader adds a column named CustomData_PARTY containing the values for the other mapping.