Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

To open the real-time Disk_Deprecated collection agent configuration, click  Build → New Configuration. Select Workflow from the Configurations dialog. When prompted to Select workflow type, select Realtime. Click Add agent and select Disk Deprecated from the Agent Selection dialog. Double-click the agent icon or right-click the icon and select Edit agent, to display the Agent Configuration dialog.


Disk_Deprecated Tab

The Disk_Deprecated tab contains settings related to the placement and handling of the source files to be collected by the agent.

Disk_Deprecated collection agent - Disk_Deprecated tab

File Information

Setting

Description

Directory

Enter the path of the source directory on the local file system of the EC, where the source files reside. The path can be absolute or relative to the $MZ_HOME environment variable.

During processing of a file, it will be temporarily stored in a subdirectory under DR_TMP_DIR in the specified directory.

Filename

Enter an expression that matches the source files on the local file system. Regular expressions according to Java syntax applies. For further information, see: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

Example.

 To match all filenames beginning with TTFILE, type:

ready_TTFILE.*


Note!

Collecting a file while it is open for writing in another application may cause loss of data. For this reason, it is recommended that you rename files after moving them to the source directory. The renamed files should include a suffix or prefix that is also included in the Filename expression.


Compression

Select Gzip to decompress files before collection and insertion into the workflow. If the collected files are not compressed, select No Compression.

Polling Interval (ms)

Enter the interval, in milliseconds, at which the source directory is to be scanned for new files.

File Reader Size

The agent moves files to the temporary directory for processing in bulk. Enter the maximum number of files to be included in each bulk move operation.

Read Size (b)

Enter the buffer size to be used for for reading files.

When a decoder is selected the agent produces FileSend UDRs that contain one decoded UDR. On the other hand, when a decoder is not selected, the FileSend UDRs contain bytearrays which are split according to the value of Read Size (b).

Note!

For performance reasons, it is recommended to set this value to a multiple of 1024.


No of Thread(s)

Enter the number of worker threads that should be used for file collection. This field determines the number of files that the agent can process concurrently. You should typically set the number of threads to match the number of workflow threads. To ensure that the files are processed in timestamp order, set No of Thread(s) to 1 and also select the checkbox Sort Files.

Sort Files

Select this checkbox to sort the files that are moved to the temporary directory for processing. The files are sorted according to timestamp in ascending order. Due to multithreading, the files may not be processed in this order.

Timeout Handling

Timeout (ms)

 When the agent routes the first partial data set to the workflow, a timeout counter starts. The timeout counter is reset when the workflows acknowledge reception of the complete file. If the timeout counter is exceeded the collected file is either moved to a user-specified directory or After Collection strategy is applied.

Enter the timeout value in milliseconds.

Move Files on Timeout

Select this checkbox to move timed out files to an automatically created subdirectory that you specify in the Directory setting.

Note!

When this checkbox is cleared and a timeout occurs, the After Collection strategy is applied.


Path

When Move Files on Timeout is selected, enter the target directory for files that are subject to timeout handling.

After Collection

Move/Rename

Select this radio button to move the source files to the directory specified in the Destination field, after the collection.

If you use the Prefix or Suffix fields, the file is renamed as well.

Enter the path of the directory on the local file system of the EC into which the source files should be moved after collection. The path can be absolute or relative to the $MZ_HOME environment variable.

Note!

It is possible to move collected files from one file system to another, however it will have a negative impact on performance.


Prefix/Suffix

Enter the prefix and/or suffix to append to the beginning/end of the name of the source files, after the collection. These fields are available if you have selected Move or Rename.

Note!

If you have selected Rename, the source files are renamed in the current directory. Make sure not to assign a Prefix or Suffix that result in files names that match the regular expression in Filename, or the files will be collected over and over again.


Search/Replace


Note!

 To apply Search and Replace, select Move/ Rename.

Search: Enter the part of the filename that you want to replace.

Replace: Enter the replacement text.

Search and Replace operate on entries in a way that is similar to the Unix sed utility.

The identified filenames are modified and forwarded to subsequent agents in the workflow.

This functionality enables you to perform advanced filename modifications as well:

  • Use regular expression in the Search entry to specify the part of the filename to extract.

Note!

A regular expression that fails to match the original file name will abort the workflow.

    • Enter Replace with characters and meta characters that define the pattern and content of the replacement text.

Remove

If this radio button is selected, the source files are removed from the source directory after collection.

Decoder Tab

The Decoder tab contains settings related to decoding of the collected data.

Disk_Deprecated collection agent - Decoder tab

Disk_Deprecated collection agent when MZ Tagged Format is selected - Decoder tab

Setting

Description

Decoder

Click Browse to select from a list of available decoders created in the Ultra Format Editor, as well as the default built-in decoders:

  • CSV Format

  • JSON Format

  • MZ Tagged Format

Different settings are available depending on the Decoder you select.

Full Decode

This option is only available when you have selected a decoder created in the Ultra Format Editor.

Select this option to fully decode the UDR before it is sent out from the decoder agent. This action may have a negative impact on performance, since not all fields may be accessed in the workflow, making decoding of all fields in the UDR unnecessary. If it is important that all decoding errors are detected, you must select this option.

If this checkbox is cleared (default), the amount of work needed for decoding is minimized using "lazy" decoding of field content. This means that the actual decoding work may be done later in the workflow, when the field values are accessed for the first time. Corrupt data (that is, data for which decoding fails) may not be detected during the decoding stage, but can cause a workflow to abort later in the process.

MZ Tagged Specific Settings

Tagged UDR Type

Click the Add button to select from a list of available internal UDR formats stored in the Ultra and Code servers to reprocess UDRs of an internal format and send them out.

If the compressed format is used, the decoder automatically detects this. 

JSON Specific Settings

UDR Type

Click Browse to select the UDR type you want the Decoder to send out. You can either select one of the predefined UDRs or the DynamicJson UDR, which allows you to add a field of type any for including payload.

Unmapped Fields

If you have selected DynamicJson as UDR Type, you can select the option data in this field in order to include payload. If you have selected another UDR type that contains an any, or a map field, you can select to put any unmapped fields into the field you select in this list. All fields of any or map type in the selected UDR type will be available. If set to (None), any unmapped fields will be lost.

Schema Path

Enter the path to the JSON schema you want to use in this field.

CSV Specific Settings

UDR Type

Click Browse to select the UDR type you want the Decoder to send out. You can either select one of the predefined UDRs or the DynamicCsv UDR if the CSV format is not known.

Format

Select the CSV format you want to use; Unix, Mac, Windows, or Excel, or select to define your own customized format. If you select Custom, the following four settings will be enabled.

Delimiter

Enter the delimiter character(s) for the fields in the CSV.

Use Quote

Select this option if quotes are used in the CSV.

Quote

If Use Quote is selected, enter the type of quotes used in the CSV.

Line Break

Enter how line breaks are expressed in the CSV.

Setting

Description

Decoder

Click Browse to select from a list of available decoders created in the Ultra Format Editor, as well as the default built-in decoders:

  • CSV Format

  • JSON Format

  • MZ Tagged Format

Different settings are available depending on the Decoder you select.

Full Decode

This option is only available when you have selected a decoder created in the Ultra Format Editor.

Select this option to fully decode the UDR before it is sent out from the decoder agent. This action may have a negative impact on performance, since not all fields may be accessed in the workflow, making decoding of all fields in the UDR unnecessary. If it is important that all decoding errors are detected, you must select this option.

If this checkbox is cleared (default), the amount of work needed for decoding is minimized using "lazy" decoding of field content. This means that the actual decoding work may be done later in the workflow, when the field values are accessed for the first time. Corrupt data (that is, data for which decoding fails) may not be detected during the decoding stage, but can cause a workflow to abort later in the process.

MZ Tagged Specific Settings

Tagged UDR Type

Click the Add button to select from a list of available internal UDR formats stored in the Ultra and Code servers to reprocess UDRs of an internal format and send them out.

If the compressed format is used, the decoder automatically detects this. 

JSON Specific Settings

UDR Type

Click Browse to select the UDR type you want the Decoder to send out. You can either select one of the predefined UDRs or the DynamicJsonUDR, which allows you to add a field of type any for including payload.

Unmapped Fields

If you have selected the DynamicJsonUDR as UDR Type, you can select the option data in this field in order to include payload. If you have selected another UDR type that contains an any, or a map field, you can select to put any unmapped fields into the field you select in this list. All fields of any or map type in the selected UDR type will be available. If set to (None), any unmapped fields will be lost.

Schema Path

Enter the path to the JSON schema you want to use in this field.

CSV Specific Settings

UDR Type

Click Browse to select the UDR type you want the Decoder to send out. You can either select one of the predefined UDRs or the DynamicCsv UDR if the CSV format is not known.

Format

Select the CSV format you want to use; Unix, Mac, Windows, or Excel, or select to define your own customized format. If you select Custom, the following four settings will be enabled.

Delimiter

Enter the delimiter character(s) for the fields in the CSV.

Use Quote

Select this option if quotes are used in the CSV.

Quote

If Use Quote is selected, enter the type of quotes used in the CSV.

Line Break

Enter how line breaks are expressed in the CSV.






  • No labels