FTP Collection Agent(4.3)

The FTP collection agent collects files from a remote file system and inserts them into a workflow, using the standard FTP (RFC 959) protocol.

When activated, the collector establishes an FTP session towards the remote host. On failure, additional hosts are tried if so configured. On success, the source directory on the remote host is scanned for all files matching the current Filename settings, which are located in the Source tab. In addition, the Filename Sequence service may be used to further control the matching files. All files found will be fed one after the other into the workflow.

The agent also offers the possibility to decompress compressed (gzip) files after they have been collected, before they are inserted into the workflow. When all the files are successfully processed, the agent stops to await the next activation, scheduled or manually initiated.

The FTP collection agent supports IPv4 and IPv6 environments.

Part of the configuration may be done in the Filename Sequence or Sort Order service tabs described in the sections, Filename Sequence Tab, and Sort Order Tab in Workflow Template (3.0).

Configuration

The FTP Collection agent consists of three tabs: Connection, Source and Advanced.

Connection Tab

The Connection tab contains configuration data that is relevant to a remote server.


SettingDescription

Host

Primary host name or IP address of the remote host to be connected. If a connection cannot be established to this host, the Additional Hosts specified in the Advanced tab, are tried.

Username

Username for an account on the remote host, enabling the FTP session to login.

Password

Password related to the Username.

Transfer Type

Data transfer type to be used during file retrieval.

  • Binary - agent uses binary transfer type. Default setting.

  • ASCII - agent uses ASCII transfer type.

File System Type

Type of file system on the remote host.

  • Unix - remote host using Unix file system. Default setting.

  • Windows NT - remote host using Windows NT file system.

  • VAX/VMS - remote host using VAX/VMS file system.

Collection Retries

Select this check box to enable repetitive attempts to connect and start a file transfer.

When this option is enabeled, the agent will attempt to connect to the host as many times as is stated in the Max Retries field described below. If the connection fails, a new attempt will be made after the number of seconds entered in the Retry Interval (s) field described below.

Retry Interval (s)

Enter the time interval in seconds, between retries.

If a connection problem occurs, the actual time interval before the first attempt to reconnect will be the time set in the Timeout field in the Advanced tab plus the time set in the Retry Interval (s) field. For the remaining attempts, the actual time interval will be the number seconds entered in this field.

Max Retries

Enter the maximum number of retries to connect.

In case more than one connection attempt has been made, the number of used retries will be reset as soon as a file transfer is completed successfully.

Note!

This number does not include the original connection attempt.

RESTART Retries

Select this check box to enable the agent to send a RESTART command if the connection has been broken during a file transfer. The RESTART command contains information about where in the file you want to resume the file transfer.

Before selecting this option, ensure that the FTP server supports the RESTART command.

When this option is selected, the agent will attempt to re-establish the connection, and resume the file transfer from the point in the file stated in the RESTART command, as many times as is entered in the Max RESTARTS field described below. When a connection has been re-established, a RESTART command will be sent after the number of seconds entered in the Retry RESTARTS Interval (s) field described below.

Note!

The  RESTART Retries  settings will not work if you have selected to decompress the files in the  Source tab, see the section below, Source Tab.

Note!

RESTART  is not always supported for transfer type ASCII.

For further information about the RESTART command, see http://www.w3.org/Protocols/rfc959/.

Retry RESTARTS Interval (s)

Enter the time interval, in seconds, you want to wait before initiating a restart in this field. This time interval will be applied for all restart retries.

If a connection problem occurs, the actual time interval before the first attempt to send a RESTART command will be the time set in the Timeout field in the Advanced tab plus the time set in the Retry Interval (s) field. For the remaining attempts, the actual time interval will be the number seconds entered in this field.

Max RESTARTS

Enter the maximum number of restarts per file you want to allow.

If more than one attempt to send the RESTART command has been made, the number of used retries will be reset as soon as a file transfer is completed successfully.

Source Tab

The Source tab contains configurations related to the remote host, source directories and source files. The following text describes the configuration options available when no custom strategy has been chosen.


SettingDescription

Collection Strategy

If there are more than one collection strategy available in the system, a Collection Strategy drop down list will also be visible. For further information about the nature of the collection strategy, see Collection Strategies(3.0).

Directory

Absolute pathname of the source directory on the remote host, where the source files reside. If the FTP server is of UNIX type, the path name might also be given relative to the home directory of the User Name account.

Include Subfolders

Select this check box if you have subfolders in the source directory from which you want files to be collected.

Note!

Subfolders that are in the form of a link are not supported.

If you select Enable Sort Order in the Sort Order tab, the sort order selected will also apply to subfolders.

Filename

Name of the source files on the remote host. Regular expressions according to Java syntax applies. For further information, see http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html.


Example

To match all file names beginning with  TTFILE , type:  TTFILE.*

Note!

When collecting files from VAX file systems, the names of the source files include both path and filename, which has to be considered when entering the regular expression.

Compression

Compression type of the source files. Determines if the agent will decompress the files before passing them on in the workflow.

  • No Compression - the agent will  not  decompress the files.

  • Gzip - the agent will decompress the files using gzip.

Move to Temporary Directory

If enabled, the source files will be moved to the automatically created subdirectory DR_TMP_DIR in the source directory, before collection. This option supports safe collection when source files repeatedly uses the same name.

Append Suffix to Filename

Enter the suffix that you want added to the file name prior to collecting it.

Important!

Before you execute your workflow, make sure that none of the file names in the collection directory include this suffix.

Inactive Source Warning (h)

If enabled, when the configured number of hours have passed without any file being available for collection, a warning message (event) will appear in the System Log and Event Area:

The source has been idle for more than <n> hours, the last
inserted file is <file>.

Move to

If enabled, the source files will be moved from the source directory (or from the directory DR_TMP_DIR if using Move Before Collecting), to the directory specified in the Destination field, after collection.

Note!

The Directory has to be located in the same file system as the collected files at the remote host. Also, absolute pathnames must be defined (relative pathnames cannot be used).

If a file with the same filename, but with a different content, already exists in the target directory, the workflow will abort.

If a file with the same file name, AND the same content, already exists in the target directory, this file will be overwritten and the workflow will not abort.

Rename

If enabled, the source files will be renamed after the collection, and remain (or moved back from the directory DR_TMP_DIR if using Move Before Collecting) in the source directory from which they were collected.

Note!

When the  File System Type  for VAX/VMS is selected, some issues must be considered. If a file is renamed after collection on a VAX/VMS system, the filename might become too long. In that case the following rules will apply:

A VAX/VMS filename consists of <file name>.<extension>;<version>, where the maximum number of characters for each part is:

  • <file name>: 39 characters

  • <extension>: 39 characters

  • <version>: 5 characters

If the new filename turns out to be longer than 39 characters, the agent will move part of the filename to the extension part. If the total sum of the filename and extension part exceeds 78 characters, the last characters are truncated from the extension.

  An example:

  A_VERY_LONG_FILENAME_WITH_MORE_THAN_39_ CHARACTERS.DAT;5

  will be converted to:

  A_VERY_LONG_FILENAME_WITH_MORE_THAN_39_. CHARACTERSDAT;5

Note!

Creating a new file on the FTP server with the same file name as the original file, but with another content, will cause the workflow to abort.

Creating a new file with the same file name AND the same content as the original file, will cause the file to be overwritten.

Remove

If enabled, the source files will be removed from the source directory (or from the directory DR_TMP_DIR, if using Move Before Collecting), after the collection.

Ignore

If enabled, the source files will remain in the source directory after the collection. This field is  not  available if Move Before Collecting is enabled.

Destination

Full pathname of the directory on the remote host into which the source files will be moved after the collection. This field is only available if Move to is enabled.

Prefix and Suffix

Prefix and/or suffix that will be appended to the beginning and the end of the name of the source files, respectively, after the collection. These fields are only available if Move to or Rename is enabled.

Warning!

If  Rename  is enabled, the source files will be renamed in the current (source or  DR_TMP_DIR ) directory. Be sure not to assign a  Prefix  or  Suffix , giving files new names still matching the  Filename  regular expression. That will cause the files to be collected over and over again.

Search and Replace

Select either Move to or Rename option to enable Search and Replace.

  • Search: Enter the part of the filename that you want to replace.

  • Replace: Enter the replacement text.

Search and Replace operate on your entries in a way that is similar to the Unix  sed  utility. The identified filenames are modified and forwarded to the following agent in the workflow.

This functionality enables you to perform advanced filename modifications, as well:

  • Use regular expression in the Search entry to specify the part of the filename that you want to extract.

    Note!

    A regular expression that fails to match the original file name will abort the workflow.

  • Enter Replace with characters and meta characters that define the pattern and content of the replacement text.


Search and Replace Examples

To rename the file file1.new to file1.old, use:


  • Search: .new
  • Replace: .old


To rename the file JAN2011_file to file_DONE, use:

  • Search: ([A-Z]*[0-9]*)_([a-z]*)
  • Replace: $2_DONE


Note that the search value divides the file name into two parts by using parentheses. The replace value applies to the second part by using the place holder $2.

Keep (days)

Number of days to keep moved or renamed source files on the remote host after the collection. In order to delete the source files, the workflow has to be executed (scheduled or manually) again, after the configured number of days.

Note!

A date tag is added to the filename, determining when the file may be removed. This field is only available if  Move to  or  Rename  is enabled.

Advanced Tab 

The Advanced tab contains configurations related to the use of the FTP service.

For example, if the used FTP server does not return the file listed in a well-defined format the Disable File Detail Parsing option can be useful. For further information, see the setting description below.


SettingDescription

Command Port

The value in this field defines which port number the FTP service will use on the remote host.

Timeout (s)

The maximum time, in seconds, to wait for response from the server. 0 (zero) means to wait forever.

Passive Mode (PASV)

Must be enabled if FTP passive mode is used for data connection.

In passive mode, the channel for data transfer between client and server is initiated by the client instead of by the server. This is useful when a firewall is situated between the client and the server.

Disable File Detail Parsing

Disables parsing of file detail information received from the FTP server. This enhances the compatibility with unusual FTP servers but disables some functionality.

If file detail parsing is disabled, file modification timestamps will not be available to the collector. The collector does not have the ability to distinguish between directories and simple files, sub directories in the input directory must for that reason  not  match the filename regular expression. The agent assumes that a file named DR_TMP_DIR is a directory because a directory named DR_TMP_DIR is used when Move to Temporary Directory under the Source tab is activated. Therefore, it is not allowed to name a regular file in the collection directory DR_TMP_DIR.

Note!

When collecting files from a VAX file system, this option has to be enabled.

Additional Hosts

List of additional host names or IP addresses that may be used to access the source directory, from which the source files are collected. These hosts are tried, in sequence from top to bottom, if the agent fails to connect to the remote host, set in the Connection tab.

Use the Add, Edit, Remove, Move up and Move down buttons to configure the order of the hosts in the list.