Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The File System Profile are is used for making file system-specific configurations, currently used by the Amazon S3 collection and forwarding agents.

Configuration

To create a new File System profile, click the New Configuration button in the upper left part of the Desktop window, and then select File System Profile  from the menu. The configurations will vary depending on the selected file system, and each file system will be described separately below.

Menus

The contents of the menus in the menu bar may change depending on which configuration type that has been opened in the currently displayed tab. The File System profile uses the standard menu items and buttons that are visible for all configurations, and these are described in 2.1 Menus and Buttons.

...

ItemDescription

External References

Select this menu item to enable the use of External References in the File System profile configuration. This can be used to configure the following fields:

Amazon S3 file systems

  • Access Key
  • Secret Key
  • Bucket
  • Region
  • Advanced Properties

HDFS file systems

  • Host
  • Port
  • Advanced Properties

For further information, see 8.11.4 Using External Reference in Agent Profile Fields and 8.11 External Reference Profile.

Amazon S3

When selecting Amazon S3 as a file system, you will see two tabs; General and Advanced.

...

File System profile - Amazon S3 - General tab

General Tab

The following settings are available in the General tab in the File System profile (see screenshot above):

SettingDescription

File System Type

Select which file system type this profile should be applied for. Currently, only Amazon S3 is available.
Credentials from EnvironmentSelect this check box in order to pick up the credentials from the environment instead of entering them in this profile. If this check box is selected, the Access Key and Secret Key fields will be disabled.

Access Key

Enter the access key for the user who owns the Amazon S3 account in this field.

Secret Key

Enter the secret key for the stated access key in this field.

Region from EnvironmentSelect this check box in order to pick up the region from the environment instead of entering the region in this profile. If this check box is selected, the Region field will be disabled.
RegionEnter the name of the Amazon S3 region in this field.

Bucket

Enter the name of the Amazon S3 bucket in this field.

Use Amazon ProfileSelect this check box if you already have an Amazon Profile set up, this will disable the the fields above and allow you to utilize the credentials that you have defined in your chosen Amazon Profile.

Advanced Tab

In the Advanced tab, you can configure properties for the Amazon S3 File System client. 

...

For information on how to configure the properties for the Amazon S3 File System client, please refer to https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl.

HDFS

When selecting HDFS as the file systemssystem, you will see two tabs; General and Advanced.

...

File System profile - HDFS General tab

The General Tab

In the General tab you can find the following settings:

FieldDescription

File System Type

Select which file system type this profile should be applied for. Currently, only Amazon S3 is available.
Version

Select a version of Hadoop from the drop-down box:

  • Non-HA - This version of Hadoop does not support high availability as it has only one NameNode.
  • HA - This verison version of Hadoop support supports high availability.

This setting only applies when you have selected Distributed File System as the File System Type.

Host

Enter the IP address or hostname of the NameNode in this field. See the Apache Hadoop Project documentation for further information about the NameNode.

Port

Enter the port number of the NameNode in this field.

The Advanced Tab

The Advanced tab contains Advanced Properties for the configuration of Kerberos authentication.

...

Kerberos is an authentication technology that uses a trusted third party to authenticate one service or user to another. Within Kerberos, this trusted third party is commonly referred to as the Key Distribution Center , or KDC. For HDFS, this means that the HDFS agent authenticates with the KDC using a user principal which must be pre-defined in the KDC. The HDFS cluster must be set up to use Kerberos, and the KDC must contain service principals for the HDFS NameNodes. For information on how to set up a an HDFS cluster with Kerberos, see the Hadoop Users Guide at http://www.hadoop.apache.org.

...

PropertyDescription
hadoop.security.authentication

Set the value to kerberos to activate Kerberos authentication.

Note
titleNote!

Due to limitations in the Apache Hadoop client libraries, if you change this property, you may be required to restart the ECs where workflows containing the HDFS agent is going to run.


dfs.namenode.kerberos.principalThis sets the service principal to use for the HDFS NameNode. This must be predefined in the KDC. The service principal is expected to be in the form of nn/<host>@<REALM> where <host> is the host where the service is running and <REALM> is the name (in uppercase) of the Kerberos realm.
java.security.krb5.kdcThis specifies the hostname of the Key Distribution Center.
java.security.krb5.realmThis sets the name of the Kerberos realm. Uppercase only.
dr.kerberos.client.keytabfileThis sets the keytab file to use for authentication. A keytab must be predefined using Kerberos tools. The keytab must be generated for the user principal in dr.kerberos.client.principal. This filepath file path must be on a file system that can be reached from the EC process. The user that launches the EC must also have read permissions for this file.
dr.kerberos.client.principalThis sets the user principal that the HDFS agent authenticates as. This must be predefined in the KDC. User principals principlas are expected to be in the form of <user>@<REALM> where <user> is typically a username and <REALM> is the name (in uppercase) of the Kerberos realm.
sun.security.krb5.debugSet this value to true to activate debug output for Kerberos.

...

  1. Create a properties file containing the advanced configurations.

    Info
    titleExample - Properties file with advanced configurations


    Code Block
    languagetext
    themeEclipse
    ADV_PROP=hadoop.security.authentication\=kerberos\n\ 
     java.security.krb5.kdc\=kdc.example.com\n\ 
     dr.kerberos.client.principal\=mzadmin@EXAMPLE.COM\n\ 
     dr.kerberos.client.keytabfile\=/home/mzadmin/keytabs/ex.keytab



    Note
    titleNote!

    All "=" characters need to be escaped.


  2. Create an External Reference profile pointing out the property file, and containing a key paripair, e g "ADV_PROP" and "ADV_PROP".
     
  3. In the workflow containing the agent, open up the Workflow Properties, select the Enable External Reference check box.
     
  4. Click on the Browse button and select your Exernal External Reference profile , and for the HD FS - Advanced Properties field, select either Default, or Per Workflow.
     
  5. In the workflow instance table, right-click and select the Enable External Reference option, and enter the key for the properties file, e g ADV_PROP, if that is what you used in step 2 above.


Scroll ignore
scroll-viewportfalse
scroll-pdftrue
scroll-officefalse
scroll-chmtrue
scroll-docbooktrue
scroll-eclipsehelptrue
scroll-epubtrue
scroll-htmlfalse


Next:


Scroll pagebreak