Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

As there are several different distributions of Hadoop available, you may have to create your own mzp package containing the specific Hadoop jar files to be used, and commit this package into your system in order to start using the HDFS agents. This is required when you are using a different distribution than the one that is available at hadoop.apache.org. The included mzp package has been tested with Apache Hadoop version 2.7.3.

...

  1. Copy the set of jar files for the Hadoop version you want to use to the machine that is that Image Added is running on.e

    The set of jar files comprises hadoop-authhadoop-commonhadoop-hdfscommons-collections and jets. If any of these files do not exist, or do not work, contact support.

    Depending on the file structure, the files may be located in different folders, but typically they will be located in a folder called hadoop, or hadoop-common, where the hadoop-common.jar file is placed in the root directory, and the rest of the jar files are placed in a subdirectory called /lib.
     

  2. Set a variable called $FILES for all the different jars.

    Info
    titleExample

    This example shows how this is done for the Cloudera Distribution of Hadoop 4.

    Code Block
    languagetext
    themeEclipse
    FILES="-exported 3.1.0 file=hadoop-auth-3.1.0.jar \
    -exported 3.1.0 file=hadoop-common-3.1.0.jar \
    -exported 3.1.0 file=hadoop-hdfs-3.1.0.jar \
    -exported 3.1.0 file=hadoop-aws-3.1.0.jar \
    -exported 3.1.0 file=hadoop-annotations-3.1.0.jar \
    file=hadoop-hdfs-client-3.1.0.jar \
    file=stax2-api-3.1.4.jar \
    file=commons-collections-3.2.2.jar \
    file=htrace-core4-4.1.0-incubating.jar \
    file=woodstox-core-5.0.3.jar \
    file=commons-configuration2-2.1.1.jar \
    file=httpclient-4.5.2.jar file=commons-logging-1.1.3.jar \
    file=protobuf-java-2.5.0.jar \
    file=guava-11.0.2.jar \
    file=re2j-1.1.jar \
    file=aws-java-sdk-bundle-1.11.271.jar"



    Note
    titleNote

    These files are version specific, which means that the list in the example will not work for other versions of Hadoop.


  3. Create the mzp package:

    Code Block
    languagetext
    themeEclipse
    mzsh pcreate "Apache Hadoop" "<distribution>" 
    apache_hadoop_cdh4.mzp -level platform -osgi true $FILES


    Note
    titleNote

    It is important that the package is called exactly "Apache Hadoop".


    Info
    titleExample - Creating the mzp package

    This example shows how this could look like for the Cloudera Distribution of Hadoop 4.

    Code Block
    languagetext
    themeEclipse
    mzsh pcreate "Apache Hadoop" "CDH4.4" apache_hadoop_cdh4.mzp
    -level platform $FILES



  4. Commit the new package:

    Code Block
    languagetext
    themeEclipse
    mzsh mzadmin/<password> pcommit apache_hadoop_<application>.mzp


  5. Restart the Platform and ECs:

    Code Block
    languagetext
    themeEclipse
    mzsh shutdown platform <ec> <ec>
    mzsh startup platform <ec> <ec> 


    Info
    titleKerberos Use

    It is possible to use manually-created Kerberos tickets by using the kinit command. The UseGroupInformation class can access them from the ticket cache. In this case, the items cannot be auto-renewed. 

    It is advised that you allow MediationZone the system to handle user logins and ticket renewal as we do not recommend you manually create tickets.