Connect to a remote, kerberized hadoop cluster

To use a remote hadoop cluster with kerberos authentication you will need to get a proper krb5.conf file (eg from your remote cluster /etc/kerb5.conf) and place the file /etc/krb5.conf on your client OSX machine. To use this configurations from your osx hadoop client change your .[z]profile to: export HADOOP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf" export YARN_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf" With java 1.7 this should be sufficient to detect the default realm, the kdc and also any specific authentication options used by your site. Please make sure the kerberos configuration is already in place when you obtain your ticket with ...

December 2, 2014 · Dirk Duellmann

Basic set-up of hadoop on OSX yosemite

Why would I do this? An OSX laptop will not allow to do any larger scale data processing, but it may be convenient place to develop/debug hadoop scripts before running on a real cluster. For this you likely want to have a local hadoop “cluster” to play with, and use the local commands as client for an larger remote hadoop cluster. This post covers the local install and basic testing. A second post shows how to extend the setup for accessing /processing against a remote kerberized cluster. ...

November 18, 2014 · Dirk Duellmann