There are two reasons for that:
- GUI programs do not inherit your terminal/shell enviroment variables – unless you start them from a terminal session with
$ open /Applications/RStudio.app
- $HADOOP_OPTS / $YARN_OPTS are not evaluated by other programs even if the variables are present in their execution environment.
The first problem is well covered by various blog posts. The main difficulty is only to find the correct procedure for your OSX version,since Apple has changed several times over the years:
- using a .plist file in ~
- using a setenv statement line
- using the
launchctl setenvcommand (from Yosemite)
To find out which variable is used inside your GUI program or plugin may need some experimentation or look at the source. For java based plugins the variable
_JAVA_OPTIONS which is always evaluated may be a starting point. For RHadoop package the more specific HADOOP_OPTS is already sufficient, so on yosemite:
$ launchctl setenv HADOOP_OPTS "-Djava.security.krb5.conf=/etc/krb5.conf" # prefix command with sudo in case you want the setting for all users
If you need the setting only inside R/RStudio you could simply add the enviroment setting in your R scripts before initialising the RHadoop packages.
# wrapper script: hadoop --config ~/remote-hadoop-conf hadoop.command <- "~/scripts/remote-hadoop" Sys.setenv(HADOOP_OPTS ="-Djava.security.krb5.conf=/etc/krb5.conf") Sys.setenv(HADOOP_CMD=hadoop.command) # load hdfs plugin for R library(rhdfs) hdfs.init() # print remote hdfs root directory print(hdfs.ls("/"))