分享

Installing Hadoop on Mac OSX Yosemite Tutorial

 dzh1121 2015-04-04
Install HomeBrew

Installing Hadoop
SSH Localhost
Configuring Hadoop
Starting and Stopping Hadoop
Good to know

Install HomeBrew

Found here: http:/// or simply paste this inside the terminal

$ ruby -e "$(curl -fsSL https://raw./Homebrew/install/master/install)"

Install Hadoop

$ brew install hadoop

Hadoop will be installed in the following directory
/usr/local/Cellar/hadoop

Configuring Hadoop

Edit hadoop-env.sh

The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hadoop-env.sh
where 2.6.0 is the hadoop version.

Find the line with

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

and change it to

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="

这里要不要在后面加上=true呢???(现在用的是原来的那个。没有改)
 

Edit Core-site.xml

The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/core-site.xml

 <configuration>
  <property>
     <name>hadoop.tmp.dir</name>  
    <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
    <description>A base for other temporary directories.</description>
  </property>
  <property>
     <name>fs.defaultFS</name>                                     
     <value>hdfs://localhost:9000</value>                             
  </property>                                                        
</configuration> 

Edit mapred-site.xml

The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/mapred-site.xml and by default will be blank.

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Edit hdfs-site.xml

The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hdfs-site.xml

<configuration>

    <property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/Cellar/hadoop/2.6.0/hadoop_data/hdfs/namenode</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/Cellar/hadoop/2.6.0/hadoop_data/hdfs/datanode</value>
</property>
</configuration>

Edit yarn-site.xml

$ sudo vi yarn-site.xml
        <configuration>
                  <property>
                      <name>yarn.nodemanager.aux-services</name>
                      <value>mapreduce_shuffle</value>
                  </property>
                  <property>
                      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                      <value> org.apache.hadoop.mapred.ShuffleHandler</value>
                  </property>
          </configuration>

#######  解决datanode链接namenode总是给出IP地址为0.0.0.0的问题######(待商榷)###

     <property>
         <name>yarn.resourcemanager.address</name>
         <value>localhost:8032</value>
      </property>
      <property>
         <name>yarn.resourcemanager.scheduler.address</name>
         <value>localhost:8030</value>
      </property>
      <property>
         <name>yarn.resourcemanager.resource-tracker.address</name>
         <value>localhost:8031</value>
      </property>

######################end of yarn-site.xml #############



To simplify life edit your ~/.profile using vim,  and add the following two commands

alias hstart="/usr/local/Cellar/hadoop/2.6.0/sbin/start-dfs.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/start-yarn.sh"
alias hstop="/usr/local/Cellar/hadoop/2.6.0/sbin/stop-yarn.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/stop-dfs.sh"

and execute

$ source ~/.profile

in the terminal to update.

Before we can run Hadoop we first need to format the HDFS using

$ hdfs namenode -format

SSH Localhost

Nothing needs to be done here if you have already generated ssh keys. To verify just check for the existance of ~/.ssh/id_rsa and the ~/.ssh/id_rsa.pub files. If not the keys can be generated using

$ ssh-keygen -t rsa

Enable Remote Login
“System Preferences” -> “Sharing”. Check “Remote Login”
Authorize SSH Keys
To allow your system to accept login, we have to make it aware of the keys that will be used

$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Let’s try to login.  (这样不用输密码了)

$ ssh localhost
> Last login: Fri Mar  6 20:30:53 2015
$ exit

Change owner of folders

mkdir -p /usr/local/Cellar/hadoop/2.6.0/hadoop_data/hdfs/namenode
mkdir -p /usr/local/Cellar/hadoop/2.6.0/hadoop_data/hdfs/datanode
sudo chown -R Tiger /usr/local/Cellar/hadoop

Running Hadoop

Now we can run Hadoop just by typing

$ hstart

and stopping using

$ hstop

(在这之后,别的都可以了,但是resourcemanager没有启动起来)
$sbin/yarn-daemon.sh start resourcemanager (这个在hadoop目录下)
starting resourcemanager, logging to /usr/local/Cellar/hadoop/2.6.0/libexec/logs/yarn-Tiger-resourcemanager-MacMini.out
nohup: can't detach from console: No such file or directory
(但那个目录明明是有的啊???? FYI 如果成功的话,不会有nohup那一行)
解决方法是:

Download Examples

To run examples, Hadoop needs to be started.

Hadoop Examples 1.2.1 (Old)
Hadoop Examples 2.6.0 (Current)

Test them out using:

$ hadoop jar <path to the hadoop-examples file> pi 10 100

Good to know

We can access the Hadoop web interface by connecting to

Resource Manager: http://localhost:50070
JobTracker: http://localhost:8088
Specific Node Information: http://localhost:8042

This we can use to access the HDFS filesystem, for any resulting output files.

HDFS viewer


    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多