我正在尝试运行Spring Boot YARN示例(在Windows上为https://spring.io/guides/gs/yarn- basic/)。在application.yml我更改fsUri并resourceManagerHost指向我的VM的主机192.168...。但是,当我尝试运行应用程序时,Exceprion出现了:
application.yml
fsUri
resourceManagerHost
192.168...
DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1508) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1284) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449) [2017-05-27 19:59:49.570] boot - 7728 INFO [Thread-5] --- DFSClient: Abandoning BP-646365587-10.0.2.15-1495898351938:blk_1073741830_1006 [2017-05-27 19:59:49.602] boot - 7728 INFO [Thread-5] --- DFSClient: Excluding datanode DatanodeInfoWithStorage[10.0.2.15:50010,DS-f909ec7a-8374-4cdd-9cfc-0e778810d98c,DISK] [2017-05-27 19:59:49.647] boot - 7728 WARN [Thread-5] --- DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
这意味着无法从我的主机访问DataNode。因此,我将其添加到hdfs-site.xml中
<property> <name>dfs.client.use.datanode.hostname</name> <value>true</value> <description>Whether clients should use datanode hostnames when connecting to datanodes. </description> </property>
但是它仍然抛出该异常。
我的VM上正在运行Hadoop 2.8.0。这是conf。文件:
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://0.0.0.0:9000</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/hadoop-2.8.0/data/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/hadoop-2.8.0/data/datanode</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <property> <name>dfs.client.use.datanode.hostname</name> <value>true</value> <description>Whether clients should use datanode hostnames when connecting to datanodes. </description> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>8192</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>8192</value> </property> <property> <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per- disk-percentage</name> <value>99</value> </property> </configuration>
您core- site.xml应该指向Namenode地址,但当前指向的0.0.0.0是本地计算机上的所有地址。这将导致模棱两可的结果,因为每台机器都应视为Namenode。
core- site.xml
Namenode
0.0.0.0
Namenode 只能是hadoop集群中的一个。
更换0.0.0.0用Namenode的ip还是hostname应该可以解决你所面临的问题。
ip
hostname