hdfs nfs启动错误:“error mount.mountdbase:启动tcp服务器失败…channelexception:绑定失败…”

vhmi4jdf  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(647)

尝试按照文档使用/启动hdfs nfs(忽略停止 rpcbind 服务,未启动 hadoop portmap 服务,因为操作系统不是sles 11和rhel 6.2),而是在尝试设置nfs服务时出错 hdfs nfs3 服务:

[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs status
Redirecting to /bin/systemctl status nfs.service
Unit nfs.service could not be found.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service nfs stop
Redirecting to /bin/systemctl stop nfs.service
Failed to stop nfs.service: Unit nfs.service not loaded.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-07-23 13:48:54 HST; 28s ago
  Process: 27337 ExecStart=/sbin/rpcbind -w $RPCBIND_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 27338 (rpcbind)
   CGroup: /system.slice/rpcbind.service
           └─27338 /sbin/rpcbind -w

Jul 23 13:48:54 HW02.ucera.local systemd[1]: Starting RPC bind service...
Jul 23 13:48:54 HW02.ucera.local systemd[1]: Started RPC bind service.
[root@HW02 ~]#
[root@HW02 ~]#
[root@HW02 ~]# hdfs nfs3
19/07/23 13:49:33 INFO nfs3.Nfs3Base: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting Nfs3
STARTUP_MSG:   host = HW02.ucera.local/172.18.4.47
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 3.1.1.3.1.0.0-78
STARTUP_MSG:   classpath = /usr/hdp/3.1.0.0-78/hadoop/conf:/usr/hdp/3.1.0.0-78/hadoop/lib/jersey-server-1.19.jar:/usr/hdp/3.1.0.0-78/hadoop/lib/ranger-hdfs-plugin-shim-1.2.0.3.1.0.0-78.jar:
...
<a bunch of other jars>
...
STARTUP_MSG:   build = git@github.com:hortonworks/hadoop.git -r e4f82af51faec922b4804d0232a637422ec29e64; compiled by 'jenkins' on 2018-12-06T12:26Z
STARTUP_MSG:   java = 1.8.0_112

************************************************************/

19/07/23 13:49:33 INFO nfs3.Nfs3Base: registered UNIX signal handlers for [TERM, HUP, INT]
19/07/23 13:49:33 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
19/07/23 13:49:33 INFO impl.MetricsSystemImpl: Nfs3 metrics system started
19/07/23 13:49:33 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:33 INFO security.ShellBasedIdMapping: Not doing static UID/GID mapping because '/etc/nfs.map' does not exist.
19/07/23 13:49:33 INFO nfs3.WriteManager: Stream timeout is 600000ms.
19/07/23 13:49:33 INFO nfs3.WriteManager: Maximum open streams is 256
19/07/23 13:49:33 INFO nfs3.OpenFileCtxCache: Maximum open streams is 256
19/07/23 13:49:34 INFO nfs3.DFSClientCache: Added export: / FileSystem URI: / with namenodeId: -1408097406
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Configured HDFS superuser is
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Delete current dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.RpcProgramNfs3: Create new dump directory /tmp/.hdfs-nfs
19/07/23 13:49:34 INFO nfs3.Nfs3Base: NFS server port set to: 2049
19/07/23 13:49:34 INFO oncrpc.RpcProgram: Will accept client connections from unprivileged ports
19/07/23 13:49:34 INFO mount.RpcProgramMountd: FS:hdfs adding export Path:/ with URI: hdfs://hw01.ucera.local:8020/
19/07/23 13:49:34 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242 for Rpc program: mountd at localhost:4242 with workerCount 1
19/07/23 13:49:34 ERROR mount.MountdBase: Failed to start the TCP server.
org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
        at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
        at org.apache.hadoop.oncrpc.SimpleTcpServer.run(SimpleTcpServer.java:89)
        at org.apache.hadoop.mount.MountdBase.startTCPServer(MountdBase.java:83)
        at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:98)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:56)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:69)
        at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:79)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
...
...
19/07/23 13:49:34 INFO util.ExitUtil: Exiting with status 1: org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:4242
19/07/23 13:49:34 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at HW02.ucera.local/172.18.4.47

************************************************************/

不知道如何解释这里看到的任何错误(并且没有安装任何类似的软件包) nfs-utils ,假设ambari在最初安装集群时已经安装了所有需要的包)。
有什么调试建议或解决方案吗?

**更新:查看错误后,我可以看到

原因:java.net.bindexception:地址已在使用中
看看已经在用它的东西,我们看到。。。

[root@HW02 ~]# netstat -ltnp | grep 4242
tcp        0      0 0.0.0.0:4242            0.0.0.0:*               LISTEN      98067/jsvc.exec

jsvc.exec进程似乎与运行java应用程序有关。鉴于hadoop是在java上运行的,我认为仅仅终止这个进程是不好的。它不应该在这个端口上吗(因为它会干扰nfs网关)?不知道该怎么办。

7kjnsjlb

7kjnsjlb1#

tldr:nfs网关服务已经在运行(默认情况下,很明显),我认为该服务正在阻止 hadoop nfs3 服务( jsvc.exec )从一开始就是(我假设)该服务已经运行的一部分。
让我怀疑的是,当关闭集群时,服务也停止了,加上它正在使用nfs所需的端口。我确认这一点的方法只是遵循文档中的验证步骤,并看到我的输出与预期的类似。

[root@HW02 ~]# rpcinfo -p hw02
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    1   udp   4242  mountd
    100005    2   udp   4242  mountd
    100005    3   udp   4242  mountd
    100005    1   tcp   4242  mountd
    100005    2   tcp   4242  mountd
    100005    3   tcp   4242  mountd
    100003    3   tcp   2049  nfs
[root@HW02 ~]# showmount -e hw02
Export list for hw02:
/ *

另一件可以告诉我jsvc进程是已经运行的hdfs nfs服务的一部分的事情是检查进程信息。。。

[root@HW02 ~]# ps -feww | grep jsvc
root      61106  59083  0 14:27 pts/2    00:00:00 grep --color=auto jsvc
root     163179      1  0 12:14 ?        00:00:00 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...
...
hdfs     163193 163179  0 12:14 ?        00:00:17 jsvc.exec -Dproc_nfs3 -outfile /var/log/hadoop/root/hadoop-hdfs-root-nfs3-HW02.ucera.local.out -errfile /var/log/hadoop/root/privileged-root-nfs3-HW02.ucera.local.err -pidfile /var/run/hadoop/root/hadoop-hdfs-root-nfs3.pid -nodetach -user hdfs -cp /usr/hdp/3.1.0.0-78/hadoop/conf:...

看到了吗 jsvc.exec -Dproc_nfs3 ... 得到暗示 jsvc (显然是为了在linux上运行java应用程序)被用来运行我正试图启动的nfs3服务。
对于其他有此问题的人,请注意,我没有停止docs希望您停止的所有服务(因为使用centos7)

[root@HW01 /]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services
   Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
[root@HW01 /]# service rpcbind status
Redirecting to /bin/systemctl status rpcbind.service
● rpcbind.service - RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-07-19 15:17:02 HST; 6 days ago
 Main PID: 2155 (rpcbind)
   CGroup: /system.slice/rpcbind.service
           └─2155 /sbin/rpcbind -w

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

还要注意的是,我没有遵循文档中建议的任何配置文件设置(文档中指示的某些属性甚至在ambari管理的hdfs配置中都找不到(因此,如果有人能解释为什么尽管如此,它仍然对我有效,请解释)。

**更新:

在与一些比我更有使用hdp(v3.1)经验的人交谈之后,我链接到的为hdfs设置nfs的文档可能不是完全最新的(当通过ambari mgnt设置nfs时)。无论如何)。。。
通过在ambari主机管理ui中将群集节点作为nfs节点 checkout ,可以使群集节点充当nfs网关:

所需的配置可以在hdfs管理中这样设置。用户界面。。。




可以通过查看ambari中的host>summary>components部分来确认hdfs nfs网关正在运行。。。

相关问题