我有一个3节点设置运行马拉松,mesos主,mesos从和zookeeper与ha配置启用,然后测试了一个简单的hello应用程序部署使用mesos执行和它的工作预期。
现在一切看起来都很好,所以我连接到marathon并部署了一个简单的应用程序来测试marathon:(echo“hello”>>/tmp/output.txt),但是应用程序会陷入“等待”状态。
使用mesos资源进行部署会有什么问题?
来自mesos master的日志:
I0904 11:23:27.064332 19769 master.cpp:2813] Received SUBSCRIBE call for framework 'marathon' at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.064623 19769 master.cpp:2890] Subscribing framework marathon with checkpointing enabled and capabilities [ PARTITION_AWARE ]
I0904 11:23:27.064669 19769 master.cpp:6272] Updating info for framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.064697 19769 master.cpp:2994] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 failed over
I0904 11:23:27.065032 19770 hierarchical.cpp:342] Activated framework cb16118a-2257-4020-a907-63aa6294e11b-0000
I0904 11:23:27.065465 19770 master.cpp:7305] Sending 3 offers to framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:27.907865 19769 http.cpp:1115] HTTP GET for /files/read?_=1504517007920&jsonp=jQuery17109098185077823333_1504516979864&length=50000&offset=352538&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
I0904 11:23:28.916651 19768 http.cpp:1115] HTTP GET for /files/read?_=1504517008930&jsonp=jQuery17109098185077823333_1504516979865&length=50000&offset=353797&path=%2Fmaster%2Flog from 192.168.40.1:53525 with User-Agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'
E0904 11:23:30.071293 19775 process.cpp:2450] Failed to shutdown socket with fd 39, address 192.168.40.159:58072: Transport endpoint is not connected
I0904 11:23:30.073277 19768 master.cpp:1430] Framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 disconnected
I0904 11:23:30.073307 19768 master.cpp:3160] Deactivating framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073485 19768 master.cpp:3137] Disconnecting framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324
I0904 11:23:30.073496 19768 master.cpp:1445] Giving framework cb16118a-2257-4020-a907-63aa6294e11b-0000 (marathon) at scheduler-0340362b-0bb6-4fb8-8501-118d976e2cbd@192.168.40.156:36324 1weeks to failover
I0904 11:23:30.073519 19768 hierarchical.cpp:374] Deactivated framework cb16118a-2257-4020-a907-63aa6294e11b-0000
curl -xget'http://mesosphere2:8098/v2/queue?漂亮的jq
{
"queue": [
{
"count": 1,
"delay": {
"timeLeftSeconds": 0,
"overdue": true
},
"since": "2017-09-04T13:12:42.024Z",
"processedOffersSummary": {
"processedOffersCount": 12,
"unusedOffersCount": 12,
"lastUnusedOfferAt": "2017-09-04T13:14:52.554Z",
"rejectSummaryLastOffers": [
{
"reason": "UnfulfilledRole",
"declined": 3,
"processed": 3
},
{
"reason": "UnfulfilledConstraint",
"declined": 0,
"processed": 0
},
{
"reason": "NoCorrespondingReservationFound",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientCpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientMemory",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientDisk",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientGpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientPorts",
"declined": 0,
"processed": 0
}
],
"rejectSummaryLaunchAttempt": [
{
"reason": "UnfulfilledRole",
"declined": 12,
"processed": 12
},
{
"reason": "UnfulfilledConstraint",
"declined": 0,
"processed": 0
},
{
"reason": "NoCorrespondingReservationFound",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientCpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientMemory",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientDisk",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientGpus",
"declined": 0,
"processed": 0
},
{
"reason": "InsufficientPorts",
"declined": 0,
"processed": 0
}
]
},
"app": {
"id": "/test03",
"acceptedResourceRoles": [
"slave_public"
],
"backoffFactor": 1.15,
"backoffSeconds": 1,
"container": {
"type": "DOCKER",
"docker": {
"forcePullImage": false,
"image": "laghao/hello-marathon",
"network": "BRIDGE",
"parameters": [],
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"labels": {},
"protocol": "tcp",
"servicePort": 10003
}
],
"privileged": false
},
"volumes": []
},
"cpus": 0.1,
"disk": 0,
"executor": "",
"instances": 1,
"labels": {},
"maxLaunchDelaySeconds": 3600,
"mem": 64,
"gpus": 0,
"portDefinitions": [
{
"port": 10003,
"name": "default",
"protocol": "tcp"
}
],
"requirePorts": false,
"upgradeStrategy": {
"maximumOverCapacity": 1,
"minimumHealthCapacity": 1
},
"version": "2017-09-04T13:12:41.993Z",
"versionInfo": {
"lastScalingAt": "2017-09-04T13:12:41.993Z",
"lastConfigChangeAt": "2017-09-04T13:12:41.993Z"
},
"killSelection": "YOUNGEST_FIRST",
"unreachableStrategy": {
"inactiveAfterSeconds": 300,
"expungeAfterSeconds": 600
}
}
}
]
}
1条答案
按热度按时间vom3gejh1#
来自文档
应用程序永远处于“等待”状态这意味着marathon不会从mesos收到允许它启动此应用程序任务的“资源提供”。最简单的失败是集群中没有足够的可用资源,或者另一个框架将所有这些资源集中在一起。您可以检查mesos ui以获取可用资源。请注意,所需的资源(如cpu、mem、磁盘)必须全部在单个主机上可用。
如果您自己没有找到解决方案,并且创建了github问题,请将mesos/state endpoint的输出附加到bug报告中,以便我们可以检查可用的集群资源。
在您的案例中,应用程序角色要求和代理角色存在问题。你可以从
UnfulfilledRole
.Marathon1.4引入了有关部署停滞的信息。您可以查询
/v2/queue
并获得拒绝报价的统计数据。