kubernetes 使用地形部署EKS节点组时出错

3htmauhk  于 2022-12-17  发布在  Kubernetes
关注(0)|答案(4)|浏览(224)

我遇到了一个问题在EKS集群中部署Terraform节点组。这个错误看起来像是一个插件有问题,但我不知道如何解决它。
如果我在AWS控制台(Web)中看到EC2,我可以看到集群的示例,但我在集群中有这个错误。
错误显示在我的管道中:
错误:正在等待创建EKS节点组(UNIR-API-REST-CLUSTER-DEV:node_sping_ Boot ):节点创建失败:示例无法加入kubernetes群集。资源ID:[第一卷第五版第五十八页第八页]
在EKS.tf第17行,资源“aws_eks_node_group”中的“节点”:
17:资源“aws_eks_node_group”“节点”
2020年6月1日00:03:50.576Z [调试]插件:插件进程已退出:path=/home/ubuntu/.jenkins/工作区/商店基础架构生成器管线/商店项目开发/.terraform/插件/linux_amd64/terraform提供商aws_v2.64.0_x4 pid=13475
2020年6月1日00:03:50.576Z [调试]插件:插件退出
并在AWS控制台中打印错误:
Link
这是我用来创建我的项目的Terraform中的代码:

EKS.tf用于创建群集和节点

resource "aws_eks_cluster" "CLUSTER" {
  name     = "UNIR-API-REST-CLUSTER-${var.SUFFIX}"
  role_arn = "${aws_iam_role.eks_cluster_role.arn}"
  vpc_config {
    subnet_ids = [
      "${aws_subnet.unir_subnet_cluster_1.id}","${aws_subnet.unir_subnet_cluster_2.id}"
    ]
  }
  depends_on = [
    "aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy",
    "aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy",
    "aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly",
  ]
}

resource "aws_eks_node_group" "nodes" {
  cluster_name    = "${aws_eks_cluster.CLUSTER.name}"
  node_group_name = "node_sping_boot"
  node_role_arn   = "${aws_iam_role.eks_nodes_role.arn}"
  subnet_ids      = [
      "${aws_subnet.unir_subnet_cluster_1.id}","${aws_subnet.unir_subnet_cluster_2.id}"
  ]
  scaling_config {
    desired_size = 1
    max_size     = 5
    min_size     = 1
  }
# instance_types is mediumt3 by default
# Ensure that IAM Role permissions are created before and deleted after EKS Node Group handling.
# Otherwise, EKS will not be able to properly delete EC2 Instances and Elastic Network Interfaces.
  depends_on = [
    "aws_iam_role_policy_attachment.AmazonEKSWorkerNodePolicy",
    "aws_iam_role_policy_attachment.AmazonEKS_CNI_Policy",
    "aws_iam_role_policy_attachment.AmazonEC2ContainerRegistryReadOnly",
  ]
}

output "eks_cluster_endpoint" {
  value = "${aws_eks_cluster.CLUSTER.endpoint}"
}

output "eks_cluster_certificat_authority" {
    value = "${aws_eks_cluster.CLUSTER.certificate_authority}"
}

安全和组.tf

resource "aws_iam_role" "eks_cluster_role" {
  name = "eks-cluster-${var.SUFFIX}"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "eks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role" "eks_nodes_role" {
  name = "eks-node-${var.SUFFIX}"

  assume_role_policy = <<POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "AmazonEKSClusterPolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = "${aws_iam_role.eks_cluster_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKSServicePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"
  role       = "${aws_iam_role.eks_cluster_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKSWorkerNodePolicy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEKS_CNI_Policy" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" {
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  role       = "${aws_iam_role.eks_nodes_role.name}"
}

VPCAndRouting.tf创建我的路由、VPC和子网

resource "aws_vpc" "unir_shop_vpc_dev" {
  cidr_block = "${var.NET_CIDR_BLOCK}"
  enable_dns_hostnames = true
  enable_dns_support = true
  tags = {
    Name = "UNIR-VPC-SHOP-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}
resource "aws_route_table" "route" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.unir_gat_shop_dev.id}"
  }
  tags = {
    Name = "UNIR-RoutePublic-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}
resource "aws_subnet" "unir_subnet_aplications" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_APLICATIONS}"
  availability_zone = "${var.ZONE_SUB}"
  depends_on = ["aws_internet_gateway.unir_gat_shop_dev"]
  map_public_ip_on_launch = true
  tags = {
    Name = "UNIR-SUBNET-APLICATIONS-${var.SUFFIX}"
    Environment = "${var.SUFFIX}"
  }
}

resource "aws_subnet" "unir_subnet_cluster_1" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_1}"
  map_public_ip_on_launch = true
  availability_zone = "${var.ZONE_SUB_CLUSTER_2}"
  tags = {
    "kubernetes.io/cluster/UNIR-API-REST-CLUSTER-${var.SUFFIX}" = "shared"
  }
}

resource "aws_subnet" "unir_subnet_cluster_2" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_2}"
  availability_zone = "${var.ZONE_SUB_CLUSTER_1}"
  map_public_ip_on_launch = true
  tags = {
    "kubernetes.io/cluster/UNIR-API-REST-CLUSTER-${var.SUFFIX}" = "shared"
  }

}

resource "aws_internet_gateway" "unir_gat_shop_dev" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  tags = {
    Environment = "${var.SUFFIX}"
    Name = "UNIR-publicGateway-${var.SUFFIX}"
  }
}

我的变量:

SUFFIX="DEV"
ZONE="eu-west-1"
TERRAFORM_USER_ID=
TERRAFORM_USER_PASS=
ZONE_SUB="eu-west-1b"
ZONE_SUB_CLUSTER_1="eu-west-1a"
ZONE_SUB_CLUSTER_2="eu-west-1c"
NET_CIDR_BLOCK="172.15.0.0/24"
SUBNET_CIDR_APLICATIONS="172.15.0.0/27"
SUBNET_CIDR_CLUSTER_1="172.15.0.32/27"
SUBNET_CIDR_CLUSTER_2="172.15.0.64/27"
SUBNET_CIDR_CLUSTER_3="172.15.0.128/27"
SUBNET_CIDR_CLUSTER_4="172.15.0.160/27"
SUBNET_CIDR_CLUSTER_5="172.15.0.192/27"
SUBNET_CIDR_CLUSTER_6="172.15.0.224/27"
MONGO_SSH_KEY=
KIBANA_SSH_KEY=
CLUSTER_SSH_KEY=

需要更多的原木吗?

jc3wubiy

jc3wubiy1#

根据AWS documentation
如果在AWS管理控制台中收到错误“示例无法加入kubernetes群集”,请确保已启用群集的专用端点访问,或者已正确配置公共端点访问的CIDR块。有关详细信息,请参阅Amazon EKS群集端点访问控制。
我注意到您正在切换子网的可用性区域:

resource "aws_subnet" "unir_subnet_cluster_1" {
  vpc_id = "${aws_vpc.unir_shop_vpc_dev.id}"
  cidr_block = "${var.SUBNET_CIDR_CLUSTER_1}"
  map_public_ip_on_launch = true
  availability_zone = "${var.ZONE_SUB_CLUSTER_2}"

您将var.ZONE_SUB_CLUSTER_2分配给unir_subnet_cluster_1,将var.ZONE_SUB_CLUSTER_1分配给unir_subnet_cluster_2。这可能是配置错误的原因。

ngynwnxp

ngynwnxp2#

我得到了同样的问题,它是通过创建另一个NAT网关,并选择公共子网和附加的新NAT网关在路由到我的私有子网解决。

piztneat

piztneat3#

还有一种快速解决此类问题的方法。你可以使用“AWSSupport-TroubleshootEKSCorkerNode”Runbook。此Runbook旨在帮助解决无法加入EKS群集的EKS工作节点的问题。你需要转到AWS系统管理器-〉自动化-〉选择Runbook -〉使用群集名称和示例ID执行Runbook。
这对故障排除非常有帮助,并在执行结束时提供了一个很好的总结。
您也可以参考文档here

hl0ma9xz

hl0ma9xz4#

如“NodeCreationFailure”下的here所述,此错误可能有两个原因:

节点创建失败:您启动的示例无法注册到Amazon EKS群集。此故障的常见原因是节点IAM角色权限不足或节点缺少出站Internet访问

您的节点必须能够使用公共IP地址访问Internet才能正常工作。
在我的例子中,集群位于私有子网内,在向NAT网关添加路由后,错误就消失了。

相关问题