AI/ML

    How to Deploy Kimi K2 on AWS ECS or EKS (Kubernetes) - Full Step by Step Guide


    Option 1: Deploy Kimi K2 on AWS EKS (Elastic Kubernetes Service)

    Prerequisites:

    • AWS account

    • IAM role with EKS and EC2 permissions

    • GPU-enabled EC2 instance types (e.g., p3.2xlarge)

    • Kubectl, eksctl, Helm installed

    • Docker & Git installed locally

    Step 1: Create an EKS Cluster with GPU Nodes

    eksctl create cluster \  --name kimi-k2-cluster \  --region us-east-1 \  --nodegroup-name gpu-nodes \  --node-type p3.2xlarge \  --nodes 2 \  --nodes-min 1 \  --nodes-max 3 \  --managed

     

    Step 2: Build and Push Docker Image for Kimi K2

    git clone https://huggingface.co/moonshotai/Kimi-K2-Instructcd Kimi-K2-Instruct

     

    Dockerfile:

    FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04RUN apt update && apt install -y python3 python3-pip gitRUN pip3 install torch torchvision transformers accelerate huggingface_hubWORKDIR /appCOPY . .CMD ["python3", "app.py"]aws ecr create-repository --repository-name kimi-k2$(aws ecr get-login --no-include-email --region us-east-1)docker build -t kimi-k2 .docker tag kimi-k2:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latestdocker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest

     

    Step 3: Create Kubernetes YAML Deployment Files

    deployment.yaml:

    apiVersion: apps/v1kind: Deploymentmetadata:  name: kimi-k2spec:  replicas: 1  selector:    matchLabels:      app: kimi-k2  template:    metadata:      labels:        app: kimi-k2    spec:      containers:      - name: kimi-k2        image: <your-ecr-repo-url>        resources:          limits:            nvidia.com/gpu: 1        ports:        - containerPort: 7860

     

    service.yaml:

    apiVersion: v1kind: Servicemetadata:  name: kimi-k2-servicespec:  type: LoadBalancer  selector:    app: kimi-k2  ports:    - protocol: TCP      port: 80      targetPort: 7860kubectl apply -f deployment.yamlkubectl apply -f service.yaml

     

    Step 4: Access Kimi K2 on Public IP

    kubectl get svc kimi-k2-service

    Option 2: Deploy Kimi K2 on AWS ECS (Elastic Container Service)

    Prerequisites:

    • AWS CLI configured

    • IAM roles for ECS + ECR

    • Docker installed

    • ECS Fargate or EC2 cluster created

    • ECR repository created

    Step 1: Build Docker Image

    Same steps as above. Push image to Amazon ECR.

    Step 2: Create ECS Task Definition

    task-definition.json:

    {  "family": "kimi-k2-task",  "containerDefinitions": [    {      "name": "kimi-k2",      "image": "<account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest",      "memory": 30720,      "cpu": 2048,      "essential": true,      "portMappings": [        {          "containerPort": 7860,          "hostPort": 7860        }      ]    }  ],  "requiresCompatibilities": ["EC2"],  "networkMode": "bridge",  "cpu": "2048",  "memory": "30720"}aws ecs register-task-definition --cli-input-json file://task-definition.json

     

    Step 3: Run Task on ECS Cluster

    aws ecs run-task \  --cluster kimi-k2-cluster \  --launch-type EC2 \  --task-definition kimi-k2-task \  --count 1

    Final Thoughts

    Deploying Kimi K2 on EKS or ECS gives you the power to scale open-source LLMs efficiently in the cloud.

    Kubernetes allows for autoscaling, GPU scheduling and production-grade LLM APIs all while keeping you in control.

    Need enterprise grade deployment or DevOps help? Contact our AI DevOps experts at OneClick IT Consultancy.

    Contact Us

    Comment

    Share

    facebook
    LinkedIn
    Twitter
    Mail
    AI/ML

    Related Center Of Excellence