AI/ML

How to Deploy Kimi K2 on AWS ECS or EKS (Kubernetes) - Full Step by Step Guide

Need technical help?

Our experts will get back to you within 24 hours.

Option 1: Deploy Kimi K2 on AWS EKS (Elastic Kubernetes Service)

Prerequisites:

AWS account
IAM role with EKS and EC2 permissions
GPU-enabled EC2 instance types (e.g., p3.2xlarge)
Kubectl, eksctl, Helm installed
Docker & Git installed locally

Step 1: Create an EKS Cluster with GPU Nodes

eksctl create cluster \ --name kimi-k2-cluster \ --region us-east-1 \ --nodegroup-name gpu-nodes \ --node-type p3.2xlarge \ --nodes 2 \ --nodes-min 1 \ --nodes-max 3 \ --managed

Step 2: Build and Push Docker Image for Kimi K2

git clone https://huggingface.co/moonshotai/Kimi-K2-Instructcd Kimi-K2-Instruct

Dockerfile:

FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04RUN apt update && apt install -y python3 python3-pip gitRUN pip3 install torch torchvision transformers accelerate huggingface_hubWORKDIR /appCOPY . .CMD ["python3", "app.py"]aws ecr create-repository --repository-name kimi-k2$(aws ecr get-login --no-include-email --region us-east-1)docker build -t kimi-k2 .docker tag kimi-k2:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latestdocker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest

Step 3: Create Kubernetes YAML Deployment Files

deployment.yaml:

apiVersion: apps/v1kind: Deploymentmetadata: name: kimi-k2spec: replicas: 1 selector: matchLabels: app: kimi-k2 template: metadata: labels: app: kimi-k2 spec: containers: - name: kimi-k2 image: <your-ecr-repo-url> resources: limits: nvidia.com/gpu: 1 ports: - containerPort: 7860

service.yaml:

apiVersion: v1kind: Servicemetadata: name: kimi-k2-servicespec: type: LoadBalancer selector: app: kimi-k2 ports: - protocol: TCP port: 80 targetPort: 7860kubectl apply -f deployment.yamlkubectl apply -f service.yaml

Step 4: Access Kimi K2 on Public IP

kubectl get svc kimi-k2-service

Option 2: Deploy Kimi K2 on AWS ECS (Elastic Container Service)

Prerequisites:

AWS CLI configured
IAM roles for ECS + ECR
Docker installed
ECS Fargate or EC2 cluster created
ECR repository created

Step 1: Build Docker Image

Same steps as above. Push image to Amazon ECR.

Step 2: Create ECS Task Definition

task-definition.json:

{ "family": "kimi-k2-task", "containerDefinitions": [ { "name": "kimi-k2", "image": "<account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest", "memory": 30720, "cpu": 2048, "essential": true, "portMappings": [ { "containerPort": 7860, "hostPort": 7860 } ] } ], "requiresCompatibilities": ["EC2"], "networkMode": "bridge", "cpu": "2048", "memory": "30720"}aws ecs register-task-definition --cli-input-json file://task-definition.json

Step 3: Run Task on ECS Cluster

aws ecs run-task \ --cluster kimi-k2-cluster \ --launch-type EC2 \ --task-definition kimi-k2-task \ --count 1

Final Thoughts

Deploying Kimi K2 on EKS or ECS gives you the power to scale open-source LLMs efficiently in the cloud.

Kubernetes allows for autoscaling, GPU scheduling and production-grade LLM APIs all while keeping you in control.

Need enterprise grade deployment or DevOps help? Contact our AI DevOps experts at OneClick IT Consultancy.

Comment

AI/ML

Related Center Of Excellence

See all