AI/ML

How to Install Kimi K2 on AWS EC2 with Docker - Step by Step Setup Guide

Need technical help?

Our experts will get back to you within 24 hours.

How to Install Kimi K2 on AWS EC2 Using Docker (Complete Guide)

Kimi K2, a trillion parameter open source LLM by Moonshot AI, is making waves for developers seeking privacy, performance and cost efficiency. With Docker on AWS EC2, you can deploy Kimi K2 with GPU acceleration for a high-performance, cloud hosted AI setup.

Prerequisites

1. AWS account with EC2 access

2. Create an EC2 instance (recommended: g5.2xlarge, p3.2xlarge, or p4d)

3. SSH access to the instance

4. Security group with ports 22, 8000, 7860 open

5. NVIDIA GPU support with NVIDIA drivers installed

6. Docker & NVIDIA Container Toolkit

Step 1: Launch a GPU Enabled EC2 Instance

Use Deep Learning Base AMI (Ubuntu 20.04 / CUDA 12 pre-installed). Choose instance type: g5.2xlarge, p3.2xlarge, or p4d. Add security group rules for SSH (22), HTTP (80), and custom TCP (8000, 7860). Launch and connect via SSH.

Step 2: Install Docker & NVIDIA Container Toolkit

sudo apt updatesudo apt install -y docker.io nvidia-driver-525sudo systemctl start dockersudo systemctl enable dockerdistribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt updatesudo apt install -y nvidia-docker2sudo systemctl restart docker

Step 3: Clone the Model Repo from Hugging Face

sudo apt install git-lfsgit lfs installgit clone https://huggingface.co/moonshotai/Kimi-K2-Instructcd Kimi-K2-Instruct

Make sure you have accepted the model license at: https://huggingface.co/moonshotai/Kimi-K2-Instruct

Step 4: Create a Dockerfile (if not available)

Create a file named Dockerfile in your project:

FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04RUN apt update && apt install -y python3 python3-pip gitRUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121RUN pip3 install transformers accelerate huggingface_hubWORKDIR /appCOPY . .CMD ["python3", "app.py"]

Step 5: Build and Run the Docker Container

sudo docker build -t kimi-k2 .sudo docker run --gpus all -p 7860:7860 kimi-k2

Step 6: Test the Kimi K2 API

curl http://<your-ec2-ip>:7860

Or run an example Python script from within the container.

Optional: Use vLLM or OpenRouter Compatible API

pip install vllmpython3 -m vllm.entrypoints.openai.api_server --model moonshotai/Kimi-K2-Instruct

Final Thoughts

By following this guide, you now have a cloud hosted, GPU accelerated, self-hosted Kimi K2 model running on AWS EC2 with Docker. This is perfect for building chatbots, coding assistants, semantic search engines and more without relying on paid APIs or closed platforms.

Need enterprise grade deployment or DevOps help? Contact our expert AI Squad at OneClick IT Consultancy and let’s build something powerful together.

AI/ML

Related Center Of Excellence

See all