AI/ML

    How to Install Kimi K2 on AWS EC2 with Docker - Step by Step Setup Guide


    How to Install Kimi K2 on AWS EC2 Using Docker (Complete Guide)

    Kimi K2, a trillion parameter open source LLM by Moonshot AI, is making waves for developers seeking privacy, performance and cost efficiency. With Docker on AWS EC2, you can deploy Kimi K2 with GPU acceleration for a high-performance, cloud hosted AI setup.

    Prerequisites

    1. AWS account with EC2 access

    2. Create an EC2 instance (recommended: g5.2xlarge, p3.2xlarge, or p4d)

    3. SSH access to the instance

    4. Security group with ports 22, 8000, 7860 open

    5. NVIDIA GPU support with NVIDIA drivers installed

    6. Docker & NVIDIA Container Toolkit

    Step 1: Launch a GPU Enabled EC2 Instance

    Use Deep Learning Base AMI (Ubuntu 20.04 / CUDA 12 pre-installed). Choose instance type: g5.2xlarge, p3.2xlarge, or p4d. Add security group rules for SSH (22), HTTP (80), and custom TCP (8000, 7860). Launch and connect via SSH.

     

    Step 2: Install Docker & NVIDIA Container Toolkit

    sudo apt updatesudo apt install -y docker.io nvidia-driver-525sudo systemctl start dockersudo systemctl enable dockerdistribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.listsudo apt updatesudo apt install -y nvidia-docker2sudo systemctl restart docker

    Step 3: Clone the Model Repo from Hugging Face

    sudo apt install git-lfsgit lfs installgit clone https://huggingface.co/moonshotai/Kimi-K2-Instructcd Kimi-K2-Instruct

     

    Make sure you have accepted the model license at: https://huggingface.co/moonshotai/Kimi-K2-Instruct 

    Step 4: Create a Dockerfile (if not available)

    Create a file named Dockerfile in your project:

    FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04RUN apt update && apt install -y python3 python3-pip gitRUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121RUN pip3 install transformers accelerate huggingface_hubWORKDIR /appCOPY . .CMD ["python3", "app.py"]

    Step 5: Build and Run the Docker Container

    sudo docker build -t kimi-k2 .sudo docker run --gpus all -p 7860:7860 kimi-k2

    Step 6: Test the Kimi K2 API

    curl http://<your-ec2-ip>:7860

    Or run an example Python script from within the container.

    Optional: Use vLLM or OpenRouter Compatible API

    pip install vllmpython3 -m vllm.entrypoints.openai.api_server --model moonshotai/Kimi-K2-Instruct

    Final Thoughts

    By following this guide, you now have a cloud hosted, GPU accelerated, self-hosted Kimi K2 model running on AWS EC2 with Docker. This is perfect for building chatbots, coding assistants, semantic search engines and more without relying on paid APIs or closed platforms.

    Need enterprise grade deployment or DevOps help? Contact our expert AI Squad at OneClick IT Consultancy and let’s build something powerful together.

    Share

    facebook
    LinkedIn
    Twitter
    Mail
    AI/ML

    Related Center Of Excellence