AI/ML

    AWS ECS OpenThinker 7B Deployment - A Step by Step Guide


    Introduction

    Deploying OpenThinker 7B on AWS allows for scalable, high availability hosting of the model. AWS provides various services such as Amazon ECS (Elastic Container Service), Amazon EC2 (Elastic Compute Cloud) and AWS Lambda, which can be used for deploying LLMs.

    In this guide, we will focus on deploying OpenThinker 7B on AWS using Amazon ECS (Fargate), which allows for serverless containerized deployment.

    Key Benefits of Deploying OpenThinker 7B on AWS

    Scalability : Can handle high-demand traffic

    Cost-effectiveness : Pay only for compute usage

    Managed Infrastructure : No need to manually manage servers

    Security : AWS IAM and VPC ensure secure access

    Step 1: Prerequisites

    Before starting, ensure you have:

    • An AWS account
    • AWS Management Console access
    • Docker installed on your local machine
    • AWS CLI installed and configured (aws configure)
    • A pre-built Docker image of OpenThinker 7B (from previous steps)

    Step 2: Push the Docker Image to Amazon ECR (Elastic Container Registry)

    Create an ECR Repository

    1. Open the AWS Management Console
    2. Go to Amazon ECR → Click Create repository
    3. Enter Repository name (e.g., openthinker-7b)
    4. Select Private repository
    5. Click Create repository

    Authenticate Docker with AWS ECR

    Run the following command to log in to ECR (replace <aws_account_id> and <region> with your actual values):

    aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.<region>.amazonaws.com

    Tag the Docker Image Retrieve your ECR repository URI from AWS ECR (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com/openthinker-7b) and tag your image:

    docker tag openthinker-7b:latest <aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest Push the Image to ECRdocker push <aws_account_id>.dkr.ecr.<region>.amazonaws.com/openthinker-7b:latest Once completed, the image will be stored in AWS ECR and can be used for ECS deployment.

    Step 3: Create an ECS Cluster

    We will use AWS Fargate to run the container without managing EC2 instances.

    1. Go to Amazon ECS → Click Create cluster
    2. Choose Networking only (AWS Fargate)
    3. Enter Cluster name (e.g., openthinker-cluster)
    4. Click Create

    Step 4: Create a Task Definition for OpenThinker 7B

    1. Go to Amazon ECS → Click Task Definitions
    2. Click Create new task definition
    3. Select Fargate as the launch type
    4. Enter Task definition name (e.g., openthinker-7b-task)
    5. Set Task size:
      1. vCPU: 2 vCPUs
      2. Memory: 8GB RAM
    6. Click Add container and configure:
      1. Container name: openthinker-7b-container
      2. Image: Paste the ECR image URL from Step 2
      3. Port mappings: 11434 (same as the Docker container)
    7. Click Create

    Step 5: Create an ECS Service

    1. Go to Amazon ECS → Click ServicesCreate
    2. Select Launch Type: Fargate
    3. Choose Cluster: openthinker-cluster
    4. Select Task definition: openthinker-7b-task
    5. Choose Service Name: openthinker-7b-service
    6. Set the Number of tasks to 1 (or more for scaling)
    7. Select Networking:
      • VPC: Choose an existing or new VPC
      • Subnets: Select public subnets
      • Security Group: Allow port 11434 inbound
    8. Click Deploy

    Step 6: Verify Deployment

    Check Running Tasks

    Go to Amazon ECS → Select openthinker-cluster → Click Tasks

    Make sure the task status is RUNNING.

    Get the Public IP Address

    If using a public subnet, navigate to:

    • ECS Service → Select Running Task
    • Look for Public IP

    Run the following command to test the model:

    curl http://<public-ip>:11434 Expected output:{"message": "Model is up and running"}

    Step 7: Scaling the Model (Optional)

    To handle high traffic, increase the number of tasks:

    1. Go to ECS Service
    2. Select openthinker-7b-service
    3. Click Update → Increase Desired Task Count
    4. Save and Deploy

    AWS Fargate Auto Scaling can also be enabled for automatic scaling.

     

    Step 8: Cleaning Up Resources (If Needed)

    To avoid unnecessary charges, delete the ECS resources when not in use: aws ecs delete-service --cluster openthinker-cluster --service

    openthinker-7b-service --forceaws ecs delete-cluster --cluster openthinker-clusteraws ecr delete-repository --repository-name openthinker-7b --force 

    Conclusion

    Deploying OpenThinker 7B on AWS using ECS Fargate provides a fully managed, serverless environment with minimal setup and maintenance. By leveraging AWS ECR, ECS and Fargate, you can run large language models efficiently without managing underlying infrastructure.

    Experts in AI, ML, and automation at OneClick IT Consultancy

    AI Force

    AI Force at OneClick IT Consultancy pioneers artificial intelligence and machine learning solutions. We drive COE initiatives by developing intelligent automation, predictive analytics, and AI-driven applications that transform businesses.

    Share

    facebook
    LinkedIn
    Twitter
    Mail
    AI/ML

    Related Center Of Excellence