Amazon SageMaker AI Pricing: Detailed Breakdown and Ultimate Guide

December 19, 2024
10
min read

Introduction

Machine learning (ML) is revolutionizing industries by enabling predictive analytics, automation, and personalized experiences. However, developing, training, and deploying ML models can be both complex and costly. AWS SageMaker offers a fully managed service to streamline these processes, but understanding its pricing structure is essential to maximize its benefits.

In this guide, we'll break down the Amazon SageMaker pricing components and provide practical examples to help you manage costs effectively while taking full advantage of its powerful features.

What is AWS SageMaker?

At its core, AWS SageMaker is a fully managed machine learning service that empowers developers and data scientists to quickly build, train, and deploy models at scale. It offers integrated tools such as Jupyter notebooks, distributed training, and real-time endpoints, all designed to simplify the ML lifecycle.

By handling the complexity of infrastructure management, SageMaker allows businesses to focus on innovation while keeping costs under control. However, its pricing structure includes multiple components which we are going to cover in this blog post.

AWS SageMaker Pricing Breakdown

Free Tier for SageMaker AI

Amazon SageMaker AI provides a Free Tier to help you get started at no cost for the first two months. Below are the details of the free usage included each month:

  • Studio Notebooks and Notebook Instances: 250 hours on ml.t3.medium or ml.t2.medium instances.
  • RStudio on SageMaker: 250 hours on ml.t3.medium for the RSession app and a free ml.t3.medium instance for the RStudioServerPro app.
  • Data Wrangler: 25 hours on an ml.m5.4xlarge instance.
  • Feature Store: 10 million write units, 10 million read units, and 25 GB of storage (standard online store).
  • Training: 50 hours on m4.xlarge or m5.xlarge instances.
  • SageMaker with TensorBoard: 300 hours on an ml.r5.large instance.
  • Real-Time Inference: 125 hours on m4.xlarge or m5.xlarge instances.
  • Serverless Inference: 150,000 seconds of on-demand inference duration.
  • Canvas: 160 hours per month for session time.
  • HyperPod: 50 hours on an m5.xlarge instance.

The SageMaker Free Tier is available starting from the first month when you create your SageMaker AI resource.

SageMaker Components Billed by Instance Type and Duration (On-Demand)

SageMaker components are priced based on the type of instance you select and the length of time you use them. These services utilize fully managed compute infrastructure, allowing you to concentrate on building, training, and deploying machine learning models without worrying about the underlying resources. Costs vary depending on the instance family—such as CPU, GPU, or memory-optimized—and the duration the instances are active.

Note: All pricing details are based on the US East (N. Virginia) region.

1. SageMaker Studio Classic

SageMaker Studio Classic provides a legacy IDE with one-step Jupyter notebooks, enabling interactive development and seamless collaboration without managing underlying compute resources. SageMaker Studio pricing is based on the selected instance type and duration of use.

2. SageMaker JupyterLab

SageMaker JupyterLab provides a fully managed, web-based interactive development environment for notebooks, code, and data, enabling quick launches and seamless development. Pricing is based on the selected instance type and duration of use.

3. SageMaker Code Editor

SageMaker Code Editor, built on Code-OSS (Visual Studio Code – Open Source), allows you to write, test, debug, and run analytics and ML code. It integrates with SageMaker Studio and supports extensions from the Open VSX registry.

4. SageMaker RStudio

SageMaker RStudio provides on-demand cloud computing resources to accelerate model development and enhance productivity. Pricing is based on the instance types used for running the RStudio Session app and RStudio Server Pro app.

5. SageMaker Notebook Instances

SageMaker Notebook Instances are compute instances running the Jupyter Notebook app, providing an environment for developing and testing ML models. Pricing is based on the selected instance type and duration of use.

6. SageMaker Processing

SageMaker Processing allows you to run pre-processing, post-processing, and model evaluation workloads on fully managed infrastructure. Pricing is based on the selected instance type and duration of use.

7. SageMaker with TensorBoard

SageMaker with TensorBoard offers a hosted environment to visualize and debug model convergence issues for SageMaker training jobs, enabling better insights into model performance.

8. SageMaker Training

SageMaker Training simplifies training, tuning, and debugging ML models with a fully managed infrastructure. SageMaker Training pricing is based on the instance type used during training. Built-in debugging rules are free, while custom rules incur charges based on the selected instance type and duration of use.

9. SageMaker Real-Time Inference (Endpoints)

SageMaker Hosting provides real-time inference for use cases requiring immediate predictions. SageMaker endpoint pricing is based on the chosen instance type. Built-in Model Monitor rules include 30 free hours of monitoring, after which charges apply based on usage duration. Custom rules incur separate charges based on the instance type and duration of use.

10. SageMaker Asynchronous Inference

SageMaker Asynchronous Inference processes large payloads and models with long inference times, queuing requests for near-real-time predictions. SageMaker inference pricing is based on the selected instance type and duration of use.

11. SageMaker Batch Transform

SageMaker Batch Transform enables running predictions on large or small datasets without managing real-time endpoints or splitting data into chunks. Pricing is based on the selected instance type and duration of use.

12. SageMaker JumpStart

SageMaker JumpStart provides one-click access to popular pre-built models and end-to-end ML solutions for common use cases. SageMaker JumpStart pricing doesn’t include additional charges; you pay only for the underlying Training and Inference instance hours used.

13. SageMaker HyperPod

SageMaker HyperPod accelerates foundation model (FM) development with resilient training, automatic fault recovery, and frequent checkpointing. It includes distributed training libraries to maximize cluster performance. SageMaker HyperPod pricing covers its usage but excludes charges for connected services like Amazon EKS, FSx for Lustre, and S3.

SageMaker Components with Alternative Pricing Models

The following SageMaker components include pricing mechanisms beyond traditional instance-based models. While some elements might still incorporate instance-based pricing, these services often introduce usage-based components, such as charges per request, per unit of data processed, or for completed optimization jobs. These models provide flexibility and cost efficiency for specific tasks like inference optimization, serverless inference, or data storage.

1. SageMaker Data Wrangler

SageMaker Data Wrangler simplifies and accelerates data aggregation, preparation, and visualization for machine learning, significantly reducing the time needed for these tasks. Pricing primarily depends on the instance type and duration used to cleanse, explore, and visualize data. For workflows using the SageMaker Canvas workspace, Canvas-specific pricing options apply (see Canvas pricing below for details)

2. SageMaker Feature Store

SageMaker Feature Store is a central repository for ingesting, storing, and serving ML features. Pricing includes charges for data storage, writes, and reads, with different rates for the standard online store and in-memory online store. Throughput can be billed on-demand or via provisioned capacity, depending on your needs.

Pricing:

Standard Online Store:

  • Storage: $0.45 per GB-month.
  • Write Requests (on-demand): $1.25 per million write units (1 KB each).
  • Read Requests (on-demand): $0.25 per million read units (4 KB each).

Provisioned Capacity:

  • Write Capacity Unit (WCU): $0.00065 per WCU-hour.
  • Read Capacity Unit (RCU): $0.00013 per RCU-hour.

In-Memory Online Store:

  • Storage: $0.233 per GB-hour (minimum 5 GiB charge per hour).
  • Write Requests: $0.0249 per million write units.
  • Read Requests: $0.0224 per million read units.

3. SageMaker with MLflow

SageMaker with MLflow enables cost-effective tracking of ML experiments. Pricing is based on the compute instance size, the duration the MLflow Tracking Server runs, and charges for metadata storage.

Pricing:

MLflow Tracking Server Compute:

  • Small: $0.60 per hour
  • Medium: $1.04 per hour
  • Large: $1.91 per hour

MLflow Tracking Server Storage:

  • Storage Cost: $0.10 per GB-month

Charges are based on the size of the Tracking Server, the duration it runs, and the amount of metadata stored.

4. SageMaker Serverless Inference

SageMaker Serverless Inference allows you to deploy ML models without managing infrastructure. SageMaker Serverless Inference pricing is based on the compute capacity used (billed per millisecond) and the amount of data processed. Costs depend on the selected memory configuration, with an option to add Provisioned Concurrency for predictable performance.

Pricing:

On-Demand Serverless Inference:

  • Compute Pricing (per second, based on memory configuration):
    • 1 MB: $0.0000200
    • 2 MB: $0.0000400
    • 3 MB: $0.0000600
  • Data Processing:
    • Data Processed IN: $0.016 per GB
    • Data Processed OUT: $0.016 per GB

Provisioned Concurrency:

Provisioned Concurrency ensures predictable performance by keeping endpoints warm for concurrent requests. Charges include compute usage and concurrency provisioning:

  • Provisioned Concurrency Usage (per second):
    • 1024 MB: $0.0000050
    • 2048 MB: $0.0000100
    • 3072 MB: $0.0000150
  • Inference Duration (per second):
    • 1024 MB: $0.0000117
    • 2048 MB: $0.0000233
    • 3072 MB: $0.0000350

SageMaker Serverless Inference is ideal for workloads with unpredictable or intermittent traffic, offering a cost-effective alternative to managing infrastructure.

5. SageMaker Profiler

SageMaker Profiler helps data scientists and engineers identify hardware performance bottlenecks by visualizing high-resolution CPU and GPU trace plots, reducing training time and cost. Currently, it supports profiling for ml.g4dn.12xlarge, ml.p3dn.24xlarge, and ml.p4d.24xlarge instance types.

Note: SageMaker Profiler is in preview and available free of charge to customers in supported regions, including the US East (Ohio, N. Virginia), US West (Oregon), Europe (Frankfurt, Ireland), and Israel (Tel Aviv).

6. SageMaker Inference Optimization Toolkit

The SageMaker Inference Optimization Toolkit simplifies the implementation of state-of-the-art (SOTA) optimization techniques, enabling improved cost performance for model inference. It allows you to run optimization jobs, benchmark models for performance and accuracy, and deploy optimized models to SageMaker endpoints for inference, saving significant developer time.

Pricing:

Optimization Instance Pricing (per hour):

  • ml.g5.48xlarge: $20.36
  • ml.g6.48xlarge: $16.688
  • ml.inf2.48xlarge: $15.58

The toolkit enables you to achieve state-of-the-art cost performance while saving months of development time.

Note: AWS SageMaker also offers Partner AI Applications that can accelerate ML development. You can explore their pricing and capabilities on the SageMaker Pricing Page.

7. SageMaker Canvas

Purpose: SageMaker Canvas empowers users to build, evaluate, and deploy production-ready machine learning models at scale without writing code. It streamlines the end-to-end ML lifecycle in a secure, collaborative environment, fostering transparency and governance through model versioning and access controls.

Pricing:

  1. Workspace Instance (Session-Hours):
    • Charges: $1.90 per hour.
    • Details: Billing starts when you launch the SageMaker Canvas application and ends when you log out or when an administrator terminates the session.
  2. Data Processing Charges:
    • For datasets larger than 5 GB, SageMaker Canvas utilizes Amazon EMR Serverless, billed for vCPU, memory, and storage consumed.
  3. Custom Model Training Charges:
    • Based on compute resources used (e.g., ml.m5.12xlarge, ml.c5.18xlarge).
  4. Ready-to-Use Model Charges:
    • When using ready-to-use models powered by Amazon AI services, additional charges apply as per their pricing.

Example: For a team of 5 users utilizing SageMaker Canvas for 10 hours and processing 1,000,000 rows:

Workspace Instance Charges: 5 users x 10 hours/user x $1.90/hour = $95.00 Model Prediction Charges: 1,000,000 rows x $0.00025/row = $250.00

Total Monthly Cost: $95.00 + $250.00 = $345.00.

8. SageMaker Autopilot

Purpose:

Amazon SageMaker Autopilot automates the process of building, training, and tuning machine learning models, enabling users to create high-quality models without requiring deep ML expertise. It analyzes your data, selects the best algorithms, performs feature engineering, and optimizes hyperparameters to deliver an optimal model for your specific use case.

Features:

  • Automated Workflow: Autopilot handles everything from data preprocessing to algorithm selection and hyperparameter tuning.
  • Integration with SageMaker Canvas: Autopilot is integrated into SageMaker Canvas, providing a unified no-code environment for model development and deployment.
  • Built-In Visualizations: Autopilot offers visual insights into data transformations, model performance, and comparisons for easy evaluation.
  • Customizability: Users with coding expertise can access and fine-tune Autopilot workflows via AWS SDKs and APIs.

Pricing:

SageMaker Autopilot charges for the compute instances and duration used during training, feature engineering, and hyperparameter tuning. For inference, standard SageMaker endpoint hosting fees apply.

With SageMaker Autopilot, businesses can efficiently build production-ready machine learning models, accelerating innovation while reducing development time.

Example: For a user running an Autopilot job to train models using an ml.c5.xlarge instance for 3 hours and performing hyperparameter tuning on an ml.m5.2xlarge instance for 2 hours:

  • Training Job Charges: 3 hours x $0.204/hour = $0.612
  • Hyperparameter Tuning Charges: 2 hours x $0.460/hour = $0.92
  • Total Cost: $0.612 + $0.92 = $1.532

Popular Instance Types and Pricing

Amazon SageMaker provides a wide variety of instance types optimized for different machine-learning workloads. Below are some popular instance types and their pricing.

General Purpose Instances

  • ml.m5.large: $0.115/hour (2 vCPUs, 8 GiB memory)
    • Use Case: Suitable for lightweight training and hosting workloads.

Compute-Optimized Instances

  • ml.c5.xlarge: $0.204/hour (4 vCPUs, 8 GiB memory)
    • Use Case: Ideal for CPU-intensive tasks like data processing and training small models.

GPU Instances

  • ml.p3.2xlarge: $3.825/hour (1 NVIDIA V100 GPU, 8 vCPUs, 61 GiB memory)
    • Use Case: Best for training deep learning models and running inference on large datasets.

Memory-Optimized Instances

  • ml.r5.xlarge: $0.302/hour (4 vCPUs, 32 GiB memory)
    • Use Case: Suitable for memory-intensive workloads such as large datasets and feature engineering.

Inference-Optimized Instances

  • ml.g4dn.xlarge: $0.736/hour (1 NVIDIA T4 GPU, 4 vCPUs, 16 GiB memory)
    • Use Case: Optimized for real-time inference and cost-efficient deployments.

These instance options allow you to balance performance and cost based on your specific machine learning workloads. To explore more, refer to the AWS SageMaker AI Pricing Page.

Amazon SageMaker Savings Plans

Besides the on-demand pricing, there is also the option to use Savings Plans with SageMaker to reduce your costs by up to 64%.

What Are SageMaker Savings Plans?

SageMaker Savings Plans offer flexible pricing that automatically applies discounts to your eligible SageMaker usage. This includes services like SageMaker Studio notebooks, notebook instances, Processing jobs, Data Wrangler, Training, Real-Time Inference, and Batch Transform. The plans work across any instance family, size, or region, providing you with the flexibility to adapt your infrastructure without worrying about varying costs.

How Do Savings Plans Work?

For example, if you start with a ml.c5.xlarge CPU instance in US East (Ohio) and later switch to a ml.inf1 GPU instance in US West (Oregon) for inference tasks, your Savings Plans will continue to apply the discounted rate automatically. This ensures consistent savings regardless of changes in your instance types or regions.

For more information, visit the AWS SageMaker Pricing Page or use the AWS SageMaker Pricing Calculator to estimate your potential savings.

Conclusion

AWS SageMaker offers a comprehensive and flexible platform for building, training, and deploying machine learning models at scale. By understanding its pricing structure, including on-demand options and Savings Plans, businesses can effectively manage costs while leveraging SageMaker's powerful features. Whether you're initiating new ML projects or scaling existing ones, SageMaker provides the tools and flexibility needed to support your machine learning initiatives efficiently and economically.

Tip: Use the AWS SageMaker Pricing Calculator to accurately estimate your costs based on your specific workloads and usage.
Share this article:
Subscribe to our newsletter to get our latest updates!
Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Related articles