New serverless customization in Amazon SageMaker AI accelerates model fine-tuning

By Dustin Ward

Today, I’m happy to announce new serverless customization in Amazon SageMaker AI for popular AI models, such as Amazon Nova, DeepSeek, GPT-OSS, Llama, and Qwen. The new customization capability provides an easy-to-use interface for the latest fine-tuning techniques like reinforcement learning, so you can accelerate the AI model customization process from months to days. With…

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod

By Dustin Ward

Today, we’re announcing two new AI model training features within Amazon SageMaker HyperPod: checkpointless training, an approach that mitigates the need for traditional checkpoint-based recovery by enabling peer-to-peer state recovery, and elastic training, enabling AI workloads to automatically scale based on resource availability. Checkpointless training – Checkpointless training eliminates disruptive checkpoint-restart cycles, maintaining forward training…

Announcing replication support and Intelligent-Tiering for Amazon S3 Tables

By Dustin Ward

Today, we’re announcing two new capabilities for Amazon S3 Tables: support for the new Intelligent-Tiering storage class that automatically optimizes costs based on access patterns, and replication support to automatically maintain consistent Apache Iceberg table replicas across AWS Regions and accounts without manual sync. Organizations working with tabular data face two common challenges. First, they…

Build multi-step applications and AI workflows with AWS Lambda durable functions

By Dustin Ward

Modern applications increasingly require complex and long-running coordination between services, such as multi-step payment processing, AI agent orchestration, or approval processes awaiting human decisions. Building these traditionally required significant effort to implement state management, handle failures, and integrate multiple infrastructure services. Starting today, you can use AWS Lambda durable functions to build reliable multi-step applications…