Crafting serverless streaming ETL jobs with AWS Glue

By Dustin Ward

Amazon Web Services FeedCrafting serverless streaming ETL jobs with AWS Glue Organizations across verticals have been building streaming-based extract, transform, and load (ETL) applications to more efficiently extract meaningful insights from their datasets. Although streaming ingest and stream processing frameworks have evolved over the past few years, there is now a surge in demand for…

Automating EMR workloads using AWS Step Functions

By Dustin Ward

Amazon Web Services FeedAutomating EMR workloads using AWS Step Functions Amazon EMR allows you to process vast amounts of data quickly and cost-effectively at scale. Using open-source tools such as Apache Spark, Apache Hive, and Presto, and coupled with the scalable storage of Amazon Simple Storage Service (Amazon S3), Amazon EMR gives analytical teams the…

Using speaker diarization for streaming transcription with Amazon Transcribe and Amazon Transcribe Medical

By Dustin Ward

Amazon Web Services FeedUsing speaker diarization for streaming transcription with Amazon Transcribe and Amazon Transcribe Medical Conversational audio data that requires transcription, such as phone calls, doctor visits, and online meetings, often has multiple speakers. In these use cases, it’s important to accurately label the speaker and associate them to the audio content delivered. For…

AWS Glue supports reading from self-managed Apache Kafka

By Dustin Ward

Amazon Web Services FeedAWS Glue supports reading from self-managed Apache Kafka Streaming extract, transform, and load (ETL) jobs in AWS Glue can now ingest data from Apache Kafka clusters that you manage yourself. Previously, AWS Glue supported reading specifically from Amazon Managed Streaming for Apache Kafka (Amazon MSK). With this update, AWS Glue allows you…

Event-driven refresh of SPICE datasets in Amazon QuickSight

By Dustin Ward

Amazon Web Services FeedEvent-driven refresh of SPICE datasets in Amazon QuickSight Businesses are increasingly harnessing data to improve their business outcomes. To enable this transformation to a data-driven business, customers are bringing together data from structured and unstructured sources into a data lake. Then they use business intelligence (BI) tools, such as Amazon QuickSight, to…