AWS re:Invent 2020 Week 2: Amazon Introduces SageMaker Clarify

essidsolutions

The second week of AWS re:Invent 2020 saw a host of machine learning and big data products. Let’s look at the major announcements from the second week of the event.

AWS re:Invent 2020, a cloud computing event hosted by AWS for the global cloud community, entered into its second week with several launches and informative sessions. This week, the cloud behemoth focused on various machine learning and big data-related launches. The company announced new SageMakers capabilities, a data lake for the healthcare industry, and a host of big data announcements. Let’s find out this week’s major announcements.

1. Amazon Announced New SageMaker Capabilities

The company has added new features to the Amazon SageMaker platform, a machine learning service for developers to automate and scale their end-to-end machine learning workflow.

  • Data Wrangler

Data Wrangler is a new service that automates data preparation and feature engineering (transforming data into features). With this, developers can prepare data for training machine learning models faster. It contains over 300 built-in data transformers that can help customers normalize, transform, and combine features without writing any code. Customers can preview and inspect these transformations in SageMaker Studio and reuse the engineered features in SageMaker Feature Store.

  • SageMaker Feature Store 

SageMaker Feature Store is a repository that enables developers to easily store, update, retrieve, and share machine learning features for training and inference. Since the feature store resides in SageMaker Studio, it provides single-digit millisecond latency for inference.

  • SageMaker Pipelines

SageMaker Pipelines is a workflow management and automation toolkit that will enable developers to log each step of an end-to-end machine learning workflow. Developers can use Pipelines to easily re-run an end-to-end workflow from SageMaker Studio using the same settings to get the same model or re-run the workflow with new data inputs for a new, updated model.

  • SageMaker Clarify

SageMaker Clarify addresses the on-going AI bias issues in machine learning models. Data engineers can use this tool to detect bias across the models and build transparent, fair, and bias-free models. Clarify will help data scientists detect statistical bias across the machine learning workflow and explain the models’ predictions.

  • Deep Profiling for SageMaker Debugger

Simplifying the machine learning development process, Amazon announced new deep profiling for SageMaker Debugger. This will enable data scientists to train models faster by automatically monitoring system resource utilization and providing alerts for training bottlenecks. The deep profiling works across frameworks, including PyTorch, Apache MXNet, and TensorFlow, and automatically collects training metrics from the scripts without any code changes. This enables developers to visualize how their system resources were used during model training.

  • Distributed Training on Amazon Sage Maker

Another important feature added to the Amazon SageMaker is distributed training, which enables developers to train complex deep learning models up to two times faster than current approaches. Ideal for large and complex models, distributed training offers two distributed training capabilities — Data Parallelism and Model Parallelism to efficiently split large data or models across multiple GPUs and train models faster.

  • SageMaker Edge Manager

SageMaker Edge Manager is a machine learning management tool for developers to optimize, secure, monitor, and manage models deployed on fleets of edge devices. Customers can cryptographically sign their models, upload prediction data from their devices to SageMaker for monitoring and analysis. Furthermore, Edge Manager provides a dashboard that tracks and visually reports on the operation of the deployed models within the SageMaker console. 

  • SageMaker JumpStart 

SageMaker JumpStart provides developers with a searchable interface with solutions, algorithms, and sample notebooks to help them quickly start their machine learning journey.

2. AWS Unveiled Amazon HealthLake

The cloud giant launched Amazon HealthLake, a fully managed service (in preview mode) for healthcare providers to store, transform, and analyze all the data in cloud. The new service is compliant with the health insurance portability and accountability act (HIPAA) and is powered by machine learning to extract meaningful insights from unstructured health data. 

Healthcare providers can then query, search, and analyze the data to identify trends, spot anomalies, and make predictions about patient health. Existing customers include Cerner, Ciox Health, Konica Minolta Precision Medicine (KMPM), and Orion Health.

Swami Sivasubramanian, VP of Amazon machine learning for AWS, saidOpens a new window , “With Amazon HealthLake, healthcare organizations can reduce the time it takes to transform health data in the cloud from weeks to minutes so that it can be analyzed securely, even at petabyte scale. This completely reinvents what’s possible with healthcare and brings us that much closer to everyone’s goal of providing patients with more personalized and predictive treatment for individuals and across entire populations.”

Also Read: AWS re:Invent 2020 Week 1: The Cloud Giant Unveils DevOps Guru

3. New Features Added to Amazon CodeGuru

The Amazon CodeGuru, a developer tool, now has three new capabilities for its two main components — Reviewer and Profiler.

  • Python support CodeGuru Reviewer and Profiler

CodeGuru now offers support for applications written in Python. Available in preview mode, it offers recommendations to improve and tune Python code, which will reduce infrastructure costs and improve application performance.

  • Security detectors for CodeGuru Reviewer

These detectors identify security vulnerabilities in Java code and offer remediations when an issue is detected. Security engineers can focus on application-specific security best-practices with these detectors, and code reviewers can focus on other improvements. 

  • Memory profiling for CodeGuru Profiler  

Memory profiling identifies memory leaks in an application and provides necessary methods to optimize its memory usage.

4. Amazon Lookout for Metrics

Amazon Lookout for Metrics is an anomaly detection service powered by machine learning to detect anomalies in metrics and monitor the health of customer’s business. This service will help organizations detect defects early and save material costs and improve customer experience. Existing customers include Digitata Networks, NextRoll, and Playrix.

Amazon Lookout for Metrics is now available in preview in the U.S, Asia Pacific, and Europe.

Alex Casalboni, a developer advocate at AWS, said, “Because Amazon Lookout for Metrics is a fully managed service, it takes care of the whole ML process so you can get started quickly and focus on your core business. And most importantly, the service improves model performance continually by incorporating your real-time feedback on the accuracy and relevance of the anomalies and root cause analysis.”

5. Amazon EMR on Amazon EKS Now Generally Available

Amazon Elastic MapReduce (EMR) is a big data platform that processes a vast amount of data quickly using open source tools, such as Apache Spark, Hive, HBase, Presto, and Flink. The cloud giant announced the general availability of Amazon EMR on Amazon EKS, which will allow customers to automate the provisioning and management of open-source tools on EKS. Customers can focus on running analytics workloads while Amazon EMR on Amazon EKS builds, configures, and manages containers.

The platform is now available in the U.S and Europe.

6. AWS Announced General Availability of Automatic Table Optimizations (ATO) for Amazon Redshift

Automatic table optimizations (ATO) for Amazon Redshift is a new self-tuning capability that uses machine learning to automate optimization tasks such as setting sort and distribution keys without the need for administrator intervention. 

Also Read: The Future of Data Protection in the Cloud

7. AWS Announced Amazon Redshift Data Sharing in Preview

The preview launch of the new Amazon Redshift data sharing will enable organizations to share live data across multiple Amazon Redshift clusters securely without the need to copy or move data. Data sharing provides live access to data so that customers are equipped with up-to-date data and consistent information. Existing customers include Warner Bros, Yelp, Fannie Mae, home24, Etleap, and Aginity.

The preview is available in the U.S, Asia Pacific, and Europe.

8. Amazon Braket Supports PennyLane

Amazon has made the PennyLane library available on Amazon Braket so that developers can build and run hybrid quantum-classical algorithms. This integration will allow developers to test and fine-tune algorithms faster and run them on any quantum computing hardware. Also, PennyLane is pre-installed in Braket notebooks, and developers can use parallel circuit execution to train quantum algorithms up to 10 times faster compared to the algorithm on a single machine.

9. AWS Lake Formation Features Now in Preview Mode

AWS announced the preview of three AWS Lake Formation APIs, including transactions, row-level security, and acceleration. Transactions will deliver concurrent updates and consistent query results. At the same time, row-level security will offer granular access control, and acceleration capability will deliver better performance through inline filtering, aggregations, and automatic file compaction.

Also Read: Every Cloud Has a Silver Lining: Why Investment in Infrastructure Is Integral

10. AWS Launched Amazon Forecast Weather Index 

AWS announced the release of the Amazon Forecast Weather Index to improve the forecasting accuracy of machine learning models using local weather information. The tool leverages machine learning and combines weather metrics from historical weather events and current forecasts to develop accurate forecasts. AWS customers can add 14-day weather forecasts for U.S. and Europe locations to create accurate demand forecasts.

11. AWS Audit Manager

AWS announced Audit manager, a new compliance service to audit AWS usage, and simplify how an organization assesses risk and compliance with regulations and industry standards. It has prebuilt frameworks to map AWS resources to meet the requirements of regulations, such as GDPR, PCI-DSS, and CIS AWS Foundations Benchmark. 

The service is available globally for all AWS customers.

12. New AWS Region To Launch in Melbourne, Australia

After launching new AWS regions in Italy and South Africa in 2020, the company announced a new AWS region in Melbourne, which will open in the second half of 2022. The new region will enable AWS customers to store backup data in geographically separated locations within Australia.

After ending the first week with new launches in cloud-native technologies, the second week of AWS re:Invent 2020 was filled with a slew of machine learning and big data launches. AWS plans to lead not only the cloud market but even in AI and big data, which are going to be the biggest technologies in the upcoming years.

What did you think of these announcements? Comment below or let us know on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We’d love to hear from you!