From the course: AWS Certified Data Engineer Associate (DEA-C01) Cert Prep
RDS performance optimization
From the course: AWS Certified Data Engineer Associate (DEA-C01) Cert Prep
RDS performance optimization
- [Instructor] Relational databases rely on users to carefully design schemas and queries that are optimized for performance. Most performance bottlenecks and relational database are due to long running SQL queries that are tying up system resources. In this lesson, we'll discuss how to monitor, troubleshoot, and remedy performance issues with RDS and Aurora databases. You can enable performance insights for RDS and Aurora databases to get a quick picture of the load on your database and what specifically is the cause of performance issues. It graphically shows the number of sessions running at certain intervals, and the dashboard breaks down the database load by wait events, SQL queries, and SQL users. A wait event causes a SQL statement to wait for a specific event to happen before it can continue running. Every active session is either running on the CPU or waiting. When sessions aren't consuming CPU, they might be waiting for a memory buffer to become free, a data file to be read, or a log to be written to. The more time that a session waits for resources, the less time it runs on the CPU. Another cause of wait events is row locking, which is a normal characteristic of an asset compliant database. When a database transaction is occurring, that is modifying data like insert, update, or delete statements, the database engine will lock access to the rows until the transaction has been committed. This makes other queries wait before they can run, but it assures the returned results are consistent. If this is happening too frequently and causing performance issues, then you may need to adjust when updates are made or else batch them all together. Where wait events show bottlenecks, the top SQL tab of performance insights shows which queries are contributing the most to the database load. For example, many queries might be concurrently running on the database, but one query might be consuming 99% of the database load. In this case, the high load may indicate a problem with the query. Top users shows which database users are contributing the most to the database load. A handy feature of performance insights is to enable the max CPU line on the dashboard graph. This gives you a quick indication of whether requested CPU utilization is exceeding the maximum causing queries to have to wait for execution. Also on the dashboard, you can access the CloudWatch metrics of the database, including CPU utilization, the free memory, query latency, and lock time. When performance metrics exceed certain thresholds, RDS will actually make recommendations to you on how to troubleshoot. In addition to using RDS insights, another thing you should do is enable the logging of slow queries, which will log any queries that take longer than a set threshold to run. To remediate performance issues, usually the first place to start is with optimizing the SQL queries that are contributing to the database load. Relational databases perform well when queries make use of indexes, rather than full table scans. The explain command will show you the query plan and whether large table scans will be happening with your query. To quickly evaluate a query, you can look at a metric known as the query cost, which refers to how expensive the database engine thinks the query is in terms of overall utilization of the CPU memory, disc, et cetera. This example is for our Northwind database running on a Aurora MySQL instance. For MySQL, simple queries generally have a query cost of less than 1,000. First, we explain this query without setting primary keys on the tables, and as you can see, we get a huge query cost. But when we add primary keys to the orders and products table and a compound key on the orders details table, the query cost drops dramatically. That is because while our query needs to retrieve all of the rows in the orders table, the orders details and products tables are going to look up the needed rows in the index and not need to scan the tables. In general, any fields that are used in joins should be indexed in order to minimize table scans. Indexes also help with sorting in your queries because the index is sorted rather than the entire table, so you also want to index columns that you are sorting by.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Download courses and learn on the go
Watch courses on your mobile device without an internet connection. Download courses using your iOS or Android LinkedIn Learning app.
Contents
-
-
(Locked)
Introduction26s
-
(Locked)
AWS account overview3m 59s
-
(Locked)
Hands-on learning: Create your AWS Free Tier account4m 33s
-
(Locked)
Hands-on learning: Set up an account with a budget and alarm5m 18s
-
AWS IAM7m 50s
-
(Locked)
Hands-on learning: Create an IAM user and group4m 36s
-
(Locked)
IAM authentication and multi-factor authentication (MFA)3m 26s
-
(Locked)
Hands-on learning: Enable MFA2m 30s
-
(Locked)
Access control methods: RBAC and ABAC5m 22s
-
(Locked)
Hands-on learning: Switch IAM roles7m 25s
-
(Locked)
-
-
(Locked)
Introduction32s
-
(Locked)
Data pipelines overview2m 55s
-
(Locked)
The three Vs1m 54s
-
(Locked)
Types of data3m 27s
-
(Locked)
Batch vs. streaming1m 32s
-
(Locked)
OLAP vs. OTAP2m 11s
-
Data formats4m 10s
-
(Locked)
Data modeling3m 19s
-
(Locked)
Data warehouses1m 17s
-
(Locked)
Data lakes3m 1s
-
(Locked)
Design considerations1m 57s
-
(Locked)
State management and replayability1m 44s
-
(Locked)
Use cases1m 31s
-
(Locked)
Distributed computing2m 49s
-
(Locked)
Data pipelines on AWS1m 57s
-
(Locked)
-
-
(Locked)
Introduction41s
-
(Locked)
Using SQL with AWS2m 58s
-
(Locked)
Hands-on learning: Create a SQL environment2m 54s
-
(Locked)
SQL commands1m 30s
-
(Locked)
SQL data types2m 48s
-
(Locked)
Data definition language (DDL)1m 19s
-
(Locked)
Data query language (DQL)1m 39s
-
SQL operators2m 23s
-
(Locked)
Displaying results1m 32s
-
(Locked)
SQL functions2m 39s
-
(Locked)
SQL joins3m 41s
-
(Locked)
SQL constraints3m 20s
-
(Locked)
Data manipulation language2m 54s
-
(Locked)
Hands-on learning: Query a database11m 4s
-
(Locked)
-
-
(Locked)
Introduction38s
-
Data ingestion scenarios3m 5s
-
(Locked)
AWS Database Migration Service3m 22s
-
(Locked)
Schema conversion2m 52s
-
(Locked)
Change data capture1m 30s
-
(Locked)
AWS DataSync1m 2s
-
(Locked)
AWS Storage Gateway1m 10s
-
(Locked)
AWS Snow Family2m 27s
-
(Locked)
AWS Transfer Family2m 11s
-
(Locked)
AWS Glue59s
-
(Locked)
AWS Glue Data Catalog1m 17s
-
(Locked)
Hands-on learning: Create an AWS Glue Data Catalog3m 13s
-
(Locked)
AWS Glue Crawler45s
-
(Locked)
Hands-on learning: Create an AWS Glue Crawler2m 6s
-
(Locked)
Amazon Kinesis Data Streams9m 18s
-
(Locked)
Hands-on learning: Create a data stream3m 30s
-
(Locked)
Consumer throughput1m 21s
-
(Locked)
Reading data with AWS Lambda2m 28s
-
(Locked)
AWS Lambda concurrency2m 28s
-
(Locked)
Using Amazon EFS with AWS Lambda1m 25s
-
(Locked)
Hands-on learning: Create and use an AWS Lambda function3m 50s
-
(Locked)
Amazon Kinesis Client Library (KCL)2m 52s
-
(Locked)
Amazon MSK3m 19s
-
(Locked)
Amazon Data Firehose4m 9s
-
Hands-on learning: Configure an Amazon Data Firehose stream5m 33s
-
(Locked)
Amazon DynamoDB Streams2m 31s
-
(Locked)
AWS Glue Schema Registry1m 9s
-
(Locked)
Amazon AppFlow4m 41s
-
(Locked)
Consume data APIs2m 21s
-
(Locked)
Data exchange1m 31s
-
(Locked)
-
-
(Locked)
Introduction53s
-
(Locked)
Common scenarios4m 58s
-
(Locked)
Amazon EMR introduction3m 52s
-
(Locked)
Apache Hadoop1m 48s
-
(Locked)
Hadoop frameworks2m 18s
-
Apache Spark3m 12s
-
(Locked)
Amazon EMR architecture7m 48s
-
(Locked)
Hands-on learning: Launch an Amazon EMR cluster13m 7s
-
(Locked)
Amazon EMR serverless3m 16s
-
(Locked)
Amazon EMR on EKS1m 44s
-
(Locked)
Amazon EMR Studio1m 27s
-
(Locked)
Hands-on learning: Use Amazon EMR Studio5m 8s
-
(Locked)
Container services for batch processing8m 7s
-
(Locked)
AWS Batch2m 46s
-
(Locked)
Amazon Managed Service for Apache Flink2m 22s
-
(Locked)
AWS Glue DataBrew3m 1s
-
(Locked)
Hands-on learning: Create an AWS Glue DataBrew project5m 37s
-
(Locked)
-
-
(Locked)
Introduction56s
-
(Locked)
Storage platforms4m 14s
-
Aligning to access patterns8m 35s
-
(Locked)
Cost and performance comparisons3m 4s
-
(Locked)
Amazon RDS4m 47s
-
(Locked)
Amazon Aurora4m 15s
-
(Locked)
Hands-on learning: Create an Amazon RDS database7m 24s
-
(Locked)
Amazon DynamoDB8m 4s
-
(Locked)
DynamoDB partitions and primary keys5m 52s
-
(Locked)
Hands-on learning: Practice creating DynamoDB tables8m 24s
-
(Locked)
DynamoDB LSI and GSI3m 12s
-
(Locked)
Hands-on learning: Create an LSI and a GSI6m 5s
-
(Locked)
Hands-on learning: Add a time to live (TTL) to items5m 18s
-
(Locked)
Amazon S3 for data lakes5m 10s
-
(Locked)
Amazon S3 storage classes4m 39s
-
(Locked)
Amazon S3 lifecycle policies2m 29s
-
(Locked)
Amazon S3 versioning and replication2m 49s
-
(Locked)
Hands-on learning: Configure replication and lifecycle8m 3s
-
(Locked)
Amazon S3 performance optimization1m 54s
-
(Locked)
Server access logging1m 13s
-
(Locked)
Using Amazon S3 for staging data4m 39s
-
(Locked)
Amazon S3 access points1m 9s
-
(Locked)
Other database services4m 19s
-
(Locked)
-
-
(Locked)
Introduction45s
-
(Locked)
Analytics services2m 23s
-
(Locked)
Amazon Redshift5m 14s
-
(Locked)
Hands-on learning: Launch an Amazon Redshift cluster8m 22s
-
(Locked)
Amazon Redshift serverless2m 32s
-
(Locked)
Schema design for Amazon Redshift2m 36s
-
Loading data into Amazon Redshift6m 19s
-
(Locked)
Hands-on learning: Use the Amazon Redshift COPY command5m 19s
-
(Locked)
Unloading Amazon Redshift data1m 52s
-
(Locked)
Hands-on learning: Unload data to Amazon S33m 10s
-
(Locked)
Column compression2m 45s
-
(Locked)
Distribution styles5m 28s
-
(Locked)
Maintaining tables3m 41s
-
(Locked)
Amazon Redshift federated queries1m 55s
-
(Locked)
Amazon Redshift Spectrum2m 42s
-
Amazon Redshift materialized views3m 39s
-
(Locked)
Transform data with stored procedures4m 18s
-
(Locked)
Workload management1m 59s
-
(Locked)
Zero-ETL integrations3m 3s
-
(Locked)
Streaming ingestion2m 7s
-
(Locked)
Amazon Athena4m
-
(Locked)
Partitioning data3m 2s
-
(Locked)
Creating views2m 40s
-
(Locked)
Hands-on learning: Create and query tables using Athena4m 56s
-
(Locked)
AWS Lake Formation1m 58s
-
(Locked)
Hands-on learning: Create a data lake9m 9s
-
(Locked)
Amazon QuickSight4m 47s
-
(Locked)
Hands-on learning: Create a QuickSight dashboard5m 33s
-
Amazon OpenSearch7m 11s
-
(Locked)
-
-
(Locked)
Introduction37s
-
(Locked)
Serverless services and event-driven architecture4m 57s
-
(Locked)
Amazon Simple Queue Service (SQS)9m 18s
-
(Locked)
Amazon Simple Notification Service (SNS)3m 9s
-
(Locked)
AWS Step Functions2m 7s
-
(Locked)
Hands-on learning: Create a state machine for data analytics5m 14s
-
(Locked)
Amazon S3 event notifications1m 43s
-
(Locked)
Amazon EventBridge1m 15s
-
(Locked)
Hands-on learning: Create an event bus and rule5m 26s
-
(Locked)
Amazon Managed Workflows for Apache Airflow2m 15s
-
(Locked)
Hands-on learning: Orchestrate EMR tasks using Amazon MWAA9m 39s
-
(Locked)
AWS CloudFormation4m 16s
-
(Locked)
Hands-on learning: Create an AWS Glue Catalog with partitions8m 18s
-
(Locked)
AWS Serverless Application Model (SAM)4m 22s
-
(Locked)
Hands-on learning: Run a SAM app using CloudShell7m 54s
-
(Locked)
AWS Cloud Development Kit (CDK)1m 52s
-
(Locked)
CI/CD overview5m 59s
-
AWS CodeCommit and CodePipeline3m 59s
-
(Locked)
Hands-on learning: Install Git5m 49s
-
(Locked)
Hands-on learning: Deploy an AWS Glue job with CodePipeline11m 29s
-
(Locked)
-
-
(Locked)
Introduction44s
-
(Locked)
Amazon CloudWatch overview5m 24s
-
(Locked)
Hands-on learning: Create a custom metric10m 4s
-
(Locked)
Hands-on learning: Create an Amazon CloudWatch alarm2m 37s
-
(Locked)
Amazon CloudWatch Logs Insights1m 36s
-
(Locked)
Hands-on learning: Query Amazon CloudWatch logs4m 7s
-
(Locked)
DynamoDB capacity units5m 7s
-
(Locked)
DynamoDB performance and throttling2m 4s
-
RDS performance optimization4m 43s
-
(Locked)
RDS Proxy2m 1s
-
(Locked)
Amazon Redshift performance optimization3m 54s
-
(Locked)
-
-
(Locked)
Introduction43s
-
(Locked)
AWS Key Management Service (KMS)4m 19s
-
(Locked)
Virtual private cloud (VPC) overview8m 10s
-
(Locked)
Defining VPC Classless Interdomain Routing (CIDR) blocks6m 26s
-
(Locked)
Hands-on learning: Create a custom VPC10m 16s
-
(Locked)
Security groups and network ACLs10m 48s
-
(Locked)
Hands-on learning: Configure security groups and NACLs10m 11s
-
(Locked)
VPC peering5m 44s
-
(Locked)
Hands-on learning: Configure VPC peering9m 42s
-
VPC endpoints4m 27s
-
(Locked)
Hands-on learning: Create a VPC endpoint9m 27s
-
(Locked)
AWS Systems Manager Parameter Store2m 50s
-
(Locked)
AWS Secrets Manager2m 26s
-
(Locked)
Hands-on learning: Work with secrets6m 5s
-
(Locked)
AWS Config2m 47s
-
(Locked)
AWS CloudTrail4m 33s
-
(Locked)
Hands-on learning: Create a CloudTrail trail3m 59s
-
(Locked)
AWS CloudTrail Lake1m 43s
-
(Locked)
Hands-on learning: Query CloudTrail Lake events5m 18s
-
(Locked)
Amazon RDS security4m 6s
-
(Locked)
Amazon Redshift security5m 15s
-
(Locked)
Database audit logging1m 38s
-
(Locked)
AWS Audit Manager1m 35s
-
(Locked)
Amazon Macie3m 4s
-
(Locked)