From the course: Data Engineering on AWS: Data Cataloging, Processing, Analytics, and Visualization

Unlock this course with a free trial

Join today to access over 25,500 courses taught by industry experts.

Amazon Redshift Spectrum and performance tuning

Amazon Redshift Spectrum and performance tuning

- [Instructor] Redshift Spectrum is an AWS feature that allows you to run queries on data stored in your S3 buckets directly from your Amazon Redshift cluster. It essentially extends the functionality of Redshift beyond the data that you have loaded into your cluster, enabling you to access and analyze large amount of data in a more cost-effective and flexible way. So how does Redshift Spectrum work? Essentially, the data is stored in S3 is formatted like a Redshift table and cataloged with something like AWS Glue. That's the high-level explanation of what happens. Amazon Redshift Spectrum nodes are dedicated Amazon Redshift servers managed by AWS that are independent of customer queries and clusters. Because of this, Redshift Spectrum queries use much less of a cluster crossing capacity than other queries, because compute-intensive activity is pushed into the spectrum nodes. Based on the demands of queries, Redshift…

Contents