Optimising Query Execution for Amazon Redshift Spectrum

The AWS Big Data Blog post Twelve Best Practices for Amazon Redshift Spectrum discusses a number of ways for optimising queries running on Redshift Spectrum.

Here I’m verifying ideas from that post, focusing in particular on using Redshift internals such as system views and query plans to confirm how changes […] “Optimising Query Execution for Amazon Redshift Spectrum”

Processing IoT Data using Amazon Kinesis

In this post I’m using Kinesis Data Streams, Kinesis Data Firehose and Kinesis Data Analytics to process data received from an IoT sensor.

The inbound IoT data stream is written to Redshift, has rolling averages calculated over it which are again written to Redshift, and is filtered to discover data that’s […] “Processing IoT Data using Amazon Kinesis”

Running ETL Jobs using Amazon Athena

Amazon Athena supports the ‘CREATE TABLE AS’ and ‘INSERT INTO’ statements, which can be used to transform data between a source and a destination.

In this post I’m using them to optimise the storage of data that is received into S3 as files of JSON objects. Specifically, I’m using CREATE TABLE AS (CTAS) to […] “Running ETL Jobs using Amazon Athena”