Use the Conformity Knowledge Base AI to help improve your Cloud Posture

Idle Redshift Cluster

Trend Cloud One™ – Conformity is a continuous assurance tool that provides peace of mind for your cloud infrastructure, delivering over 1000 automated best practice checks.

Risk Level: High (not acceptable risk)
Rule ID: RS-009

Identify any Amazon Redshift clusters that appear to be idle and delete them to help lower the cost of your monthly AWS bill. By default, a Redshift cluster is considered 'idle' when meets the following criteria (to declare the cluster 'idle' both conditions must be true):

  • The average number of database connections has been less than 1 for the last 7 days.
  • The total number of ReadIOPS and WriteIOPS recorded per day for the last 7 days has been less than 20 on average.

The AWS CloudWatch metrics used to detect idle Redshift clusters are:

  • DatabaseConnections - the number of database connections made to a Redshift cluster (Units: Count).
  • ReadIOPS and WriteIOPS - the average number of disk I/O (Input/Output) operations per second (Units: Count/Second).

This rule can help you work with the AWS Well-Architected Framework.

This rule resolution is part of the Conformity Security & Compliance tool for AWS.

Sustainability
Cost
optimisation

Idle Redshift clusters represent a good candidate for reducing your monthly AWS costs and avoid accumulating unnecessary usage charges.

Note 1: Backing up your Redshift clusters before termination is highly recommended because once these clusters are deleted, all their automated backups (snapshots) will be removed as well.
Note 2: Knowing the role and the owner of an AWS Redshift cluster before you take the decision to remove it from your account is very important. For this rule Cloud Conformity assumes that your Redshift clusters are tagged with 'Role' and 'Owner' tags which provide visibility into their usage profile and help you decide whether it's safe or not to terminate these resources.
Note 3: You can change the default threshold for this rule on the Cloud Conformity console and set your own values for the number of database connections, the total number of ReadIOPS and WriteIOPS for each condition in order to configure the clusters idleness.
Note 4: If the Redshift cluster selected for the checkup is needed within your AWS environment, you can suppress (disable) the conformity rule check for the cluster from the Cloud Conformity console.


Audit

To identify any idle Redshift clusters currently provisioned within your AWS account, perform the following:

Using AWS Console

01 Login to the AWS Management Console.

02 Navigate to Redshift dashboard at https://console.aws.amazon.com/redshift/.

03 In the left navigation panel, under Redshift Dashboard, click Clusters.

04 Choose the Redshift cluster that you want to examine then click on its identifier link:

Choose the Redshift cluster that you want to examine then click on its identifier link

listed in the Cluster column.

05 On the cluster settings page, select the Performance tab to access the monitoring panel.

06 On the monitoring panel displayed for the selected cluster, perform the following actions:

  1. To verify the Redshift cluster Database Connections usage graph, follow the steps below:
    • From the Time Range dropdown list, select Last 1 Week.
    • From the Period list, select 1 Hour.
    • From the Statistic dropdown list, select Average.
    • And from the Metrics dropdown list, select DatabaseConnections.
    Once the monitoring data is loaded into the Database Connections usage graph, check the number of database connections for the last 7 days. If the average usage (count) has been less than 1, e.g. If the average usage (count) has been less than 1, the selected Redshift cluster qualifies as candidate for the idle cluster.
  2. To verify the cluster Read IOPS usage graph, follow the steps below:
    • From the Time Range dropdown list, select Last 1 Week.
    • From the Period list, select 1 Hour.
    • From the Statistic dropdown list, select Sum.
    • And from the Metrics dropdown list, select ReadIOPS metric name.
    Once the monitoring data is loaded into the ReadIOPS usage graph, verify the total number of Read operations per second recorded in the last 7 days. If the total number of WriteIOPS has been less than 20, e.g. If the total number of WriteIOPS has been less than 20, the selected Redshift cluster qualifies as candidate for the idle cluster.
  3. To verify the cluster Write IOPS usage graph, follow the steps below:
    • From the Time Range dropdown list, select Last 1 Week.
    • From the Period list, select 1 Hour.
    • From the Statistic dropdown list, select Sum.
    • And from the Metrics dropdown list, select WriteIOPS.
    Once the monitoring data is loaded into the WriteIOPS usage graph, verify the total number of Write operations per second recorded in the last 7 days. If the total number of WriteIOPS has been less than 20, e.g. If the total number of WriteIOPS has been less than 20, the selected Redshift cluster qualifies as candidate for the idle cluster.

07 Now determine the selected cluster role and owner by checking the Role and Owner tags values assigned to the Redshift cluster in order to decide whether it's safe or not to terminate the resource. To check for the necessary tags, perform the following:

  1. Click the Manage Tags button from the dashboard top menu to open the panel that lists the cluster tags.
  2. On the Manage Tags panel, in the Applied Tags section, verify the requested tags and their values:
    • Check the Role tag value, available in the Value column, or any Role-like tag value that can provide information about the usage profile of the cluster (e.g. redshift-test-cluster) in order to decide if the resource can be terminated or not.
    • Check the Owner tag value, available in the Value column, or any Owner-like tag value that can provide the contact information (name, email, phone number) of the resource owner in order to get the confirmation to terminate or not the selected Redshift cluster.
  3. If all conditions outlined at step no. 6 (a, b + c) and 7 are met, the selected Redshift cluster is considered "idle" and can be deleted in order to stop incurring charges for this resource.

08 Repeat steps no. 4 – 7 to verify the role, owner, DatabaseConnections, ReadIOPS and WriteIOPS metrics usage within the selected time frame for the rest of the Redshift clusters created in the current region.

09 Change the AWS region from the navigation bar and repeat the audit process for the other regions.

Using AWS CLI

01 Run describe-clusters command (OSX/Linux/UNIX) using custom query filters to list the identifiers of all Redshift clusters currently available in the selected region:

aws redshift describe-clusters
	--region us-east-1
	--output table
	--query 'Clusters[*].ClusterIdentifier'

02 The command output should return a table with the requested cluster names:

------------------------
|   DescribeClusters   |
+----------------------+
|  cc-sandbox-cluster  |
|  cc-staging-cluster  |
|  cc-prod-cluster     |
+----------------------+

03 Run get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by AWS CloudWatch for the DatabaseConnections metric, representing the number of Redshift database connections in use. Change the --start-time (start recording date) and --end-time (stop recording date) parameters value to choose your own time frame for recording the DatabaseConnections usage. Also, set the --period parameter value to define the granularity - in seconds - of the returned datapoints, based on your requirements. A period can be as short as one minute (60 seconds) or as long as one day (86400 seconds). The following command example returns the average database connections usage of an AWS Redshift cluster identified by the name cc-sandbox-cluster, usage data captured during a 7 days period (set by the --start-time and --end-time command parameters), using 1 hour period as the granularity of the returned datapoints (set by the --period parameter):

aws cloudwatch get-metric-statistics
	--region us-east-1
	--metric-name DatabaseConnections
	--start-time 2016-10-04T18:22:45
	--end-time 2016-10-11T18:22:45
	--period 3600
	--namespace AWS/Redshift
	--statistics Average
	--dimensions Name=ClusterIdentifier,Value=cc-sandbox-cluster

04 The command output should return the DatabaseConnections usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2016-10-04T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        },
        {
            "Timestamp": "2016-10-04T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        },
        {
            "Timestamp": "2016-10-04T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        },

        ...

        {
            "Timestamp": "2016-10-11T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        },
        {
            "Timestamp": "2016-10-11T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        },
        {
            "Timestamp": "2016-10-11T18:22:45Z",
            "Average": 0.0,
            "Unit": "Count"
        }
    ],
    "Label": "DatabaseConnections"
}

If the average number of database connections has been less than 1 for the last 7 days, the selected Redshift cluster qualifies as candidate for the idle cluster.

05 Run again get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by AWS CloudWatch for the ReadIOPS metric, representing the number of Read I/O operations per second. The following command example returns the total number of ReadIOPS used by an AWS Redshift cluster identified by the name cc-sandbox-cluster, IOPS usage data captured during a 7 days period (set by the --start-time and --end-time command parameters), using 1 hour period as the granularity of the returned datapoints (set by the --period parameter):

aws cloudwatch get-metric-statistics
	--region us-east-1
	--metric-name ReadIOPS
	--start-time 2016-10-04T18:22:57
	--end-time 2016-10-11T18:22:57
	--period 3600
	--namespace AWS/Redshift
	--statistics Sum
	--dimensions Name=ClusterIdentifier,Value=cc-sandbox-cluster

06 The command output should return the ReadIOPS usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2016-10-04T18:22:57Z",
            "Sum": 3.0000539505765276,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-04T18:22:57Z",
            "Sum": 1.2000329652228976,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-04T18:22:57Z",
            "Sum": 1.9001483344299335,
            "Unit": "Count/Second"
        },

        ...

        {
            "Timestamp": "2016-10-11T18:22:57Z",
            "Sum": 3.0000557761644715,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-11T18:22:57Z",
            "Sum": 3.133804686450845,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-11T18:22:57Z",
            "Sum": 4.087927411198773,
            "Unit": "Count/Second"
        }
    ],
    "Label": "ReadIOPS"
}

If the total number of ReadIOPS has been less than 20 for the last 7 days, the selected Redshift cluster qualifies as candidate for the idle cluster.

07 Run get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by AWS CloudWatch for the WriteIOPS metric, representing the number of Write I/O operations per second. The following command example returns the total number of WriteIOPS used by an AWS Redshift cluster identified by the name cc-sandbox-cluster, IOPS usage data captured during a 7 days period (set by the --start-time and --end-time command parameters), using 1 hour period as the granularity of the returned datapoints (set by the --period parameter):

aws cloudwatch get-metric-statistics
	--region us-east-1
	--metric-name WriteIOPS
	--start-time 2016-10-04T18:23:10
	--end-time 2016-10-11T18:23:10
	--period 3600
	--namespace AWS/Redshift
	--statistics Sum
	--dimensions Name=ClusterIdentifier,Value=cc-sandbox-cluster

08 The command output should return the WriteIOPS usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2016-10-04T18:23:10Z",
            "Sum": 1.5608611980164495,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-04T18:23:10Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-04T18:23:10Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        },

        ...

        {
            "Timestamp": "2016-10-11T18:23:10Z",
            "Sum": 2.595462486617107,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-11T18:23:10Z",
            "Sum": 0.0,
            "Unit": "Count/Second"
        },
        {
            "Timestamp": "2016-10-11T18:23:10Z",
            "Sum": 1.566811457747086,
            "Unit": "Count/Second"
        }
    ],
    "Label": "WriteIOPS"
}

If the total number of WriteIOPS has been less than 20 for the last 7 days, the selected Redshift cluster qualifies as candidate for the idle cluster.

09 Run describe-tags command (OSX/Linux/UNIX) to describe the tags for the selected cluster:

aws redshift describe-tags
	--region us-east-1
	--resource-name arn:aws:redshift:us-east-1:123456789012:cluster:cc-sandbox-cluster

10 The command output should return the tags (key-value pairs) applied to the cluster. The Role and Owner tags returned and their values (highlighted) can be used to determine the resource role within the environment and to contact its owner for more information in order to decide whether the Redshift cluster can be terminated or not:

{
	"TaggedResources": [
    	{
        	"ResourceType": "cluster",
        	"ResourceName": "arn:aws:redshift:us-east-1:123456789012:
                             cluster:cc-sandbox-cluster",
        	"Tag": {
            	"Value": "redshift-test-cluster",
            	"Key": "Role"
        	}
    	},
    	{
        	"ResourceType": "cluster",
        	"ResourceName": "arn:aws:redshift:us-east-1:123456789012:
                             cluster:cc-sandbox-cluster",
        	"Tag": {
            	"Value": "db_ops@cloudconformity.com",
            	"Key": "Owner"
        	}
    	}
	]
}

If the data returned for the steps no. 3 - 10 satisfy the conditions set by the conformity rule (cluster role, cluster owner, ReadIOPS + WriteIOPS, database connections), the selected cluster is considered "idle" and can be terminated in order to reduce AWS Redshift usage costs.

11 Repeat steps no. 3 - 10 to verify the role, owner, DatabaseConnections, ReadIOPS and WriteIOPS metrics usage within the specified time frame for the rest of the Redshift clusters created in the current region.

12 Change the AWS region by updating the --region command parameter value and repeat steps no. 1 - 11 to perform the audit process for other regions.

Remediation / Resolution

Option 1: terminate the idle clusters. To terminate (delete) any AWS Redshift clusters that are currently running in idle mode, perform the following commands:

Using AWS Console

01 Login to the AWS Management Console.

02 Navigate to Redshift dashboard at https://console.aws.amazon.com/redshift/.

03 In the left navigation panel, under Redshift Dashboard, click Clusters.

04 Choose the Redshift cluster that you want to examine then click on its identifier link:

Choose the Redshift cluster that you want to examine then click on its identifier link

listed in the Cluster column (see Audit section part I to identify the right resource).

05 On the cluster settings page, select the Configuration tab to access the configuration panel.

06 Click the Cluster dropdown button from the dashboard top menu and select Delete.

07 On the Delete Cluster confirmation page, select Yes next to Create snapshot and enter a unique name for your cluster snapshot (backup) in the Snapshot name box. Cloud Conformity strongly recommends taking a final snapshot of your cluster before termination because once the selected cluster is deleted its automated backups will no longer be available.

08 Click the Delete button to terminate the Redshift cluster.

09 Repeat steps no. 4 - 8 to delete any other idle Redshift clusters provisioned within the current region.

10 Change the AWS region from the navigation bar and repeat the process for other regions.

Using AWS CLI

01 Run delete-cluster command (OSX/Linux/UNIX) using the name of the resource as identifier to terminate the selected Redshift idle cluster (see Audit section part II to identify the right resource). Cloud Conformity strongly recommends taking a final snapshot of your cluster before you terminate it as all the automated backups are removed together with the cluster. The following command example deletes an Amazon Redshift cluster named cc-sandbox-cluster and creates a final snapshot of the resource (cc-sandbox-cluster-final-snapshot):

aws redshift delete-cluster
	--region us-east-1
	--cluster-identifier cc-sandbox-cluster
	--final-cluster-snapshot-identifier cc-sandbox-cluster-final-snapshot

02 The command output should return the metadata of the cluster selected for deletion:

{
    "Cluster": {
        "PubliclyAccessible": true,
        "MasterUsername": "ccclusterusr",
        "VpcSecurityGroups": [
            {
                "Status": "active",
                "VpcSecurityGroupId": "sg-061e2e7c"
            }
        ],
        "NumberOfNodes": 1,
        "PendingModifiedValues": {},
        "VpcId": "vpc-2fb56548",
        "ClusterVersion": "1.0",
        "Tags": [],
        "AutomatedSnapshotRetentionPeriod": 1,
        "ClusterParameterGroups": [
            {
                "ParameterGroupName": "default.redshift-1.0",
                "ParameterApplyStatus": "in-sync"
            }
        ],
        "DBName": "ccclusterdb",
        "PreferredMaintenanceWindow": "sat:07:00-sat:07:30",
        "Endpoint": {
            "Port": 5439,
            "Address": "cc-sandbox-cluster.cmfpsgvyjhf ... "
        },
        "IamRoles": [],
        "AllowVersionUpgrade": true,
        "ClusterCreateTime": "2016-10-05T16:48:16.086Z",
        "ClusterSubnetGroupName": "default",
        "ClusterSecurityGroups": [],
        "ClusterIdentifier": "cc-sandbox-cluster",
        "AvailabilityZone": "us-east-1a",
        "NodeType": "dc1.large",
        "Encrypted": false,
        "ClusterStatus": "final-snapshot"
    }
}

03 Repeat step no. 1 and 2 to terminate any other idle Redshift clusters provisioned in the current region.

04 Change the AWS region by updating the --region command parameter value and repeat the entire process for other regions.

Option 2: disable the rule check. If the selected idle Redshift cluster is needed (its role within your environment/application stack is important), you should turn off the conformity rule check for the cluster from the Cloud Conformity console.

References

Publication date Oct 14, 2016