How to estimate capacity consumption in Amazon Keyspaces
When you read or write data in Amazon Keyspaces, the amount of read/write request units (RRUs/WRUs) or read/write capacity units (RCUs/WCUs) your query consumes depends on the total amount of data Amazon Keyspaces has to process to run the query. In some cases, the data returned to the client can be a subset of the data that Amazon Keyspaces had to read to process the query. For conditional writes, Amazon Keyspaces consumes write capacity even if the conditional check fails.
To estimate the total amount of data being processed for a request, you have to consider the encoded size of a row and the total number of rows. This topic covers some examples of common scenarios and access patterns to show how Amazon Keyspaces processes queries and how that affects capacity consumption. You can follow the examples to estimate the capacity requirements of your tables and use Amazon CloudWatch to observe the read and write capacity consumption for these use cases.
For information on how to calculate the encoded size of rows in Amazon Keyspaces, see Calculating row size in Amazon Keyspaces.
Topics
Range queries
To look at the read capacity consumption of a range query, we use the following example table which is using on-demand capacity mode.
pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value -----+-----+-----+-----+-----+-----+------- a | b | 1 | a | b | 50 | <any value that results in a row size larger than 4KB> a | b | 1 | a | b | 60 | value_1 a | b | 1 | a | b | 70 | <any value that results in a row size larger than 4KB>
Now run the following query on this table.
SELECT * FROM amazon_keyspaces.example_table_1 WHERE pk1='a' AND pk2='b' AND pk3=1 AND ck1='a' AND ck2='b' AND ck3 > 50 AND ck3 < 70;
You receive the following result set from the query and the read operation performed by Amazon Keyspaces consumes 2 RRUs in
LOCAL_QUORUM
consistency mode.
pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value -----+-----+-----+-----+-----+-----+------- a | b | 1 | a | b | 60 | value_1
Amazon Keyspaces consumes 2 RRUs to evaluate the rows with the values ck3=60
and ck3=70
to process the query.
However, Amazon Keyspaces only returns the row where the WHERE
condition specified in the query is true, which is the row with
value ck3=60
. To evaluate the range specified
in the query, Amazon Keyspaces reads the row matching the upper bound of the range, in this case ck3 = 70
,
but doesn’t return that row in the result. The read capacity consumption is based on the data read when processing the query, not
on the data returned.
Limit queries
When processing a query that uses the LIMIT
clause, Amazon Keyspaces reads rows up
to the maximum page size when trying to match the condition specified in the query. If
Amazon Keyspaces can't find sufficient matching data that meets the LIMIT
value on the
first page, one or more paginated calls could be needed. To continue reads on the next
page, you can use a pagination token. The default page size is 1MB. To consume less read
capacity when using LIMIT
clauses, you can reduce the page size. For more
information about pagination, see Paginating results in Amazon Keyspaces.
For an example, let's look at the following query.
SELECT * FROM my_table WHERE partition_key=1234 LIMIT 1;”
If you don’t set the page size, Amazon Keyspaces reads 1MB of data even though it returns only 1 row to you.
To only have Amazon Keyspaces read one row, you can set the page size to 1 for this query. In this case, Amazon Keyspaces would only read one row
provided you don’t have expired rows based on Time-to-live settings or client-side timestamps. To consume less read capacity, we
recommend to set your page size equal to the LIMIT
value to reduce the amount of data Amazon Keyspaces reads.
Table scans
Queries that result in full table scans, for example queries using the ALLOW FILTERING
option, are another
example of queries that process more reads than what they return as results. And the read capacity consumption is based on the data
read, not the data returned.
For the table scan example we use the following example table in on-demand capacity mode.
pk | ck | value ---+----+--------- pk | 10 | <any value that results in a row size larger than 4KB> pk | 20 | value_1 pk | 30 | <any value that results in a row size larger than 4KB>
Amazon Keyspaces creates a table in on-demand capacity mode with four partitions by default. In this example table, all the data is stored in one partition and the remaining three partitions are empty.
Now run the following query on the table.
SELECT * from amazon_keyspaces.example_table_2;
This query results in a table scan operation where Amazon Keyspaces scans all four partitions of the table and consumes
6 RRUs in LOCAL_QUORUM
consistency mode. First, Amazon Keyspaces consumes 3 RRUs for reading the three rows with pk=‘pk’
. Then, Amazon Keyspaces consumes the
additional 3 RRUs for scanning the three empty partitions of the table. Because this query results in a table scan,
Amazon Keyspaces scans all the partitions in the table, including partitions without data.
Lightweight transactions
Lightweight transactions (LWT) allow you to perform conditional write operations against your table data. Conditional update operations are useful when inserting, updating and deleting records based on conditions that evaluate the current state.
In Amazon Keyspaces, all write operations require LOCAL_QUORUM consistency and there is no additional charge for using LWTs. The difference
for LWTs is that when a LWT condition check results in FALSE, it consumes write capacity units. The number of write capacity units
consumed depends on the size of the row. If the row size is 2 KB, the failed conditional write consumes two write capacity units.
If the row doesn’t currently exist in the table, the operation consumes one write capacity unit. By monitoring the
ConditionalCheckFailed
metric in CloudWatch you can determine the capacity consumed by LWT condition check failures.
Estimate read and write capacity consumption with Amazon CloudWatch
To estimate and monitor read and write capacity consumption, you can use a CloudWatch dashboard. For more information about available metrics for Amazon Keyspaces, see Amazon Keyspaces metrics and dimensions.
To monitor read and write capacity units consumed by a specific statement with CloudWatch, you can follow these steps.
Create a new table with sample data
Configure a Amazon Keyspaces CloudWatch dashboard for the table. To get started, you can use a dashboard template available on Github
. Run the CQL statement, for example using the
ALLOW FILTERING
option, and check the read capacity units consumed for the full table scan in the dashboard.