Cost Optimization Tactics for Amazon DocumentDB
Welcome to the world of cloud computing, where cost optimization often leads to better utilization and performance. If you are using Amazon DocumentDB, you're in the right place. We are going to show you how to cut costs and resource utilization without compromising the performance. Let's get started!
How does Amazon DocumentDB pricing work?
Understanding how Amazon DocumentDB charges you is the first step to saving money. Let's look at the main parts of your bill:
- Instance Pricing: This is what you pay for the compute instances in your cluster. You're billed per second, and the minimum time you'll be charged for is 10 minutes.
- Database I/O Operations: This is the number of storage input/output requests you make. You're billed for every million requests you make in a billing cycle.
- Database Storage: This part is about how much data you store in your cluster. You pay for each GB stored per month.
- Backup Storage: This includes storage for automatic backups of your database and any snapshots you take. You pay for each GiB stored per month.
- Other AWS Costs: Remember, there might be extra costs like moving data between your applications and Amazon DocumentDB if they are in different Regions.
Next, we’ll go into each of these parts in more detail and show you ways to save money.
How to optimize 25% - 50% on instance costs in Amazon DocumentDB
Choosing the right instance size
The first step in reducing costs is selecting the appropriate instance size for the Database clusters. It's vital to ensure that the instance size is a good fit for your workload. Your cluster should have enough memory to accommodate application data, and serve queries in low latencies. Choosing a right instance size and number of instances for your cluster, not only impacts on performance but also can help you reduce costs. To find the ideal instance size for your specific workload or cluster, consider using a third-party Amazon DocumentDB sizing calculator. This tool can guide you in making a smart choice. Note: The instance price for Database clusters is multiplied by the number of instances in the cluster.
Save more by pausing instances
You can pause instances for up to 7 days if you are not using them. This is particularly useful for times like weekends in testing environments. By starting and stopping your instances as needed, you can save money when you're not actively using your resources. Tools like AWS Lambda, AWS EventBridge can help automate this process. This kind of automation can be run in three clicks with Cloudchipr. Currently, Cloudchipr supports automated Off Hours for RDS and EC2, and it's coming soon for DocumentDB as well.
Understanding Amazon DocumentDB deployments: Single-AZ vs. Multi-AZ
Amazon DocumentDB offers two deployment options, each with its own Service Level Agreement (SLA): Single-AZ and Multi-AZ
Single-AZ Deployments: Suitable for less critical work.
Multi-AZ Deployments: Ideal for crucial applications, with a stronger uptime promise.
Please note that SLA commitments may not apply in cases beyond AWS's control or when operational guidelines are not adhered to. For a deeper understanding of SLAs, you can refer to more detailed information here
Saving money with Single-Instance Durability
Amazon DocumentDB is designed to separate storage from compute. This unique setup means that even clusters with a single instance provide high durability. Your data is replicated in six different ways across three Availability Zones, ensuring great durability.
Typically, production clusters have three or more instances for enhanced reliability. However, if high availability isn't a priority for you, like in development environments, you can opt for a single instance cluster to save costs. The T family instances, such as db.t4g.medium and db.t3.medium, are great choices for development, testing, or small-scale production tasks.
Lowering database storage and I/O costs
Understanding data storage billing
The way Amazon DocumentDB charges for storage depends on how much space you use. The system will automatically make more room as your data increases, up to a maximum of 128 TiB. However, you're only billed for the space your data actually takes up in the cluster volume.
To gain a deeper understanding of pricing, you can refer to the detailed examples available on the AWS DocumentDB documentation page.
Choosing the right storage configuration
With version 5.0 and later, Amazon DocumentDB offers two storage options for instance-based clusters:
- Amazon DocumentDB Standard Storage: This is a good choice if you think your I/O costs will be less than 25% of your total DocumentDB cluster expenses.
- Amazon DocumentDB I/O-Optimized Storage: Opt for this if you expect your I/O costs to be more than 25% of your overall DocumentDB cluster costs. It's more cost-effective for higher I/O demands.
Important to Know:
You can change your existing database clusters to Amazon DocumentDB I/O-Optimized storage once every 30 days. If you want to switch back to Standard Storage, you can do that anytime. To keep track of when you can switch to I/O-Optimized storage again, use the describe-db-clusters command in the AWS CLI or look at the configuration page of your cluster in the AWS Management Console.
Now, let's explore how Amazon DocumentDB calculates I/Os
In Amazon DocumentDB, I/O (Input/Output) costs come up when transaction logs are sent to the storage layer. Because DocumentDB keeps its computing and storage separate, all your data is stored in six copies across three availability zones, just like Amazon Aurora does. A key thing to remember is that I/Os are not counted multiple times for different instances in the same cluster.
Measuring I/Os in DocumentDB
I/Os in DocumentDB are measured in chunks: 8KB for reading and 4KB for writing. This means if you write anything from 1KB to 4KB, it counts as one I/O. But if you write 5KB, it counts as two I/Os. The same kind of rule applies to reading data.
Optimizing I/O usage
DocumentDB has a smart way of handling small, concurrent write operations. If these are smaller than 4KB, it batches them together. This approach optimizes I/O usage. Plus, unlike some other databases, DocumentDB doesn’t frequently move modified database pages to the storage layer. This method also helps to save on I/O costs.
Managing TTL index usage to reduce costs
In Amazon DocumentDB, every time the Time-To-Live (TTL) monitor removes documents, it causes I/O operations which add to your costs. Features like TTL and change streams create I/O charges when data is written, read, or deleted. If you have these features active but aren't really using them well in your application, turning them off can help you save money.
Be careful, though:
If you're using your system more and the TTL feature is deleting documents more often, your costs might go up because of more I/O usage. A good money-saving idea could be to not use a TTL index for deleting documents. Instead, put your documents in collections based on time and just get rid of these collections when you don't need them anymore. This way, you avoid the I/O costs that come with using TTL indexes and can be more cost-effective.
Use caching to cut costs
A great way to reduce the load on your DocumentDB instances and save money is by using a caching service such as AWS ElastiCache. By storing frequently used data in a cache, you reduce the number of times you need to read and write data. This not only lowers your costs but also makes your system respond faster because of lower delay times.
The BufferCachehitRatio CloudWatch metric indicates the percentage of the documents and indexes that are retrieved from the memory instead of its storage volume. Generally, you want the BufferCacheHitRatio value to be close to 100.
Document compression
Starting with version 5.0, documents in collections can be compressed. This can help you reduce both storage and I/O costs. You can turn on compression for each collection and see how well it's working by looking at metrics like the size of the compressed documents. DocumentDB uses the LZ4 algorithm, which is known for its efficiency in compression.
Important things to know about document compression:
- By default, document compression is not enabled.
- Although it's not possible to compress a collection that already exists, you can apply compression to documents that haven't been compressed yet. Similarly, you can also reverse this process for compressed documents. To store existing uncompressed documents in compressed format, copy the document to a compression-enabled collection. To convert compressed documents to uncompressed format, copy the documents to a compression-disabled collection.
- This feature is only available in Amazon DocumentDB version 5.0 and future versions.
- DocumentDB only compresses documents that are 2KB or larger.
How to enable document compression
You can turn on document compression for a new collection in Amazon DocumentDB. To do this, use the db.createCollection() method when you're creating the collection. This step will enable compression right from the start.
Monitoring document compression
To determine whether a collection has been compressed and to figure out its compression ratio, you can follow these steps.
Run either db.printCollectionStats() or db.collection.stats() in the mongo shell to see the compression statistics. This will give you information on both the uncompressed and compressed sizes of the collection, allowing you to assess how much storage you're saving through document compression. For instance, let's look at the statistics for a collection called “sample_collection”:
In this information:
- size reflects the total size of the document collection before compression.
- avgObjSize denotes the mean size of each document before compression, rounded to the nearest tenth. It's measured in bytes.
- storageSize indicates the amount of space the collection actually uses after compression, also in bytes.
- enabled under compression shows whether the compression feature is active.
To figure out the compression ratio, you simply divide the original size (size) by the post-compression size (storageSize). In our case, for the "sample_collection", this would be 3906.3 divided by 1953.1, showing a compression ratio of approximately 2:1.
Reducing costs by efficient index management
The Importance of Indexing Your Queries
Using indexes for your queries in DocumentDB is a smart way to reduce I/O costs. When you index your queries, they usually need less I/O than if you were to scan your entire collection for data. This method still uses some CPU and I/O, but it's a lot less than what you'd need for a full collection scan.
Tips for further reducing I/O costs with indexes
To cut down on your I/O costs even more, especially those related to garbage collection, consider these steps:
- Delete any indexes that you're not using. Keeping unused indexes can add unnecessary I/O costs.
- Adjust your instances so that your indexes can fit in memory. Having indexes in memory rather than on storage can significantly reduce I/O operations.
By following these tips, you can make sure your DocumentDB usage is more efficient and cost-effective.
Managing backup storage to control costs
How Backup Storage Works
There are two points of backup storage in Amazon DocumentDB:
- Continuous Backups: These are the backups that happen regularly within your set backup retention time (the period you choose to keep backups).
- Manual Snapshots: These are the backups you make yourself and choose to keep beyond the backup retention time.
Free backup storage and extra costs
Backup storage is free for the last 35 days, and it's equal to your Amazon DocumentDB cluster's data storage. For example, if you store 100 GB in your cluster, you also get 100 GB of backup storage at no cost. This includes your automated backups and any manual snapshots, as long as they are within your backup retention period.
A few important points to note:
Backup storage allocation: The amount of backup space you have is based on the region of your cluster. It's the total of all backups in that region.
Retention period: If your backup retention period is set to one day, there's no extra charge for backup storage. This means you won't pay anything extra as long as you don't keep manual snapshots beyond this one-day period.
Extra snapshots and region transfers: Having more backups than your free allowance or transferring snapshots to another AWS region will use extra backup space and might lead to charges.
Charges for extra storage: If you go over your free backup storage amount, you'll be charged a small fee. This is generally around $0.021 per GB each month, but the price can change depending on your AWS region.
Effective backup storage management
To handle your backup storage costs better, you might want to do a couple of things:
- Shorten the backup retention duration if you don't need to keep backups for a long time.
- Delete old manual snapshots that you don't need anymore.
Keeping track of your backup storage expenses
It's important to watch how much storage you're using for both continuous backups and manual snapshots. This way, you can decide if you need to reduce your backup retention time or get rid of unneeded snapshots.
Amazon CloudWatch provides useful metrics to help you monitor your backup storage usage:
- BackupRetentionPeriodStorageUsed: This shows the storage used for continuous backups right now. It's limited to the size of your cluster for the retention period. For example, if your cluster is 100 GiB and you have a two-day retention period, the max this metric can show is 200 GiB.
- SnapshotStorageUsed: This measures the storage for manual snapshots kept after the backup retention time. Only snapshots kept beyond the retention period count here. The size of each snapshot is the same as your cluster's size when the snapshot was taken.
- TotalBackupStorageBilled: This adds up the storage used for continuous backups and manual snapshots, minus one day of free backup storage. For example, for a 100 GiB cluster with one extra snapshot and a one-day retention, the total billed would be 100 GiB
Reminder
This is something a lot of users forget: Even if you delete instances in a cluster, you'll still be billed for the storage and backup usage associated with that cluster. To completely stop all charges, you need to delete both your cluster and any manual snapshots you've taken.
Conclusion
AWS DocumentDB costs can be tricky to navigate, but we offer some insights into possible approaches. Remember, the secret to saving on AWS DocumentDB is a balanced approach that fits your needs.