Beware of DynamoDB TTL

Bibhuti Poudyal
January 10, 2021
3 minutes min read
Its a very basic requirement for applications to delete data item from the database after it has no use. Certain information like verification code, one time passwords, gift coupons, activation keys etc. don’t need to be stored once used. Typically, a cron job can be scheduled to get rid of such data items at certain […]

Its a very basic requirement for applications to delete data item from the database after it has no use. Certain information like verification code, one time passwords, gift coupons, activation keys etc. don’t need to be stored once used. Typically, a cron job can be scheduled to get rid of such data items at certain intervals. But modern databases have this feature built onto the system.

Dynamodb provides a TTL field to achieve this goal. Dynamodb docs states that

Amazon DynamoDB Time to Live (TTL) allows you to define a per-item timestamp to determine when an item is no longer needed. Shortly after the date and time of the specified timestamp, DynamoDB deletes the item from your table without consuming any write throughput.

At first, it looks like a cool feature. Allows you to free up your database by automatically deleting the old records. Cool, isn’t it ?

But if you are someone(like me) who instantly gets amazed and implements the feature right away without going into depth, you are likely to face the problems I did.

Here are few things to be considered before implementing this feature on your application.

TTL data type and value

The TTL field must be a number DataType. And the value should be in seconds i.e. timestamp in Unix epoch time format in seconds. In case of any other format, the TTL processes ignore the item.

TTL precision

This feature can work only if your task requires hour precision, but not minute or seconds precision.

As Dynamodb docs states

DynamoDB typically deletes expired items within 48 hours of expiration. The exact duration within which an item truly gets deleted after expiration is specific to the nature of the workload and the size of the table. Items that have expired and not been deleted will still show up in reads, queries, and scans.

In my case I was working with verification codes, sent as SMS, which needed to expire after XX minutes. For this TTL didn’t work as it didn’t guarantee minute precision, So I had to manually store timestamp and check for expiration within the lambda function.

References

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html

BACK TO BLOG

Scalable schema for DynamoDB

Bibhuti Poudyal
January 10, 2021
4 minutes min read
DynamoDB is a NoSql database and models data in Table-Items-Attributes structure. It allows data-item to be schema less. It’s a huge advantage as it allows you to write data in any format as desired. Even if structure of your data changes eg: a new field is added/deleted at later point of time, it adapts to […]

DynamoDB is a NoSql database and models data in Table-Items-Attributes structure. It allows data-item to be schema less. It’s a huge advantage as it allows you to write data in any format as desired. Even if structure of your data changes eg: a new field is added/deleted at later point of time, it adapts to it well.

With freedom of schema flexibility, data format inconsistency comes alongside. Even if database provides the power of flexibility, the schema shouldn’t change to that point at which it affects the performance of the system. It becomes an overhead to write separate business logic to manage the data inconsistency. It’s a good idea to resolve this issue on database layer.

I came up with a simple, yet effective solution to solve this problem. It worked pretty well in my scenario, hope it works well in yours too. For demonstration purpose Nodejs runtime have been used for business logic.

Here is the dead-simple schema for all of your DynamoDB tables.

As you can see this schema provides all the required data and metadata fields. The data field will contain the core information and others are metadata.

Lets explain major fields one by one.

id

id is the primary key to uniquely identify each item in a table. As DynamoDB doesn’t provide automatic generation of this field, the developer is required to generate it manually. The uuidjs library seems to be the best fit.

v1(version 1) of uuid guarantees that it is always unique even if everyone is doing millions of database insertions/sec. An identifier has been pre-appended so that its more unique and easier to separate from others(its optional though).

Read more about the UUID algorithm here.

data

The data field is a map type attribute. Maps are enclosed in curly braces({ … }). It is similar to a JSON object. There are no restrictions on the data types, and the elements do not have to be of the same type.

example:

The structure is similar for both user and the product. Core data lies inside the data field.

Make sure that the value inside data field isn’t too inconsistent to manage. That inconsistency needs to be handed at either back-end or via front-end code. In case of very complicated inconsistency, the whole idea of this schema becomes useless.

timestamps

The timestamp field provides information on when the data was created and updated. For createdAt field, a timestamp needs to be inserted during data creation. For updatedAt field a simple function can be used to wrap the update logic. The function then automatically appends updatedAt field each time. Here’s a simple one which I wrote for my use

For implementing the soft delete functionality, you can use the deletedAt field. In order to track the item’s author createdBy field comes in handy.

This is my solution to tackle the minor changes in database schema while working with NoSql databases. I prefer to keep very little amount of data format inconsistency and normalize it on front-end during data rendering. If the amount is very high, its best to write a back-end script to normalize the data.

If you find any flaw, have some improvements or you have a different or more cleaner approach, please share in the comments below.