Scalable schema for DynamoDB

DynamoDB is a NoSql database and models data in Table-Items-Attributes structure. It allows data-item to be schema less. It’s a huge advantage as it allows you to write data in any format as desired. Even if structure of your data changes eg: a new field is added/deleted at later point of time, it adapts to it well.

With freedom of schema flexibility, data format inconsistency comes alongside. Even if database provides the power of flexibility, the schema shouldn’t change to that point at which it affects the performance of the system. It becomes an overhead to write separate business logic to manage the data inconsistency. It’s a good idea to resolve this issue on database layer.

I came up with a simple, yet effective solution to solve this problem. It worked pretty well in my scenario, hope it works well in yours too. For demonstration purpose Nodejs runtime have been used for business logic.

Here is the dead-simple schema for all of your DynamoDB tables.

As you can see this schema provides all the required data and metadata fields. The data field will contain the core information and others are metadata.

Lets explain major fields one by one.

id

id is the primary key to uniquely identify each item in a table. As DynamoDB doesn’t provide automatic generation of this field, the developer is required to generate it manually. The uuidjs library seems to be the best fit.

v1(version 1) of uuid guarantees that it is always unique even if everyone is doing millions of database insertions/sec. An identifier has been pre-appended so that its more unique and easier to separate from others(its optional though).

Read more about the UUID algorithm here.

data

The data field is a map type attribute. Maps are enclosed in curly braces({ … }). It is similar to a JSON object. There are no restrictions on the data types, and the elements do not have to be of the same type.

example:

The structure is similar for both user and the product. Core data lies inside the data field.

Make sure that the value inside data field isn’t too inconsistent to manage. That inconsistency needs to be handed at either back-end or via front-end code. In case of very complicated inconsistency, the whole idea of this schema becomes useless.

timestamps

The timestamp field provides information on when the data was created and updated. For createdAt field, a timestamp needs to be inserted during data creation. For updatedAt field a simple function can be used to wrap the update logic. The function then automatically appends updatedAt field each time. Here’s a simple one which I wrote for my use

For implementing the soft delete functionality, you can use the deletedAt field. In order to track the item’s author createdBy field comes in handy.

This is my solution to tackle the minor changes in database schema while working with NoSql databases. I prefer to keep very little amount of data format inconsistency and normalize it on front-end during data rendering. If the amount is very high, its best to write a back-end script to normalize the data.

If you find any flaw, have some improvements or you have a different or more cleaner approach, please share in the comments below.

Scalable schema for DynamoDB

SHARE ARTICLE

id

data

timestamps

Subscribe and get our latest Blog Posts right in your inbox