Migrating Cassandra to Amazon DynamoDB

Considering the migration of Apache Cassandra to Amazon DynamoDB? Learn how ABCloudz can help you get there and take advantage of a fully managed NoSQL database service.

Why migrate Apache Cassandra to Amazon DynamoDB?

Are you looking for greater flexibility, speed, and scale for your Apache Cassandra solutions? You can consider Amazon DynamoDB as an alternative for your existing workloads.

The ABCloudz migration team has already used the new data extraction agents to migrate Apache Cassandra database to Amazon DynamoDB. So, here we will share our experience with you. Also, we will talk about the unusual migration approach implemented in the AWS Schema Conversion Tool for this pair of databases.

Background on Apache Cassandra

Apache Cassandra is a free and open-source distributed NoSQL database management system. Cassandra is a wide column store database. Rows are organized into tables and the first component of a table’s primary key is the partition key.

Initially, Cassandra was developed to handle large amounts of data across many commodity servers. Since July 2008, when Facebook brought Cassandra to the market, developers created quite a few versions of the Cassandra database. Actually, the code behind these versions is very different, and this brings many incompatibility issues when working with Cassandra workloads. So, you have to pay attention to the Cassandra database version and consider its limitations when you start a migration project.

Background on Amazon DynamoDB

Amazon DynamoDB is a fully managed proprietary NoSQL database service. DynamoDB allows you to create database tables that can store and retrieve any amount of data.

Our developers figured out that DynamoDB databases may serve almost any level of request traffic. You can easily scale up or scale down your DynamoDB tables’ throughput capacity without downtime.

Why choose Amazon DynamoDB instead of Apache Cassandra?

Sometimes Amazon DynamoDB can provide greater scale and performance over your existing Apache or Datastax Cassandra workloads. This very much depends on your database use cases. For example, DynamoDB works better with real-time bidding platforms, gaming applications, and recommendation engines. Cassandra has its own advantages too, however, now we will consider the benefits of Amazon’s solution.

  • Performance. DynamoDB scans the data much faster, especially if you don’t have a primary key in your query.
  • Consistency. DynamoDB provides you with strong consistency, while Cassandra can have issues with frequently updated data due to latency issues between distributed nodes.
  • Global reach. In addition to that, DynamoDB provides Global Tables for deploying multi-region, multi-master databases without you having to maintain your own replication solution.
  • Security. This may also be a major concern for Cassandra’s data at rest, while DynamoDB encrypts the data at rest and in transit.

If you’re encountering similar issues, consider moving your workloads to Amazon DynamoDB! However, this may be a challenging task. Let’s discuss some of the key limitations we encountered while migrating Apache Cassandra to Amazon DynamoDB for our customer.

Workload challenges we see migrating Cassandra to Amazon DynamoDB

Despite the similarities in data models, Apache Cassandra and Amazon DynamoDB have some critical architectural differences. Generally speaking, Apache Cassandra is a column-oriented data store, while Amazon DynamoDB is a key-value and document-oriented store.

Cassandra’s table consists of rows. These rows may contain different numbers of columns. Opposed to that, DynamoDB considers rows as database items, and cells as attributes. Here you can define a schema for each item, rather than for the whole table.

Both Cassandra and DynamoDB databases require primary keys for your tables, and they both use partition keys to distribute your data. However, the meaning of partition is different in Cassandra and DynamoDB. In Cassandra, a partition is a set of rows with the same partition key. Therefore, Cassandra stores these rows on one node. In DynamoDB, a partition is a physical part of storage allocated for a particular chunk of a table.

Finally, we should note that Cassandra supports more data types than DynamoDB. So, during while migrating Cassandra to Amazon DynamoDB, you need to consider the right type mapping. Below, you can find additional challenges that emerge when migrating Cassandra to Amazon DynamoDB.




Application conversion

Up to 25% of a migration effort involves updating application code. So, what do I need to effectively update my application code to work with the new database?

Database migration projects include not only schema conversion and data migration. To work properly with your data in the new environment, you need to convert the application code to support the new database platform. In most cases, you can use AWS Schema Conversion Tool to make the application code compatible with the new target database. However, AWS SCT does not support application conversion for Cassandra to DynamoDB migrations. So, you will need to convert your application manually.

Validating migration

After migrating data from one database to another, you want to make sure that all files migrated successfully. In other words, you need to proof your migration. How can I do that?

One of the most important challenges in any database migration project is validation or verification. Of course, you want to be sure that the data in the source and target databases are identical. However, comparing data in two NoSQL databases proves to be a very hard task.

Amazon DynamoDB limitations

Amazon DynamoDB database has some restrictions and limitations at its core. So, what do I need to consider to successfully migrate Cassandra workloads to this Amazon’s cloud platform?

You need to examine DynamoDB’s limitations before the start of migration. For example, you need to create the right architecture design for database items (rows) because their size is limited by 400 KB. This limit includes both the attribute name binary length and attribute value lengths. The attribute name counts towards the size limit.

Another limitation is related to Cassandra’s collection types. DynamoDB doesn’t support collection types (set, list, and map). Moreover, you can’t use AWS Database Migration Service (DMS) to upload the data of these types to the Amazon cloud.

In addition to that, you should consider that in any AWS account, you can store no more than 256 tables per region. If you reach this limit, you can either restructure your database design or request Amazon support for a service limit increase.

Getting started

ABCloudz has broad experience with moving customer’s workloads to the Amazon cloud. We were engaged in hundreds of successful customer’s migrations from on-premises to AWS. Moreover, ABCloudz has hands-on experience with data extraction agents for an Apache Cassandra to Amazon DynamoDB migration. Find some more exciting details below.

Simplifying the process for migrating Cassandra to Amazon DynamoDB

AWS DMS supports Apache Cassandra versions 2.0 and 3.0 as a source, with Amazon DynamoDB as a target.

What makes this process interesting is the clone datacenter task. To avoid interfering with production applications that use your Cassandra cluster, AWS SCT will create a clone datacenter and copy your production data into it. The clone datacenter acts as a staging area, so that AWS SCT can perform further migration activities using the clone rather than your production datacenter.

ABCloudz supports the AWS Database Migration Service (DMS) and AWS Schema Conversion Tool (SCT). We have helped hundreds of customers successfully migrate to the Amazon cloud. We use our 12-step migration methodology to help your organization streamline its database migration projects. Below you can find the basic ABCloudz offerings that can help you cost-effectively accelerate cloud adoption.

Hi there! How can I help you?