Recently, the ABCloudz team migrated a customer’s database from Apache Cassandra to Amazon DynamoDB. As part of this project, we needed to run multiple tests to ensure perfect quality for our delivery. However, we faced a problem in setting up the test environment. Even though we were using AWS CloudFormation templates for the EC2 instances, it was taking over 30 mins to get the dev/test environment setup. In addition, our developers couldn’t run parallel regression testing. So, we decided to leverage Docker to automate the environment set up.

In this blog post, we will share our experience with using Docker for automating a NoSQL database migration dev/test environment.

Original problem

Initially, we needed several virtual machines to run both source and target database applications and execute regression tests. Even with This whole process  In addition to that, our engineers spent approximately 30 minutes manually configuring every virtual machine. The image below illustrates the architecture of the original solution.

Original diagram

 

Here is a summary of the key tasks that we needed to perform.

  1. Install the source Apache Cassandra database on one virtual machine.
  2. Create another VM to deploy a new Cassandra node and copy the source data here.
  3. Set up the data replication from the source database to this temporary node.
  4. Setup another virtual machine after the data in this temporary node is ready for extraction.
  5. Install the data extraction agent and start it.
  6. During the data extraction, we create log files, directories, etc.
  7. Repeat steps 1 through 6 for the test environment.

As you can see, we needed to keep quite a few virtual machines up and running on AWS. However, the load of these virtual machines was not stable. At this point, we needed to think about isolating processes to run different tests in parallel.

Creating a dev/test migration environment for Apache Cassandra to AWS using Docker

We created a Docker container with a local repository for the virtual machine we used to extract the Apache Cassandra source data. This container includes all required applications, including an Apache Cassandra node, Java, ssh, sshfs, etc. Also, we set up all the access settings. Here is the high-level architecture for the solution.

docker solution

 

When we run this container it executes the script that sets up the environment and mounts the required Cassandra nodes. The whole process now takes no more than 1 second! What’s even more important is that when we need to run the test again, we simply close the container and run it again. Now, different users now can create several containers and use them simultaneously. Also, different containers can use various Cassandra nodes.

Future plans

We successfully utilized the existing solution and it fully met the project requirements. To improve the process, we are working on the following improvements:

  • Create separate Docker images for different versions of Cassandra (this allows for decreasing the risk of occasional data changes during the test runs)
  • Create the Docker images for various agents that can start working automatically (when needed, on request). This will allow several developers to work in parallel with the same Cassandra data center with no interference

Тest Automation with Docker

This is how Docker helped us build a perfect test automation process. However, you can also consider using Docker for a variety of different tasks. Particularly, you can minimize the usage of system resources with Docker. And, you don’t need to support multiple versions of operating systems. Docker simplifies and speeds up the development and testing processes.

This is how Docker became one of the most important instruments for any developer. And it’s very difficult to underestimate the importance of Docker. We can compare the benefits of this revolutionary container service with Git. And if haven’t used Docker yet, now is the right time to give it a try.

Ready to start the conversation?