Recently, the ABCloudz team migrated a customer’s database from Apache Cassandra to Amazon DynamoDB. As part of this project, we needed to run multiple tests to ensure perfect quality for our delivery. However, we faced a problem in setting up the test environment. Even though we were using AWS CloudFormation templates for the EC2 instances, it was taking over 30 minutes to set up the dev/test environment. In addition, our developers couldn’t run regression tests in parallel. So, we decided to leverage Docker to automate the environment setup.
In this blog post, we will share our experience using Docker to automate the setup of a NoSQL database migration dev/test environment.
Original problem
Initially, we needed several virtual machines to run both source and target database applications and execute regression tests. The whole process was cumbersome and time-consuming. In addition, our engineers spent approximately 30 minutes manually configuring each virtual machine. The image below illustrates the architecture of the original solution.Here is a summary of the key tasks that we needed to perform:
- Install the source Apache Cassandra database on one virtual machine.
- Create another VM to deploy a new Cassandra node and copy the source data there.
- Set up data replication from the source database to this temporary node.
- Set up another virtual machine after the data in the temporary node is ready for extraction.
- Install the data extraction agent and start it.
- During data extraction, create log files, directories, etc.
- Repeat steps 1 through 6 for the test environment.
As you can see, we needed to keep several virtual machines up and running on AWS. However, the load on these virtual machines was not stable. At this point, we needed to think about isolating processes to run different tests in parallel.
Creating a dev/test migration environment for Apache Cassandra to AWS using Docker
We created a Docker container with a local repository for the virtual machine we used to extract the Apache Cassandra source data. This container includes all required applications, such as an Apache Cassandra node, Java, SSH, SSHFS, etc. We also set up all necessary access settings. The image below shows the high-level architecture of the solution.When we run this container, it executes a script that sets up the environment and mounts the required Cassandra nodes. The entire process now takes no more than 1 second! Even more importantly, when we need to run the test again, we simply close the container and run it again. Now, different users can create several containers and use them simultaneously, with each container using various Cassandra nodes as needed.
Future plans
We successfully utilized the existing solution, and it fully met the project requirements. To further improve the process, we are working on the following enhancements:
- Create separate Docker images for different versions of Cassandra. This will decrease the risk of accidental data changes during test runs.
- Create Docker images for various agents that can start working automatically when needed, on request. This will allow several developers to work in parallel with the same Cassandra data center without interference.
Test automation with Docker
This is how Docker helped us build an efficient test automation process. However, you can also use Docker for a variety of other tasks. For instance, Docker can help minimize the usage of system resources, and you won’t need to maintain multiple versions of operating systems. Docker simplifies and speeds up the development and testing processes.
Docker has become one of the most essential tools for any developer. It’s hard to overstate its importance. We can compare the benefits of this revolutionary container service to those of Git. If you haven’t used Docker yet, now is the right time to give it a try.