Microsoft Data Warehouse Solutions

See how ABCloudz can help you get the most out of the Microsoft Data Warehouse platform technologies like SQL Server, Azure Synapse Analytics, Azure Data Lake, and more.

Why you should consider data warehouse solutions from Microsoft

Microsoft offers the most comprehensive logical data warehouse solution for on-premises and the cloud. In the case of the cloud, we are talking about Microsoft Azure and Office 365 with integration of services like Power BI, PowerApps, Flow, SharePoint and other software-as-a-service productivity applications.

In our experience, SQL Server 2017 Data Warehouse features provide the best performance, security, and high availability compared to other popular databases on the market. Here is how Microsoft promotes their advantage over other platforms:

Advantages for using Microsoft Azure Synapse Analytics solutions featuring SQL Server 2017 on-premises

In addition to using SQL Server 2017, Microsoft offers the Analytics Platform System with appliance offerings from HP, Dell, and QCT for massively paralleled processing power that can scale out to 6 petabytes of data.

Likewise, Microsoft Azure data services featuring Azure Synapse Analytics (previously Azure SQL Data Warehouse) and Azure Data Lake Analytics provides great on-demand scaling with competitive pricing. More importantly, you have excellent integration with other services like the Microsoft AI platform and open source solutions.

Microsoft Azure Synapse Analytics services advantage

Everything you hear from vendors like Microsoft is that their platform provides the freedom of choice. The problem is when you have too many choices!

Choices within your data center

Within your data center, you have several choices focused around your IT infrastructure. For example,

  • Servers:
    • Size: Do you go with bare-metal servers or virtualization with VMware or Hyper-V using a large host server?
      • Tip: Most of the time, we recommend going with a virtual environment to avoid over-provisioning of CPU cores and memory.
    • Memory: How much memory is enough? SQL Server will attempt to use all available memory to cache data or use it for In-Memory OLTP and In-Memory Columnstore.
      • Tip: Get as much memory as the server supports. All the database vendors use per-core licensing. However, the more memory you have, you reduce the potential for slow query performance due to memory pressure. If you are virtualizing systems, you can always allocate the memory for other VMs and services.
      • Tip: If you want the best query performance for data warehouse queries with SQL Server 2017, consider running on Linux and using a combination of In-Memory tables with Clustered Columnstore indexes. To determine how much system memory is needed to keep the data fully in memory, the general rule is to allocate 3 times the size of the table when using just a Clustered Columnstore index for RAM. For example, if your Fact table uses 1 TB of Clustered Columnstore data, plan on 3 TB of RAM.
    • Operating System: Now that SQL Server 2017 runs on Linux, you have yet another choice to make. Here is a data point to consider – the SQL Server 2017 Enterprise Edition running on SUSE Linux Enterprise Server 12 SP3 for X86_64 holds the world record for the new TPC-H Advanced Sort Results benchmark as of March 21, 2018.
      • Tip: If you run both Windows and Linux operating systems in your data center, consider SQL Server 2017 on Linux. DBA tools like SQL Server Management Studio fully support management of SQL Server 2017 on Linux. It’s well worth a little Linux training for your SQL Server DBAs to get better performance with lower operating costs. Our solution architects can craft a one-day crash course on Linux for the SQL Server DBA for your team.
    • Data Warehouse Fast Track solutions: One of the easiest ways to optimize SQL Server 2017 for data warehouse solutions is to choose one of the hardware vendors and their Fast Track solution. Hardware vendors like HPE, Dell EMC, Cisco, Quanta, Lenovo, and more have created hardware solutions based on the Microsoft fast-track reference architecture, optimized for data warehouse solutions. Appliances range in size to support data warehouses from 1 TB  to 120 TB using a single server with a symmetric multiprocessing (SMP) architecture.
    • Analytics Platform System solutions: If your data warehouse has to grow beyond 120 TB and it has to remain on-premises based on your organization’s data governance policies, Microsoft Analytics Platform is a great way to go for up to 6 petabytes of storage. Solutions from HPE. Dell EMC, and QCT.
  • Storage:
    • On-board versus different types of storage arrays: We recommend a hybrid approach when it comes to storage systems based on our experience and guidance from the Fast Track reference architecture.
      • Tip: Consider using Flash storage on the physical server for allocation to the virtual machine for SQL Server’s TempDB. For complex data warehouse queries, TempDB can often be the file IO bottleneck. Local flash storage can dramatically reduce disk latency attributed to TempDB usage.
      • Tip: Consider using all-flash storage arrays from your favorite storage vendor. You typically can get five times the density within your physical rack, five times the power efficiency, and ten times greater reliability over traditional spinning disk drives. Not to mention 2M raw IOPS depending on your vendor’s solution. ABCloudz has vendor relationships with Western Digital, Cisco, Pure Storage, and NetApp. Our solution architects along with architects from our storage partners can help your team optimize performance for your data warehouse solutions.

Choices in the cloud

Here is where the decision-making process becomes interesting.

Virtual machines:
Microsoft, AWS, and Google continually improve server technology with more cores and more RAM. For example, you can provision a server with up to 128 cores and 4 TB of RAM that can support 30 TB of data on SSD storage. If you then consider using SQL Server 2017 PolyBase with scale-out groups to connect to one of the supported Hadoop distributions, Azure Blob Store, or Azure Data Lake Store, you can effectively extend your data warehouse limited only by your cloud spending limit.

  • Tip: Based on our benchmarking and customer experience, running SQL Server on virtual machine solutions for Azure, AWS, and Google Cloud will give you greater performance per dollar for up to 15 TB data warehouses compared to Azure Synapse Analytics. Of course, you need to manage the server. However, Microsoft Azure provides manageability features like automated patching within a scheduled maintenance window, automated backups to blob storage, Geo Redundant Storage, easily configurable Always On Availability Groups with readable secondaries, Azure Site Recovery, and pre-configured VM templates optimized for data warehouse solutions.

Azure Synapse Analytics: This solution offers a cloud-based massively parallel processing (MPP) data warehouse architecture similar to the Microsoft Analytics Platform System. Due to the MPP architecture, we recommend, based on guidance from Microsoft and our own experience with customers, considering Azure Synapse Analytics for data warehouse sizes 10TB or more. You get essentially unlimited scale with PolyBase capabilities to integrate your cloud-borne dark data with data from your enterprise data warehouse.

  • Tip: Don’t consider a proof of concept pilot unless you have at least 10 TB of data from your data warehouse. We’ve had first-hand customer experience migrating small Oracle Data Warehouse solutions to Azure Synapse Analytics under 5 TB, where performance was substantially slower than the original Oracle database system. Once we increased the data over 10 TB, Azure Synapse Analytics performance was significantly faster with a great price per terabyte savings over the Oracle solution.

Azure Synapse Analytics Overview

Oracle and Microsoft prohibit us from publishing benchmark values, but you can contact us directly to learn more about migrating your Oracle Data Warehouse databases to Azure Synapse Analytics.

Azure Data Lake Store: This serverless based solution, along with the U-SQL for querying non-relational data across a variety of data sources, provides a great way to understand your dark data.

  • Tip: Consider using just Azure Data Lake Store to analyze your dark data. If you discover data that needs to get incorporated into your Enterprise Data Warehouse, go ahead and update your data model and ETL processes to load the data into either Azure Synapse Analytics or SQL Server running on an Azure VM. You can then incorporate your standard practices of using data cleansing, master data management, and optimizations with Analysis Services for a consistent view of the data for business decision makers.

Here is a great overview provided by Microsoft to better understand the choices available to get the most out Microsoft Data Warehouse solutions.

Microsoft Modern Data Estate for Data Warehouse Solutions

Getting started

We have a variety of offers and incentives from Microsoft to help you get started with our team of experts to future-proof your data warehouse and analytics solutions.

We start with assessing your data environment. At the beginning of a project, you discuss existing database deployments and tools with a dedicated ABCloudz project team. This step helps establish the information, resources, and team members needed from your side.

You then make your data infrastructure available to trusted ABCloudz team members. We use this access to identify issues and craft a modernization path that fits within your current environment.

Our proficient solutions architects then design the future-state cloud architecture of your data environment. Our experienced team can provide you with comprehensive technical and financial information about the potential pros and cons of any proposed modernization project.

As a result, we can either provide your team with cloud deployment scripts, recommendations, and suggestions, or go all the way and create the cloud infrastructure using our manpower.

Building an ultimate cloud-based business intelligence solution

One of the largest North American agricultural enterprises had been envisioning and designing a new cloud-based Business Intelligence and Data Warehousing architecture. They were using a fragmented system, based on Oracle databases, while the reports were presented as a mix of SQL Server Reporting Services (SSRS) and Excel dashboards.

We came up with a solution allowing them to consolidate all reports with Power BI dashboards. In addition, we migrated the source Oracle database to Azure Synapse Analytics to reduce licensing costs and improve database performance.

Learn more about how we approached this migration project from the following video.

Hi there! How can I help you?