Modernizing analytics pipelines with Data Connect

03 Jul 2025

Oleksandr Polyvanyi (Solution Architect), Oleksii Svitlychnyi (Technical Writer)

This post shares how ABCloudz and Ellucian helped one of the largest U.S. community college systems move from legacy SQL Server infrastructure to a cloud-native analytics platform on AWS, using Ellucian Data Connect to modernize data pipelines and integrations. We dealt with business-critical legacy procedures, manual PowerShell scripts, SQL Agent jobs, and a complete lack of automated testing. We rebuilt the legacy ETL with Ellucian Data Connect, automated PostgreSQL testing, and addressed the most critical performance issues between AWS and SAS in Azure.

If you are responsible for university data architecture, analytics modernization, or large-scale cloud migrations, this story will help you understand the challenges involved and see the practical solutions and technical expertise needed to make these projects successful.

Stepping in for a small batch conversion

The ABCloudz team initially stepped into this project when Ellucian and the customer requested our help migrating a few dozens of database stored procedures. At first glance, that task looked like an ordinary routine. The customer’s legacy data warehouse relied extensively on Microsoft SQL Server stored procedures to process and validate incoming institutional data. Our job was simply to convert a limited number of these from T-SQL into PL/pgSQL for deployment on AWS RDS for PostgreSQL.

As we began work, our engineers quickly grasped the true scope of the system.

*Click or tap the image to view it in full size*

It comprised multiple interconnected databases:

INIT stored initial data loads.
REPORT handled validation and preliminary reporting.
TRANSFORMED managed in-depth data transformations.
PRODUCTION was the primary production database containing transformed data.
ARCHIVE stored all raw and historical data loads, so any original input could be reviewed and audited if an issue was identified in the future.
Wage served for specialized wage-related data integration and reporting.
DiplomaSender handled external diploma sender integration data.
RPM (Research and Performance Management) drove analytics and reporting.

The processes moving data through these databases depended on classic ETL (Extract-Transform-Load) techniques. SQL Server Agent Jobs executed stored procedures according to predefined schedules. PowerShell scripts copied data files arriving via manual SFTP (Secure File Transfer Protocol) uploads into the warehouse. Each step involved critical data validation, transformation, and promotion to final storage for reporting.

Expanding scope

Using AWS Schema Conversion Tool (SCT), we rapidly converted the first 20 stored procedures to run effectively in PostgreSQL. Yet, this initial migration was just the beginning. Once our team demonstrated our capability, the scope expanded dramatically. The customer, impressed by our results, asked us to take on nearly 200 additional stored procedures integral to their daily operations. These procedures formed the core of their complex reporting infrastructure, deeply interwoven with business logic accumulated over decades.

Each stored procedure originally written in T-SQL had to be converted into PL/pgSQL to run efficiently on AWS RDS PostgreSQL. Although AWS SCT simplified initial translations, we faced substantial manual adjustments due to fundamental differences between the two database engines. We resolved discrepancies related to data types, syntax, and error handling unique to PostgreSQL. However, transaction control required a separate, more careful redesign, since the legacy system and PostgreSQL follow different execution and error-handling patterns.

Transaction control reimplementation

Most of the original stored procedures followed this transaction control pattern:

The procedure started by beginning a transaction.

It performed data changes (such as INSERT, UPDATE, and DELETE statements).

If there were no issues, the procedure saved the changes and returned control to the invoking SQL Server Agent job.

If an error occurred, the error handling block caught the issue, rolled back all preceding changes, and then disabled the parent SQL Server Agent job for manual investigation and resolution.

The diagram below illustrates this workflow in the legacy system:

In PostgreSQL, we reimplemented this transaction control logic as follows:

Step 1: The Data Connect pipeline calls a procedure in transactional mode.

Step 2: The procedure executes converted SQL statements in PostgreSQL.

Step 3: If execution succeeds, control returns to the invoking Data Connect pipeline with a success result (0).

Step 4: If execution fails, the error handling block returns a failure result (1) to the Data Connect pipeline. Upon receiving a failure, the pipeline sets a global context flag to prevent the next pipeline steps from running and blocks future executions until a data engineer investigates and resolves the issue.

This approach allowed us to preserve robust error handling and data consistency in the new environment while adapting to the architectural standards of cloud-based PostgreSQL and Data Connect.

Automating critical testing

As we undertook this extensive migration, our team quickly identified a significant risk. The legacy SQL Server-based system completely lacked automated testing. Each procedural change required manual validation by loading data files, executing jobs, and visually inspecting database tables. This manual approach posed risks of human error, introduced significant overhead, and threatened project timelines.

To address this, we proactively implemented automated testing scripts in PostgreSQL. These scripts validated every step of the data loading and transformation processes, clearly logging outcomes, row counts, and processing times. Each automated test script was executed within a transaction. This allowed us to run thorough tests repeatedly without impacting the underlying database. If a test identified issues, we easily retained the problematic state in the database by saving results explicitly. Otherwise, we could roll back transactions and retest immediately.

This solution dramatically improved testing efficiency, reliability, and accuracy. Our proactive technical approach significantly reduced manual overhead and provided clear audit trails of each test outcome.

Modernizing ETL workflows with Ellucian Data Connect

Once we successfully migrated the stored procedures, the next logical step was modernizing the legacy Extract-Transform-Load (ETL) processes. Previously, SQL Server Agent Jobs and PowerShell scripts handled ETL activities manually, triggering procedures, loading files from SFTP servers, and coordinating data flows between databases.

Ellucian Data Connect, the tool designated for managing data pipelines in the new SaaS-based environment, initially lacked essential capabilities. Most critically, Data Connect could not directly execute database stored procedures. This posed a serious challenge. To fully modernize ETL processes, Data Connect needed to trigger our newly converted PL/pgSQL procedures automatically.

Recognizing this gap, we worked closely with Ellucian’s product development team to advocate for necessary enhancements. Our collaboration led to the development of a new Data Connect feature allowing the direct execution of PostgreSQL stored procedures.

With this key enhancement, we re-engineered the entire ETL workflow using Data Connect pipelines.

How the new data architecture works:

Data arrival: Educational institutions (over 60 colleges) and external sources now upload data files directly to Amazon S3, which serves as a secure, cloud-based landing zone. S3 provides durability, encryption, and lifecycle management, removing the burden of maintaining on-premises file servers. Event-driven triggers in S3 enable automated downstream processing.
Automatic uncompression: Large data files delivered as compressed archives are automatically uncompressed using AWS Lambda before any ingestion steps. This ensures the Data Connect pipelines work with ready-to-process data.
API integrations: Integrations relying on external APIs continue to use their established logic, but now leverage functionality native to Data Connect for unified orchestration and monitoring.
SFTP server integration: Some external sources (such as DiplomaSender) still use SFTP for data transfer. These files are now first staged in S3, enabling consistent troubleshooting and a unified data flow.
Orchestration and processing: Data ingestion, validation, and transformation are now fully orchestrated by Ellucian Data Connect pipelines. Instead of legacy SQL Server Agent Jobs, these cloud-native pipelines schedule and execute each step, loading validated data into the new Ellucian Insights Data Warehouse built on PostgreSQL and enhanced with custom schemas.
Analytics and reporting: Once data is processed, it becomes instantly available to analytics and reporting platforms, including Tableau, SAS, and Power BI, via standard database drivers and APIs.
Automated notifications: Data Connect also manages email notifications, keeping both institutional users and data warehouse teams informed about data loads, issues, and job status.

This modernized ETL design delivered significant operational advantages. Manual file transfers and scripts were eliminated. The new pipelines automated data ingestion, transformation, validation, and error management.

Performance tuning for cross-cloud data integration

As we moved forward, a significant performance issue emerged. Integrating the new PostgreSQL-based Data Warehouse hosted in AWS with the SAS analytics platform in Azure proved more challenging than expected. The migration itself was successful, yet the interactions between these two cloud environments began showing severe performance degradation.

Identifying root causes of performance slowdown

We quickly investigated the underlying causes and found two main sources:

Network architecture: Data moving between AWS and Azure passed through multiple VPN connections and a state-managed firewall. Since the firewall is centrally controlled by state-level IT services, our team and the customer did not have direct control over its configuration or optimization. Each connection point added latency, causing dramatic slowdowns. Queries that previously executed in minutes began taking hours.

SAS query compatibility: Many legacy SAS queries originally optimized for SQL Server did not translate efficiently to PostgreSQL. Differences in query execution between SQL Server and PostgreSQL contributed further to the performance issue.

Steps taken to mitigate performance issues

We addressed these issues methodically. Our team performed targeted query optimizations, adjusted indexing strategies, and conducted thorough troubleshooting sessions. These steps reduced execution times and mitigated some latency concerns. However, network bottlenecks still significantly limited overall efficiency.

To resolve the networking issue, we recommended creating a direct, dedicated connection between AWS and Azure using AWS Direct Connect. This strategy would reduce the latency caused by multiple VPN hops and firewall checkpoints. Additionally, we advised moving certain data transformation workloads currently executed within SAS back into the PostgreSQL Data Warehouse. Performing transformations closer to data sources within AWS would significantly reduce unnecessary cross-cloud data transfers.

Recommendations for long-term stability

Long-term, we proposed further modernization. This included optimizing existing SAS queries specifically for PostgreSQL, refining indexing strategies within PostgreSQL, and clearly defining data transfer policies. By strategically rethinking how analytics processes interacted with the data warehouse, we set the stage to avoid similar bottlenecks in the future.

Cutover planning and reliable rollback strategy

As migration efforts approached the final stages, careful planning for the production cutover became essential. Our team took proactive leadership to ensure a smooth and controlled transition to the new AWS-based Data Warehouse environment.

Sequencing the migration for minimal risk

We first identified and sequenced all critical migration steps. Our detailed cutover plan defined exactly how data, procedures, and infrastructure components would move to the new platform. Each step included specific tasks, dependencies, and responsible team members. Clear sequencing allowed us to execute the migration methodically, minimizing disruptions to the customer’s reporting and analytics capabilities.

Next, we developed comprehensive validation scenarios. Each migrated procedure and data flow underwent rigorous testing before final deployment. We conducted multiple rehearsals of the cutover process itself. These rehearsals allowed us to identify and mitigate potential risks early, rather than facing surprises in production. Validation results were documented transparently, ensuring the customer maintained full visibility and confidence at each stage.

Preparing for the unexpected

Recognizing that unexpected challenges could still occur, we developed a clear rollback strategy. This strategy allowed rapid restoration of the previous state if the migration encountered severe issues. Our rollback plan outlined precise steps to revert data, re-establish legacy processes, and quickly resume normal operations. Clearly defined criteria guided when and how to invoke the rollback strategy, reducing uncertainty during critical moments.

Delivering a predictable transition

By structuring the cutover and rollback plans comprehensively, we significantly minimized risks associated with migrating mission-critical analytics infrastructure. The customer gained reassurance through our meticulous preparation, extensive validation testing, and proactive communication throughout the transition process.

Modernize your analytics with ABCloudz

This project proved that a well-planned modernization can solve longstanding problems, reduce risk, and give your team confidence in daily operations. We guided our customer through every technical and organizational hurdle, bringing efficiency and transparency to each stage of the migration.

If you are evaluating a similar project or need a partner to support complex changes in your analytics environment, reach out to ABCloudz. We are ready to help you map out a migration plan, streamline your processes, and ensure your data infrastructure is prepared for future demands. Let’s discuss how our expertise can support your next steps.

Contact us

More blogs

Categories:

Share this article:

Facebook Twitter Linkedin