European bank: 200% boost with a unified data warehouse utilizing Data Vault 2.0

intro-img

European bank: 200% boost with a unified data warehouse utilizing Data Vault 2.0

SENLA helps speed up big data processing and cut ownership costs for a major financial institution.

Location:

Europe

Employees:

70.000+

Industry:

Banking & Finance

Customers:

15 million

The Challenge

A leading bank's data was dispersed across many separate databases after merging with two other banks. This significantly slowed down the launch of new services and created problems with reporting and data quality.

The Solution

SENLA facilitated the development of a unified data warehouse using the modern Data Vault 2.0 methodology on the base of an open-source data platform.

The Value

Our Client achieved a 200% boost in data availability and reports generation speed, cut 50% of ownership costs and new products’ time to market, and increased data quality and processing speed several times.

Tags

The Client and the challenges of legacy software

With volumes of data skyrocketing — Statista predicts 120 zettabytes of data generated in 2023! — businesses increasingly grapple with the complexity of data management, especially in critical sectors like banking and finance.

Our Client, a leading European bank operating across Europe, Asia, and Africa with over 20 subsidiaries, faced a complex situation after merging with two other banks.

They inherited three separate information structures and triple expenses on infrastructure, maintenance, and development. This also unleashed a cascade of other issues:

  • Dispersed data across numerous isolated databases slowed down the launch of new products. The technicians simply couldn’t integrate new data sets when the business needed to expand.
  • The bank’s analytics faced significant problems generating monthly reports leading to delays in regulatory submissions.
  • Data duplication and inconsistencies undermined its quality.
  • The technological stack used for these repositories was rapidly becoming outdated.

As a result, our Client suffered from hindered growth and decreased competitiveness.

The search for the golden source

To address these issues, our Client sought to implement a unified "golden source of data", enhance data quality, reduce processing time, as well as modernize and unify the technology stack. They also wanted to facilitate the generation of comprehensive reports and improve time to market for new products and services in the future.

It was clear that our Client needed a unified data warehouse (DWH). Their in-house team couldn’t handle a task of this scale, so they turned to SENLA to outsource the software development and implementation.

Big problems with big data?

We have a bigger-than-life solution

Unlock possibilities today

“This is the way”

Of course, dumping all the data into one database wouldn’t work. Our experts knew that a specific methodology was required to build an effective data warehouse, especially on an enterprise level. The things to consider included scalability, development speed, integration of new data formats, technology independence, business orientation, and security.

The answer to all this was Data Vault 2.0 (DV2), a robust methodology for designing data warehouses.

DV2 was chosen for its remarkable scalability and flexibility resulting from its architectural principles. Databases in this methodology are initially modeled upon core unique business concepts and needs that constitute hubs. There are also satellites that are points in time and include descriptive data. Links are relations and associations that connect the hubs.

Thanks to this, DV2 effortlessly adapts to any changes in the business process, significantly simplifying the job for technicians compared to other methods. Using this methodology, a company can, for example, seamlessly and quickly integrate or modify services.

How to set up a pipeline

SENLA already collaborated with the Client’s colleague on a similar solution, so our vetting stage was short. After completing a series of project interviews, SENLA’s data engineers joined the development of the unified data warehouse in 2020. The Agile approach allowed us to release new features each sprint.

“Our team was enthusiastic to work on this project. The Data Vault 2.0 methodology is a relatively new approach and is highly relevant for the large big data projects we commonly work on at SENLA. Besides, we were able to help the Client beyond our initial scope. We found a bug in their software related to decimals being rounded incorrectly. Fixing it helped save a lot of trouble for our Client.”

Vadzim Herasimovich, SENLA’s Software Engineering Team Leader

Our main task was to develop and organize data transfer from multiple sources to the DWH. Wе also ensured all necessary transformations and scheduling for it to be loaded correctly.

This process represents the ETL pipeline, which stands for Extract > Transform > Load. It constitutes the 3 steps that the data goes through on the way to the unified repository.

Why ETL?

ETL is preferred in data warehouse use cases (as opposed to ELT and data lakes) because it demands the data to be transformed and cleansed before it’s loaded into the repository. 

So, when the data gets to the DWH, it’s already normalized and consistent, ready to be used for reports and advanced analytics. Only this pipeline ensures that the repository data adheres to the “golden source” standard.

The pipeline was based on Apache Airflow, with microservices-based process management, and Apache Kafka as a data bus. The microservices' primary goal was to transmit data from numerous sources to the various levels of the warehouse. At this stage, we managed to successfully reduce the load on the sources to ensure stability under any circumstances.

“All your (data)base are belong to us”

The development team chose the open-source MPP Greenplum as the data platform.

MPP (massively parallel processing) allowed for distributing the workload across many servers, enabling multiple simultaneous operations. This feature exponentially speeded up the processing of the bank’s huge volumes of data (up to petabytes).

The database was procured through a vendor, as our Client sought to have open influence and control of the product's evolution. They wanted to align development plans to receive the exact support and features they needed. Moreover, these same features later would become accessible to all clients of the vendor in the financial sector, and vice versa. That gave our Client confidence that their platform wouldn’t become "stagnant" but instead be continuously updated in line with the latest market trends.

“At the start, we were not sure if we could launch such a solution from scratch using only open-source tools. But our partners provided us with all the necessary expertise. Yes, we had many discussions, including in the evenings, when we devised complex schemes. But there were never any unsolvable issues. Only complex engineering tasks.”

Client’s Representative

To align with legal restrictions and security considerations, the system was built on-premises, similar to the big data solution we did for another Client.

Prior to deploying the final solution to production, our experts conducted multiple testing sessions on various development environments. The tests focused on data loading, query speed, integration with BI tools, and stress testing to identify possible breakpoints. Notably, no breakpoints were found, and the solution demonstrated outstanding performance levels in both synthetic and real-life test scenarios.

The value: +200% data availability and +800% processing speed

We worked on this project for 1.5 years and helped our Client achieve impressive results.

"In addition to saving on ownership costs two times thanks to using an open-source database, we saw a significant increase in the processing speed. This gave us a higher level of data availability, by our estimates it has doubled."

Client’s Representative

Some other outcomes of our collaboration included:

  • Data processing became up to 800% faster for some operations.
  • Data quality increased multiple times.
  • Reports generation speed increased by 200%.
  • Time to market for new products was cut by at least 50%.
  • Storage space was saved, and the data took 30 TB in the new repository instead of the initial 50 TB.
  • The bank's offers and services became more relevant, enhancing the overall efficiency of business units.

Big data doesn't have to be overwhelming. SENLA can help you streamline operations and deliver analytical insights quickly and efficiently. Contact us, and step into the data future today.

Why Senla?

Big data gurus

We understand the great importance of big data solutions and make them one of our priority services. Our data engineers take relevant courses and constantly adapt to the latest trends in technology.

Successful big data projects

Apart from one of the best-in-class theoretical bases, our experts often apply it in the field, having accomplished numerous profitable projects.

Lightning-fast onboarding

By choosing us, you spend the least amount of time possible on preparatory activities, reducing the profit loss caused by this stage to the minimum for your business.

Frequently Asked Questions

What kind of services do you provide? Is it outstaffing, outsourcing, or custom software development?

All of them! Understanding the market needs, we’re offering all 3 engagement models to accommodate all needs you might have, from just reinforcing your team to fully developing a solution for you from scratch.

I have legacy software that needs an update but I don’t want to shut down my business. Can you help?

Yes, we work with all kinds of systems including legacy ones. If you need to modernize your software, we will introduce the changes gradually to make sure your business flows uninterrupted.

Can I account for the same level of maintenance & support after a project's launch as with an in-house team?

Absolutely! Many of our Clients choose extended maintenance that can continue for years after the release. For Abbott we even provided 24/7 online support during live events. So rest assured, we will do everything to maintain your satisfaction.

Request an offer

More than seven hundred technical experts are ready to work

Contact info

Help me with:optional

Development