Banks must recognize that the depth of tech failures goes far below the surface of complaints and downtime, to a hidden layer of infrastructure, coding and configuration that accounts for as much as 90 percent of failure costs.
echnology failures are becoming increasingly frequent and increasingly detrimental to banks. In our modern digital world, the impact of these failures is far-reaching for both finances and reputation. The question becomes, how do you build a genuinely resilient bank?
Network outages and tech failures have hit the headlines in 2024, with numerous regional banks hit by some form of digital disruption. These failures come at a time of growing consumer pressure for more advanced digital services, with increasing expectation of convenient, always-on services. In parallel, regulators’ own expectations are deepening across the region, as too are regulatory interventions on those banks perceived to be underperforming.
In Indonesia, cyber-resilience requirements are a focus in technology risk-management guidelines championed by agencies including the Financial Services Authority (OJK). The risk of failure to comply are clear, with institutions and individuals potentially exposed to punitive action by regulators.
We know that banks are continuing to accelerate investment in tech, with 8.2 percent of revenue spent on technology in 2023 compared to 7.2 percent in 2020 (BCG analysis). But is it going to the right places to build resilience?
Despite expanding investment, banks are struggling to avoid failure. Investment is heavily focused on digital engagement, accounting for almost half of total annual spending. This can leave vulnerabilities in areas such as the data environment, core systems and infrastructure, incurring large technology debts and delivering fragmented response efforts to less critical but still severe incidents.
We see growing complexity across numerous IT touchpoints in our work with banks across Southeast Asia.
Modular architecture is leading to a drastic increase in the number of services to monitor. Cloud adoption brings in the challenges of third-party asset monitoring, something the growth of ecosystems and partnerships is also impacting. The rise in data-driven organizations creates a data-handling flood. Agile ways of working, shifting spending, changing service agreements and other factors add further complexity that must be managed.
The motivation for these changes is clear, as banks adapt to an evolving ecosystem, but resilience must be part of that.
Resilience in a broader context
Resilience must be looked at in a broader context than just the uptime of information technology (IT) solutions to build a truly robust organization, recognizing that failure will happen but requiring efforts to mitigate and respond to the impacts.
Banks must recognize that the depth of tech failures goes far below the surface of complaints and downtime, to a hidden layer of infrastructure, coding and configuration that accounts for as much as 90 percent of failure costs.
We know that the barriers to transformation can be high. Limited funding, unrealistic expectations around availability and a focus on platform versus end-to-end availability are hurdles to overcome. So too is a lack of joint ownership between business and technology and the often low automation of run processes, which introduces potential for human error. This can be compounded by weak governance across change and run processes.
Six fundamental elements for improving resilience
Building resilience requires six fundamental elements that together combine to create a flexible, modern tech ecosystem for banks.
First, strategy and ambition must be aligned. Banks should set a realistic ambition for service availability. Regulators expect 99.95 percent service availability, with most banks in Southeast Asia hovering between 97 percent and 99 percent. Raising resilience from 99.95 percent to 99.99 percent comes with a significant cost. Remember, too, that resilience is more than just availability; focus on reliability too.
Top-down alignment from management and the board is a key priority for resilience. Sufficient funding to deliver on these ambitions is also essential. Business and technology will need to work together to deliver on this.
Second, know your services and prioritize critical services for resilience. Set a different resilience ambition by the priority of services, considering customer, bank and regulator impact. Ensure service-level agreements (SLAs) are broken down into service-level objectives (SLOs) and service-level indicators (SLIs) and target across availability, success rate, latency and throughput. Run regular scenario testing to identify potential points of failure.
Third, organizational readiness is vital, with predefined plans to deliver services when a tech outage occurs. Banks should develop a corporation playbook for incident management and recovery that goes beyond just technology. A multidisciplinary team should be established with dedicated capacity for improving resilience, with accountability to improve service resilience. Strong business-tech engagement and governance around backlog prioritization is key for this, with error margins that inform investment priorities.
Fourth, build with automation that incorporates continuous integration and continuous development, and be careful to avoid configuration drift. Embed strong controls for changes in critical applications, with higher testing requirements for critical services and minimal changes undertaken during critical hours.
Fifth, run with observability and monitoring of all components, both technology and non-technology. Redefine the alert model to identify true positives for human action to improve emergency response as and when needed.
Finally, build on a solid tech ecosystem. This is the foundation on which banks’ wider success is delivered. It should be modular, scalable architecture with the ability to identify and isolate points of failure. Incorporate horizontally scalable applications to degrade gracefully, and back it with a dedicated test environment for simulation and stress testing.
Leading financial organizations offer key lessons to inform this ambition.
A global financial services provider ensured availability by prioritizing mission-critical services, leveraging end-to-end service monitoring including internal and external stakeholders. It also embedded the ability to provide a degraded service during the failure of third parties, leveraged automation with a particular focus on points of failure in the ecosystem and provided generous infrastructure reserves greater than 30 percent. These are the kind of steps required to deliver genuine resilience.
Technology is central to a modern bank, and as a result, so too is technology resilience. Banks can’t afford to get fixated on shiny front-end tech improvements at the cost of the vital nuts and bolts which ensure a reliable and resilient service. Banks must commit to embed resilience, build reliability and deliver on the expectations of customers and regulators.
***
The writer is principal at Boston Consulting Group.
Share your experiences, suggestions, and any issues you've encountered on The Jakarta Post. We're here to listen.
Thank you for sharing your thoughts. We appreciate your feedback.
Quickly share this news with your network—keep everyone informed with just a single click!
Share the best of The Jakarta Post with friends, family, or colleagues. As a subscriber, you can gift 3 to 5 articles each month that anyone can read—no subscription needed!