ARTICLE
2 September 2024

Part 22: Guidelines For Using Artificial Intelligence In Financial Institutions (FINMA)

Many banks, insurance companies and other Swiss financial institutions are currently working on projects involving the use of artificial intelligence...
Switzerland Technology

Many banks, insurance companies and other Swiss financial institutions are currently working on projects involving the use of artificial intelligence – some of which have been running for years, while others are new. While the Federal Council is still considering where and how AI should be specifically regulated, FINMA has already defined its expectations in the form of four guiding principles for AI. But what do these guidelines mean? We provide an answer in this, part 22 of our AI blog series.

Despite the current hype surrounding AI, the use of artificial intelligence in the Swiss financial industry is nothing new. Applications trained with machine learning have been used for many years, for example in recognizing money laundering or fraud, in investment decisions or for forecasts in the insurance industry. What is new are applications based on generative AI. They are used in financial institutions via tools such as "ChatGPT", but also built into applications for specific applications. Examples that we see used by our clients include the recording and evaluation of meetings, the extraction of content from documents, the analysis of call center conversations, the summarization of news or the automated analysis and design of contracts.

"Supervisory expectations" and controls

Of course, these activities have not gone unnoticed by FINMA. It already commented on the challenges of AI in its "Risk Monitor 2023" in November 2023 and formulated its supervisory expectations for Swiss financial institutions. It writes that AI is also expected to bring various changes to the financial market. In its "Annual Report 2023", it later stated that the autonomy and complexity of AI systems would harbor various risks. It cited the danger that results generated by AI cannot be understood or explained by humans, that errors or unequal treatment can creep in unnoticed or that responsibilities are unclear. It has been conducting on-site inspections and supervisory discussions on this topic with financial institutions since the end of 2023 and intends to continue doing so.

Financial institutions that use AI must therefore expect to be questioned and inspected by FINMA and to have to explain how they identify, limit and monitor AI-specific risks. They will have to show how they are fulfilling FINMA's "supervisory expectations". These consist of the following four guiding principles:

  1. Clear roles and responsibilities as well as risk management processes must be defined and implemented. Responsibility for decisions cannot be delegated to AI or third parties. All parties involved must have sufficient expertise in the field of AI.
  2. When developing, adapting and using AI, it must be ensured that the results are sufficiently accurate, robust and reliable. Both the data and the models and results must be critically scrutinized.
  3. The explainability of the results of an application and the transparency of its use must be ensured depending on the recipient, relevance and process integration.
  4. Unequal treatment that cannot be justified must be avoided.

While some of these expectations concern long-standing issues, other elements of these guiding principles raise questions as to how they are meant and can be implemented. We will deal with these below. Our comments are based on our understanding of the subject matter and our experience from specific cases and discussions. FINMA has not yet publicly elaborated on its guidelines in the Risk Monitor 2023. However, it regularly holds discussions with individual financial institutions in which the topic is discussed in more detail. It should be noted that although the term "supervisory expectations" chosen by FINMA itself may suggest a formal character, the guidelines do not originate from a supervisory guidance (Aufsichtsmitteilung) or a circular (Rundschreiben) and are therefore less binding. Institutions can thus also deal with the issue differently. In our opinion, however, it is important that financial institutions deal with the topic in such a way that they have a sensible plan for managing the risks associated with AI. Only then will they be able to answer any critical questions from FINMA without having to adopt FINMA's guidelines.

Guiding principle no. 1: Governance and accountability

This guiding principle consists of three different elements that are not necessarily closely related. First of all, it sets out what is always required for proper governance of compliance and risks in a company: the tasks, powers and responsibilities (which in German is abbreviated as the so-called AKV) must be defined, i.e. it must also be clear in relation to the use of AI which bodies (preferably individuals, not committees) are responsible for which aspects, how they are to do this and who is responsible for the respective success or achievement of objectives (e.g. compliance). As AI is ultimately just one of several possible forms of IT (AI legally refers to a system in which not only people have programmed how an output is created from input, but where this is sometimes done solely on the basis of training), the AKV for AI can often be, or are already, mapped within the framework of existing directives, processes and other governance measuresHowever, because AI has a strong interdisciplinary element and requires new perspectives, we see many companies that regulate the AKV and other requirements for the use of AI separately, e.g. in a directive for AI. Such a separate AI directive also offers communicative advantages. We also refer you to our blog post no. 5.

In this case, FINMA's statement that "[t]he responsibility for decisions ... cannot be delegated to AI or third parties" seems more important to us. At first glance, this sounds like a matter of course: a financial institution remains responsible for its management even if it outsources functions to third parties. This has always been the case in outsourcing and must apply even more so when a machine is used that is operated by the financial institution itself. The fact that FINMA nevertheless mentions this has to do with the fear that the persons responsible for decisions in the institutions could in fact still absolve themselves of responsibility if decisions are made by the AI. It sees this danger – in addition to the risk of poor quality (see guiding principle no. 2) and incomprehensible behavior (see guiding principle no. 3) – above all where AI errors go unnoticed, where processes become so complex that it is no longer clear who is responsible for what, or where there is simply a lack of expertise because the systems have become too complex. The peculiarity of AI systems, the element of "autonomy" (not every decision will be pre-programmed, but "only" trained), can also contribute to these risks.

FINMA is concerned both with fully automated decisions for which no one feels responsible and those made by humans on the basis of AI results, but whose errors they do not recognize, whether due to negligence or inability. This also has an indirect impact on FINMA's supervision, for example because it can no longer effectively monitor how a financial institution came to certain decisions because there is no one to answer all of its questions. It does not want the "blame" for errors to be shifted to AI, nor does it want the institution to no longer have the necessary expertise to either make important decisions without AI or to be able to understand and, if necessary, override them. In relevant areas, a financial institution should therefore not rely on a tool or technology that it does not understand itself. Although FINMA explicitly mentions generative AI (i.e. "ChatGPT") as an example in the risk monitor, the problem also affects deterministic or predictive AI (as has been used in AML checks for some time, for example), if not in fact even more so, because it is used in more important applications where errors can have a correspondingly higher impact (such as when using AI for risk management or combating money laundering).

However, it is also interesting to note what FINMA does not mention in its Guiding Principle No. 1 and therefore apparently does not expect e contrario: That decisions are only made by humans. In other words, it is possible to have decisions made by AI. This means: AI may be used for or to support decisions as long as someone in the bank who is authorized to make such decisions actually controls the use of AI and also "takes the rap" for it – end-to-end, not just for individual components. So, if AI is used to make investment decisions, someone from the department where such decisions are normally made must remain responsible for them. This person should know that they cannot hide behind IT or the AI supplier in the event of problems because the AI had a defect.

FINMA's position is in principle correct: there will be more and more areas in which it is appropriate to let computers make decisions, also based on pattern recognition, because human decisions can have relevant disadvantages. Transactions can, under certain circumstances, be checked much better by a machine for signs of money laundering, sanctions offences or fraud and stopped if necessary. Hence, we may be much better able to fulfil this objective with automated decisions, at least as a first step. It does not matter that mistakes are made, because people also make mistakes. The decisive factor is that an appropriate level of quality is delivered overall. We can compare this with the authorization of medicinal products: They also have side effects, but if the overall benefit is sufficient and the ratio is right, we allow them. In practice, the main problem is that we still lack experience in certain aspects of AI, such as the use of large language models, or we are sometimes fooled by the perceived quality of the output. Finally, this topic also includes the third element of guiding principle no. 1, namely that all those involved must have sufficient expertise in the field of AI. We see some deficits here in practice. Although many companies have now introduced their employees to generative AI and tools such as "ChatGPT", most of these activities focus on the legally compliant use of such tools in the personal sphere. A "prompting" workshop may well provide a good insight into the possibilities, limitations and certain risks of generative AI. The "scaremongering" presentations on the topic of AI that are always popular in the early days of a hype can also sensitize people (although we have always refrained from doing this). However, FINMA's expectations in terms of expertise rightly go far beyond this. On the one hand, it is about know-how about how AI actually works, i.e. for which applications which AI methods can be used and how, which technical and organisational measures exist to address AI-specific risks, etc.

On the other hand, it is important to understand what AI-specific risks the use of AI harbors for the institution as a whole, namely financial, operational, reputational and legal risks. It is not enough for employees to know where "ChatGPT" could be a problem. Nor is it enough for an institution's AI experts to know what they are doing. The institution's top management must also understand what and what risks they are exposing the institution to when using AI. The members of the executive board and the board of directors must also be able to describe what the AI-specific risks are for the institution and how it deals with them, and they must have thought about what they are prepared to accept before approving corresponding projects and initiatives. This in turn means that they need to have a basic understanding of what AI actually is, where its weaknesses and strengths lie and what approaches there are for dealing with it. It is about understanding what AI really brings to the financial institution and what it does not, and in which applications it is used.

Of course, thisunderstanding does not have to go into every detail at this level and it can and will be prepared by specialists, but we believe there must be a basic understanding of the AI-specific risks from the perspective of the institution as a whole, and this is also how we understand FINMA's position. The management bodies must have a good feeling for the topic, without scaremongering and without exaggerating the capabilities of AI. In our experience, this is still often lacking, which sometimes has to do with the fact that – apart from the aforementioned "ChatGPT" workshops and "scaremongering" presentations – AI risk training at C-level and for the board of directors tends to be neglected; this has led to us now also being active in the area of management training, although this is not within our typical area of activity.

The need to deal with AI-specific risks also explains the last element of guiding principle no. 1, the definition and implementation of risk management processes. This is nothing new either. One challenge for many institutions is to expand their existing risk maps and catalogues to include AI-specific risks. However, this is only a question of investing the time. We have also developed tools for this, which are available free of charge as open source and are already being used by a number of companies, including banks and insurance companies (see our GAIRA tool and blog post no. 4). This also includes a proposal on how AI projects can be classified according to their risk exposure for a company, because not every AI application is equally risky (see blog post no. 20).

The risk-based approach is also advocated by FINMA in its guidelines. It is primarily concerned with applications with correspondingly high risks: What happens if the AI makes a material error or grossly malfunctions? How many customers would be affected and how? What would be the financial impact? What impact would this have on compliance with legal requirements?

Guiding principle no. 2: Robustness and reliability

On the one hand, this requires appropriate quality management in relation to AI-based components and systems and, on the other hand, implicitly formulates the expectation that AI systems will only be used autonomously once they are sufficiently reliable and this can ultimately be proven.

FINMA is particularly concerned that AI systems might be used that could deliver incorrect or otherwise unacceptable results because they have been trained with incorrect or unrepresentative data or because circumstances change after they have gone live in such a way that their output no longer fits, for example because the model has not been updated (concept drift). In practice, the convincingly designed outputs of generative AI in particular can lead to their content quality being overestimated. This can lead not only to errors or bias in the output, but also to a bias in relation to the assumption that the AI is correct – in other words, it is trusted too much, even if a human checks the result (AI overreliance). For example, a large language model can easily be asked how likely a previously generated answer corresponds to the facts. It will provide an answer. However, as empirical research shows, these answers are extremely unreliable because although the model works with probabilities when it generates text, it cannot transfer these probabilities into its own text. However, these things are not apparent in the answer – it appears convincing and self-critical.

In addition to the risk of inadequate training data and changing circumstances, institutions should also be aware of the limitations of the techniques they use. This also applies to predictive AI, which can be used to predict financial parameters, for example. Each machine learning method has its advantages, disadvantages and areas of application that need to be recognized. For example, a particular method for predicting certain values may work reliably in its normal rangesbut fail with more extreme inputs or certain combinations – without this being apparent from the output. Whoever uses it must know these limits and be able to deal with them.

FINMA also sees AI systems as a new gateway for cyberattacks, which must be countered. It has every reason to expect this, as generative AI systems in particular allow new forms of attacks (see our blog post no. 6). This often poses a double challenge for financial institutions: firstly, they must ensure that information security is guaranteed in the traditional way; many of the new AI applications are used in the cloud, in relation to which some financial institutions (and their providers) are still gaining experience. Secondly, the new forms of attack require new defense measures. For example, an EDR system or firewall is of no use against an attack via prompt injection, in which an appropriately formulated command to a chatbot that is freely accessible to all causes it to override its security protocols.

So, if a financial institution wants to implement a chatbot for customer enquiries, it must ensure that it does so on a carefully selected model, test it extensively and successfully (including for misuse) and monitor it continuously after launch (e.g. to recognize when the model needs to be "re-calibrated" or when misuse occurs). If AI is used to recognize problematic transactions, KPIs for the model must be defined (e.g. accuracy, precision, recall) and measured regularly so that corrective measures can be initiated in good time. Is a financial institution able to detect outliers or major errors in the output of its AI?

Guiding principle no. 3: Transparency and explainability

Guiding principle no. 3 combines two topics that have only a limited nexus with each other. However, both have in common that they are about classic AI buzzwords: calling for them sounds obvious at first glance, but on closer inspection it becomes clear that they do not necessarily lead anywhere.

Firstly, FINMA requires the use of AI to be transparent. This is, rightly, not to be implemented according to the scattergun approach, but rather in a manner that is appropriate to the risks and affected persons, as FINMA also states. It is primarily concerned with transparency towards clients, rather than other market participants or employees. It should enable clients to assess the risks associated with the use of AI because they (can) know where and how AI is being used to affect them.

There is nothing wrong with this in itself. However, the reality is that transparency in the field of AI is usually either a mere fulfilment of duty or an alibi exercise for a clear conscience, but in both cases, it only has a limited positive effect. We have all experienced this in connection with privacy notices: companies today provide much more comprehensive information about their data processing than they did five or ten years ago (because they have to), and yet the data subjects do not really know what happens to their data, even if they are interested and read the privacy notices, which hardly anyone does. In addition, they regularly feel powerless: although they are given a hint of what is happening to their data, they cannot really do anything about it.

This is hardly any different in the area of AI: FINMA expects that if an AI answers a customer's questions directly, suggests personal investment decisions or monitors their payment behavior to combat fraud, they should know that it is doing so and on what basis. For example, they will know that their payments are being scanned for suspicious patterns. However, they will not understand how this works in detail, nor can it be reasonably communicated to them. They will be no more or less able to assess their risks than if no AI were used for this purpose, and they will not be able to adjust their behavior accordingly. If he knows that the investment decision comes from an AI instead of a human, this may be subjectively important information for him, but it will not enable him to make a reliable statement about the quality of the information – if guiding principle no. 2 is followed, it should not make any difference.

The transparency requirement will therefore lead to financial institutions specifically informing customers when they are dealing with trained rather than fully programmed algorithms. However, this will not change their use. FINMA does not go as far as the Federal Data Protection and Information Commissioner, who demands transparency about every use of AI. FINMA clearly only expects such transparency where the client is directly confronted with it in a way that is relevant to them. If AI helps a flesh and bone advisor to better formulate or translate their emails, they do not need to be informed about this. However, if a bot answers the enquiry without human intervention, this should be made clear. We explained in detail how information can be provided and what transparency is required in the area of AI in blog post no. 16, which includes a sample AI declaration.

Pro memoria: Art. 17 FinSA requires compliance with good faith in the processing of client orders, which also includes transparency. In turn, Art. 8 FinSA requires that clients receive certain information about financial services (such as investment advice). According to Art. 7 para. 1 FinSO, this includes information on the nature of the financial service, its characteristics and how it works. This may result in a specific obligation to provide information about AI. However, not every use of AI is special; AI is already being used in many places today without anyone thinking of demanding transparency in this regard. Please also refer to our blog post no. 16.

FINMA's second topic, the explainability of AI results, should also be treated with a certain amount of caution and should not be taken too literally. Firstly, it is obvious to demand that the results from an AI used for a decision should be explainable. Nobody wants to trust a black box. However, with some of the advanced methods of AI, we are simply not able to explain why a specific result was achieved in exactly the way it was. We are able to build such systems, we understand how they work in principle and we are becoming increasingly familiar with their behavior through empirical research. However, we do not really understand, for example, large language models in depth. We have discussed this in detail – including how they work – in our blog post no. 17. With some other machine learning methods, we may in principle understand every decision down to the last detail, but this might involve so many decision steps or calculations that this is no longer practicable. This is why the explainability of the results is literally wishful thinking, at least for certain AI applications. FINMA's demand does not change this. It is demanded in many places, but this does not mean that it can be implemented as conceived.

We must therefore interpret and adjust the requirement accordingly in order to be able to implement it sensibly. First of all, FINMA is of course aware of the problem itself. It knows that due to the large number of parameters and the complexity of AI models, the influence of the individual parameters on the result cannot be understood by us humans, at least not currently. However, it recognizes the risk that without an understanding of how an AI arrives at a certain result, a financial institution's decision based on that result can no longer be explained either. If an institution's decisions can no longer be explained, they can no longer be meaningfully reviewed – and the audit firms and FINMA can no longer fulfil their supervisory duties. This fear is already addressed in guiding principle no. 1. Not mentioned by FINMA, but just as relevant, is the fact that the institution itself can no longer really control any AI-supported decision-making if it does not know why decisions are made the way they are – quasi by the oracle of AI.

The explainability of the results of an (AI) application should therefore ensure that decisions based on AI remain comprehensible and therefore verifiable. Understood in this sense, explainability does not mean that a financial institution must understand why and how an AI has arrived at a specific or each individual result. It is sufficient if the result can be understood and confirmed in some way, even if the explanation is provided by way of an alternative justification. The validation of the result is therefore essential. The question is whether the result makes logical sense. We humans are no different: when we recognize a certain type of object, we cannot explain why we immediately know what it is. However, we can subsequently deduce why the object is what we spontaneously realized it was – even if we do this in a different way than our brain instinctively did.

To ensure this, a financial institution can, for example, establish what determines the output of an AI, i.e. which aspects in the input are the drivers in relation to the output. For predictive AI in particular, a so-called sensitivity analysis can be carried out to determine how sensitively the results of a model react to changes in the input variables. This makes it possible to understand which variables have the greatest influence on the model and how uncertainties in the input data affect the predictions. In this way, the robustness and reliability of a model can be assessed and the most important influencing factors can be identified. As a result, models can not only be optimized, but also better understood.

So, if an AI is used to combat fraud and it monitors a customer's transactions and blocks one as suspicious, the question arises as to whether the bank understands which patterns were responsible for this in order to check whether the blocking of a transaction was justified and that they agree with the decision. If this is not readily possible, the bank can alternatively show that the circumstances would have justified the blocking even if considered separately. This does not necessarily have to be the case for every such event, as there must be a certain tolerance where assessments are fuzzy by nature, but the AI should come to a result that can be justified sufficiently often. To find out whether that is the case, for example, the sensitivity of the AI model to very high and very low transaction amounts as drivers of suspicious activity could be tested, as could other parameters. Systematic variation of extreme cases could be used to determine how the model reacts - and whether this is comprehensible.

In generative AI, for example, source information in the output can contribute to explainability if the AI draws on information from certain databases (so-called retrieval augmented generation). This makes it possible to understand where a certain answer comes from, even if the user may not be able to find out why the AI has chosen this particular content. However, they will be able to categorize the answer.

An AI result is therefore "explainable" if it can be validated or justified independently of the AI. In this context, it has even been suggested that the principle of explainability should be fulfilled if the result of an AI can be validated by another AI trained on a different basis.

Guiding principle no. 4: Equal treatment

Unjustified unequal treatment due to the use of AI should be avoided. This requirement goes somewhat less far than it might appear at first glance. We must be aware that Swiss law does not recognize a general ban on discrimination in the private sphere. This only exists in certain areas (e.g. in the workplace). In addition, unequal treatment in financial institutions, for example when granting loans, is normal. There is no right to credit and no right to the same conditions. The principle of equal treatment only applies in narrow constellations, for example in the processing of client orders (Art. 17 FinSA).

The use of AI should not change this (for the time being). The guiding principle is accordingly worded softly: Avoiding unequal treatment through AI should ensure that the lack of balance in an AI system does not lead to one-sided results and thus unintentionally discriminate against groups of people. If customers with high potential are to be selected by an AI on the basis of their data in order to provide them with special offers, carefully curated training material, tests and other measures must be used to ensure that women or certain nationalities, for example, are not automatically rated lower because biased data sets have been used for machine learning.

The financial institution may favor or disadvantage groups of people using AI, but it must do so on the basis of a conscious decision. Leaving this to AI or chance is not permitted. AI should therefore not make subjective decisions in financial institutions and should not discriminate on its own initiative or due to its inability.

Implementation in the financial institutions

There is no transitional period for implementing these guidelines, which is clear from the fact that they are not binding. Nevertheless, they express FINMA's expectations and set out general principles for the use of AI that FINMA believes regulated financial institutions must comply with even without the guidelines it has formulated (even if it is not always clear what the legal basis is).

In concrete terms, this means that institutions must take the usual governance measures, such as issuing corresponding directives, defining GSCs and implementing compliance processes.

They should also consider how FINMA's guidelines (or their own guidelines) can be operationalized as part of the implementation and review of specific AI projects. The challenge lies not in the principle, but in the concrete operational implementation. Many financial institutions will already have AI projects, i.e. there is a need for action. However, in our experience, only a few institutions have looked closely at the specific implementation of these supervisory expectations, and in our experience, some institutions do not yet have a real overview of all relevant AI activities.

As a first step, we therefore recommend that, in addition to corresponding directives, a "map & track" of the relevant applications be carried out, i.e. a survey of the applications in which AI is used in a relevant manner. In a second step, these applications should be categorized according to a risk scale tailored to the respective institution and then assessed on a risk-oriented basis.

Moreover, FINMA's "supervisory expectations" will not stop there: By the end of the year (or early 2025), the Federal Council will set out where it sees a need for further adjustments to Swiss law in relation to AI. Furthermore, financial institutions should prepare for the application of the EU AI Act (see our blog post no. 7). This has an extraterritorial effect, particularly in cases where the output of an AI is also used in the EU in the way it was intended. FINMA expects Swiss financial institutions to comply with the AI Act insofar as it will apply to AI projects of these institutions.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Learn More