An effective anti-money laundering (AML) transaction monitoring (TM) system is essential for a bank to fulfill its regulatory obligations in detecting and reporting illicit transactional activity. It is important that AML TM systems are able to detect suspicious activity while minimizing non-productive alerts. With many TM systems embedding artificial intelligence and machine learning (AI/ML) components, model risk management practices need to be modified to get the best out of AI-enabled TM models.

Recent Regulatory Guidance

Recent regulatory guidance should also be considered in modernizing model risk management practices, including the Model Risk Management Framework SR 26-2, issued on April 17, 2026, jointly by the Federal Reserve, FDIC, and OCC. This guidance replaces the SR 11-7 Model Risk Management framework based on supervisory experience, industry feedback, and advancements in modeling practices over the past 15 years.

The updated guidance clarifies model risk management principles and emphasizes a risk-based approach to model risk management tailored to a financial institution’s risk profile and the size and complexity of its operations. SR 26-2 emphasizes that a one-size-fits-all approach to AML model governance is no longer aligned with supervisory expectations.

Revised Definition of Model

A key change is the revised definition of “model” and the new materiality framework now requiring that to qualify as a model, a system must be a complex quantitative method or approach. This is explicitly excluding simple arithmetic calculations, deterministic rule-based processes, and software where no statistical or financial theories underpin the design or use.

For AML specifically, this means that institutions should consider a re-scoping exercise of their BSA/AML system inventory: basic rules-based transaction monitoring alert thresholds may no longer qualify as “models” under the revised definition and therefore may not require full model risk management treatment.

New Materiality Framework

Further, SR 26-2 highlights materiality to be a key cornerstone of the MRM program, basing it on “model exposure” — the financial scale or business significance of the decisions a model drives — and “model purpose.” Institutions should document this re-tiering analysis carefully. Examiners will expect to see a principled, evidence-based rationale for how AML systems have been classified and tiered, and an undocumented or inconsistently applied tiering methodology may raise the possibility of supervisory challenge.

In addition, SR 26-2 states that “the quality of the validation process depends on the rigor and effectiveness of the review rather than on organizational structure,” explicitly providing that validators may be closer to model development teams, provided the rigor of the review is demonstrable and conflicts of interest are managed. The emphasis on demonstrable rigor means that validation documentation, challenge evidence, and conflict management protocols will face heightened scrutiny.

SR 26-2 and AI

SR 26-2 explicitly excludes generative and agentic AI from its scope, on the basis that these technologies are “novel and rapidly evolving,” while confirming that traditional machine learning models, classifiers, gradient-boosted models, and neural networks used for credit or fraud detection remain fully in scope.

Further details can be found on the Federal Reserve’s website.

Best Practices for Implementing or Enhancing a TM System

Aligned with the new model risk management guidance, best practices to consider when implementing, enhancing, or optimizing a TM system include the following.

Risk-Based Approach

The financial institution’s risk appetite should be defined up front. Customers should be classified and segmented by risk segments (consumer, small business, commercial, high net worth, correspondent banks, etc.), and the risk tolerance for each tier should be well-defined.

Data Integrity

It is crucial that the TM system receives high-quality data inputs. Data feeds from all channels (core banking, wire transfers, ACH, cards, trade finance, etc.) should be clean, complete, and timely, and data governance standards should be established and data reconciled across systems to ensure accurate data, resulting in quality alerts. 

Scenario & Rule Design

A library of TM scenarios and typologies should be tailored to the financial institution’s risk profile. Common scenarios include structuring, rapid movement of funds, transactions to/from high-risk jurisdictions, activity inconsistent with expected customer activity, etc.

New emerging risk typologies faced by the financial institution and consideration from FinCEN’s National AML Priorities should result in updated TM scenarios proactively. TM rules should be documented with a clear rationale depicting the AML typologies they are intended to detect, adjusted by customer segment, and reviewed regularly.

For AI-based models, model explainability documentation and rationale should be developed upfront as part of the implementation cycle.

Threshold Tuning & Optimization

The TM system should be periodically tuned and optimized to generate higher quality alerts and reduce high volumes of false positives. High false positive volume creates operational risks and burden, which may be remediated through threshold tuning to balance detection sensitivity for better quality productive alerts. High false positive rates may signal too much noise in alert volumes to efficiently clear queues. Similarly, false positive rates that are too low may indicate that alerting thresholds do not align with the financial institution’s risk appetite.

Tuning and optimization decisions should be thoroughly documented, and the rules optimized to appropriately identify and mitigate risks faced by the financial institution. Several TM platforms have integrated AI-based tuning recommendations, proposing optimized threshold logic based on alert history and investigative outcomes.

Behavioral Analytics & Machine Learning

Threshold-based detection scenarios should be combined with behavioral analytics and machine learning where applicable. While base line scenarios can detect known typologies, analytics can help to identify expected activity for specific customers and deviations from that activity.

Machine learning is increasingly being used to better detect these deviations from a customer’s/account’s expected activity by analyzing recent transaction history in an account and identifying anomalous behavior, and comparison with peer customer segments that traditional rules-based monitoring may not identify.

Case Management Workflow

Alerts should flow into a structured case management system with clear service level agreement alerts, escalation paths, and audit trails. Both cleared and escalated disposition decisions must be documented with a defensible rationale and subjected to quality reviews.

AI agents can be utilized in triaging and clearing first-level alert reviews by assessing investigative history to identify alerts by highest/lowest risk and provide relevant structured alert summaries related to customer profiles, transactional details, counterparty analysis, etc.

Safeguards in place include full audit trails of every AI action to support transparency and compliance, traceable explanations of every AI action (how risk scores were calculated, which thresholds were crossed, etc.), and parallel testing prior to deployment of any live changes.

Staffing & Training

Alert analysts should receive ongoing training on AML risk typologies, TM scenario thresholds, and alert review procedures. A lack of adequate training can affect the quality of detection and impact the ability to discern unusual and/or potentially suspicious activity patterns.

Independent Testing & Model Validation

Periodic model validations should address rule coverage gaps, threshold adequacy, data integrity, false positive and alert-to-SAR reporting rates, and staffing sufficiency. Model validations are essential to verify whether the TM system is accurately detecting potentially suspicious activity, thresholds are appropriately calibrated to the bank’s risk profile, models are conceptually sound, and data inputs are complete and accurate and produce reliable alerts.

Conclusion

A strong AML TM system is well-defined and tailored to a financial institution’s specific risk appetite, relies on high-quality data, and is regularly reviewed for optimal effectiveness after implementation with rules adjusted as necessary. Alert thresholds should be tuned and optimized to ensure high-quality alerts while appropriately identifying potentially suspicious activity.

It is critical that staff are trained on the TM system and relevant typologies on an ongoing basis. Testing and validation of the TM system ensures that it is performing as intended and helps to identify and close any control gaps.