AML Shield
An AI-powered AML compliance platform for BaFin-regulated German financial institutions. A conditional ReAct agent combines Claude Haiku with an XGBoost classifier to deliver explainable transaction risk decisions grounded in live regulatory citations — from GwG to FATF to EU 6AMLD — producing BaFin-format SAR reports with immutable audit trails.
Legal Grounding
Every decision is anchored to a specific statutory obligation. The primary framework is German law (GwG), supplemented by FATF standards and EU directives where they impose a higher obligation. Citations are embedded in the agent's reasoning chain — compliance is not a post-hoc annotation.
| Instrument | Provision | How it is applied in AML Shield |
|---|---|---|
| GwG §10 Abs. 3 | Customer due diligence for transactions ≥ €10,000 | Triggers is_above_ctr feature in the ML model; regulatory checker raises a MEDIUM-severity flag |
| GwG §43 Abs. 1 | Obligation to report suspicious transactions to the BaFin FIU | The primary trigger for SAR_REQUIRED decisions; report is submitted via the goAML portal |
| GwG §47 Abs. 5 | Tipping-off prohibition — informing a customer of a SAR is a criminal offence | Explicit warning printed on every SAR output; UI prevents any customer-facing message referencing the case |
| GwG §17 | Criminal liability for under-reporting (Strafbarkeit) | Justifies the conservative default bias in the system prompt: "when uncertain between tiers, escalate higher" |
| FATF Rec. 16 | Wire transfer transparency — originator and beneficiary info must accompany cross-border payments | Missing/incomplete IBAN or counterparty data raises the risk score and triggers a regulatory flag |
| FATF Rec. 19 | Enhanced due diligence for transactions involving blacklisted jurisdictions | is_high_risk_country feature; Rule E: sanctions fast-track overrides all other processing |
| FATF Rec. 20 | Suspicious transactions must be reported regardless of amount | Overrides the €10,000 threshold — suspicious patterns below CTR are still escalated |
| EU 6AMLD Art. 18 | Expanded predicate offences and enhanced corporate liability | SAR narrative includes 6AMLD transposition language when predicate offence indicators are present |
| EU Reg. 2015/847 | Wire Transfer Regulation — information accompanying transfers of funds | Cross-border payment scrutiny layer; flags transfers lacking compliant originator records |
| EU MiCA 2023/1114 | Crypto-asset service provider obligations | Applied by the regulatory checker when transaction_type = crypto_exchange |
Money Laundering Patterns
The rule engine and ML features jointly encode four primary typologies. Each maps to one or more statutory obligations.
Structuring (Smurfing)
Transactions in the €8,500–€9,999 band are flagged by the is_near_ctr_threshold feature — deliberate positioning just below the €10,000 Cash Transaction Reporting threshold to avoid GwG §10 Abs. 3 scrutiny. This is the second-highest SHAP contributor in the reference case (0.614).
High-Risk Jurisdiction Exposure
Country codes are extracted from BIC/IBAN and matched against two lists derived from EU Delegated Regulation 2016/1675:
IR · KP · MM · SY · YE · AF
Iran, North Korea, Myanmar, Syria, Yemen, Afghanistan. Triggers Rule E sanctions fast-track. Highest SHAP weight: 0.821.
PK · TR · ML · VN · MZ · TZ · JO
Pakistan, Turkey, Mali, Vietnam, Mozambique, Tanzania, Jordan. Elevated risk weight; does not trigger automatic SAR.
Temporal Anomalies
The is_night feature flags transactions between 22:00 and 06:00. Legitimate retail banking activity is strongly concentrated in business hours; early-morning timestamps correlate with automated layering scripts. SHAP weight in reference case: 0.392.
Network Graph Typologies
Detected via NetworkX traversal — these patterns are invisible to single-transaction screening.
One account rapidly distributes funds to many receivers — the classic placement layer in a three-stage laundering scheme.
Funds traverse ≥3 hops in under 24 hours, deliberately obscuring the beneficial ownership trail.
Funds return to the originating account after passing through one or more intermediaries — a classic integration indicator.
Many accounts funnel into one — structuring across multiple originators to avoid individual reporting thresholds.
XGBoost Classifier
The risk scoring model is an XGBoost binary classifier trained on the IBM AMLworld dataset (NeurIPS 2023). When real data is unavailable, the pipeline falls back to generating 2,000 synthetic transactions.
Training Data
IBM AMLworld is preferred when available in data/. The pipeline loads up to 3 CSV files (≤5,000 rows each), maps IBM payment format labels to the internal transaction type schema, and validates that the positive rate exceeds 1% before training.
14 Engineered Features
All features are derived programmatically in models/features.py from raw transaction fields. No manual labelling is required.
Model Hyperparameters
SHAP Explainability
Every prediction is accompanied by SHAP values — a game-theoretic approach assigning each feature a contribution to the final score. This satisfies EU AI Act interpretability requirements and provides compliance officers with auditable explanations.
Reference Case: Iran Wire Transfer
A €9,750 international wire from Germany to Iran, timestamped 02:34 AM. The model scores this SAR_REQUIRED. The SHAP breakdown shows which features drove the decision:
Red bars increase risk; green bar mitigates. The account age signal partially offsets the other factors — an older, established account is modestly less suspicious than a recently opened one, all else equal.
Suspicious Activity Reporting
SAR reports follow the BaFin GwG format required for FIU submission via the goAML portal. Each report receives an auto-generated ID in the format SAR-YYYYMMDD-TXID and enters DRAFT status pending compliance officer sign-off.
Claude compiles transaction facts, SHAP attributions, network findings, and triggered regulatory rules into the standardised BaFin template — with inline statutory citations at every assertion.
Report surfaces in the case queue as DRAFT. Designated officer reviews findings, may annotate, and approves or rejects the SAR before submission.
Approved SAR is forwarded to BaFin's Financial Intelligence Unit. Submission timestamp and portal reference number are written to the audit trail.
GwG §47 Abs. 5 is enforced at the UI level — no customer-facing communication may reference the SAR case. The prohibition is printed on every report as a statutory notice.
The ReAct Agent
The decision engine uses the ReAct pattern — the model alternates between thinking about what to do and calling tools to do it. Each tool result feeds back into context, updating the risk assessment before the next step. The loop runs up to 10 iterations with Claude Haiku via the Anthropic tool-use API.
pick next tool
precise inputs
update assessment
The entire chain — every thought, every tool call, every result — is stored in reasoning_chain and written to an immutable audit trail. The whole reasoning process is available for regulatory examination.
The Five Tools
Claude has exactly five registered tools, called in a default sequence that the conditional branching rules can override.
risk_score 0–100, confidence interval, and SHAP feature attributions. Always called first.Conditional Branching Rules
The system prompt hardcodes five rules that override the default tool sequence — encoding the same proportionality judgments a senior compliance officer would apply intuitively.
IF score < 30 AND domestic
THEN skip network analysis
Domestic low-risk transactions don't warrant graph traversal. Cuts latency for the majority of legitimate payments.
IF score > 80
THEN network depth = 3
High-risk transactions require deeper traversal to surface layering across three degrees of separation.
IF flagged_connections not empty
THEN re-run network on flagged account
One hop is insufficient — flagged accounts must be investigated to their source.
IF 40 ≤ score ≤ 60
THEN document FOR / AGAINST before deciding
Ambiguous cases require documented balanced analysis — a formal pro/con for the audit trail before any escalation decision.
IF sanctions match detected
THEN SAR immediately, skip remaining tools
Sanctions matches are per-se SAR events under EU 6AMLD Art. 18 — no further analysis needed or permitted.
IF uncertain between tiers
THEN escalate higher
Under GwG §17, under-reporting carries criminal liability. The system prompt instructs Claude to default upward when in doubt.
System Prompt Architecture
The system prompt is divided into eight sections and treated as a legal instrument — the source file header reads: DO NOT modify regulatory citations in this file.
AML Shield is in active development. Architecture and regulatory mappings are subject to change as the design is validated against production compliance requirements.