Compliance Platform In Development

AML Shield

An AI-powered anti-money laundering compliance platform built for BaFin-regulated German financial institutions. A conditional ReAct agent (Reason + Act) combines Claude with an XGBoost classifier to deliver explainable transaction risk decisions grounded in live regulatory citations — from German GwG to FATF Recommendations to EU 6AMLD — and produces BaFin-format SAR reports with full, immutable audit trails.

Python FastAPI Claude Haiku XGBoost SHAP NetworkX Docker BaFin / GwG FATF 6AMLD
Regulatory Framework

Legal Grounding

Every decision AML Shield produces is anchored to a specific statutory obligation. The platform embeds regulatory citations directly into the agent's reasoning chain rather than treating compliance as a post-hoc annotation. The primary framework is German law (GwG), supplemented by FATF standards and EU directives where they impose a higher obligation.

InstrumentProvisionHow it is applied in AML Shield
GwG §10 Abs. 3 Customer due diligence for transactions ≥ €10,000 Triggers is_above_ctr feature in the ML model; regulatory checker raises a MEDIUM-severity flag
GwG §43 Abs. 1 Obligation to report suspicious transactions to the BaFin FIU The primary trigger for SAR_REQUIRED decisions; report is submitted via the goAML portal
GwG §47 Abs. 5 Tipping-off prohibition — informing a customer of a SAR is a criminal offence Explicit warning printed on every SAR output; UI prevents any customer-facing message referencing the case
GwG §17 Criminal liability for under-reporting (Strafbarkeit) Justifies the conservative default bias in the system prompt: "when uncertain between tiers, escalate higher"
FATF Rec. 16 Wire transfer transparency — originator and beneficiary info must accompany cross-border payments Missing/incomplete IBAN or counterparty data raises the risk score and triggers a regulatory flag
FATF Rec. 19 Enhanced due diligence for transactions involving blacklisted jurisdictions is_high_risk_country feature; Rule E: sanctions fast-track overrides all other processing
FATF Rec. 20 Suspicious transactions must be reported regardless of amount Overrides the €10,000 threshold — suspicious patterns below CTR are still escalated
EU 6AMLD Art. 18 Expanded predicate offences and enhanced corporate liability SAR narrative includes 6AMLD transposition language when predicate offence indicators are present
EU Reg. 2015/847 Wire Transfer Regulation — information accompanying transfers of funds Cross-border payment scrutiny layer; flags transfers lacking compliant originator records
EU MiCA 2023/1114 Crypto-asset service provider obligations Applied by the regulatory checker when transaction_type = crypto_exchange
Detection Typologies

Money Laundering Patterns

The rule engine and ML features jointly encode four primary typologies. Each maps to one or more statutory obligations.

Structuring (Smurfing)

Transactions in the €8,500–€9,999 band are flagged by the is_near_ctr_threshold feature — deliberate positioning just below the €10,000 Cash Transaction Reporting threshold to avoid GwG §10 Abs. 3 scrutiny. This is the second-highest SHAP contributor in the reference case (0.614).

High-Risk Jurisdiction Exposure

Country codes are extracted from BIC/IBAN and matched against two lists derived from EU Delegated Regulation 2016/1675:

FATF Blacklist (is_high_risk_country)

IR · KP · MM · SY · YE · AF

Iran, North Korea, Myanmar, Syria, Yemen, Afghanistan. Triggers Rule E sanctions fast-track. Highest SHAP weight: 0.821.

FATF Greylist (is_greylist_country)

PK · TR · ML · VN · MZ · TZ · JO

Pakistan, Turkey, Mali, Vietnam, Mozambique, Tanzania, Jordan. Elevated risk weight; does not trigger automatic SAR.

Temporal Anomalies

The is_night feature flags transactions between 22:00 and 06:00. Legitimate retail banking activity is strongly concentrated in business hours; early-morning timestamps correlate with automated layering scripts. SHAP weight in reference case: 0.392.

Network Graph Typologies

Detected via NetworkX traversal — these patterns are invisible to single-transaction screening.

Hub-and-Spoke

One account rapidly distributes funds to many receivers — the classic placement layer in a three-stage laundering scheme.

Rapid Layering

Funds traverse ≥3 hops in under 24 hours, deliberately obscuring the beneficial ownership trail.

Round-Trip Cycling

Funds return to the originating account after passing through one or more intermediaries — a classic integration indicator.

Fan-In Aggregation

Many accounts funnel into one — structuring across multiple originators to avoid individual reporting thresholds.

Machine Learning

XGBoost Classifier

The risk scoring model is an XGBoost binary classifier trained to distinguish legitimate transactions from suspicious ones. It is trained on the IBM AMLworld dataset (NeurIPS 2023) — a six-file CSV collection of high/low-income transaction categories at small, medium, and large volume tiers. When real data is unavailable, the pipeline falls back to generating 2,000 synthetic transactions.

Training Data

Synthetic Fallback — 2,000 Transactions
1,400
legitimate (70%)
600
suspicious (30%)
Legitimate: card payments, internal transfers, wire transfers. Low-risk countries, business hours, amounts €10–€5,000.
Suspicious: mixed structuring bands (€8,500–€9,999), high-risk countries, night timestamps, crypto exchanges.

IBM AMLworld is preferred when available in data/. The pipeline loads up to 3 CSV files (≤5,000 rows each), maps IBM payment format labels to the internal transaction type schema, and validates that the positive rate exceeds 1% before training.

14 Engineered Features

All features are derived programmatically in models/features.py from raw transaction fields. No manual labelling is required.

amount_log
amount_eur
is_near_ctr_threshold
is_above_ctr
is_round_number
hour_of_day
is_night
is_weekend
is_cross_border
is_high_risk_country
is_greylist_country
transaction_type_wire
transaction_type_crypto
transaction_type_cash
■ Amount ■ Time ■ Geographic ■ Transaction Type

Model Hyperparameters

n_estimators
200
Boosting rounds; capped by early stopping
max_depth
6
Sufficient for feature interactions without overfitting
learning_rate
0.10
Conservative shrinkage; balances bias/variance
scale_pos_weight
auto
= negatives / positives; corrects class imbalance
eval_metric
AUC
Optimises ranking, not accuracy — better for imbalanced data
early_stopping
20
Stops if AUC on held-out test set doesn't improve for 20 rounds

SHAP Explainability

Every prediction is accompanied by SHAP (SHapley Additive exPlanations) values — a game-theoretic approach that assigns each feature a contribution to the final score. This satisfies EU AI Act interpretability requirements and provides compliance officers with an auditable explanation for every automated decision.

Reference Case: Iran Wire Transfer

A €9,750 international wire from Germany to Iran, timestamped 02:34 AM. The model scores this SAR_REQUIRED. The SHAP breakdown shows which features drove the decision:

is_high_risk_country
+0.821
is_near_ctr_threshold
+0.614
is_night
+0.392
is_cross_border
+0.280
transaction_type_wire
+0.180
account_age_days
−0.120

Red bars increase risk; green bar mitigates. The account age signal partially offsets the other factors — an older, established account is modestly less suspicious than a recently opened one, all else equal.

SAR Workflow

Suspicious Activity Reporting

SAR reports follow the BaFin GwG format required for FIU submission via the goAML portal. Each report receives an auto-generated ID in the format SAR-YYYYMMDD-TXID and enters DRAFT status pending compliance officer sign-off.

01
Agent drafts SAR narrative

Claude compiles transaction facts, SHAP attributions, network findings, and triggered regulatory rules into the standardised BaFin template — with inline statutory citations at every assertion.

02
Compliance officer review

Report surfaces in the case queue as DRAFT. Designated officer reviews findings, may annotate, and approves or rejects the SAR before submission.

03
goAML submission

Approved SAR is forwarded to BaFin's Financial Intelligence Unit. Submission timestamp and portal reference number are written to the audit trail.

04
Tipping-off safeguard

GwG §47 Abs. 5 is enforced at the UI level — no customer-facing communication may reference the SAR case. The prohibition is printed on every report as a statutory notice.

Statutory notice on every SAR output: "Alerting the customer about this SAR filing constitutes a criminal offence under GwG §47 Abs. 5."
Under the Hood — AI Engine

The ReAct Agent

The platform's decision engine is built on the ReAct (Reasoning + Acting) pattern — a framework where a language model alternates between thinking about what to do and calling tools to do it. Each tool result feeds back into the model's context, allowing it to update its risk assessment before deciding the next step. AML Shield's loop runs up to 10 iterations and uses Claude Haiku via the Anthropic tool-use API.

Step 1
Think
Analyze transaction,
pick next tool
Step 2
Act
Call tool with
precise inputs
Step 3
Observe
Read result,
update assessment
repeat up to 10×

The entire chain — every thought, every tool call, every result — is stored in reasoning_chain and written to an immutable audit trail. The whole reasoning process is available for regulatory examination.

The Five Tools

Claude has exactly five registered tools, called in a default sequence that the conditional branching rules can override.

transaction_risk_scorer
Runs the XGBoost model. Returns risk_score 0–100, confidence interval, and SHAP feature attributions. Always called first.
entity_network_analyzer
NetworkX graph traversal around sender/receiver accounts. Detects structuring, layering, fan-out/in, and cycle patterns. Default depth 2; depth 3 when score > 80; recursive if flagged connections returned.
Rule A: skipped if score < 30 and domestic
Rule B: depth=3 if score > 80
Rule C: recursive if flagged_connections non-empty
regulatory_rule_checker
Checks GwG, FATF 40 Recommendations, EU 6AMLD, Wire Transfer Reg. 2015/847, and MiCA. Returns triggered rules with severity and exact statutory citations.
sar_report_generator
Generates a BaFin/GwG-format SAR for goAML submission. Only invoked at score ≥ 80 or confirmed sanctions match. Auto-populates narrative and appends GwG §47 Abs. 5 tipping-off warning.
Rule E: immediate if sanctions match detected
case_escalation_decider
Final arbiter. Accepts risk score, network risk tier, triggered rule count, and Claude's reasoning summary. Returns decision, case priority, SLA hours, and compliance queue. Always called last.

Conditional Branching Rules

The system prompt hardcodes five rules that override the default tool sequence — encoding the same proportionality judgments a senior compliance officer would apply intuitively.

Rule A — Low Risk Shortcut
IF score < 30 AND domestic THEN skip network analysis

Domestic low-risk transactions don't warrant graph traversal. Cuts latency for the majority of legitimate payments.

Rule B — Deep Network Analysis
IF score > 80 THEN network depth = 3

High-risk transactions require deeper traversal to surface layering across three degrees of separation.

Rule C — Recursive Investigation
IF flagged_connections not empty THEN re-run network on flagged account

One hop is insufficient — flagged accounts must be investigated to their source.

Rule D — Grey Zone Analysis
IF 40 ≤ score ≤ 60 THEN document FOR / AGAINST before deciding

Ambiguous cases require documented balanced analysis — a formal pro/con for the audit trail before any escalation decision.

Rule E — Sanctions Fast-Track
IF sanctions match detected THEN SAR immediately, skip remaining tools

Sanctions matches are per-se SAR events under EU 6AMLD Art. 18 — no further analysis needed or permitted.

Conservative Default
IF uncertain between tiers THEN escalate higher

Under GwG §17, under-reporting carries criminal liability. The system prompt instructs Claude to default upward when in doubt.

System Prompt Architecture

The system prompt is divided into eight sections and treated as a legal instrument — the source file header reads: DO NOT modify regulatory citations in this file.

§1
Role Definition
Positions the agent as a licensed compliance officer. Sets accountability framing: "Every decision you make carries legal weight."
§2
Regulatory Framework
Exhaustive citation list: GwG §10, §17, §43, §47; FATF Rec. 16/19/20/29; EU 6AMLD; Wire Transfer Reg. 2015/847; MiCA 2023/1114.
§3
ReAct Protocol
Defines the Think → Act → Observe loop. Requires Claude to explicitly state what it learned from each tool result before deciding the next action.
§4
Tool Calling Order
Default sequence: scorer → network → rules → SAR → decider. Explicitly marked as overridable by the conditional rules in §5.
§5
Conditional Branching
Rules A through E. Prefixed "CRITICAL — You MUST follow these rules" to ensure reliable adherence.
§6
Decision Thresholds
Score-to-decision mapping: 0–29 CLEAR, 30–59 WATCHLIST, 60–79 ESCALATE, 80–100 SAR_REQUIRED, with SLA hours per tier.
§7
Output Format
Mandates a structured output block (DECISION, Risk Score, Key Findings, Regulatory Basis, Reasoning) for downstream parsing and audit logging.
§8
Behavioural Rules
Hard prohibitions: never fabricate tool results, always cite specific articles, never alert the customer (GwG §47 Abs. 5), default to SAR when uncertain.

AML Shield is in active development. Architecture and regulatory mappings are subject to change as the design is validated against production compliance requirements.