Keras and TensorFlow let you save executable logic as part of the model graph. It runs every time you call predict, it uses standard operators rather than exploits, and that is exactly what makes it slip past security tools.

The pickle attack fires when you load. This one is patient. It waits in the model graph and fires when you run inference, which means it can sit dormant through your loading, your validation, even your first few test predictions, and then execute on real traffic.

Two formats make this easy, and in both cases the attacker is not exploiting a bug. They are using the format exactly as designed.

The Keras Lambda layer

Keras has a layer type called Lambda whose entire purpose is to let you drop an arbitrary Python function into a model. That is genuinely useful when you are building something custom. The catch is that when you save the model, the function comes along, serialised as bytecode inside the model config. When somebody else loads the model, that bytecode is deserialised and the function becomes a live part of the graph. Whatever the attacker wrote in the body of that function now runs on the next forward pass.

So a model that looks like a clean image classifier can carry a Lambda layer whose function quietly opens a file, calls out to a network, or shells out, every time it processes an input. The classification still works. The side effect rides along.

There is a quieter variant too. Keras lets a config reference a callable by structure rather than by source text: a class name, a module, a function name, assembled into a call at load time. That form carries no readable Python in the file at all, which is exactly why it slips past anything that only greps for suspicious strings. A reference to a module and a function name is enough to resolve to something dangerous, and it does so even with the framework's own safe-loading switch turned on. We mention this because it is a good example of why string matching is not detection. The danger is in what the reference resolves to, not in whether the file happens to spell out the word system.

TensorFlow's standard, dangerous operators

TensorFlow SavedModel is a code-inclusive format. The computation graph is a protobuf, and it can be loaded and run without the original training code. That is convenient for deployment and it means the graph itself can carry behaviour.

A handful of perfectly standard operators are the problem. There are operators that execute an arbitrary Python callback as part of the graph. There is an operator that runs shell commands. There are operators that read and write files on the host. None of these are exploits. They are documented parts of the TensorFlow specification, intended for legitimate uses. But an attacker can place them in a graph to get code execution, a shell, credential theft, or file tampering at inference time, and because they are standard ops, tools built to look for "malware" do not flag them. There is no malformed structure to catch. The model is using the framework as intended, for an unintended end.

This is the uncomfortable theme of the whole class: the most durable attacks do not look like attacks. They look like features.

Why detection is awkward here

Two things make this harder than the pickle case.

First, the dangerous logic is encoded, not written. A Lambda body is compiled bytecode, not Python you can read. Catching it means reconstructing what that bytecode actually does, which is more involved than scanning text, and when reconstruction is not fully possible you have to make a judgement call about an unverifiable blob of executable logic in a file from someone you do not know. Our position is that an unverifiable Lambda in an untrusted model is suspicious by default. Treating "I cannot read it" as "it is probably fine" is how things get missed.

Second, plenty of real models use custom layers and legitimate operators for legitimate reasons. So the same false-positive discipline from the pickle post applies. The goal is to flag the operators and structures that grant code execution or filesystem access, while leaving the ordinary custom layer that just does some math alone. Getting that balance right is the work. A scanner that flags every Lambda is not protecting anyone, it is just retraining its users to ignore it.

What to do about it

If you are publishing models, you almost never need a Lambda layer or a file-touching operator baked into a graph you ship to others. Keep custom logic in code that ships separately and is reviewed, not welded into the artifact.

If you are consuming models, be especially wary of these formats from unfamiliar authors, because the payload here survives the load step and waits. A model that passed a quick smoke test is not cleared. The trigger might simply not have come up yet.

Inference-Time Logic: when the backdoor is a layer

Table of Contents

The Keras Lambda layer

TensorFlow's standard, dangerous operators

Why detection is awkward here

What to do about it

Inference-Time Logic: when the backdoor is a layer

Table of Contents

Share this article

The Keras Lambda layer

TensorFlow's standard, dangerous operators

Why detection is awkward here

What to do about it

Share this article

Keep reading

Lockdown Mode Is a Retreat, Not a Solution

How to Threat Model an LLM Application (Without Boiling the Ocean)

Incident Response for AI: What to Do When the Model Is the Problem