Penetration test

AI and LLM pentesting: we put your artificial intelligence to the test like an attacker

AI and LLM pentesting, what many call an artificial intelligence security audit, attacks the systems you put into production (chatbots, assistants, agents and the models you have behind them) looking for how to trick them, get data out of them that they should not give or make them carry out actions they should not. It is a new field, with its own rules, where the attacker does not break in by force: they convince the model. And where almost no consultancy steps in yet.

The offensive arm of your AI: you govern it with the AI Act and ISO 42001, and here we put it to the test.

Why AI

The attacker no longer breaks in: they convince

Companies are putting AI into production faster than they secure it. And traditional security does not work here: an AI system is not attacked through its ports or its code, it is attacked by talking to it. It is tricked with hidden instructions, information it should not give is extracted from it and, if it has tools connected, it is made to act on your behalf.

And what changes the risk is that your AI is not alone: it has access to your data and, increasingly, to your systems. A tricked model is like an employee with keys who has been convinced to open the door. That is why putting it to the test is no longer optional.

Scope

What we put to the test

Three fronts, depending on what your AI does: whether it only talks, whether it reads your documents or whether it acts for you. The more it can do, the more it needs to be tested.

LLM applications

Chatbots, assistants and copilots that converse with your users and your data. The most widespread front and the way in for almost every attack.

direct prompt injectionindirect injectionjailbreakdata leakageinsecure output

RAG and knowledge bases

The model that responds by reading your documents. If the source is poisoned, the model responds with whatever the attacker wants, wearing your face and your trust.

source poisoningcontext manipulationleakage through retrieval

Agents, tools and MCP

When AI does not just respond, but acts: it carries out actions, uses tools and connects through MCP. Here a flaw stops being an improper response and becomes a real action in your systems.

tool abuseexcessive permissionsMCP server securityconfused deputy

Why Meta-Data

Few govern your AI. Even fewer attack it

Almost nobody does offensive AI security, and of those who do, almost nobody also understands how it is governed. We are at the intersection: we help companies comply with the AI Act and certify ISO 42001, and we attack those same systems to prove whether they hold up. The same firm that governs your AI puts it to the test.

And we do not improvise: we work with the reference frameworks for attacking AI, the OWASP Top 10 for LLM and MITRE ATLAS. Every finding becomes direct evidence for your AI Act and your ISO 42001, so governance and attack stop going their separate ways.

When

When you need to put your AI to the test

Before putting it into production

You are about to launch a chatbot, an assistant or an agent and you want to know how it would be broken before opening it to the world.

Your AI touches data or systems

The model accesses sensitive information or tools that carry out actions, where a trick turns into real harm.

The AI Act or ISO 42001 applies to you

You need evidence that your AI systems are secure for your AI Act or your ISO 42001.

You use agents or MCP

You have connected the model to tools or to MCP servers, and now your AI can act, not just respond.

Method

How we work

01

Scope and rules

We agree on which AI systems are in scope, what data and tools they reach and the rules, so we can attack freely and without risk.

02

Attack

Prompt injection, data extraction and abuse of tools and agents, just as a real adversary would do with your model.

03

Findings with proof

Every demonstrated flaw, with its business impact and how to close it, in language your team understands.

04

Verification

When you fix it, we test it again to confirm that the hole is really closed.

Fits with

It does not end with the report

An application with AI is also an application, so application pentesting covers its classic part and this one handles what is specific to the model. And since what we find proves real risks, it becomes evidence for your AI Act and your ISO 42001: the same work governs and attacks your AI.

And what we uncover here, with Sondriva, our SOC, we monitor afterwards: we detect abuse attempts against your AI in real time, while your team closes the flaws.

Questions

Frequently asked questions

What is AI or LLM pentesting?+

It means putting your artificial intelligence systems to the test by attacking them as a real adversary would: tricking the model with hidden instructions, getting data out of it that it should not give and, if it has tools connected, making it carry out actions for you. Unlike classic pentesting, here you do not break in by force, you convince the model.

Is this the same as an AI security audit?+

Yes, it is its offensive version. An AI security audit reviews and checks; we go further and attack: we do ethical hacking on your artificial intelligence to prove with evidence which flaws are really exploited. The name changes depending on who asks for it, AI pentesting, security audit or AI red teaming, but the work is the same: putting it to the test as a real adversary would.

What is a prompt injection?+

It is the flagship technique against LLM: slipping in instructions that the model obeys as if they came from its owner. It can be direct, in what the user writes, or indirect, hidden in a document, a website or an email that the model reads. With it, the model is made to ignore its rules, reveal data or misuse its tools.

Do you test AI agents and MCP servers?+

Yes, and it is one of the most important things right now. When AI does not just respond but acts, using tools, calling APIs or connecting through MCP, a flaw stops being an improper response and becomes a real action in your systems. We test tool abuse, agents with too many permissions and the security of MCP servers.

What is RAG and why is it attacked?+

RAG is when the model responds by reading your documents or your knowledge base. If an attacker manages to insert content into that source, they manipulate what the model retrieves and, with it, what it responds. We test source poisoning and data leakage through the retrieval path.

Do you follow any reference framework?+

Yes. We rely on the OWASP Top 10 for LLM and on MITRE ATLAS, which are the reference catalogs of attack techniques against AI. They give us a common base, but the interesting part is usually in how your specific system fits together.

Is it useful for the AI Act or ISO 42001?+

Yes, and it is one of its biggest advantages. The findings prove real risks in your AI systems, so they count as evidence for your AI Act compliance and for your ISO 42001. The same AI you govern with those standards, here you put to the test: governance and attack complement each other.

How does it differ from a normal application pentest?+

An application with AI is also an application, so application pentesting still applies to its classic part. What is new is the model: its behavior is not fixed code, it responds to language, and that opens up attacks that do not exist in traditional software. That is why it needs its own approach that adds to the application one.

Is it safe to do this on an AI in production?+

We agree on it beforehand and work carefully, just as in any test. When there is a risk of affecting real data or operations, we use an equivalent environment. The priority is to prove the flaw without causing harm.

Direct line

Shall we talk?

Tell us which AI system you want to put to the test, a chatbot, an assistant or an agent, and we will propose how to attack it safely.

Get in touch