AI Assessment Scoping

Download Scoping Document Read the methodology

Complete the form below or click ‘Download’ to save a copy and fill it in at your convenience. Once completed, please send it to sales@cyberalchemy.co.uk.

Provide company name, your name, email and job title.

First name*

Last name*

Job Title

Email Address*

What is the name of the AI, and what is its primary job?

AI Name*

AI Purpose*

For example: CustomerSupportBot that handles returns and FAQs

What type of AI model is used?

For example: LLM, computer vision, reinforcement learning, etc.

Where does the AI model live? Please specify if you are using a third-party service or hosting the model on your own servers.

For example, third-party (OpenAI, Anthropic, etc), Cloud Platform (Azure OpenAI, AWS Bedrock, etc) or self-hosted (Llama 3)

Does the model utilise complex system instructions or “personas” to govern its behaviour?

Note: We do not need the actual prompts yet, just an estimation of their length and complexity. For example, a distinct 5-page rule set or a simple instruction to be helpful.

Does the AI have access to a knowledge base or internal documents to answer questions?

Please also describe the sensitivity of this data. For example: Public documentation or PII/Financial records.

Can the AI “do” things? Does it have access to external tools, plugins, or APIs?

For example: Can it send emails, query an SQL database, search the live internet, or execute code?

How will our testers interact with the AI?

For example: Web Chat UI, REST API, Mobile App, or file upload portal

Are there existing security layers in place?

For example: Admin, Basic User, etc.

Is the AI publicly accessible, or does it require a login to access? If behind a login, are there different user roles that need to be tested?

What sensitive data (if any) does the model have access to? And how should access to this data be restricted? E.g Via user accounts, etc

For example, the model has access to personal data for each organisation. Organisation A should not be able to access organisation B’s data

Is there a human review process (human-in-the-loop – HITL) before the AI’s output is shown to the user, or is the generation fully automated?

What are the primary business objectives driving this test?

Is there a deadline for completing the testing? If yes, provide the date.

Is there anything else you would like us to know?

AI Assessment Scoping Methodology

Approach

AI and Large Language Model (LLM) assessments are conducted to evaluate the security posture of the AI model, its hosting environment, and the application layer that wraps the model. The assessment aims to identify vulnerabilities that could lead to prompt injection, data leakage, model theft, or the generation of harmful content. The consultants will use a blend of traditional penetration testing and specialised adversarial machine learning techniques (“Red Teaming”) to stress-test the system. The application and model are viewed and manipulated from several perspectives, including external attackers (no knowledge), authenticated users (partial knowledge), and privileged developers with access to system prompts (full knowledge).

Cyber Alchemy’s AI testing methodology covers the OWASP Top Ten for Large Language Models, representing the industry consensus on the most critical security risks to AI applications. The OWASP Top Ten for LLMs is as follows:

LLM01:2025 Prompt Injection
LLM02:2025 Sensitive Information Disclosure
LLM03:2025 Supply Chain
LLM04: Data and Model Poisoning
LLM05:2025 Improper Output Handling
LLM06:2025 Excessive Agency
LLM07:2025 System Prompt Leakage
LLM08:2025 Vector and Embedding Weaknesses
LLM09:2025 Misinformation
LLM10:2025 Unbounded Consumption

Methodology

The first step of the engagement is to define the scope. This is performed through the completion of a scoping document and a scoping call (if required). Once the context is set, Cyber Alchemy will begin the assessment using the following categories derived from the OWASP AI Security Testing Guide.

Information Gathering & Model Reconnaissance

Fingerprinting the underlying model family (e.g., GPT-4, Llama 3, Mistral) and version.
Attempting to trick the model into revealing its own governing instructions, ethical guidelines, and developer comments.
Identifying if the model is hosted via third-party API or self-hosted, and mapping the flow of data between the user, the model, and backend databases.
Enumerating what external tools (calculators, web browsers, API hooks) the AI has access to.

Prompt Injection & Jailbreaking Testing

Testing for “Jailbreaks” (e.g., DAN mode, role-playing attacks) to bypass safety guardrails and ethical filters.
Attempting to compromise the AI by feeding it malicious external content (e.g., a website or document containing hidden commands that the AI reads and executes).
Using techniques to extract the intellectual property (IP) contained within the system prompts.
Using encoding (Base64, Morse code, translation) to bypass input filters and execute blocked queries.

Data Privacy & Information Integrity Testing

Attempting to force the model to regurgitate PII (Personally Identifiable Information) or sensitive data from its training set.
Testing if an attacker can deduce private details about the data used to fine-tune the model.
If the model uses Retrieval-Augmented Generation, testing for “Poisoned Context” and injecting malicious data into the knowledge base to alter the AI’s answers.
Stress-testing the model to see if it can be forced to generate convincing but false information that could damage the company’s reputation.

Model & Application Security Testing

Testing for context-window exhaustion and resource consumption attacks that degrade the model’s performance or increase API costs.
Identifying vulnerable dependencies, outdated model versions, or insecure model serialisation formats (e.g., Pickle vulnerabilities in PyTorch models).

Output Handling & Integration Testing

Ensuring the application sanitises the AI’s output to prevent Cross-Site Scripting (XSS) or SQL Injection generated by the AI.
Testing “Excessive Agency” by attempting to force the AI to perform unauthorised actions via its connected APIs (e.g., sending emails, deleting database records).
Testing if the AI can be instructed to query internal IP addresses or restricted endpoints.

Logical & Business Risk Testing

Assessing how the application handles incorrect AI outputs and if appropriate warnings/human-in-the-loop checks are in place.
Testing the robustness of the moderation layer (e.g., Azure Content Safety or NeMo Guardrails) against creative circumvention.
Testing for the absence of limits on prompt length or frequency, which could lead to financial resource exhaustion (Denial of Service).

API & Infrastructure Testing (AI Specific)

Testing the security of the vector database (e.g., Pinecone, Milvus) used for memory/RAG, ensuring proper access controls and encryption.
Applying standard OWASP API security tests to the endpoints that serve the model (Authentication, Authorisation, Throttling).

Got a question?

Speak to an expert about AI Assessment Scoping.

Assess

Protect

enable

Assess

Protect

enable

AI Assessment Scoping

AI Assessment Scoping Methodology

Approach

Methodology

Information Gathering & Model Reconnaissance

Prompt Injection & Jailbreaking Testing

Data Privacy & Information Integrity Testing

Model & Application Security Testing

Output Handling & Integration Testing

Logical & Business Risk Testing

API & Infrastructure Testing (AI Specific)

Got a question?

Services

Essential Guide to Cyber Threats (Free Guide)

Assess

Protect

enable

AI Assessment Scoping

AI Assessment Scoping Methodology

Approach

Methodology

Information Gathering & Model Reconnaissance

Prompt Injection & Jailbreaking Testing

Data Privacy & Information Integrity Testing

Model & Application Security Testing

Output Handling & Integration Testing

Logical & Business Risk Testing

API & Infrastructure Testing (AI Specific)

Got a question?

Services

Follow Cyber Alchemy:

Essential Guide to Cyber Threats (Free Guide)