AI Security Glossary

Short, referenceable definitions of key concepts in AI security, machine learning, and cybersecurity.

Glossary

H

P

Prompt amplification

Prompt amplification is a technique in which a relatively small input causes the model to generate a disproportionately large or complex output.

This may be used intentionally, for example in adversarial scenarios, where the prompt is designed to trigger extended, detailed, or repetitive responses. As a result, resource usage, response time, and operational cost may increase significantly.

Related concepts: token flooding

Reinforcement Learning from Human Feedback (RLHF)

Reinforcement Learning from Human Feedback (RLHF)

Reinforcement Learning from Human Feedback (RLHF) is a training approach in which a model’s behavior is optimized based on human feedback. Instead of learning directly from explicit rules or labeled datasets, the model is refined using evaluations that represent human preferences.

The process typically consists of multiple steps: after initial pretraining, human annotators rank or evaluate different model outputs, which are then used to train a so-called reward model. The generative model is subsequently optimized through reinforcement learning to produce responses that better align with these learned human preferences.

RLHF plays a key role in aligning the behavior and safety properties of modern large language models. However, it does not guarantee fully invariant behavior, particularly in the presence of further fine-tuning steps or adversarial inputs.

Related concepts:

S

Sampling

Sampling is the process used by generative AI models to select the next token during output generation.

At each step, the model produces a probability distribution over possible next tokens. Instead of always choosing the most likely option, sampling selects from this distribution, allowing less probable tokens to be chosen. This results in more diverse and less deterministic outputs.

Related concepts: temperature, non-deterministic behavior

T

Temperature

Temperature is a parameter in generative AI models that controls the sharpness of the probability distribution used during token selection.

Lower temperature values make the model more likely to choose high-probability tokens, resulting in more consistent and deterministic outputs. Higher values increase the likelihood of selecting less probable tokens, leading to more creative but less predictable responses.

Related concepts: sampling

Token flooding

Token flooding is a technique in which an attacker uses an excessive number of input or generated tokens to overload an AI system, exhaust computational resources, or distort the model’s behavior.

This can be achieved through long, redundant, or repetitive inputs, or by crafting prompts that force the model to produce excessively long outputs. The goal may include cost amplification, denial of service (DoS), or obscuring relevant information within the context.

Related concepts: prompt amplification, denial of service

Author

About the Author

Sandra S. Ethical Hacker | Former CISO | Cybersecurity Expert

Her professional career is defined by the duality of offensive technical experience and strategic information security leadership. As an early researcher in AI security, she was already working on the vulnerabilities of language models in 2018, and later became responsible for the secure integration of AI systems in enterprise environments. Through her publications, she aims to contribute to the development of a structured body of knowledge that supports understanding in the complex landscape of algorithm-driven threats and cyber resilience.

Contact

Get in Touch

For general inquiries, professional discussions, or consultations related to AI security, you can reach out using the contact information below.

Show email address
infoqyntarcom