Glossary
Agent/agentic
An agent is a system that uses an LLM to autonomously perform tasks by combining reasoning, memory, and tool use. Unlike a plain LLM, it can plan, take actions, and interact with its environment to achieve goals.
Artificial general intelligence (AGI)
A hypothetical form of AI that can understand, learn, and apply knowledge across a wide range of tasks at a human level of general intelligence.
Constrained decoding
A technique used during LLM output generation to enforce specific rules or structures, such as valid JSON, matching a schema, or including required keywords, by limiting the model’s token choices at each step.
Chain of thought (CoT)
Chain of thought refers to prompting an LLM to to generate a step-by-step explanation or reasoning process before producing the final answer.
Context
The input given to an LLM, constrained to a maximum size supported by the given model. Typically assembled from pieces such as chat history, agent rules, system prompt, which obscures the fact that the model takes a single input sequence and produces output from it in a stateless manner.
Deep learning
Deep learning is a subset of machine learning that uses artificial neural networks with many layers (hence “deep”) to automatically learn hierarchical representations of data. It’s the foundation of large language models (LLMs).
Diffusion model
Diffusion model is a type of generative model that creates data (such as images or text) by iteratively reversing a noise process. It starts with random noise and gradually refines it into coherent output through learned denoising steps. Most commonly used for image generation.
Distillation
Distillation is a model compression technique where a smaller, more efficient model (the “student”) is trained to replicate the behavior of a larger, more powerful model (the “teacher”), typically by mimicking its output probabilities, which enables deployment of LLM-like capabilities with reduced computational cost.
Eval/evals
Eval/evals refers to evaluation methods and benchmarks used to measure a model’s performance on specific tasks or capabilities, ranging from accuracy on standard datasets to assessments of reasoning, factuality, bias, and safety, often through curated tests or automated pipelines.
Fine-tuning
Fine-tuning is the process of continuing to train a pre-trained model on a smaller, task-specific dataset to adapt it to particular use cases or domains.
Foundation model
Initial training produces a large model which is more general purpose and is therefore called a “foundation model”. This model can then be adapted to more specific tasks via fine-tuning, distillation, prompting, or other techniques.
Frontier model
A frontier model is an LLM that pushes the boundaries of scale or capabilities, typically developed by leading AI labs and considered close to or at the limits of current AI performance. There is a lot of spin and hype surrounding the term.
Function calling
A mechanism that allows an LLM to interact with external tools or APIs by outputting structured data (like JSON) that maps to predefined functions, enabling more reliable and programmable responses.
Generative AI
Generative AI refers to models that learn patterns and structures of their training data and apply them to generate new content such as text, images, code, or audio based on the given input.
Hallucination
When an LLM generates plausible-sounding but false or fabricated information, it is referred to as hallucination. This phenomenon appears to be an inherent characteristic of the current generation of LLMs.
Inference
The process of a trained LLM generating an output for a given input.
LLM (large language model)
An LLM is a deep learning model trained on vast amounts of text with the purpose of generating text output. It can be extended into a multi-modal model by means of representing input and output for other modes (image, audio) as token sequences.
LLMOps
A set of practices and tools related to training and operating LLMs, including managing training data, deploying models, monitoring performance etc. This is a subset of MLOps.
LRM (large reasoning model)
An LRM is an LLM that has been further trained to solve multi-step reasoning tasks (for example, using a dataset of reasoning tasks with example solutions and details of reasoning steps). LRMs perform an order of magnitude more computation than LLMs during inference.
Machine learning
An area of study concerned with the development of statistical algorithms that can learn from data and generalise to new data.
Model Context Protocol (MCP)
MCP is an open, JSON-RPC‑based standard created by Anthropic that standardises two‑way connections between LLM applications (clients) and external MCP servers which provide access to data or tools (such as filesystem access).
MLOps
A set of practices and tools for managing, operating and monitoring machine learning software and data.
Mixture of experts (MoE)
An LLM architecture where a gating network routes inputs to a collection of specialised models (“experts”) depending on the input. This makes inference faster as it requires computation with fewer parameters (using a single “expert”), but the downside is that the whole model has to be loaded into memory nonetheless.
Neural network
A computational model inspired by the structure and functions of biological neural networks. A neural network is made up of interconnected conceptual “neurons”, with every connection assigned a particular strength which is known as a weight.
Prompt injection
Prompt injection refers to a third party manipulating the context with nefarious intent such as extracting confidential information. Mechanisms like RAG and agents vastly increase the opportunities for this exploit.
Quantisation
Quantisation reduces the precision of model weights (eg. from 32-bit to 8-bit) to decrease memory usage and improve inference speed, with the tradeoff being the loss of accuracy.
Retrieval-augmented generation (RAG)
A mechanism for retrieving external data (eg. documents) and including them in the context window with the goal of producing higher quality output. The nature of training means that models have a fixed cutoff date for the training data, so RAG is also a useful way of incorporating more recent information without going to the trouble of repeated fine-tuning.
Reasoning
Reasoning refers to the model’s ability to draw logical inferences, make decisions, or solve problems by combining information, particularly triggered by techniques like Chain of Thought prompting.
System prompt
The system prompt is a (usually hidden) prompt prefix added to every prompt. Typically used to attempt to constrain the LLM’s tone, role or output.
Temperature
Temperature controls the randomness of an LLM’s output. Lower values make the output more deterministic, while higher values increase variability.
Token
A token is an integer mapped to a subsequence of the input sequence (often a word or subword) that the model processes as a unit. Models operate on tokens rather than directly on text or pixels, therefore input has to be tokenised before being fed into the model. The reason for this is twofold: to expose a degree of semantic structure, and to reduce the amount of input/output required.
Training
Training means processing vast amounts of diverse data (text, images, audio and video) to calculate internal neural network parameters called model weights with the goal of reducing the error in its predictions.
Vector embeddings
Vector embeddings are high-dimensional numeric representations of text (or other data types) that capture semantic meaning and can be compared using mathematical operations like cosine similarity.
Vibecoding
Attempts to use LLMs or agents to generate working software (mostly) without reviewing or manually modifying generated code are called vibecoding. The term is sometimes used in a looser way to refer to any AI-assisted coding.
Weights
LLMs are multi-layered structures called neural networks which are made up of interconnected conceptual “neurons”, with every connection assigned a particular strength which is known as a weight. The set of model weights is the output of training.