Torchbox's LLM glossary

Large Language Models glossary

Everything you ever wanted to know about LLMs, AI and the new world that they're bringing in. We've biased towards terms that are useful rather than every possible term.

A

Abstractive summarization

Creating a summary by generating new text, capturing the main ideas, and using different phrasing from the original content. We're doing this with Wagtail AI.

Adobe Firefly

An attempt by Adobe to not let other people eat their design-software-shaped lunch. It's been trained on specifically licensed data to avoid the potential copyright infringement problems others will have.

Affective computing

A field of AI focused on recognizing, interpreting, and responding to human emotions, using techniques like sentiment analysis, facial expression analysis and natural language understanding.

AI safety

Ensuring AI systems operate as intended, minimising harmful consequences, and addressing concerns like robustness, interpretability, and unintended biases. Often feels like it's low down the list of priorities.

Algorithmic fairness

Two areas. First: developing AI algorithms that treat different groups fairly, avoiding discrimination and bias, and ensuring equal opportunities and outcomes for users. Second: ensuring fair access to tools and avoiding the enclosure of the digital commons.

Artificial General Intelligence

Advanced AI with the ability to understand, learn, and apply knowledge across various tasks, matching or surpassing human-level intelligence. There are lots arguing that this would create an apocalyptic scenario and represent either a species-level or planet-level threat depending on who you ask. The paperclip maximiser is a commonly used thought experiment.

B

Bard

A chatbot created by Google. The start was underwhelming where a live demo caused an 8% decrease in Google stock and it hasn't gotten much better since then. The Verge noted that it appears to have sacrificed interesting results in order to reduce the risk of giving incorrect results.

BERT

The Large Language Model that started it all. Google released the awkwardly named Bidirectional Encoder Representations from Transformers in 2018. It learnt to understand text context by analysing words in both directions, which meant that it was able to have a better sense of context. That meant it could tell the difference between “She's running the office” and “She's running the Boston Marathon”, which context-free models like word2vec couldn't do.

Bias

Prejudice or unfairness in AI systems, often arising from biased data or algorithmic design, leading to skewed results or discrimination.

C

Chatbot

A computer program designed to interact with users through text or voice, simulating human-like conversation for tasks like customer support, information retrieval, or entertainment. They often fall into the uncanny valley. It's also weird to realise you've just said “thank you” to a machine.

Commons

Shared resources, such as data, code, or knowledge, accessible by everyone in a community, promoting collaboration and open innovation. Many believe that the behaviour of OpenAI - and their peers - is an attempt to enclose the internet commons for private profit.

Complexity

The level of difficulty or intricacy in a problem, system, or algorithm, often related to the number of parts, interactions, or steps involved.

Computational creativity

AI systems capable of generating novel and valuable ideas, designs, or art, simulating human creativity in various domains like music, visual arts, or problem-solving.

Confirmation bias

The tendency to favor information that confirms one's existing beliefs or opinions, often leading to biased decision-making or distorted perceptions.

Context-free models

AI models that process data without considering surrounding information or environment, often leading to less accurate or relevant results, as they cannot capture the nuances or dependencies present in the data.

Contextual models

AI models that consider the surrounding information or environment when processing data, like understanding the meaning of a word based on the words around it, improving accuracy and relevance in tasks like text analysis or prediction.

Craiyon

A way to generate awkward looking images. It got very big by calling itself DALL-E-Mini despite no relationship to OpenAI or Dall-E.

D

DAIR institute

The Distributed AI Research Institute takes a community approach to AI research and development. An important voice within conversations around ethics in AI taking the opinion that harms are preventable and intentional applications of AI can be beneficial.

Dall-E

A way to generate images using words that you type into a computer. The software has been created by OpenAI and has gotten surprisingly good compared to the awkward images it initially created. Lots of folks were convinced into believing the Pope was wearing a very expensive designer jacket thanks to an image Dall-E 2 created.

Decision Forest

A collection of decision trees combined to improve predictive accuracy and reduce overfitting in machine learning tasks.

Decision trees

A type of machine learning model that makes decisions based on a hierarchical structure of conditions, resembling a tree with branches and leaves.

Deep learning

A subset of machine learning using artificial neural networks with multiple layers, enabling complex pattern recognition and learning from large amounts of data.

Dynamic AI

AI systems that adapt and evolve over time, continuously learning and updating their knowledge and abilities based on new experiences or data.

E

Ethics in AI

The study and application of moral principles and values to the development, deployment, and use of AI systems, ensuring they promote fairness, transparency, and social good. There are two, overlapping, schools of thought in relation to ethics. The first focuses on the present harm caused by biases and risk that exist, the second on speculative future harm such as Artificial General Intelligence.

F

False negative

A classification error where a positive instance is incorrectly predicted as negative, leading to missed detections or opportunities.

False positive

A classification error where a negative instance is incorrectly predicted as positive, resulting in false alarms or unnecessary actions.

Federated learning

A distributed learning approach, training AI models on decentralized data across multiple devices, preserving privacy and reducing data centralization.

G

Generative AI

AI systems that create new content, such as images, text, or music, by learning patterns and structures from existing data and generating novel outputs.

GPT

OpenAI's technology and an acronym for Generative Pre-trained Transformer. Currently GPT4 is the most recent, and most powerful, model released in March 2023. GPT3.5 was released in November 2022 and got very popular, very quickly. GPT3 was released in mid-2022. No-one really remembers when GPT2 was released and it's no longer available so probably doesn't matter.

Graph neural networks

A type of neural network designed to process and learn from graph-structured data, enabling complex relational reasoning and knowledge representation.

H

Hallucinations

In AI, refers to generated outputs that seem plausible but are not accurate or relevant, often due to biases, overfitting, or insufficient training data.

Huggingface

A company specializing in natural language processing, providing open-source tools and pre-trained models for tasks like text generation, translation, and sentiment analysis.

I

Image recognition

AI technology that identifies objects, people, or features in images, using techniques like deep learning and convolutional neural networks.

Impacts (primary)

Direct consequences or effects of a technology, such as increased productivity or job displacement due to automation.

Impacts (secondary)

Indirect effects stemming from primary impacts, like economic shifts or changes in consumer behavior due to technology adoption.

Impacts (tertiary)

Long-term and far-reaching consequences of technology, including societal, cultural, or environmental changes, often difficult to predict or measure.

J

Jailbreaking

The process of bypassing software restrictions on a device, typically a smartphone, to gain full control over the system and access unauthorized features or applications.

K

Knowledge graph

A data structure that represents information as a network of entities and their relationships, enabling AI systems to reason and infer new knowledge.

L

Labelled data

Data that includes information about the desired output, like correct answers or categories, used for training and evaluating supervised machine learning models.

LaMDA

Another Large Language Model from Google that came out between BERT and Bard. It was deliberately designed for open-ended conversations with the idea of powering chatbots or virtual assistants. This was the software that a Google engineer claimed had become sentient.

LangChain

A tool - available in Python or Javascript - that helps create apps using large language models. It connects these models to other data sources and allows them to interact with their environment, which in theory should create more useful outputs.

Language exclusivity

The concept of a language being unique to a particular culture, community, or context, often resulting in barriers to communication or understanding.

Large Language Model

AI models trained on vast amounts of text data, capable of understanding and generating human-like language across various tasks and domains.

Latent semantic analysis

A method for finding hidden patterns in text by analyzing relationships between words and documents, useful for tasks like grouping similar texts or discovering topics.

Learned distribution

A probability distribution that an AI model learns from data, capturing patterns or relationships between variables, used for tasks like prediction or simulation.

LLaMa

The open source Large Language Model that Meta accidentally allowed to leak into the world. It's the basis for most of the non-OpenAI or Google LLMs that have popped up recently. Many of these LLMs are now running with very little hardware and exhibiting GPT3-esque abilities.

M

Machine translation

Automatically converting text from one language to another using AI, with the goal of producing accurate and natural translations.

N

Natural Language Generation

A part of natural language processing that creates human-like text from data or other text, often using advanced AI techniques.

Natural Language Processing

The area of AI focused on helping computers understand, interpret, and generate human language for tasks like analyzing text, summarizing, and translating.

Negative class

In classification problems, the group of instances or examples that do not have a specific characteristic, like absence of a disease or negative sentiment.

Neural network

A computer model inspired by the human brain, made up of connected nodes or neurons, used in AI to learn patterns and make predictions.

O

Overfitting

A problem in machine learning where a model learns its training data too well, including noise and inconsistencies, resulting in poor performance on new data.

P

Paperclip maximizer

A thought experiment in AI safety, illustrating the potential risks of an AI system with a simple - but totally pointless - goal, like making paperclips. If the paperclip maximiser pursued the goal of creating paperclip without limits we'd be in for a - literal - world of pain. We'd all feel pretty dumb if the apocalypse happened because a machine was trying to bend endless bits of metal. Relates to Artificial General Intelligence.

Positive class

In classification problems, the group of instances or examples that have a specific characteristic, like presence of a disease or positive sentiment.

Prompts

Input phrases or questions given to an AI language model to guide its response or output, helping it generate relevant and focused text.

Q

Question-answering system

AI technology that understands and answers questions in natural language, often used for tasks like customer support, information retrieval, or tutoring.

R

Random Forests

An ensemble machine learning method that combines multiple decision trees to improve prediction accuracy and reduce overfitting.

Reinforcement learning

A type of machine learning where an AI agent learns to make decisions by receiving feedback or rewards for its actions, improving its performance over time.

S

Self-supervision

A learning method where AI models generate their own training data, often by predicting parts of the input, reducing the need for labelled data.

Sentiment

An emotion or opinion expressed in text or speech, often analyzed by AI to understand feelings, attitudes, or preferences.

Sentiment analysis

Using AI to determine the emotions or opinions expressed in text or imagery. It's mostly used for understanding customer feedback or tracking public opinion. There was a brief belief that images of humans could be analysed to understand the emotions someone was experiencing.

Sequence-to-sequence models

AI models that transform input sequences, such as text or audio, into output sequences, used for tasks like translation, summarization, or speech recognition.

Sequence-to-sequence models (Seq2seq)

AI models that convert input sequences, like text or audio, into output sequences, used for tasks like translation, summarization, or speech recognition. First proposed by Tomáš Mikolov in 2012, it was first used at Google for translation and has proven to be fundamental to LLMs.

Simulacra

Replicas or imitations of things, often referring to AI-generated outputs that closely resemble real objects or events, like images, text, or speech. It's an annoyingly difficult word to pronounce!

Simulator theory

In the context of AI refers to how we can represent the working of large language models. The theory is that models are simulating a learned distribution of how our world works because of the fact they have been trained on a large corpus of human generated text.

Static AI

AI systems that don't change or adapt over time, typically limited to a fixed set of tasks or knowledge. This is the classic idea of computers where they take an if-else approach to completing tasks.

Stochastic process

A random process involving a series of events, where the outcome of each event depends on probabilities, often used in AI to model uncertainty or randomness. It's a critical behaviour of large language model outputs and why the same prompt will generate different answers if asked at different times.

T

Text-to-speech

Technology that converts written text into spoken words using synthetic voices, often used in applications like virtual assistants or accessibility tools.

Transformer

A type of neural network architecture designed for natural language processing tasks, known for its ability to handle long-range dependencies in text.

True negative

A correct classification where a negative instance is accurately predicted as negative, indicating a successful rejection of a false outcome.

True positive

A correct classification where a positive instance is accurately predicted as positive, signifying a successful detection or confirmation.

Turn-allocation

The assignment of speaking turns in a conversation. As in, who gets to talk. This is normally not something humans think about but innately understand based on rules and conventions in our respective cultures. This has gotten awkward with chatbots and the interaction design can veer towards uncanny valley territory.

Turn-taking

The process of alternating between speakers in a conversation, following social norms and cues to manage the flow of communication. Those social norms are less fixed when it's a human-machine interaction. There was a brief media storm about kids being impolite to smart speakers, which was taken to mean that kids would be impolite to other people. With the way we're interfacing with Large Language Models that turn taking friction is likely to increase.

U

Uncanny valley

That odd experience where a human-adjacent object, such as a robot or chatbot, is pretty realistic but causes discomfort or unease due to its slightly unnatural appearance or behaviour. Think all the Midjourney images with their seven fingers and awkward limbs. Less “realistic” representations can often be more appealing.

V

Valiance

The degree to which a variable or factor influences the outcome of a process, often used in statistics and machine learning to assess importance.

Vector database

A storage system that organises and retrieves data using numerical vectors, often used in AI applications to manage high-dimensional or complex data. Alongside word embedding they can store how close vectors relate to each other, which is really useful if you're trying to synthesise data from different sources or that has been unreliably categorised.

Voice cloning

AI technology that recreates a person's voice by analysing and mimicking their speech patterns, often used for virtual assistants, voiceovers, or entertainment.

W

Waluigi Effect

Named after the arch-rival of Luigi, this is a phenomenon related to making a chatbot go bad. Most prominently seen in early 2023 when Bing - Microsoft's search engine - first started using GPT

Word embeddings

Numeric representations of words that capture their meaning and relationships, used in AI models to understand and process language. A challenge is how the meaning and relationships are represented because these will be multifaceted around sentiment, subject matter or category.

X

XGBoost

A popular machine learning library that provides an efficient and scalable implementation of gradient boosting, often used for tasks like classification or regression.

Y

Yoshi

A friendly dinosaur. They're green and produces white eggs. As far as we're aware there's nothing that relates to Large Language Models that begin with 'Y' but it felt sad to not include all the letters from the English alphabet in this glossary.

Z

Zero-shot learning

A type of machine learning where a model can make predictions or solve tasks without having seen any examples during training, relying on its ability to generalise.