Skip to main content

Command Palette

Search for a command to run...

Why It Is Called a 'Large' Language Model

Updated
12 min read
A
With 4 years of experience in Product Management, Research, and Cross-functional Collaboration, I thrive at the intersection of business, technology, and users. I'm highly collaborative and naturally empathetic; these qualities have shaped my ability to build trust across diverse stakeholders, facilitate alignment, and bring teams together around a shared vision. Whether stepping into a Scrum Master capacity to keep delivery on track, diving into User Research to uncover actual needs, or driving Product Strategy, I bring a disciplined, goal-oriented approach to every role I take on. I'm energised by working with people; confident in leading conversations; comfortable taking initiative, and skilled at creating the kind of clarity that moves teams forward.

Series: Collaborating with AI Systems — Article 2 of 5

The name tells you almost everything, if you read it carefully.

Language: it is a system built specifically around language. Not spreadsheets, not databases, not physical sensors. It was trained on text: books, articles, websites, research papers, code repositories, conversations.

Model: in the scientific sense, a model is a simplified representation of something real. A weather model represents atmospheric behaviour. A financial model represents economic behaviour. A language model represents how language works, the patterns, the relationships between words and ideas, the way one sentence follow another.

Large: This ixs where things really changed. Here, researchers dramatically increased 3x; the volume of training data; the size of the model itself, and even the computing power used to train it. When all three grew together, the models didn’t just get a little better, they became very different. New capabilities emerged that nobody had explicitly programmed, like reasoning, translating, summarising, writing, and having conversations much better than smaller models.

“large” doesn’t just mean big in size, it’s what made this technology powerful 💪🏼

The 3 Pillars That Made This Possible

LLMs did not appear out of nowhere. They are the result of three pillars coming together at the right moment in history.

Algorithm
An algorithm here means a set of step-by-step mathematical instructions that the model follows to process input and produce output. A major breakthrough in this area was called the transformer. Before transformers, language models processed words one after another, in order. The transformer changed this by allowing models to look at all the words at the same time and pay different levels of attention to different words, depending on what the model is trying to understand.

Think of it like this:
when you read the sentence "The football didn't fit in the luggage because it was too big", you instantly know that "it" refers to the football, not the luggage. You resolved that ambiguity using context from the whole sentence. The transformer is what gave language models this same ability to hold the whole context in view and weigh relationships between distant words.

Data
The second pillar is the huge volume of training data. These models were trained on a large amount of human written text from the internet (billions of documents, in many languages, covering almost every area of human knowledge). This matters because language models learn patterns by seeing many examples. The more they are exposed to good reasoning, clear writing, technical explanations, and nuanced arguments, the better they become at producing those things themselves.

This is also why LLMs feel like they "know" things. They don’t store facts in one place like a normal database. Instead, they learn patterns from a lot of text and store those patterns as numbers inside the model. When you ask a question, the model uses those learned patterns to create the most likely answer.

Computation
The third pillar is the computing power. Training a large language model requires a huge amount of data to be processed through billions of calculations, using thousands of specialised processors, for weeks or even months. This was not realistic a decade ago. But as computing infrastructure became cheaper, especially specialised hardware, it became possible and affordable to train models at the large scale needed for the capabilities we see today.

These 3 pillars: a smarter algorithm, much more data, and the computing power to process it, are what created the big improvement in the capability. Remove any one of them and you do not get Claude or ChatGPT. You get something far more limited.

How an LLM Actually Produces a Response

When you type a message to an LLM (like chatGPT, Claude, Gemini) the model does not search a database or look up an answer. It does something more unusual: it predicts, one piece at a time, what the most useful next output should be based on everything it has seen so far.

Those pieces are called tokens. A token can be a full word, part of a word, a punctuation mark, or even a space. For example, a short sentence might be broken into several tokens before the model processes it. When a model says you are out of tokens, it means the total amount of text it can handle has reached its limit. That total includes your message, previous conversation history, uploaded content, and the response the model is trying to generate. The model generates tokens one after another. Each token is influenced by the tokens that came before it in the conversation.

Imagine you are an exceptionally brilliant colleague who has absorbed the writing, reasoning, and knowledge patterns of millions of documents. When someone asks you a question, you do not retrieve a memorised answer. You draw on all those patterns to compose a response in the moment in a way that fits the question, the context, and the tone of the conversation.

That is the closest human analogy to what an LLM does. It is composing a response based on learned patterns, which is what gives it both its impressive abilities and its important limitations.

What AI Does Exceptionally Well

Speed and scale. A task that takes a skilled analyst 4 hours, like reading 50 documents and turning them into a structured summary, can be done in seconds by an LLM. It can also do this for 1000s of documents at the same time, at any hour, without getting tired. This has a big economic impact on many professional workflows.

Pattern recognition: LLMs are extraordinarily good at recognising patterns in language: the structure of a strong argument, signs of weak or flawed analysis, the style of a specific document type, and the right tone for a particular audience. This is because pattern recognition is literally what they were trained to do.

Scope of knowledge: Because LLMs were trained on information from many different fields, they can engage meaningfully to topics in finance, law, medicine, engineering, marketing, and many others. This broad knowledge makes them fit as a thinking partner across almost any professional context.

Language processing at scale. Translating, summarising, reformatting, extracting key information, drafting, editing, any task where the primary work is transforming or generating language is an area where LLMs operate with intelligent capability.

Human judgement Is Irreplaceable

Understanding what AI cannot do is essential for AI Fluency:

Critical thinking and judgement: An LLM generates reasonable outputs based on patterns. It does not evaluate whether those outputs are actually correct, appropriate, or wise in your specific context. That evaluation is yours. A model can draft a strategy document. Only you can judge whether that strategy fits your market, your organisation, and the dynamics you understand from being inside it.

Creativity: LLMs can create ideas from what they have seen before, into something that feels new. But the deeper kind of creativity still needs human. The kind where you notice that everyone is asking the wrong question, take a bold direction, or make a risky decision with realistic consequences. An LLM can suggest possibilities, but a human still has to understand why it matters and take responsibility for the choice.

Ethical oversight: AI systems generate outputs. They do not have values. They do not understand consequence. They can produce content that is harmful, biased, or misleading without any internal signal that something has gone wrong. The professional working with AI is always the ethical layer, the one responsible for ensuring that what gets produced is appropriate, honest, and aligned with the interests of the people it affects.

Relationship and trust: In finance, customer success, product, and consulting, a lot of the work depends on people and relationships. Clients trust you, not just a model. Teams follow leaders, not algorithms. Knowing how to read a room, handle a difficult conversation, sense when something feels off, and make the right judgement in the moment are still much of human skills.

So the better way to think about it is not AI vs humans 😬

Should be: AI can help with speed, scale, and pattern matching. But humans bring judgement, context, accountability, and relationships.

The Limitations You Must Understand

I want to be direct about these, because understanding them is what allows you to use it safely.

Knowledge cutoff
Every LLM was trained on data up to a specific point in time, called a training cutoff. After that date, the model has no knowledge of what has happened in the world. It does not know about recent regulatory changes, market events, product launches, or news. If you ask an LLM about the current state of anything, you may get an answer that was accurate 14 months ago and is now wrong. Always verify time-sensitive information from current sources.

Hallucination
This is the most important limitation to understand, and the one with the most potential for professional harm. Because LLMs generate text based on patterns rather than verified facts, they sometimes produce information that is entirely made up, but presented with complete confidence. A fake citation. A statistic that does not exist. This is called hallucination, and it does not come with a warning label. The model does not know it is wrong.

Non-deterministic output
Ask the same question twice and you may get a different answer. This is by design, LLMs introduce variability in their outputs to avoid being mechanical and repetitive. But for professionals who need consistency, this matters. A process that relies on an LLM producing the same output every time under the same conditions will eventually be surprised.

Context window
Every conversation with an LLM happens within a container called the context window, the total amount of text the model can hold in view at once. Think of it like a desk. The desk has a fixed surface area. You can only work with what fits on the desk at any given moment. If you bring in more documents, some may fall off the edge, and the model simply cannot see them anymore.

In practice, this means very long conversations or very large documents can cause the model to lose track of earlier details. Important instructions given at the start of a long session may be effectively forgotten by the end.

Incorrect information presented confidently
Related to hallucination but worth separating: even when an LLM is not fabricating something entirely, it can be wrong in more subtle ways. It may give advice appropriate in one context but not yours. The confidence of the output does not correlate with its accuracy. As an experienced professional, your domain knowledge is your most important quality-control mechanism.

The Bottom Line: Why Understanding This Makes You Better at It

You might have noticed that most of the limitations I described have a common thread: they require a human professional to catch them.

A knowledge cutoff requires someone who knows enough to ask when was this information current? A hallucination requires someone with enough domain knowledge to spot that a citation or statistic does not feel right. A context window problem requires someone structuring the interaction thoughtfully. Non-deterministic output requires someone building in a verification step.

They are powerful pattern recognition engines, but they have no ground truth. No internal fact checker. No understanding of consequence. The professionals who will collaborate with AI most effectively are the ones who bring exactly what the system lacks, contextual knowledge, critical evaluation, ethical accountability, and professional judgement.

AI fluency is not about learning to trust the technology. It is about learning precisely when to trust it, when to verify it, and when to override it entirely.

What This Means for You

So given all of this ( the mechanism, the strengths, the limitations) what changes about how you approach AI tools?

Three things.

First, you bring your expertise to the interaction. The more context, the more domain knowledge, the more specificity you give the model, the better its outputs become. Vague inputs produce vague outputs. Thoughtful, specific, well-framed inputs produce genuinely useful ones. The quality of your thinking is dependent on the quality of what comes out.

Second, you maintain your judgement as the final layer. Not because AI is untrustworthy across the board, but because you are the one accountable for what gets used and what gets acted on. Treat AI outputs the way you would treat a smart junior colleague's first draft: worth reading carefully, worth building on, but not ready to go out the door without your review.

Third, you develop the skills to work with it deliberately. Knowing how to frame a request, how to structure a multi-step task, how to verify an output, how to recover when the model goes off track. These are learnable skills, and they make an enormous difference to outcomes.


Part 3 introduces the core competencies that define effective collaboration with AI systems, and prompting techniques.

The models evolve, and so do the techniques. But the underlying competencies, translate across tools and across model generations. Professionals who develop them will not just use AI well today. They will adapt as the technology changes, because they understand what they are actually doing when they collaborate with it.

See you in Part 3 😎


This is Article 2 of the Collaborating with AI Systems series. Missed Part 1? Start here: https://adeolaamisu.hashnode.dev/foundation-for-working-with-ai-intelligently

7 views

Collaborating With AI Systems

Part 3 of 3

The goal of this series is to naturally help individuals gain extensive knowledge on how Language Models work like (Claude, Chatgpt, Gemini, Deepseek, e.t.c.) for effective collaboration. It helps to define the technology clearly.

Start from the beginning

Foundation for Working with Language Models Intelligently

Series: Collaborating with AI Systems — Article 1 of 5 There is a version of this conversation happening in every industry right now. A finance team is asking whether AI can automate their reporting;