How Large Language Models Actually Work: A Business Leader's Technical Primer
A concise technical explanation of how large language models function — from training data and transformer architecture to why they produce the outputs they do.
A clear explanation of how large language models are built — covering pre-training, supervised fine-tuning, reinforcement learning from human feedback, and alignment techniques.
Large language models are not programmed with rules — they are trained on data. The training process has three distinct phases: pre-training, fine-tuning, and alignment. Each phase shapes a different aspect of the model's behaviour, and understanding the distinction helps leaders assess vendor claims, make sense of capability differences between models, and evaluate when customisation through fine-tuning is genuinely warranted.
Training an LLM is the process of adjusting hundreds of billions of numerical parameters — the model's weights — so that the model produces useful, accurate, and appropriately behaved outputs. This is not programming in the conventional sense: no one writes rules specifying what the model should say. Instead, the model learns statistical associations from enormous quantities of text, then is progressively shaped toward useful and safe behaviour through subsequent training phases.
The three phases are sequential, each building on the previous:
Understanding this training pipeline matters for three practical reasons.
First, it explains model differences. Two models with similar parameter counts can behave very differently based on their training data, fine-tuning approach and alignment technique. Benchmark scores do not fully capture these differences — behaviour in context does.
Second, it frames the fine-tuning decision correctly. Fine-tuning is often proposed as a solution for making a model perform well on domain-specific tasks. But fine-tuning changes the model's weights — it is a training operation requiring labelled data, compute, ongoing maintenance, and re-evaluation after each update. For most business use cases, well-designed retrieval and prompting outperforms fine-tuning at a fraction of the cost and operational overhead.
Third, it grounds alignment expectations. A model's safety behaviour, tone, and tendency to follow instructions is not inherent — it is trained. Understanding this means that different deployment configurations, different model versions, and different providers will behave differently in ways that reflect their respective alignment choices.
Pre-training: The model is trained on a very large dataset — commonly several trillion tokens — using a self-supervised objective: predict the next token given all preceding tokens. No human labels are required. The model learns grammar, factual associations, logical patterns, coding conventions, and linguistic structures from this data. Pre-training is computationally intensive — frontier model pre-training runs require thousands of specialised processors over weeks or months and costs in the tens to hundreds of millions of dollars.
Supervised fine-tuning (SFT): A pre-trained model is further trained on a curated dataset of (prompt, ideal response) pairs. This teaches the model to follow instructions rather than simply complete text. The dataset is human-generated or human-curated and represents the instruction-following behaviour the developers want the model to exhibit. SFT is much cheaper than pre-training — it uses the same pre-trained weights as a starting point and requires far less compute and data.
Reinforcement learning from human feedback (RLHF): Human evaluators compare pairs of model outputs and indicate which is preferred. These preferences are used to train a reward model — a separate neural network that scores outputs according to human preferences. The language model is then further fine-tuned using reinforcement learning to maximise the reward model's score, producing outputs that humans find more helpful, less harmful and more accurate. Variants include RLHF, RLAIF (using AI feedback instead of human feedback), and Constitutional AI (used by Anthropic), which encodes principles directly into the alignment process.
LoRA and parameter-efficient fine-tuning (PEFT): For organisations undertaking their own fine-tuning, full fine-tuning of all parameters is rarely practical. Low-rank adaptation (LoRA) and its quantised variant QLoRA update only a small subset of parameters, making domain-specific fine-tuning computationally accessible. These approaches are widely used in enterprise fine-tuning projects.
For most Australian mid-market organisations, the relevant decision is not whether to run pre-training — that remains the domain of major AI labs — but whether fine-tuning is warranted for a specific use case, and if so, which fine-tuning approach is appropriate.
The general guidance: fine-tune when the required behaviour cannot be reliably achieved through retrieval-augmented generation and well-structured prompting, and when a high volume of labelled examples of the desired behaviour is available. Fine-tuning for style, tone, or format is often warranted. Fine-tuning to inject factual knowledge is usually a worse choice than RAG, because knowledge fine-tuned into weights becomes stale as the world changes.
Edison AI's AI training programmes include structured guidance on the pre-training / fine-tuning / RAG decision framework, helping teams evaluate options with reference to their specific data, task requirements and operational constraints rather than vendor marketing claims.
Edison AI runs practical AI training that turns this understanding into day-to-day team capability.
Pre-training is the initial phase of training a large language model, in which the model learns to predict the next token across a very large dataset — typically hundreds of billions of tokens drawn from text sources including books, websites, code and documents. Pre-training builds the model's general language capability and world knowledge.
Fine-tuning is a further training phase that adjusts a pre-trained model on a smaller, task-specific dataset to specialise its behaviour — for example, to follow instructions more reliably, to adopt a particular tone, or to perform well on domain-specific tasks. Fine-tuning is appropriate when prompting alone cannot reliably produce the required behaviour, but it requires labelled data, compute budget and careful evaluation.
Reinforcement learning from human feedback (RLHF) is a training technique in which human evaluators rate model outputs, and those ratings are used to train a reward model. The AI is then trained to maximise that reward. RLHF is a primary mechanism for aligning AI behaviour with human preferences — reducing harmful outputs, improving helpfulness, and making models more likely to follow instructions rather than exploit literal interpretations of prompts.
Edison AI helps Australian businesses move from AI curiosity to practical implementation, with workflow design, team training and measurable outcomes. Tell us about your setup and we'll come back with a sequenced plan grounded in the same thinking you just read.
Article: How AI Models Are Trained: Pre-training, Fine-tuning and Alignment Explained