Chapter 1
-
Foundation or a base model or a pre-trained model A base model created by training (often also called ‘pretraining’) the to-be LLM architecture with raw & unlabeled text data (trillions of words) and has broad capabilities that can be adapted to perform specific tasks by tuning (fine-tuning) it on smaller domain-specific datasets.
-
Fine-tuning Basically adapting the foundation model to do specific tasks. Can be done by taking the pre-trained model and further training it on smaller and specific datasets (often called domain-specific dataset) to perform specific tasks. In other words, updating the pretrained model weights with domain-specific dataset.
- Instruction fine-tuning
- Classification fine-tuning
-
Transformer
-
Encoder-decoder architecture
-
Attention mechanism
-
Self-attention mechanism
-
BERT (bidirectional encoder representations from transformers)
-
zero-shot learning
-
few-shot learning
-
recurrent neural networks (RNNs)
-
convolutional neural networks (CNNs)
-
tokens and tokenization in GPTs
-
self-supervised learning
-
autoregressive model
-
emergent behavior