Home

Chapter 1

  • Foundation or a base model or a pre-trained model A base model created by training (often also called ‘pretraining’) the to-be LLM architecture with raw & unlabeled text data (trillions of words) and has broad capabilities that can be adapted to perform specific tasks by tuning (fine-tuning) it on smaller domain-specific datasets.

  • Fine-tuning Basically adapting the foundation model to do specific tasks. Can be done by taking the pre-trained model and further training it on smaller and specific datasets (often called domain-specific dataset) to perform specific tasks. In other words, updating the pretrained model weights with domain-specific dataset.

    • Instruction fine-tuning
    • Classification fine-tuning
  • Transformer

  • Encoder-decoder architecture

  • Attention mechanism

  • Self-attention mechanism

  • BERT (bidirectional encoder representations from transformers)

  • zero-shot learning

  • few-shot learning

  • recurrent neural networks (RNNs)

  • convolutional neural networks (CNNs)

  • tokens and tokenization in GPTs

  • self-supervised learning

  • autoregressive model

  • emergent behavior


Home