LLM and AI Intensive Study Guide

01. Prompting and structured output

Instruction design, role prompting, schema-constrained responses, and prompt debugging.

02. LLM deployment and production issues

Latency, throughput, observability, guardrails, caching, and cost control in real systems.

03. Fine-tuning and parameter-efficient tuning

When to fine-tune, LoRA-style adapters, data formatting, and evaluation.

04. Quantization and model optimization

Compression, inference efficiency, hardware trade-offs, and quality loss management.

05. Retrieval-Augmented Generation (RAG)

Grounding generation with external knowledge using retrieval pipelines.

06. Vector databases

Approximate nearest neighbor search, indexing, filtering, and storage design.

07. Embeddings

Dense representations, similarity, domain adaptation, and practical evaluation.

08. Document chunking and indexing

Splitting, metadata design, and index construction for retrieval quality.

09. Information retrieval

Ranking, lexical retrieval, evaluation, and hybrid search strategies.

10. Hallucination reduction

Detection, prevention, grounding, and response shaping for higher reliability.

11. Multilingual NLP

Cross-lingual transfer, multilingual embeddings, and language-specific pitfalls.

12. Text preprocessing and normalization

Cleaning, standardization, Unicode handling, and pipeline design.

13. Bias, privacy, and responsible AI

Risk management, governance, privacy, fairness, and safe deployment.

14. Core NLP concepts

Tokens, corpora, ambiguity, syntax, semantics, and task framing.

15. Classical NLP techniques

Bag-of-words, n-grams, TF-IDF, RNNs, CNNs, Seq2Seq, and Transformers.

16. Text representation methods

BoW, TF-IDF, word embeddings, sentence embeddings, and document embeddings.

17. Tokenization and subword methods

Wordpiece, BPE, unigram tokenization, and vocabulary construction trade-offs.

18. Word embeddings

Word2Vec, GloVe, fastText, analogies, and limitations of static embeddings.

19. Sequence models

Autoregressive modeling, sequence labeling, and temporal dependence.

20. RNNs and Seq2Seq

Recurrent modeling, encoder-decoder systems, and practical weaknesses.

21. Transformers

Modern language modeling architecture, scaling, and training mechanics.

22. BERT and transformer pretraining

Masked language modeling, pretraining objectives, and transfer learning.

23. Attention mechanisms

Soft attention, self-attention, cross-attention, and computational trade-offs.

24. Machine translation

Translation pipelines, alignment, decoding, and multilingual evaluation.

25. NLP evaluation metrics

BLEU, ROUGE, perplexity, WER, accuracy, precision, recall, and F1.

26. Interpretability in transformers

Attention inspection, probing, attribution, and limits of mechanistic claims.

27. Scaling laws in NLP

How performance changes with data, parameters, and compute budgets.

28. Large Language Models (LLMs)

Capabilities, limitations, training pipeline, deployment patterns, and future directions.

29. Classification metrics: accuracy, precision, recall, and F1

What each metric measures, when accuracy misleads, the precision-recall trade-off, F-beta, and multi-class averaging.

30. Activation functions and backpropagation

What activation functions are, ReLU, sigmoid, and softmax explained, and how backpropagation trains a network using the chain rule.

31. My Notes

Personal study notes: LLM fine-tuning, CNNs, RNNs, Transformers, attention, hyperparameters, KV cache, and more.

31b. Notes Exam

50 multiple-choice questions covering neural networks, attention, embeddings, training, and interview-focused topics.

32. Loss Functions

MSE, MAE, Huber, binary cross-entropy, categorical cross-entropy, hinge, KL divergence, contrastive, and triplet loss.

33. Deep Learning Fundamentals

Neurons, activation functions, backpropagation, CNNs, ResNet, transfer learning, Transformers, BERT, GPT, and generative models.