LLM Notes: From Transformer to RAG
Key takeaways and open questions from self-studying large language models.
Read in Chinese →LLM learning curves are steep, but the core modules are separable.
Current Framework
- Transformer: self-attention as foundation
- Pretrain + fine-tune: where general ability comes from
- RAG: external knowledge when parametric memory isn’t enough
Open Questions
- As context windows grow, where does RAG’s boundary lie?
- Can small models + good tooling approximate large-model UX?
More detailed notes to follow.
Comments
Comments not enabled yet. Turn on GitHub Discussions and configure Giscus.