The best way to learn about LLMs is to read the actual papers that highlight the fundamental ideas behinds LLMs. I'd prob first start off by learning about the attention mechanism which can be detailed in the following paper and try to implement a vanilla transformer: https://lnkd.in/eV6NcXx8 Once you have that down I would probably aim to learn about earlier models and try to implement them. Here are a couple models that are good examples to try to implement from scratch (Karpathy and Umar has good videos for them if you get stuck or just want general overview): ✓bert: https://lnkd.in/eU7F324a ✓gpt: https://lnkd.in/eAzaDsP5 ✓gpt2: https://lnkd.in/ehjhXveV After you gone through all of these, you should have a decent foundation for the very basics and start reading papers in the areas you are interested in. If you don't know where to start these are some suggestions for papers I think that are interesting and worth considering reading: ✓Scaling laws: https://lnkd.in/gdWbBH8i ✓Lora: https://lnkd.in/eW-V4Dcq ✓Mixture of experts: https://lnkd.in/eM5ngGSj ✓Reinforcement Learning from Human Feedback : https://rlhfbook.com/
Download the medial app to read full posts, comements and news.