Back

Bhoop singh Gurjar

AI Deep Explorer | f... • 20d

Give me 2 minutes, I will tell you How to Learn Reinforcement Learning for LLMs A humorous analogy for reinforcement learning uses cake as an example. Reinforcement learning, much like baking a cake, involves trial and error to achieve a desired outcome (a delicious cake) by learning from rewards (delicious cake) and penalties (burnt cake). Unsupervised learning is the foundation (the cake itself), supervised learning adds the frosting, and reinforcement learning is the cherry on top, the final touch. ⇛ Most important paper for LLM Reinforcement Learning - Asynchronous Deep Reinforcement Learning (Google Deepmind 2016) https://lnkd.in/gQUK3xmb - Reinforcement Learning from Human (OpenAI 2017) https://lnkd.in/gf5iPfhJ -Proximal Policy Optimization (OpenAI 2017) https://lnkd.in/gAG6As-7 -Fine-Tuning Language Models from Human Preferences (OpenAI 2020) https://lnkd.in/gsfxReUg -Learning to Summarize from Human Feedback (OpenAI 2022) https://lnkd.in/grUG-XHU -Direct Preference Optimization( Stanford University 2023) https://lnkd.in/gTKSQnCN - Group Relative Policy Optimization ( DeepSeek 2024) https://lnkd.in/gkNRn5sh -reinforcement learning with verifiable rewards (DeepSeek 2025) https://lnkd.in/gcksvi-v ⫸ Books for Reinforcement Learning -Reinforcement Learning from Human Feedback (Nathan Lambert) https://lnkd.in/gJW4JmiS -Reinforcement Learning: Industrial Applications (Phil Winder) https://amzn.to/4iufoQz -Reinforcement Learning (Richard S. Sutton) https://amzn.to/4jf0SNv Keep exploring, keep growing, and always give back!

0 replies2 likes

More like this

Recommendations from Medial

Bhoop singh Gurjar

AI Deep Explorer | f... • 16d

Having worked on Reinforcement Learning, it’s always fascinating to see how it’s being applied in the world of LLMs. If you’re curious about how RL powers modern LLM agents, especially in areas like reward modeling, and policy gradients here are a f

See More
0 replies15 likes
1
Image Description

Bhoop singh Gurjar

AI Deep Explorer | f... • 21d

The best way to learn about LLMs is to read the actual papers that highlight the fundamental ideas behinds LLMs. I'd prob first start off by learning about the attention mechanism which can be detailed in the following paper and try to implement a

See More
1 replies6 likes
Image Description

Bhoop singh Gurjar

AI Deep Explorer | f... • 21d

My Favorite AI & ML Books That Shaped My Learning Over the years, I’ve read tons of books in AI, ML, and LLMs — but these are the ones that stuck with me the most. Each book on this list taught me something new about building, scaling, and underst

See More
1 replies9 likes
1

Bhoop singh Gurjar

AI Deep Explorer | f... • 21d

𝗪𝗮𝗻𝘁 𝘁𝗼 𝗯𝗲𝗰𝗼𝗺𝗲 𝗮𝗻 𝗔𝗜 𝗺𝗮𝘀𝘁𝗲𝗿 𝗶𝗻 2025? Learn AI from the ground up with these 𝗙𝗥𝗘𝗘 YouTube channels that make it all crystal clear Let’s be honest—YouTube can teach you more about AI than a lot of university degrees, From

See More
0 replies6 likes

Comet

#freelancer • 4m

Mastering LinkedLists: Key Questions You Should Know Easy: 📌 Reverse Linked List: https://lnkd.in/g7qP9-YU 📌 Merge Two Sorted Lists: https://lnkd.in/gRfC6yyF 📌 Remove Nth Node From End of List: https://lnkd.in/gGnGF75X 📌 Delete Node in a Linked

See More
0 replies2 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 17d

The ultimate AI/ML roadmap for beginners 👇 𝗠𝗮𝘁𝗵𝘀 What to learn: • Linear Algebra • Calculus • Statistics Resources: • Practical Statistics for Data Science( https://amzn.to/446czl5 ) • Mathematics for Machine Learning( https://amzn.to/441s

See More
0 replies13 likes
11
Image Description
Image Description

Bhoop singh Gurjar

AI Deep Explorer | f... • 19d

Old is Gold: Deep Learning Classics In the fast-paced world of AI, it’s easy to overlook the timeless gems that laid the foundation for modern deep learning. Here’s a curated list of classic, high-quality courses taught by pioneers of the field tha

See More
4 replies16 likes
14

Bhoop singh Gurjar

AI Deep Explorer | f... • 22d

If I were learning RAG from scratch in 2025... here's exactly how I'd do it. But here I bring you the ultimate resource list for absolutely no cost. RAG is one of the most practical, production-ready LLM patterns today. Here's All you need to get s

See More
0 replies3 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 1m

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment link: https://bair.berkeley.edu/blog/2025/03/25/rl-av-smoothing/

0 replies2 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 1m

Day 1/100 : FREE AI Resource Sharing Topic of Day: History Of Artificial Intelligence(AI) Books ↳"Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig https://lnkd.in/gzSCYnf9 ↳ "The Master Algorithm: How the Quest for t

See More
0 replies11 likes
6

Download the medial app to read full posts, comements and news.