Back

Bhoop singh Gurjar

AI Deep Explorer | f... • 5h

Give me 2 minutes, I will tell you How to Learn Reinforcement Learning for LLMs A humorous analogy for reinforcement learning uses cake as an example. Reinforcement learning, much like baking a cake, involves trial and error to achieve a desired outcome (a delicious cake) by learning from rewards (delicious cake) and penalties (burnt cake). Unsupervised learning is the foundation (the cake itself), supervised learning adds the frosting, and reinforcement learning is the cherry on top, the final touch. ⇛ Most important paper for LLM Reinforcement Learning - Asynchronous Deep Reinforcement Learning (Google Deepmind 2016) https://lnkd.in/gQUK3xmb - Reinforcement Learning from Human (OpenAI 2017) https://lnkd.in/gf5iPfhJ -Proximal Policy Optimization (OpenAI 2017) https://lnkd.in/gAG6As-7 -Fine-Tuning Language Models from Human Preferences (OpenAI 2020) https://lnkd.in/gsfxReUg -Learning to Summarize from Human Feedback (OpenAI 2022) https://lnkd.in/grUG-XHU -Direct Preference Optimization( Stanford University 2023) https://lnkd.in/gTKSQnCN - Group Relative Policy Optimization ( DeepSeek 2024) https://lnkd.in/gkNRn5sh -reinforcement learning with verifiable rewards (DeepSeek 2025) https://lnkd.in/gcksvi-v ⫸ Books for Reinforcement Learning -Reinforcement Learning from Human Feedback (Nathan Lambert) https://lnkd.in/gJW4JmiS -Reinforcement Learning: Industrial Applications (Phil Winder) https://amzn.to/4iufoQz -Reinforcement Learning (Richard S. Sutton) https://amzn.to/4jf0SNv Keep exploring, keep growing, and always give back!

0 replies2 likes

More like this

Recommendations from Medial

Image Description

Bhoop singh Gurjar

AI Deep Explorer | f... • 1d

The best way to learn about LLMs is to read the actual papers that highlight the fundamental ideas behinds LLMs. I'd prob first start off by learning about the attention mechanism which can be detailed in the following paper and try to implement a

See More
1 replies6 likes
Image Description

Bhoop singh Gurjar

AI Deep Explorer | f... • 1d

My Favorite AI & ML Books That Shaped My Learning Over the years, I’ve read tons of books in AI, ML, and LLMs — but these are the ones that stuck with me the most. Each book on this list taught me something new about building, scaling, and underst

See More
1 replies9 likes

Comet

#uiux designer #free... • 3m

Mastering LinkedLists: Key Questions You Should Know Easy: 📌 Reverse Linked List: https://lnkd.in/g7qP9-YU 📌 Merge Two Sorted Lists: https://lnkd.in/gRfC6yyF 📌 Remove Nth Node From End of List: https://lnkd.in/gGnGF75X 📌 Delete Node in a Linked

See More
0 replies2 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 1d

𝗪𝗮𝗻𝘁 𝘁𝗼 𝗯𝗲𝗰𝗼𝗺𝗲 𝗮𝗻 𝗔𝗜 𝗺𝗮𝘀𝘁𝗲𝗿 𝗶𝗻 2025? Learn AI from the ground up with these 𝗙𝗥𝗘𝗘 YouTube channels that make it all crystal clear Let’s be honest—YouTube can teach you more about AI than a lot of university degrees, From

See More
0 replies6 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 1d

If I were learning RAG from scratch in 2025... here's exactly how I'd do it. But here I bring you the ultimate resource list for absolutely no cost. RAG is one of the most practical, production-ready LLM patterns today. Here's All you need to get s

See More
0 replies3 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 16d

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment link: https://bair.berkeley.edu/blog/2025/03/25/rl-av-smoothing/

0 replies2 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 20d

Day 1/100 : FREE AI Resource Sharing Topic of Day: History Of Artificial Intelligence(AI) Books ↳"Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig https://lnkd.in/gzSCYnf9 ↳ "The Master Algorithm: How the Quest for t

See More
0 replies10 likes
6
Image Description

Yogesh Dubey

Hey I am on Medial • 9m

𝐖𝐞𝐞𝐤𝐥𝐲 𝐀𝐈 𝐑𝐨𝐮𝐧𝐝𝐮𝐩: 𝐎𝐩𝐞𝐧-𝐒𝐨𝐮𝐫𝐜𝐞 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐢𝐨𝐧𝐬, 𝐃𝐚𝐭𝐚 𝐄𝐱𝐭𝐫𝐚𝐜𝐭𝐢𝐨𝐧, 𝐚𝐧𝐝 𝐍𝐞𝐰 𝐒𝐨𝐜𝐢𝐚𝐥 𝐍𝐞𝐭𝐰𝐨𝐫𝐤𝐬 🤖 1. 𝗞𝘆𝘂𝘁𝗮𝗶 𝗟𝗮𝗯𝘀' 𝗠𝗼𝘀𝗵𝗶: An open-source multimodal model with advanced real-ti

See More
1 replies5 likes
2
Anonymous
Image Description
Image Description

Gemini abused human, and here I am learning responsible ai from google ! Its scary though! what's your take on this?

6 replies3 likes

Bhoop singh Gurjar

AI Deep Explorer | f... • 1d

LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance

See More
0 replies2 likes

Download the medial app to read full posts, comements and news.