Back

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 2m

All LLMs are LMs, but not all LMs are LLMs. Most people still get confused. I've explained below. โ€ข ๐—Ÿ๐— ๐˜€ (๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€): These are models that can process and generate human language. They can be small or medium-sized and may not require huge datasets. โ€ข ๐—Ÿ๐—Ÿ๐— ๐˜€ (๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€): These are a ๐˜€๐—ฝ๐—ฒ๐—ฐ๐—ถ๐—ณ๐—ถ๐—ฐ ๐˜๐˜†๐—ฝ๐—ฒ of LM, but much ๐—น๐—ฎ๐—ฟ๐—ด๐—ฒ๐—ฟ in scale. LLMs like GPT-3 or GPT-4 are trained on massive datasets, have billions (or even trillions) of parameters. An ๐—Ÿ๐—Ÿ๐—  is a ๐˜€๐˜‚๐—ฏ๐˜€๐—ฒ๐˜ ๐—ผ๐—ณ ๐—Ÿ๐—  that is: โ€ข Very large in size โ€ข Trained on massive datasets โ€ข Based on deep neural networks (Transformers) โ€ข Capable of reasoning, coding, summarizing, etc. Types of LM's: ๐—•๐˜† ๐—ฆ๐—ถ๐˜‡๐—ฒ / ๐—ฆ๐—ฐ๐—ฎ๐—น๐—ฒ 1. ๐—ฆ๐—บ๐—ฎ๐—น๐—น ๐—Ÿ๐— ๐˜€ โ€ข Lightweight, fast, low-cost models with limited intelligence. 2. ๐— ๐—ฒ๐—ฑ๐—ถ๐˜‚๐—บ ๐—Ÿ๐— ๐˜€ โ€ข Balanced speed and accuracy, suitable for most production systems. 3. ๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—Ÿ๐— ๐˜€ โ€ข High-capacity models with strong reasoning, powerful but expensive. ______________ ๐—•๐˜† ๐—จ๐˜€๐—ฎ๐—ด๐—ฒ 1. ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐—น-๐—ฝ๐˜‚๐—ฟ๐—ฝ๐—ผ๐˜€๐—ฒ ๐—Ÿ๐— ๐˜€ โ€ข Designed to handle many tasks โ€ข Chat, writing, coding, reasoning 2. ๐——๐—ผ๐—บ๐—ฎ๐—ถ๐—ป-๐˜€๐—ฝ๐—ฒ๐—ฐ๐—ถ๐—ณ๐—ถ๐—ฐ ๐—Ÿ๐— ๐˜€ โ€ข Trained or tuned for one field โ€ข Legal, finance, medical, etc. โ€ข More accurate in narrow domains 3. ๐—˜๐—ฑ๐—ด๐—ฒ ๐—Ÿ๐— ๐˜€ โ€ข Run locally on devices โ€ข Privacy-friendly โ€ข Limited power due to size ______________ ๐—•๐˜† ๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐—ฆ๐˜๐˜†๐—น๐—ฒ 1. ๐—ฃ๐—ฟ๐—ฒ-๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ฒ๐—ฑ โ€ข Trained on general internet-scale data โ€ข Base intelligence layer 2. ๐—™๐—ถ๐—ป๐—ฒ-๐—ง๐˜‚๐—ป๐—ฒ๐—ฑ โ€ข Adapted for specific tasks or domains โ€ข Improves accuracy and usefulness 3. ๐—œ๐—ป๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป-๐˜๐˜‚๐—ป๐—ฒ๐—ฑ โ€ข Optimized to follow user instructions โ€ข This is what ChatGPT-style models are Most people just know about LLM's but it's important to know such fundamentals. โœ… Repost for others so they can also know this fundamental difference.

Reply
1

More like this

Recommendations from Medial

AI Engineer

AI Deep Explorer | f...ย โ€ขย 10m

LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance

See More
Reply
2

Linkrcap Studio

A digital news platf...ย โ€ขย 20h

India has no dearth of large language models (LLMs). Yet, most AI models struggle with the country itself 22 languages, multiple scripts, and patchy compute. Sarvam AI wants to fix this with its two indigenous models. But can it turn technical ambiti

See More
Reply
3
Image Description

Yash K

Avid Learner | In De...ย โ€ขย 7d

Indiaโ€™s biggest AI drop so far ๐Ÿ‡ฎ๐Ÿ‡ณ Introducing Sarvam-30B and Sarvam-105B frontier LLMs built from India, for India. Sarvam-30B โ€ข 30B parameters (MoE) โ€ข 1B active params/token โ€ข 32K context window โ€ข Trained on 16T tokens โ€ข Competitive with Gemma-

See More
Reply
6
1

AI Engineer

AI Deep Explorer | f...ย โ€ขย 11m

"A Survey on Post-Training of Large Language Models" This paper systematically categorizes post-training into five major paradigms: 1. Fine-Tuning 2. Alignment 3. Reasoning Enhancement 4. Efficiency Optimization 5. Integration & Adaptation 1๏ธโƒฃ Fin

See More
Reply
1
8

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 3m

SLM vs LLM โ€” which AI model is best for you? Iโ€™ve explained both in simple steps below. ๐—ฆ๐—Ÿ๐—  (๐—ฆ๐—บ๐—ฎ๐—น๐—น ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น) (๐˜ด๐˜ต๐˜ฆ๐˜ฑ-๐˜ฃ๐˜บ-๐˜ด๐˜ต๐˜ฆ๐˜ฑ) Lightweight AI models built for speed, focus, and on-device execution. 1. ๐——๐—ฒ๐—ณ๐—ถ๐—ป๐—ฒ

See More
Reply
2
12
Image Description

Shuvodip Ray

ย โ€ขย 

YouTubeย โ€ขย 1y

Researchers at Google DeepMind introduced Semantica, an image-conditioned diffusion model capable of generating images based on the semantics of a conditioning image. The paper explores adapting image generative models to different datasets. Instea

See More
2 Replies
3
Image Description

Comet

#freelancerย โ€ขย 1y

Text Generation What It Is: Text generation involves using AI models to create humanlike text based on input prompts. How It Works: Models like GPT-3 use Transformer architectures. Theyโ€™re pre-trained on vast text datasets to learn grammar, conte

See More
1 Reply
4
Image Description

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 1m

Most people overlook these basics of AI Agents. I've explained it in a very simple way below. 1. ๐—”๐—œ ๐—”๐—ด๐—ฒ๐—ป๐˜ An AI system that observes its environment, information, makes decisions, and takes actions to achieve a goal. 2. ๐—Ÿ๐—Ÿ๐— ๐˜€ (๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ

See More
Reply
6
1

Yogesh Dubey

Hey I am on Medialย โ€ขย 1y

Weekly AI Roundup : Cost Efficient Models, Advances in Robotics & Cutting edge AI tools OpenAI Unveils GPT-4o Mini: OpenAI's GPT-4o mini is a cost-efficient model aimed at expanding AI accessibility, offering a significant price reduction compared

See More
Reply
7

Download the medial app to read full posts, comements and news.