Back

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 21d

6 Chunking Methods for RAG you should know. Iโ€™ve explained it in a simple, step by step way. ๐—ช๐—ต๐—ฎ๐˜ ๐—ถ๐˜€ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด? 1. Chunking means splitting large documents into smaller pieces. 2. Helps LLMs search and understand data better. 3. Essential for Retrieval-Augmented Generation (RAG). ๐—ฆ๐˜๐—ฒ๐—ฝ 1: ๐—ฆ๐—ฒ๐—บ๐—ฎ๐—ป๐˜๐—ถ๐—ฐ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด โ€ข Split content based on meaning, not just size. โ€ข Group sentences that talk about the same idea. โ€ข Uses embeddings to detect topic changes. โ€ข Produces high-quality chunks but costs more compute. Best for: Meaning-heavy content where context matters. ๐—ฆ๐˜๐—ฒ๐—ฝ 2: ๐—ฅ๐—ฒ๐—ฐ๐˜‚๐—ฟ๐˜€๐—ถ๐˜ƒ๐—ฒ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด โ€ข Break text using a hierarchy (paragraphs โ†’ sentences โ†’ words). โ€ข Ensures chunks stay within token limits. โ€ข Works well for most text documents. Best for: General-purpose RAG pipelines. ๐—ฆ๐˜๐—ฒ๐—ฝ 3: ๐—ฆ๐—ฒ๐—ป๐˜๐—ฒ๐—ป๐—ฐ๐—ฒ-๐—Ÿ๐—ฒ๐˜ƒ๐—ฒ๐—น ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด โ€ข Split text strictly at sentence boundaries. โ€ข Combine multiple sentences into one chunk. โ€ข Preserves natural language flow. Best for: Articles, blogs, and readable text. ๐—ฆ๐˜๐—ฒ๐—ฝ 4: ๐—ฃ๐—ฎ๐—ฟ๐—ฒ๐—ป๐˜โ€“๐—–๐—ต๐—ถ๐—น๐—ฑ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด โ€ข Store small chunks for search accuracy. โ€ข Return larger parent chunks for full context. โ€ข Balances precision and completeness. Best for: Question answering systems. ๐—ฆ๐˜๐—ฒ๐—ฝ 5: ๐—”๐—ฆ๐—ง-๐—”๐˜„๐—ฎ๐—ฟ๐—ฒ ๐—–๐—ผ๐—ฑ๐—ฒ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด โ€ข Split code using its structure (functions, classes). โ€ข Avoids breaking logical blocks. โ€ข Requires language-specific parsers. โ€ข Keeps code clean and unbroken. Best for: Codebases and developer tools. ๐—ฆ๐˜๐—ฒ๐—ฝ 6: ๐—›๐˜†๐—ฏ๐—ฟ๐—ถ๐—ฑ ๐—–๐—ต๐˜‚๐—ป๐—ธ๐—ถ๐—ป๐—ด ๐—ฆ๐˜๐—ฟ๐—ฎ๐˜๐—ฒ๐—ด๐˜† โ€ข Choose chunking method based on content type. 1. Code โ†’ AST-aware 2. PDFs โ†’ Page-based 3. Text โ†’ Recursive or Semantic โ€ข Delivers the highest retrieval accuracy. Best for: Production-grade AI systems. โœ… ๐—œ๐—ป ๐˜€๐—ต๐—ผ๐—ฟ๐˜ โ€ข ๐—ฆ๐—ฒ๐—บ๐—ฎ๐—ป๐˜๐—ถ๐—ฐ โ†’ splits by meaning โ€ข ๐—ฅ๐—ฒ๐—ฐ๐˜‚๐—ฟ๐˜€๐—ถ๐˜ƒ๐—ฒ โ†’ splits by structure (most common) โ€ข ๐—ฃ๐—ฎ๐—ฟ๐—ฒ๐—ป๐˜โ€“๐—–๐—ต๐—ถ๐—น๐—ฑ โ†’ small search, big context โ€ข ๐—ฆ๐—ฒ๐—ป๐˜๐—ฒ๐—ป๐—ฐ๐—ฒ-๐—น๐—ฒ๐˜ƒ๐—ฒ๐—น โ†’ simple and natural โ€ข ๐—”๐—ฆ๐—ง-๐—ฎ๐˜„๐—ฎ๐—ฟ๐—ฒ โ†’ best for code โ€ข ๐—›๐˜†๐—ฏ๐—ฟ๐—ถ๐—ฑ โ†’ smart combination of all Test with real queries, adjust chunk size, monitor performance, and continuously improve your RAG pipeline. โœ… Repost for others who can benefit from this.

Reply
1

More like this

Recommendations from Medial

Image Description

Kimiko

Startups | AI | info...ย โ€ขย 8m

Vector databases for AI memory just got disruptedโ€ฆ by MP4 files?! Video as Database: Store millions of text chunks in a single MP4 file Store millions of text chunks with blazing-fast semantic search โ€” no database required. 100% open source. Zero

See More
1 Reply
3
18
Image Description

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 1m

Most people building AI systems miss these crucial steps. I've explained the architecture in simple way below. ๐—ฆ๐˜๐—ฒ๐—ฝ 1 โ€“ ๐——๐—ฎ๐˜๐—ฎ ๐—œ๐—ป๐—ด๐—ฒ๐˜€๐˜๐—ถ๐—ผ๐—ป & ๐—ฃ๐—ฟ๐—ผ๐—ฐ๐—ฒ๐˜€๐˜€๐—ถ๐—ป๐—ด (๐—œ๐—ป๐—ด๐—ฒ๐˜€๐˜ ๐—Ÿ๐—ฎ๐˜†๐—ฒ๐—ฟ) โ€ข This step brings data into your AI system. โ€ข

See More
1 Reply
3
5
Image Description
Image Description

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 1m

Most people don't even know these basics of RAG. I've explained it in a simple way below. 1. ๐—œ๐—ป๐—ฑ๐—ฒ๐˜…๐—ถ๐—ป๐—ด Convert documents into a format that AI can quickly search later. Step-by-step: โ€ข ๐——๐—ผ๐—ฐ๐˜‚๐—บ๐—ฒ๐—ป๐˜: You start with files like PDFs, Word

See More
4 Replies
21
33
4
Image Description

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 3m

9 Steps to Build AI Agents from Scratch. I've given a simple step by step explanation. ๐—ฆ๐˜๐—ฒ๐—ฝ 1: ๐—˜๐˜€๐˜๐—ฎ๐—ฏ๐—น๐—ถ๐˜€๐—ต ๐— ๐—ถ๐˜€๐˜€๐—ถ๐—ผ๐—ป & ๐—ฅ๐—ผ๐—น๐—ฒ โ€ข Decide what problem the agent will solve. โ€ข Figure out who will use it. โ€ข Plan how users will interact

See More
Reply
4
15
1
Image Description
Image Description

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 5m

Simple explanation of Traditional RAG vs Agentic RAG vs MCP. 1. ๐—ง๐—ฟ๐—ฎ๐—ฑ๐—ถ๐˜๐—ถ๐—ผ๐—ป๐—ฎ๐—น ๐—ฅ๐—”๐—š (๐—ฅ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น-๐—”๐˜‚๐—ด๐—บ๐—ฒ๐—ป๐˜๐—ฒ๐—ฑ ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป) โ€ข ๐—ฆ๐˜๐—ฒ๐—ฝ 1: ๐—จ๐˜€๐—ฒ๐—ฟ ๐—ฎ๐˜€๐—ธ๐˜€ ๐—ฎ ๐—พ๐˜‚๐—ฒ๐˜€๐˜๐—ถ๐—ผ๐—ป. Example: โ€œ๐˜ž๐˜ฉ๐˜ข๐˜ต ๐˜ช๐˜ด ๐˜ต๐˜ฉ๐˜ฆ ๐˜ค๐˜ข๐˜ฑ๐˜ช๏ฟฝ

See More
4 Replies
34
41
4

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 2m

Get RAG-ready data from any unstructured document. This is crazy for AI companies. I've explained below. ๐—ฆ๐˜๐—ฒ๐—ฝ 1 โ€“ ๐—จ๐—ป๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ๐—ฑ ๐——๐—ผ๐—ฐ๐˜‚๐—บ๐—ฒ๐—ป๐˜๐˜€ (๐—ง๐—ต๐—ฒ ๐—ฆ๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ) โ€ข Real-world PDFs and documents are messy. Tables, images, signa

See More
Reply
1
5
Image Description
Image Description

sentence rewriter

Free AI Sentence Rew...ย โ€ขย 4m

Writers often face writerโ€™s block. A sentence rewriter can give your text a new shape instantly. Rewrite sentences, improve flow, and create engaging content for blogs, essays, or business use. #SentenceRewriter #ContentWriting #AIWriter #PlagiarismF

See More
2 Replies
5
Image Description

sentence rewriter

Free AI Sentence Rew...ย โ€ขย 4m

When you need to rewrite sentences quickly and accurately, this tool delivers. It ensures originality while preserving the true meaning of your text. Whether for academic work or business content, itโ€™s highly effective. Use it free: https://sentencer

See More
1 Reply
2
6

Rahul Agarwal

Founder | Agentic AI...ย โ€ขย 1m

What AI skills should you master in 2026? I've explained each with my learnings below. ๐—ฆ๐˜๐—ฒ๐—ฝ 1 โ€“ ๐—ฃ๐—ฟ๐—ผ๐—บ๐—ฝ๐˜ ๐—˜๐—ป๐—ด๐—ถ๐—ป๐—ฒ๐—ฒ๐—ฟ๐—ถ๐—ป๐—ด โ€ข Uses clear, structured, goal-driven instructions. โ€ข Adds context, constraints, and expected outputs. ๐—˜.๐—ด: Ch

See More
Reply
1
1

HEMANT GHUGE

Problem Zeroth, Tech...ย โ€ขย 7m

Most people think of RAG (Retrieval-Augmented Generation) as a text-only thing. But when we apply it to images, it unlocks serious potential โ€” especially in safety, retail, and surveillance. I recently explored Vision-RAG using Weaviate + LangChain

See More
Reply
4
10

Download the medial app to read full posts, comements and news.