Back

Rahul Agarwal

Founder | Agentic AI... • 1m

How can modern AI systems stop giving wrong answers? I've explained 4 guardrails in simple steps below. 1) 𝗦𝗮𝗳𝗲𝘁𝘆 𝗖𝗹𝗮𝘀𝘀𝗶𝗳𝗶𝗲𝗿 Purpose: detect dangerous, illegal, or policy-breaking content. 1. 𝗥𝗲𝗰𝗲𝗶𝘃𝗲 𝘁𝗵𝗲 𝘁𝗲𝘅𝘁 (input or the model’s draft). 2. 𝗡𝗼𝗿𝗺𝗮𝗹𝗶𝘇𝗲 𝗶𝘁 — convert to a standard form (lowercase, remove weird spacing) so checks are reliable. 3. 𝗥𝘂𝗻 𝘁𝗵𝗲 𝘀𝗮𝗳𝗲𝘁𝘆 𝗺𝗼𝗱𝗲𝗹 — an algorithm grades the text (safe / risky / unknown). 4. 𝗦𝗽𝗼𝘁 𝗷𝗮𝗶𝗹𝗯𝗿𝗲𝗮𝗸𝘀 𝗼𝗿 𝘁𝗿𝗶𝗰𝗸𝘀 — looks for attempts to bypass safety by hiding instructions. 5. 𝗦𝗰𝗼𝗿𝗲 𝘁𝗵𝗲 𝗿𝗶𝘀𝗸 — low/medium/high. 6. 𝗧𝗮𝗸𝗲 𝗮𝗰𝘁𝗶𝗼𝗻: • Low risk → allow. • Medium risk → modify reply (safe alternative) or add guidance. • High risk → block reply and send safe refusal message. 7. 𝗡𝗼𝘁𝗶𝗳𝘆 𝘀𝘆𝘀𝘁𝗲𝗺 & 𝗹𝗼𝗴 the incident for review. 8. 𝗔𝗱𝗷𝘂𝘀𝘁 𝘀𝗮𝗳𝗲𝘁𝘆 𝗿𝘂𝗹𝗲𝘀 if needed. 2) 𝗣𝗜𝗜 𝗙𝗶𝗹𝘁𝗲𝗿 Purpose: prevent sharing personal or private information. 1. 𝗧𝗮𝗸𝗲 𝘁𝗵𝗲 𝗺𝗼𝗱𝗲𝗹’𝘀 𝗼𝘂𝘁𝗽𝘂𝘁 (what it plans to say). 2. 𝗧𝗼𝗸𝗲𝗻𝗶𝘇𝗲 / 𝗯𝗿𝗲𝗮𝗸 𝗶𝗻𝘁𝗼 𝗽𝗶𝗲𝗰𝗲𝘀 (words, phrases). 3. 𝗖𝗼𝗺𝗽𝗮𝗿𝗲 𝘁𝗼𝗸𝗲𝗻𝘀 𝘁𝗼 𝗣𝗜𝗜 𝗽𝗮𝘁𝘁𝗲𝗿𝗻𝘀 (names, phone numbers, emails, SSNs). 4. 𝗔𝗽𝗽𝗹𝘆 𝗽𝗮𝘁𝘁𝗲𝗿𝗻 𝗿𝘂𝗹𝗲𝘀 (𝗿𝗲𝗴𝗲𝘅) 5. 𝗖𝗿𝗼𝘀𝘀-𝗰𝗵𝗲𝗰𝗸 𝘄𝗶𝘁𝗵 𝘀𝗲𝗰𝘂𝗿𝗲 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 (if allowed) to avoid leaking real records. 6. 𝗜𝗳 𝗣𝗜𝗜 𝗳𝗼𝘂𝗻𝗱 → mask, redact or replace the sensitive part (e.g., “--1234” or refuse). 7. 𝗟𝗼𝗴 𝘁𝗵𝗲 𝗲𝘃𝗲𝗻𝘁 and update the PII rules if a new pattern is found. 3) 𝗥𝘂𝗹𝗲𝘀-𝗕𝗮𝘀𝗲𝗱 𝗣𝗿𝗼𝘁𝗲𝗰𝘁𝗶𝗼𝗻𝘀 Purpose: enforce hard business rules, legal limits, or customer policies. 1. 𝗜𝗻𝘀𝗽𝗲𝗰𝘁 𝘁𝗵𝗲 𝗿𝗲𝗾𝘂𝗲𝘀𝘁 against a list of banned words/commands or limits. 2. 𝗥𝘂𝗻 𝗿𝗲𝗴𝗲𝘅 𝗼𝗿 𝗽𝗮𝘁𝘁𝗲𝗿𝗻 𝘀𝗰𝗮𝗻𝘀 for forbidden patterns (like SQL in a text field). 3. 𝗘𝗻𝗳𝗼𝗿𝗰𝗲 𝘂𝘀𝗮𝗴𝗲 𝗹𝗶𝗺𝗶𝘁𝘀 (e.g., prohibit long file attachments). 4. 𝗜𝗳 𝗮 𝗿𝘂𝗹𝗲 𝗶𝘀 𝗯𝗿𝗼𝗸𝗲𝗻 → deny the action and return a specific message explaining why. 5. 𝗥𝗲𝗰𝗼𝗿𝗱 𝘁𝗵𝗲 𝗮𝘁𝘁𝗲𝗺𝗽𝘁 and notify reviewers if needed. 6. 𝗥𝗲𝗳𝗶𝗻𝗲 𝘁𝗵𝗲 𝗿𝘂𝗹𝗲 𝗹𝗶𝘀𝘁 when new cases appear. 4) 𝗠𝗼𝗱𝗲𝗿𝗮𝘁𝗶𝗼𝗻 Purpose: detect and handle abusive, hateful, or toxic content. 1. 𝗖𝗼𝗹𝗹𝗲𝗰𝘁 𝘁𝗵𝗲 𝗶𝗻𝗽𝘂𝘁 𝗼𝗿 𝗺𝗼𝗱𝗲𝗹 𝗼𝘂𝘁𝗽𝘂𝘁. 2. 𝗖𝗹𝗲𝗮𝗻 𝗮𝗻𝗱 𝗽𝗿𝗲𝗽𝗿𝗼𝗰𝗲𝘀𝘀 (remove emojis, normalize language). 3. 𝗥𝘂𝗻 𝗺𝗼𝗱𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 to detect hate speech, harassment, sexual content, self-harm, etc. 4. 𝗦𝗰𝗼𝗿𝗲 𝘀𝗲𝘃𝗲𝗿𝗶𝘁𝘆 (mild, severe). 5. 𝗧𝗮𝗸𝗲 𝗮𝗰𝘁𝗶𝗼𝗻: • Mild → warn user or sanitize content. • Severe → block and escalate to human review or emergency resources. 6. 𝗟𝗼𝗴 𝗳𝗹𝗮𝗴𝗴𝗲𝗱 𝗰𝗼𝗻𝘁𝗲𝗻𝘁 for trend analysis and to improve the moderation model. 7. 𝗥𝗲𝘁𝗿𝗮𝗶𝗻 𝗼𝗿 𝘁𝘂𝗻𝗲 the moderation model using confirmed examples. ✅ Repost for others in your network who can benefit from this.

Reply
5

More like this

Recommendations from Medial

Prajapati Prince

presafeshoe.godaddys... • 4m

I want to detect all these functions wirelessly using sensors suitable for production or medical-level use, providing accurate readings: 1. Wetness Alert 2. Gas Alert 3. Movement Monitoring 4. Light Detection 5. Body Temperature Check 6. Presenc

See More
Reply

Rahul Agarwal

Founder | Agentic AI... • 24d

People know vibe tools, but struggle to use them. I've prepared a solid framework that works across all platforms. 1) 𝗗𝗲𝗳𝗶𝗻𝗲 𝘁𝗵𝗲 𝗽𝗿𝗼𝗱𝘂𝗰𝘁 𝗶𝗻 𝗼𝗻𝗲 𝘀𝗲𝗻𝘁𝗲𝗻𝗰𝗲 (5 𝗺𝗶𝗻𝘂𝘁𝗲𝘀) Be brutally simple. Your app must do ONE thing

See More
Reply
2
7
Image Description
Image Description

Rahul Agarwal

Founder | Agentic AI... • 1m

12 MCP servers every person should know. I've explained each one in a simple way. 1. 𝗙𝗶𝗹𝗲 𝗦𝘆𝘀𝘁𝗲𝗺 𝗦𝗲𝗿𝘃𝗲𝗿 • Gives AI access to your local files. • It can 𝗿𝗲𝗮𝗱, 𝘄𝗿𝗶𝘁𝗲, 𝗮𝗻𝗱 𝗰𝗿𝗲𝗮𝘁𝗲 files on your computer (under safe per

See More
2 Replies
44
31
6
Image Description
Image Description

Havish Gupta

Figuring Out • 1y

This startup made over $2 million just by spying at people! Let me explain. SO this is the story of Staqu, an AI startup founded by Atul Rai, Anurag Rastogi, and Pankaj Kumar in 2015. Their products use CCTV footage and with the help of AI, comput

See More
2 Replies
1
17
Image Description

Vishu Bheda

 • 

Medial • 6m

𝗪𝗵𝗲𝗻 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲’𝘀 𝗰𝗵𝗮𝘀𝗶𝗻𝗴 𝗔𝗜 𝗴𝗼𝗹𝗱, 𝗡𝗩𝗜𝗗𝗜𝗔 𝘀𝗼𝗹𝗱 𝘁𝗵𝗲 𝘀𝗵𝗼𝘃𝗲𝗹𝘀. That’s the smartest move in the whole game. While Microsoft, Google, and Meta are spending billions to build AI models... NVIDIA quietly became

See More
2 Replies
5
16
Image Description
Image Description

vishakha Jangir

↳ Ed-tech/Freelancin... • 1y

𝗛𝗼𝘄 𝗰𝗮𝗻 𝘆𝗼𝘂 𝗯𝗲𝗰𝗼𝗺𝗲 𝗳𝗶𝗻𝗮𝗻𝗰𝗶𝗮𝗹𝗹𝘆 𝗶𝗻𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝘁 𝗶𝗻 𝘆𝗼𝘂𝗿 𝟮𝟬𝘀 𝗼𝗿 𝗲𝘃𝗲𝗻 𝗯𝗲𝗳𝗼𝗿𝗲 𝟮𝟬 ? ↳ Choose Your Skills: Identify 3-4 relevant areas where you already have skills, such as video editing, graphic d

See More
27 Replies
4
11
Image Description
Image Description

Vishu Bheda

 • 

Medial • 7m

𝗠𝗮𝗿𝗰 𝗔𝗻𝗱𝗿𝗲𝗲𝘀𝘀𝗲𝗻 𝗷𝘂𝘀𝘁 𝗱𝗿𝗼𝗽𝗽𝗲𝗱 𝟱 𝘁𝗿𝘂𝘁𝗵𝘀 𝘁𝗵𝗮𝘁 𝗰𝗮𝗻 𝗹𝗶𝘁𝗲𝗿𝗮𝗹𝗹𝘆 𝗰𝗵𝗮𝗻𝗴𝗲 𝘆𝗼𝘂𝗿 𝘁𝗲𝗰𝗵 𝗷𝗼𝘂𝗿𝗻𝗲𝘆. Not gyan. Not fluff. Just real, raw frameworks. 𝟭. 𝗥𝘂𝗻 𝘁𝗼 𝘁𝗵𝗲 𝗵𝗲𝗮𝘁 Don’t play it sa

See More
11 Replies
58
49

One AI Market

AI Market Place • 8m

🚀 Introducing One AI Market 🚀 One AI Market is the place to create customized AI agents for any challenge—no code required: Text Agents for instant summaries, sentiment analysis, and data extraction from any document or message. Vision Agents to

See More
Reply
2
Image Description
Image Description

Mannan Baluvuri

Lifelong Learner • 6m

𝗠𝗶𝗰𝗵𝗮𝗲𝗹 𝗗𝗲𝗹𝗹, 𝗳𝗼𝘂𝗻𝗱𝗲𝗿 𝗼𝗳 𝗗𝗲𝗹𝗹, 𝘀𝗮𝘆𝘀: "Early in their careers, employees care most about pay and stability, but as they grow into leadership, they look for meaning and purpose too." In his book Play Nice But Win, he connec

See More
2 Replies
4
11
Image Description
Image Description

Vishu Bheda

 • 

Medial • 9m

𝗘𝗹𝗼𝗻 𝗠𝘂𝘀𝗸 𝗼𝗻 𝗪𝗵𝘆 𝗥𝗲𝗺𝗼𝘁𝗲 𝗪𝗼𝗿𝗸 𝗠𝗶𝗴𝗵𝘁 𝗞𝗶𝗹𝗹 𝗬𝗼𝘂𝗿 𝗦𝘁𝗮𝗿𝘁𝘂𝗽 "𝐈𝐟 𝐲𝐨𝐮 𝐝𝐨𝐧’𝐭 𝐬𝐡𝐨𝐰 𝐮𝐩, 𝐰𝐞’𝐥𝐥 𝐚𝐬𝐬𝐮𝐦𝐞 𝐲𝐨𝐮’𝐯𝐞 𝐫𝐞𝐬𝐢𝐠𝐧𝐞𝐝." That’s what Elon Musk told every Tesla employee. Harsh? May

See More
4 Replies
6
19

Download the medial app to read full posts, comements and news.