Back

Rahul Agarwal

Founder | Agentic AI... • 10d

How do Voice, Coding & Computer Agents work? I've explained each one in a very simple way below. 1. 𝗩𝗼𝗶𝗰𝗲 𝗔𝗴𝗲𝗻𝘁𝘀 AI systems that talk with people using speech. Examples: Vapi, Retell AI, OpenAI TTS etc. 𝗦𝘁𝗲𝗽𝘀: 1. 𝗨𝘀𝗲𝗿 𝘀𝗽𝗲𝗮𝗸𝘀 𝗮 𝗾𝘂𝗲𝗿𝘆 → Example: “What’s the stock price of nvidia today?” 2. 𝗧𝗲𝗹𝗲𝗽𝗵𝗼𝗻𝘆 𝘀𝘆𝘀𝘁𝗲𝗺 captures the voice. 3. 𝗦𝗧𝗧 (𝗦𝗽𝗲𝗲𝗰𝗵-𝘁𝗼-𝗧𝗲𝘅𝘁) converts the spoken words into text. 4. 𝗔𝗴𝗲𝗻𝘁 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗲𝘀 𝗶𝘁 using: • 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 𝗠𝗼𝗱𝗲𝗹 & 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗔𝗣𝗜 (finds relevant knowledge). • 𝗩𝗲𝗰𝘁𝗼𝗿 𝗗𝗕 (stores knowledge for quick search). 5. 𝗔𝗴𝗲𝗻𝘁 𝗱𝗲𝗰𝗶𝗱𝗲𝘀 𝗮𝗻 𝗮𝗻𝘀𝘄𝗲𝗿 using tools. 6. 𝗧𝗧𝗦 (𝗧𝗲𝘅𝘁-𝘁𝗼-𝗦𝗽𝗲𝗲𝗰𝗵) converts text back into speech. 7. 𝗢𝘂𝘁𝗽𝘂𝘁 is spoken back to the user. 𝗜𝗻 𝘀𝗵𝗼𝗿𝘁: Voice → Text → Keywords → Search → Answer → Voice Back. _____________________________________________ 2. 𝗖𝗼𝗱𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝘀 AI systems that help developers write, debug, and test code faster. Examples: GitHub Copilot, Cursor AI etc. 𝗦𝘁𝗲𝗽𝘀: 1. 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿 𝗴𝗶𝘃𝗲𝘀 𝗮 𝗾𝘂𝗲𝗿𝘆 → Example: “Write a function to sort numbers.” 2. 𝗔𝗴𝗲𝗻𝘁 receives the request. 3. Agent uses: • 𝗖𝗼𝗱𝗲 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿 → writes new code. • 𝗖𝗼𝗱𝗲 𝗗𝗲𝗯𝘂𝗴𝗴𝗲𝗿 → finds and fixes errors. • 𝗧𝗲𝘀𝘁 𝗥𝘂𝗻𝗻𝗲𝗿 → checks if code works. 4. 𝗘𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁 provides a workspace where the code runs. 5. 𝗢𝘂𝘁𝗽𝘂𝘁 → working code returned to the developer. 𝗜𝗻 𝘀𝗵𝗼𝗿𝘁: Query → Keywords → AI writes code → Debug + Test → Final Code. _____________________________________________ 3. 𝗖𝗨𝗔 (𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝘀) AI systems that 𝘂𝘀𝗲 𝗮 𝗰𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗹𝗶𝗸𝗲 𝗮 𝗵𝘂𝗺𝗮𝗻 𝘄𝗼𝘂𝗹𝗱 (e.g., clicking, typing, browsing). Examples: Manus AI, Saner AI etc. 𝗦𝘁𝗲𝗽𝘀: 1. 𝗨𝘀𝗲𝗿 𝗴𝗶𝘃𝗲𝘀 𝗮 𝗾𝘂𝗲𝗿𝘆 → Example: “Book the cheapest flight from Hong Kong to Paris.” 2. 𝗩𝗶𝘀𝗶𝗼𝗻 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗠𝗼𝗱𝗲𝗹 (𝗩𝗟𝗠) → understands what’s on the screen. 3. 𝗚𝗲𝗻𝗲𝗿𝗮𝗹 𝗣𝘂𝗿𝗽𝗼𝘀𝗲 𝗟𝗟𝗠 → decides actions (click, type, drag, etc.). 4. Uses: • 𝗩𝗲𝗰𝘁𝗼𝗿 𝗗𝗕 → recall past actions or data. • 𝗠𝗲𝗺𝗼𝗿𝘆 → remember ongoing tasks. • 𝗧𝗵𝗶𝗿𝗱-𝗣𝗮𝗿𝘁𝘆 𝗧𝗼𝗼𝗹𝘀 → connect with apps (email, Excel, etc.). • 𝗗𝗲𝘀𝗸𝘁𝗼𝗽 𝗦𝗮𝗻𝗱𝗯𝗼𝘅 → safe test environment to interact with the computer. 5. 𝗢𝘂𝘁𝗽𝘂𝘁 → task completed automatically (like a digital assistant doing your computer work). 𝗜𝗻 𝘀𝗵𝗼𝗿𝘁: You instruct → AI sees screen + plans → clicks/types → finishes the task like a human would. All these different agents are shaping the future of AI. What’s your take? ✅ Repost for others who want to understand how they work.

2 Replies
5
16
Replies (2)

More like this

Recommendations from Medial

Rahul Agarwal

Founder | Agentic AI... • 15d

How Multi-Agent AI systems actually work? Explained in a very simple way. Read below: -> 𝗧𝗵𝗲 𝗠𝗮𝗶𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 The main 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 is the 𝗼𝗿𝗰𝗵𝗲𝘀𝘁𝗿𝗮𝘁𝗼𝗿. It has several capabilities: • 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲 – Stores knowledge o

See More
Reply
3
15
Image Description
Image Description

Vishu Bheda

 • 

Medial • 4m

𝗬𝗼𝘂 𝗱𝗼𝗻’𝘁 𝗻𝗲𝗲𝗱 𝗮 𝘁𝗲𝗮𝗺, 𝗩𝗖 𝗰𝗮𝘀𝗵, 𝗼𝗿 𝗮 𝘆𝗲𝗮𝗿 𝗼𝗳 𝗰𝗼𝗱𝗶𝗻𝗴 𝘁𝗼 𝗹𝗮𝘂𝗻𝗰𝗵 𝗮 𝘀𝘁𝗮𝗿𝘁𝘂𝗽 𝗮𝗻𝘆𝗺𝗼𝗿𝗲. Now? One person + a weekend + no-code tools = real money while you sleep. Someone just built an AI tool us

See More
13 Replies
51
54

Vishu Bheda

 • 

Medial • 2m

𝗔𝗜 𝗶𝘀 𝘁𝘂𝗿𝗻𝗶𝗻𝗴 𝗲𝘃𝗲𝗿𝘆 𝗰𝘂𝗿𝗶𝗼𝘂𝘀 𝗽𝗲𝗿𝘀𝗼𝗻 𝗶𝗻𝘁𝗼 𝗮 𝗺𝗼𝗱𝗲𝗿𝗻-𝗱𝗮𝘆 𝗽𝗼𝗹𝘆𝗺𝗮𝘁𝗵. Want to code? AI walks you through it. Trying to grasp finance? AI explains it step by step. Curious about history? AI becomes your p

See More
Reply
8
Image Description
Image Description

TheGangestoday

Providing The Best N... • 3m

Hey guys, I'm building an ai platform for students where students can enter questions via :- text, pdf , image , and select option or enter what to do with it like summarise, generate quiz, etc . It also has voice mode for real-time voice conversati

See More
14 Replies
9

Mukund

Building Future • 2m

Hello no-code experts ! Anyone experienced with building apps, webpages, or AI tools using no-code platforms (Bubble, Webflow, Zapier, etc.)? Specifically: ➡️ which is the Best tools for beginners? ➡️ How to integrate APIs smoothly? Drop tip

See More
Reply
6
Image Description

PRABHU CHARAN JERRIPOTHULA

AI Engineer and AI G... • 2m

One of My best project "YAMUNA" - Personal Voice Assistant for my PC "What if just saying a name… could awaken intelligence?" What if 'Yamuna' wasn’t just a name — but a portal to the power of AI? Meet Yamuna, my personal AI voice assistant –From man

See More
1 Reply
1
5
Image Description
Image Description

Account Deleted

Hey I am on Medial • 8m

in the next few days I'll be learning & trying out latest techstack for dev using AI tools (& build 3 mini apps) no-code game dev print on demand Ecom business creating and monetising 3 faceless , ai automated yt channels learning AI integra

See More
8 Replies
3
16
Image Description
Image Description

NIKUNJ TULSYAN

Building @frendle.ap... • 2m

Built a SaaS app over a single weekend - without writing a line of code or hiring a developer. No CS degree, no late-night code grinds, just pure determination (and some Gen Z energy). Introducing Frendle: A chat app for Gen Z to spill, vent, or vib

See More
9 Replies
3
12

Swamy Gadila

Founder of Friday AI • 2m

What is Friday AI 🤔 Friday AI — The World’s First Emotionally Intelligent AI Ecosystem is Here. 🌐 fridayai.fun We believe AI shouldn’t just be smart. It should feel. Understand. Respond like a human. From voice to visuals, code to consciousnes

See More
Reply
4
Image Description
Image Description

Armaan Nath

Startups | Product • 1y

AI platforms like ChatGPT can write code in a minute then what is the use of coding platforms like leetcode, codechef, gfg, hackerrank etc. These should be optimised to prevent cheating. Having 5 star in any coding language is now everyones gig.

3 Replies
1
7

Download the medial app to read full posts, comements and news.