Python Developer ๐ป ...ย โขย 9m
3B LLM outperforms 405B LLM ๐คฏ Similarly, a 7B LLM outperforms OpenAI o1 & DeepSeek-R1 ๐คฏ ๐คฏ LLM: llama 3 Datasets: MATH-500 & AIME-2024 This has done on research with compute optimal Test-Time Scaling (TTS). Recently, OpenAI o1 shows that Test-Time Scaling (TTS) can enhance the reasoning capabilities of LLMs by allocating additional computation at inference time, which improves LLM performance. In Simple terms: "think slowly with long Chain-of-Thought." But, By generating multiple outputs on a sample and picking the best one and training model again, which eventually leads to perform 0.5B LLM better than GPT-4o. But more computation. To make it efficient, they've used search based methods with the reward-aware Compute-optimal TTS. CC Paper: Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling ๐ https://arxiv.org/abs/2502.06703 #Openai #LLM #GPT #GPTo1 #deepseek #llama3

Founder of Friday AIย โขย 5m
๐จ Open AI is an Wrapper๐๐คฏ Hot take, but letโs break it down logically: OpenAI is not a full-stack AI company โ itโs a high-level wrapper over Azure and NVIDIA. Hereโs why that matters ๐ ๐น 1. Infra Backbone = Microsoft Azure Almost 90%+ of Op
See More
Founder of Friday AIย โขย 1m
Big News: Friday AI โ Adaptive API is Coming! Weโre launching Adaptive API, the worldโs first real-time context scaling framework for LLMs. Today, AI wastes massive tokens on static context โ chat, code, or docs all use the same window. The result?
See More
ย โขย
Medialย โขย 3m
๐ ๐๐ฝ๐ฒ๐ป๐ ๐ฐ+ ๐ต๐ผ๐๐ฟ๐ ๐ฟ๐ฒ๐๐ฎ๐๐ฐ๐ต๐ถ๐ป๐ด ๐๐ฎ๐ฟ๐ฝ๐ฎ๐๐ต๐โ๐ ๐ฌ๐ ๐ธ๐ฒ๐๐ป๐ผ๐๐ฒ. And I realized โ weโve been looking at LLMs the wrong way. Theyโre not just โAI models.โ Theyโre a new kind of computer. โข LLM = CPU โข Context window = mem
See More
AI Deep Explorer | f...ย โขย 7m
LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance
See Moreย โขย
Medialย โขย 4m
GPT-5 Full Review & 10 Mind-Blowing Use Cases OpenAI has just launched its most awaited model yet: GPT-5. And itโs not just one step closer to AGI, but has almost entirely automated a lot of things using just simple prompts. In this video, we put t
See Moreย โขย
Medialย โขย 1y
Jensen Huang, the CEO of NVIDIA, describes how AI is advancing in three key dimensions: 1. Pre-training: This is like getting a college degree. AI models are trained on massive datasets to develop broad, general knowledge about the world. 2. Post-
See More
India's AI Filmmakin...ย โขย 6m
๐จ SmolVLA is here โ and itโs changing how we think about robotics AI. Hugging Face just released SmolVLA, a lightweight Vision-Language-Action model trained on community-shared datasets from their LeRobot platform. Despite being just 450M paramete
See More
Download the medial app to read full posts, comements and news.