Python Developer ๐ป ...ย โขย 4m
3B LLM outperforms 405B LLM ๐คฏ Similarly, a 7B LLM outperforms OpenAI o1 & DeepSeek-R1 ๐คฏ ๐คฏ LLM: llama 3 Datasets: MATH-500 & AIME-2024 This has done on research with compute optimal Test-Time Scaling (TTS). Recently, OpenAI o1 shows that Test-Time Scaling (TTS) can enhance the reasoning capabilities of LLMs by allocating additional computation at inference time, which improves LLM performance. In Simple terms: "think slowly with long Chain-of-Thought." But, By generating multiple outputs on a sample and picking the best one and training model again, which eventually leads to perform 0.5B LLM better than GPT-4o. But more computation. To make it efficient, they've used search based methods with the reward-aware Compute-optimal TTS. CC Paper: Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling ๐ https://arxiv.org/abs/2502.06703 #Openai #LLM #GPT #GPTo1 #deepseek #llama3
Founder of Friday AIย โขย 23d
๐จ Open AI is an Wrapper๐๐คฏ Hot take, but letโs break it down logically: OpenAI is not a full-stack AI company โ itโs a high-level wrapper over Azure and NVIDIA. Hereโs why that matters ๐ ๐น 1. Infra Backbone = Microsoft Azure Almost 90%+ of Op
See MoreAI Deep Explorer | f...ย โขย 2m
LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance
See Moreย โขย
Medialย โขย 7m
Jensen Huang, the CEO of NVIDIA, describes how AI is advancing in three key dimensions: 1. Pre-training: This is like getting a college degree. AI models are trained on massive datasets to develop broad, general knowledge about the world. 2. Post-
See MoreIndia's AI Filmmakin...ย โขย 1m
๐จ SmolVLA is here โ and itโs changing how we think about robotics AI. Hugging Face just released SmolVLA, a lightweight Vision-Language-Action model trained on community-shared datasets from their LeRobot platform. Despite being just 450M paramete
See Moreย โขย
Welbeย โขย 2m
Wake up, this is the GREATEST time to build a startup in 30 years.. The words of Greg Isenberg, CEO of latecheckoutplz ๐๐ป I say this as a 36 year old who's built/sold 3 companies, been part of companies that have raised billons and seeded multipl
See MoreDownload the medial app to read full posts, comements and news.