Artificial Intellige... • 9d
DeepSeek published a new paper detailing the technical details behind its R1 model that shook up the AI space in January, also revealing that it cost just $294,000 to train.
Finding my self 😶�... • 8m
🤯Chinese AI startup DeepSeek has surpassed OpenAI's ChatGPT in downloads on the US Apple App Store, achieving this milestone just a week after its launch on January 10, 2025. 🌐 The DeepSeek R1 model, which utilizes a hybrid architecture for enhanc
See MoreBuilding neoynai.co... • 1m
1 day ago OpenAI released GPT-OSS-120B and GPT-OSS-20B, two massive open-weight models. Here’s everything you need to know about them 👇 2/ Key features: – 256K context window – Sliding window attention – MoE architecture – RoPE variant – New MXFP4
See MoreAI Deep Explorer | f... • 5m
Want to learn AI the right way in 2025? Don’t just take courses. Don’t just build toy projects. Look at what’s actually being used in the real world. The most practical way to really learn AI today is to follow the models that are shaping the indus
See MorePython Developer 💻 ... • 7m
3B LLM outperforms 405B LLM 🤯 Similarly, a 7B LLM outperforms OpenAI o1 & DeepSeek-R1 🤯 🤯 LLM: llama 3 Datasets: MATH-500 & AIME-2024 This has done on research with compute optimal Test-Time Scaling (TTS). Recently, OpenAI o1 shows that Test-
See MoreAI Deep Explorer | f... • 5m
LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance
See MoreFounder Snippetz Lab... • 2m
I didn’t think I’d enjoy reading 80+ pages on training AI models. But this one? I couldn’t stop. Hugging Face dropped a playbook on how they train massive models across 512 GPUs — and it’s insanely good. Not just technical stuff… it’s like reading a
See MoreDownload the medial app to read full posts, comements and news.