Back

Aizend

Artificial Intellige... • 9d

DeepSeek published a new paper detailing the technical details behind its R1 model that shook up the AI space in January, also revealing that it cost just $294,000 to train.

Reply
2

More like this

Recommendations from Medial

Image Description
Image Description

Account Deleted

Hey I am on Medial • 8m

It's only been 4 days since Deepseek R1 dropped and it's INSANE ChatGPT is now falling behind.

2 Replies
16

Account Deleted

Hey I am on Medial • 3m

Apple just exposed the truth behind so-called AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini: They’re not actually reasoning — they’re just really good at memorizing patterns. Here’s what Apple found:

Reply
18
Image Description
Image Description

Ashish Singh

Finding my self 😶‍�... • 8m

🤯Chinese AI startup DeepSeek has surpassed OpenAI's ChatGPT in downloads on the US Apple App Store, achieving this milestone just a week after its launch on January 10, 2025. 🌐 The DeepSeek R1 model, which utilizes a hybrid architecture for enhanc

See More
3 Replies
1
17

Ram saravanan

Start up • 6m

The CEO of 01.AI, Lee Kai-fu, recently shared some exciting news about the progress of Chinese AI startups. He highlighted that companies like DeepSeek have managed to close the gap with the U.S. in AI development, bringing it down to just three mont

See More
Reply
5

Saurav Singh

Building neoynai.co... • 1m

1 day ago OpenAI released GPT-OSS-120B and GPT-OSS-20B, two massive open-weight models. Here’s everything you need to know about them 👇 2/ Key features: – 256K context window – Sliding window attention – MoE architecture – RoPE variant – New MXFP4

See More
Reply
14

AI Engineer

AI Deep Explorer | f... • 5m

Want to learn AI the right way in 2025? Don’t just take courses. Don’t just build toy projects. Look at what’s actually being used in the real world. The most practical way to really learn AI today is to follow the models that are shaping the indus

See More
Reply
1
9
Image Description

Parampreet Singh

Python Developer 💻 ... • 7m

3B LLM outperforms 405B LLM 🤯 Similarly, a 7B LLM outperforms OpenAI o1 & DeepSeek-R1 🤯 🤯 LLM: llama 3 Datasets: MATH-500 & AIME-2024 This has done on research with compute optimal Test-Time Scaling (TTS). Recently, OpenAI o1 shows that Test-

See More
1 Reply
5

AI Engineer

AI Deep Explorer | f... • 5m

LLM Post-Training: A Deep Dive into Reasoning LLMs This survey paper provides an in-depth examination of post-training methodologies in Large Language Models (LLMs) focusing on improving reasoning capabilities. While LLMs achieve strong performance

See More
Reply
2
Image Description
Image Description

Shuvodip Ray

 • 

YouTube • 1y

Researchers at Meta recently presented ‘An Introduction to Vision-Language Modeling’, to help people better understand the mechanics behind mapping vision to language. The paper includes everything from how VLMs work, how to train them, and approache

See More
4 Replies
3
Image Description
Image Description

Pulakit Bararia

Founder Snippetz Lab... • 2m

I didn’t think I’d enjoy reading 80+ pages on training AI models. But this one? I couldn’t stop. Hugging Face dropped a playbook on how they train massive models across 512 GPUs — and it’s insanely good. Not just technical stuff… it’s like reading a

See More
4 Replies
1
7

Download the medial app to read full posts, comements and news.