Back

Anonymous

Anonymous 1

Hey I am on Medial • 10m

The ELO scores don't tell the full story here. Gemini 2.0 Flash Preview has the widest confidence interval (-8/+8) of any model on the board, suggesting its performance is highly inconsistent. Also note it only has 8,976 appearances - about 1/4 of what most other models have. Wait for more data before making judgments.

Reply

More like this

Recommendations from Medial

Rahul Agarwal

Founder | Agentic AI... • 1m

Anthropic just released Claude Opus 4.6. Here’s what’s new: 1) Smarter problem solving. It tackles complex tasks efficiently and doesn’t waste compute on simple ones. 2) 1M token context window. Enough to hold roughly 10 full novels in one session

See More
Reply
1

Rahul Agarwal

Founder | Agentic AI... • 3m

The AI stack you should master in 2025. I’ve broken down every tool in one simple line. 1. 𝗠𝗲𝘁𝗮𝗚𝗣𝗧 — Agents collaborate using structured software-team roles. 2. 𝗖𝗿𝗲𝘄𝗔𝗜 — Coordinates multiple specialized agents to complete tasks. 3. 𝗟�

See More
Reply
2
11

Download the medial app to read full posts, comements and news.