News on Medial

Why it's impossible to review AIs, and why TechCrunch is doing it anyway

TechCrunchTechCrunch · 8m
Why it's impossible to review AIs, and why TechCrunch is doing it anyway

AI models are evolving rapidly, making it difficult to comprehensively evaluate their capabilities. The pace of release is too fast for thorough assessment, and these models are increasingly complex and opaque. Furthermore, companies do not disclose changes made to their models, making it hard to keep evaluation frameworks relevant. Synthetic benchmarks provide limited insights and can be manipulated by companies to showcase better performance. Despite these challenges, qualitative analysis of AI models is valuable to counter industry hype and provide a real-world perspective. TechCrunch utilizes a subjective approach to testing AI models, focusing on specific categories for evaluation. However, the review process is not comprehensive and has its limitations.

Comments

Download the medial app to read full posts, comements and news.