Nelson Auner

Nelson Auner

Published in
AI Advances

Cutting-edge LLM Evals with humans, AI judges, and GPT token probabilities

In the race to deploy AI, one of the trickiest components is LLM Evaluation (Evals)

Jul 7, 2024

Cutting-edge LLM Evals with humans, AI judges, and GPT token probabilities

Jul 7, 2024

Published in
Cubed

Claude Sonnet 3.5 vs GPT-4o

Comparing SOTA LLM performance on a realistic few-shot categorization task

Jul 1, 2024

Jul 1, 2024

17 Followers

Writing on Machine Learning & Data Science. I've consulted for Series A startups and Fortune 100 firms. Previously Data Science at Affirm & Coalition.

Following

Slope Stories
Alice Deng
Data Science Collective
Don Stalter
Alvin Sng

See all (21)

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech