Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using OpenAI Evals #80

Open
walking-octopus opened this issue May 21, 2023 · 0 comments
Open

Consider using OpenAI Evals #80

walking-octopus opened this issue May 21, 2023 · 0 comments

Comments

@walking-octopus
Copy link

walking-octopus commented May 21, 2023

OpenAI Evals is an open-source crowdsourced collection of tests designed to evaluate many newly emergent capabilities in LLMs. While it may be slightly GPT-4-centric, as tests that it can easily pass don't get merged, it still remains a valuable tool for automatically benchmarking LLMs.

Being a well-designed standard test, this may allow us to compare different open-access models against each other or even OpenAI's offerings objectively and effortlessly, which may allow us to gain deeper insight into what works and what doesn't.

For reference on testing non OpenAI models with Evals, see OpenAssistant model evals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant