Douhble Stack Sign

About 214,000 results

Open links in new tab

Any time

github.com
https://github.com › confident-ai › deepeval
GitHub - confident-ai/deepeval: The LLM Evaluation Framework
DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs.
github.com
https://github.com › openai › evals
GitHub - openai/evals: Evals is a framework for evaluating LLMs and …
Evals provide a framework for evaluating large language models (LLMs) or systems built using LLMs. We offer an existing registry of evals to test different dimensions of OpenAI models and the ability to …
github.com
https://github.com › EleutherAI › lm-evaluation-harness
GitHub - EleutherAI/lm-evaluation-harness: A framework for few-shot ...
Overview This project provides a unified framework to test generative language models on a large number of different evaluation tasks. Features: Over 60 standard academic benchmarks for LLMs, …
github.com
https://github.com › huggingface › evaluation-guidebook
The LLM Evaluation guidebook ⚖️ - GitHub
If you've ever wondered how to make sure an LLM performs well on your specific task, this guide is for you! It covers the different ways you can evaluate a model, guides on designing your own …
github.com
https://github.com › vibrantlabsai › ragas
Supercharge Your LLM Application Evaluations - GitHub
Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient …
github.com
https://github.com › modelscope › evalscope
GitHub - modelscope/evalscope: A streamlined and customizable …
EvalScope is a powerful and easily extensible model evaluation framework created by the ModelScope Community, aiming to provide a one-stop evaluation solution for large model developers.
github.com
https://github.com › evalplus › evalplus
GitHub - evalplus/evalplus: Rigourous evaluation of LLM-synthesized ...
EvalPerf: evaluating the efficiency of LLM-generated code! Framework: our packages/images/tools can easily and safely evaluate LLMs on above benchmarks.
github.com
https://github.com › huggingface › lighteval
GitHub - huggingface/lighteval: Lighteval is your all-in-one toolkit ...
Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team. Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends—whether …
github.com
https://github.com › raga-ai-hub › raga-llm-hub
GitHub - raga-ai-hub/raga-llm-hub: Framework for LLM evaluation ...
The RagaAI LLM Hub is uniquely designed to help teams identify issues and fix them throughout the LLM lifecycle, by identifying issues across the entire RAG pipeline.
github.com
https://github.com › topics › llm-evaluation-framework
llm-evaluation-framework · GitHub Topics · GitHub
Jan 4, 2024 · llm-evaluation-framework Here are 43 public repositories matching this topic... Language: All Sort: Most stars

Pagination
- 1
- 2
- 3
- Next