Code Generation Eval
When To Use Code Generation Eval Template
This Eval checks the correctness and readability of the code from a code generation process. The template variables are:
query: The query is the coding question being asked
code: The code is the code that was returned.
Code Generation Eval Template
Benchmark Results
GPT-4 Results

GPT-3.5 Results

How To Run the Eval
The above shows how to use the code readability template.
Code Eval
GPT-4
GPT-3.5
GPT-3.5-Instruct
Palm 2 (Text Bison)
Llama 7b (soon)
Precision
0.93
0.76
0.67
0.77
Recall
0.78
0.93
1
0.94
F1
0.85
0.85
0.81
0.85
Last updated
Was this helpful?