Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models’ Alignment
Introduction We break down a paper, Trustworthy LLMs: A Survey and Guideline for Evaluating Large Language Models’ Alignment. Ensuring alignment (aka: making models behave in accordance with human intentions) has…
40 minutes read
By Sarah Welsh
|
40 minutes read