Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
What’s New
March 28, 2024Pre-Joined Evals in Arize
Arize now supports LLM assisted evals that have been generated by the Arize Phoenix evals package. Use evals to determine the performance of your LLM application across dimensions such as Hallucination, Toxicity, QA Correctness and more. Evals can also be run on a job and sent to Arize on a regular cadence. See our docs here to get started with Evals in Arize, with more releases coming to Evals soon.
GPT-4V(ision) Integration in Prompt Playground
March 18, 2024 Arize now offers multi-modal support with GPT-4V allowing users to pass an image as part of the request to OpenAI.
Custom LLM Endpoint Support in Prompt Playground
Connect custom or third-party Large Language Models seamlessly. Test and compare different LLMs to identify optimal configurations. Learn more → Note: This feature is gated - please contact support@arize.com for access. Endpoint must conform to OpenAI ChatCompletion or Completions format.
Enhancements
March 28, 2024deleteData Endpoint
This update allows users to self-serve data deletion through GraphQL. Learn more →Area Under the Curve (AUC) as a Custom Metric
We now support AUC in custom metrics. Learn more →Python SDK v7.12.0
-
Users can now send evals and spans together via the
log_spansmethod of the Arize PandasClient - On-prem users can pass a path to certificate files or disable the TLS verification.
📚 New Content
- LLM Observability Certification: Search & Retrieval Course
- LLM Benchmarks & Retrieval for RAG Systems
- Numeric Evals: Why You Should Not Use for LLM-As-a-Judge
- Klick Health: Q&A on Healthcare LLM Use Cases
- Ragas: How To Evaluate and Analyze Your RAG Pipeline
- Needle in a Haystack LLM: New Research
- RAG Evaluation: How-To Troubleshoot LLMs and Retrieval-Augmented Generation with Retrieval and Response Metrics
- Phi 2
- Mistral’s 8x7b
- RAG vs Fine Tuning
- Sora AI from OpenAI
- Tutorial: Everything You Need to Set Up a SQL Router Query Engine for Text-To-SQL
- LLM Task Evaluations vs Model Evals
- Anthropic Claude 3: Performance and Review
- Cerebral Valley on “How Arize Is Expanding the Field of AI Observability”
- Paper Read: Reinforcement Learning In An Era of LLMs