Skip to main content

One post tagged with "finetuning"

View All Tags

Krrish Dholakia
Ishaan Jaffer

alerting, prometheus, secret management, management endpoints, ui, prompt management, finetuning, batch

note

v1.57.8-stable, is currently being tested. It will be released on 2025-01-12.

New / Updated Models​

  1. Mistral large pricing - https://github.com/BerriAI/litellm/pull/7452
  2. Cohere command-r7b-12-2024 pricing - https://github.com/BerriAI/litellm/pull/7553/files
  3. Voyage - new models, prices and context window information - https://github.com/BerriAI/litellm/pull/7472
  4. Anthropic - bump Bedrock claude-3-5-haiku max_output_tokens to 8192

General Proxy Improvements​

  1. Health check support for realtime models
  2. Support calling Azure realtime routes via virtual keys
  3. Support custom tokenizer on /utils/token_counter - useful when checking token count for self-hosted models
  4. Request Prioritization - support on /v1/completion endpoint as well

LLM Translation Improvements​

  1. Deepgram STT support. Start Here
  2. OpenAI Moderations - omni-moderation-latest support. Start Here
  3. Azure O1 - fake streaming support. This ensures if a stream=true is passed, the response is streamed. Start Here
  4. Anthropic - non-whitespace char stop sequence handling - PR
  5. Azure OpenAI - support entrata id username + password based auth. Start Here
  6. LM Studio - embedding route support. Start Here
  7. WatsonX - ZenAPIKeyAuth support. Start Here

Prompt Management Improvements​

  1. Langfuse integration
  2. HumanLoop integration
  3. Support for using load balanced models
  4. Support for loading optional params from prompt manager

Start Here

Finetuning + Batch APIs Improvements​

  1. Improved unified endpoint support for Vertex AI finetuning - PR
  2. Add support for retrieving vertex api batch jobs - PR

NEW Alerting Integration​

PagerDuty Alerting Integration.

Handles two types of alerts:

  • High LLM API Failure Rate. Configure X fails in Y seconds to trigger an alert.
  • High Number of Hanging LLM Requests. Configure X hangs in Y seconds to trigger an alert.

Start Here

Prometheus Improvements​

Added support for tracking latency/spend/tokens based on custom metrics. Start Here

NEW Hashicorp Secret Manager Support​

Support for reading credentials + writing LLM API keys. Start Here

Management Endpoints / UI Improvements​

  1. Create and view organizations + assign org admins on the Proxy UI
  2. Support deleting keys by key_alias
  3. Allow assigning teams to org on UI
  4. Disable using ui session token for 'test key' pane
  5. Show model used in 'test key' pane
  6. Support markdown output in 'test key' pane

Helm Improvements​

  1. Prevent istio injection for db migrations cron job
  2. allow using migrationJob.enabled variable within job

Logging Improvements​

  1. braintrust logging: respect project_id, add more metrics - https://github.com/BerriAI/litellm/pull/7613
  2. Athina - support base url - ATHINA_BASE_URL
  3. Lunary - Allow passing custom parent run id to LLM Calls

Git Diff​

This is the diff between v1.56.3-stable and v1.57.8-stable.

Use this to see the changes in the codebase.

Git Diff