Community
dot-inference
By devops-toolkit
Last changed 3 months ago

Notice something off about this package? Help us keep the marketplace safe and trustworthy by reporting inappropriate content or behavior.

Report this package
Overview
LLM inference on Kubernetes with vLLM and Crossplane

A Configuration package that defines an LLMInference type for deploying LLM models using vLLM Production Stack. Supports generative and embedding models with automatic GPU scaling, tensor parallelism, and Ingress routing.