Skip to content

Features

Main features

  • Fast streaming indexing pipeline with async, parallel processing
  • Experimental query pipeline
  • Integrations with OpenAI, Groq, Redis, Qdrant, FastEmbed, Treesitter, and more
  • A variety of loaders, transformers, embedders, and other common, generic tools
  • Bring your own transformers by extending straightforward traits
  • Jinja-like templating for prompts
  • Evaluate pipelines with RAGAS
  • Splitting and merging pipelines
  • Store into multiple backends

Other cool things

  • tracing support for logging and tracing, see /examples and the tracing crate for more information
  • Indexing Pipeline supports closures as well
  • Embed either all fields on a node, combine it into a single field or do both
  • Store results in memory for debugging and experimentation

LLMs & Embeddings

NamePromptingEmbeddingFeature flagNotes
openai openai
AWS Bedrock aws-bedrockMistral and Titan models supported
groq groqAll major models supported, uses async_openai with Groq’s openai schema under the hood
FastEmbed fastembedUses fastembed.rs under the hood, dense and sparse embedding models supported
Ollama ollamaOllama support

Additional integrations

NameFeature flagNotes
QdrantqdrantNamed vectors also supported.
RedisredisSupports caching and storage
Spider & htmdscrapingScrape websites fast and convert the html to markdown
Treesittertree-sitterCode splitting and various transformers to effectively index code
FluviofluvioLoading data from fluvio streams
LancedblancedbStoring and retrieval from lancedb