Query Pipeline
After indexing data, you can use a query pipeline to answer questions, using the indexed data, together with various transformations, to generate the most relevant context for the LLM.
Query pipelines, similar to indexing pipelines, function as parallel streams.
In the pipeline, a Query<states::Pending>
starts its journey of going through
a series of transformations, before being used to generate an answer.
A query holds the original
query as a string, the current
transformed query or context as a string,
state, history, embeddings and retrieved documents.
By default, a similarity search is used on a single embedding.
A query pipeline pipeline step-by-step
-
Start with the default query pipeline that does a similarity search with a single embedding:
-
Then generate subquestions, improving the semantic coverage of the query:
This transformer takes the current query, generates subquestions and sets that as
current
.A query transformer takes a
Query<states::Pending>
and implements theTransformQuery
trait. Query transformers always return aResult<Query<states::Pending>>
. For each state,Query
provides an interface for state transitioning. -
Closures are also supported at every step:
-
Now we need to generate an embedding for the transformed query:
An embedding is added to the query, based on what is in
current
, and used for retrieval. Note that multiple embeddings are not (yet!) supported. -
Then use our embedding to retrieve documents from Qdrant:
Documents are retrieved and added to the query object, changing the state to
states::Retrieved
.For full customization, closures are supported here as well. Search strategies also have configurable defaults.
A retriever implements the
Retrieve<S>
trait, whereS
is the search strategy. -
Sometimes, presenting the documents as they are, is not efficient. For example, you could summarize the documents first:
This summarizes the current set of retrieved documents, and updates the query.
Response transformers implement the
TransformResponse
trait. -
Finally, we need to explicitly tell the pipeline how to generate an answer:
The
Simple
answer either forwards the documents as they are, or any generated context like in the previous step. -
Finally, the pipeline can be run as follows: