Storing the results
After processing nodes in the pipeline you probably want to store the results. Pipelines support multiple storage steps, but need at least one. A storage implements the Persist
trait.
The Persist
trait
Which is defined as follows:
Setup functions are run right away, asynchronously when the pipeline starts. This could include setting up collections, tables, connections etcetera. Because more might happen after storing, both store
and batch_store
are expected to return the nodes they processed.
If batch_size
is implemented for the storage, the stream will always prefer batch_store
.
Built in storage
Name | Description | Feature Flag |
---|---|---|
Redis | Persists nodes by default as json | redis |
Qdrant | Persists nodes in qdrant | qdrant |
MemoryStorage | Persists nodes in memory; great for debugging | |
LanceDB | Persist and retrieve in lancedb | lancedb |