In this tutorial, we will cover the basics of indexing and querying code with Swiftide. Step by step we will set up the required dependencies and build up a pipeline. We will then manually build a query function to answer a user’s question.
Before we can get started we need to set up a new Rust project and install Redis and Qdrant.
Make sure you have the Rust toolchain installed, rustup is recommended. For the example we will use OpenAI to enrich, embed and query our data.
Then install both Redis and Qdrant, either via your favourite package manager or run them as one-off via Docker.
On github you can also find a docker compose file.
Then make sure the following environment variables are configured:
Creating the Rust project
First, let’s set up the Rust project:
Next we’ll add some useful dependencies to make our life easier, and Tokio:
And add Swiftide itself:
Setting up our main function
Then set up a basic main for the project:
Give it a test:
And we’re ready to go!
Building the indexing pipeline
Any RAG pipeline is only as good as the data it ingests. For brevity, let’s assume the project is written in a single language and has some markdown like a readme. In bigger projects we might want to index documentation, websites and other code as well. That’s all possible with Swiftide, but let’s keep that out of scope for now.
Getting user input
First, let’s get some user input when we run the program. We want to know the language, the directory it should index and query, and later the query itself. Let’s use clap for this, which we added earlier.
And add the argument parsing:
Let’s run it to make sure it works:
Enabling logging
Before we continue, let’s enable logging so we can easily see if something goes wrong.
And initialize the stdout subscriber right below main:
By setting RUST_LOG to debug or info it should now be a lot easier to identify problems.
Indexing markdown files
First we will index any markdown files, like a README. There might be valuable data about what the project does and we want to be sure to have that included.
And call it from main:
This loads all markdown files into the pipeline and runs the pipeline, but it does not do anything yet.
Chunking the data
In RAG, we generally store smaller chunks than the full file. This gives more granular and detailed results and gives the opportunity to very specifically enrich. Chunking is an exercise by itself. The goal is to get small, meaningful blocks while still allowing for performance at scale.
For this we will use the markdown chunker, which under the hood uses the excellent Rust text-splitter crate. It tries to get meaningful blocks of text by using the markdown formatting as a guide.
Enriching the data with questions and answers
A common technique for RAG is to generate questions per chunk. When searching the index, this gives more overlap with queries.
Let’s use a transformer with defaults to add it. This also sets up openai as the llm. OpenAI is cheap to clone, it wraps the client with an internal Arc.
When you try to run the pipeline, it will fail with an error that no storage is set. A pipeline must always end in at least one storage.
Embedding and storing the data
Next we will embed the chunks and store them into Qdrant. The Embed transformer only supports batch mode. Depending on the LLM, chunk sizes and resources available; this is a potential parameter to tune.
Great! Let’s run this on a repository and check if it works:
Indexing code
For indexing code we will do the exact same process in a new pipeline.
Let’s give it a whirl:
1 minute and 23 seconds. Not bad!
Getting swifty
Let’s see if we can speed it up. Since requests to openai take time, we can increase the concurrency (default is number of cpus) up to whatever the rate limit allows.
There are also approximately 500 nodes to process while we batch in 100, which means we can experiment with smaller batches to get higher parallelism.
Finally, we could also split a single stream into markdown and code. Let’s do all at once and see where we’re at:
And let’s give it another whirl:
Aww yeah, 478 nodes processed in 17 seconds! For the record, splitting gets it down to ~50 seconds, and bumping the currency gets it down close to 20. The smaller embedding batches shave off the rest.
With that sorted, let’s quickly add a node cache right before splitting, so that subsequent queries will only re-index changed nodes:
Querying our data
And let’s give it a final go:
Which gives us the following answer:
Swiftide offers a lot more features for customization and tuning. Every RAG application is different, with different data, and different requirements.
For more in depth documentation, check out our api documentation and the rest of the documentation. Questions? Join us on Discord!