You know that mix of excitement and uncertainty when you’re starting something new? That’s exactly how I felt when I set out to build a Retrieval Augmented Generation (RAG) pipeline using the Redis Vector Library.
RAG might sound like just another buzzword, but it’s really about using the power of semantic search and large language models to create smarter, more efficient ways of finding and using information. For someone like me, who’s constantly searching for ways to sharpen my technical skills, this was the perfect challenge.
What followed was a rollercoaster of wins, setbacks, and “aha!” moments. From understanding how to preprocess data for semantic search to learning about vector embeddings and schema design, this project was as much about discovery as it was about building.
This is my way of sharing the full experience—wins, challenges, and everything in between. If you’re interested in Redis, RedisVL, RAG, or just curious about tackling technical projects, this is for you.
Who am I? A PMM who speaks developer
I’m Rini, and my journey into tech has been anything but straightforward. I started out as a backend software engineer, burying myself in code and solving tricky technical problems. But over time, I found myself drawn to a different kind of problem-solving: understanding the needs of developers and bringing technical products to life in ways that truly resonate with users. That is what led me to my current role as a Product Marketing Manager (PMM) for AI at Redis.
Even though I’ve traded daily coding for marketing strategies, my curiosity for the technical side hasn’t faded. As a PMM, understanding our products inside out is key to helping users get the most out of them. That is why I rolled up my sleeves and built out my first RAG pipeline.
This Redis Vector Library felt like a great starting point. It’s at the forefront of intelligent search and AI-driven apps, and it gave me a chance to explore RAG—a technology I’ve been hearing so much about.
What was my goal here?
The main focus of my project was to build a RAG pipeline from scratch using the Redis Vector Library. I worked my way through this Redis tutorial to get started with RedisVL. RAG is an exciting technology that pairs semantic search with large language models (LLMs) to retrieve relevant information and generate accurate, context-aware answers. By tackling this project, I aimed to understand both the fundamentals of RAG and how Redis can power such applications.
The Redis Vector Library was an essential tool for this project. It simplifies working with vector embeddings, making fast and precise semantic search possible, which is key to building a functional RAG pipeline. RedisVL makes it easy to store, search, and get important data.
I ended up building a working AI assistant that can answer queries about a recent Nike earnings call. It pulls relevant context from Nike’s earnings report and generates accurate, context-aware responses using an LLM.
I learned how easy it can be to set up an AI assistant using RAG and Redis. Beyond the technical implementation, this project highlights how tools like Redis and RAG can have real-world impact. Scale this to industries like finance, healthcare, or education, and you’ve got AI assistants providing instant insights and making critical information easy to act on.
First, I set up the basics
To follow along, you can use this tutorial located on our AI resources dev hub.
Set up your environment
Clone the necessary GitHub repository to access datasets and resources
Install Python dependencies, including redis, redisVL, and LangChain.
Install and configure Redis
Set up a Redis Stack instance locally for storing, indexing, querying vector embeddings.
Configure the Redis connection URL to work with either a local or cloud instance.
Prepare the dataset
Load a financial 10k PDF document using LangChain’s PyPDFLoader.
This is what the output should look like in the Colab.
Preprocess the document by splitting it into manageable chunks using RecursiveCharacterTextSplitter.
Define schema & create index
Design a Redis index schema with fields for text, tags, and vector embeddings.
Configure the index in Redis to make semantic searches work efficiently.
Load data into Redis
Process and load the preprocessed chunks and their embeddings into the Redis index.
Query the database
Construct vector queries to find text chunks semantically similar to user queries.
These are the results outputted in Colab notebook.
This is what pagination through the results looked like in the Colab notebook.
Perform similarity searches, pull relevant results, and explore additional filtering/sorting options.
These are the results of the query from the Colab notebook.
These are the results of the range query from the Colab notebook.
Build the RAG pipeline
Setup RedisVL AsyncSearchIndex. This is a tool for creating and managing search indices in an asynchronous environment, which enables non-blocking operations for high-concurrency applications. It allows you to define data schemas, load and query data, and perform vector-based searches efficiently, making it ideal for scalable AI workflows like RAG pipelines.
Integrate OpenAI’s GPT model (gpt-3.5-turbo-0125) to generate context-aware responses based on retrieval results.
Use a structured prompt to combine user questions and relevant document context for optimal responses.
Now that we have everything set up, we can ask questions about the earnings report.
Test the pipeline
Ask financial questions (e.g. revenue trends, ESG practices) to test the RAG pipeline.
View the results
Retrieve accurate, context-based responses showcasing the pipeline’s effectiveness.
Here we can finally see some of the answers to our questions from the Colab notebook.
Highlights of my journey
Many parts of this project were surprisingly intuitive and enjoyable. Exploring the PyPDFLoader and RecursiveCharacterTextSplitter documentation was a highlight as it showed me how easy it was to preprocess and structure text from PDF documents into meaningful chunks.
RedisVL stood out for its simplicity and efficiency. Tasks like generating text embeddings with HFTextVectorizer and integrating Hugging Face models felt seamless. RedisVL’s ability to handle vector search made it easy to store, index, and retrieve relevant data, which was crucial for building the RAG pipeline.
Additionally, defining the schema, loading data into Redis, and querying the database went smoothly. Watching everything work together was rewarding and showed how powerful and user-friendly the tools were. It made the whole process both informative and fun.
Where the road got bumpy
I also ran into some hiccups with the project and I had to learn some new things to complete my AI assistant. First, I had to get up to speed on some unique technical concepts, like Hugging Face models, vector embeddings, and semantic search. These were new to me, so it required additional reading and experimentation to fully grasp how everything worked together.
Another hurdle came when I encountered an OpenAI API rate limit while testing the RAG pipeline. Each query to the API returned a “quota exceeded” error, effectively blocking progress. To resolve this, I considered introducing a delay between API requests to avoid exceeding the rate limit. However, the other solution I chose was changing to a different OpenAI API key that was part of a business plan. This allowed me to keep testing and finish the pipeline successfully.
Key learnings
Working through the project and building a RAG pipeline using RedisVL helped me learn a lot. RedisVL stood out as a powerful tool, making vector search and embedding management straightforward and efficient. Its seamless integration with Hugging Face models highlighted how well it supports complex AI workflows, and its role in powering semantic search for the RAG pipeline was invaluable. This project also deepened my understanding of RAG pipelines and their ability to combine semantic search with large language models to deliver precise, context-aware answers.
Another key lesson was the importance of effective data preprocessing. Using tools like PyPDFLoader and RecursiveCharacterTextSplitter made structuring data intuitive and ensured the rest of the pipeline worked smoothly.
Ultimately, this hands-on exploration reinforced the importance of approaching tools like Redis with a developer’s mindset, even in my role as a PMM. It was a rewarding journey that demonstrated the potential of Redis and RAG pipelines while leaving me eager to try more advanced use cases.
What’s next
If you are as intrigued by the possibilities of Redis and RAG pipelines as I am, the best way to get started is to try it yourself. Hands-on experience is invaluable, and to follow what I did, you can use this Colab notebook to build the same RAG pipeline I did.
For more in-depth guidance, be sure to explore the RedisVL documentation. It’s a rich resource filled with detailed guides and examples that can help you understand the full range of capabilities RedisVL offers.
You can also consider using RedisVL to help enhance your current projects or inspire new ones. From powering search engines to optimizing recommendation systems, the applications are limitless. RedisVL can help you solve real-world challenges in your work.
Finally, if you are looking for more inspiration, check the Redis for AI docs. It features additional use cases, tutorials, and hands-on projects to help you explore other ways to leverage RedisVL.