Redis Vector Library simplifies the developer experience by providing a streamlined client that enhances Generative AI (GenAI) application development. Redis Enterprise serves as a real-time vector database for vector search, LLM caching, and chat history.
Taking advantage of Generative AI (GenAI) has become a central goal for many technologists. Since ChatGPT launched in November 2022, organizations have set new priorities for their teams to apply GenAI. The expectations are high: AI can compose written language, construct images, answer questions, and write code. But several hurdles remain:
- AI must overcome hallucinations or made-up results.
- Teams must translate demos into live, production applications.
- Businesses must scale up projects efficiently and cost-effectively.
Countless techniques and tools have been developed to help mitigate these challenges. For example, Retrieval Augmented Generation (RAG) has gained prominence for the ability to blend domain-specific data stored in a vector database with the expansive capabilities of Large Language Models (LLMs). What began as a relatively simple method has evolved into a comprehensive suite of strategies to enhance conversational AI quality. This evolution reflects a broader trend toward more technical nuance, underscored by this fascinating deep-dive into RAG published by LangChain.
Redis isn’t a standalone vector database for RAG. It boosts GenAI application performance by serving as a real-time data layer for many essential tasks: vector search, LLM semantic caching, and session management (e.g., user chat histories). We’ve been listening to our customers, users, and community members. We want to make this easier. To that end, we’ve developed Redis Vector Library, which offers a streamlined client that enables the use of Redis in AI-driven tasks, particularly focusing on vector embeddings for search.
Getting Started
The Python Redis Vector Library (redisvl) is built as an extension of the well-known redis-py client. Below we will walk through a simple example. Alternatively, try this hands-on tutorial on Google Colab that covers RAG from scratch with redisvl.
Setup requirements
Ensure you’re working within a Python environment version 3.8 or higher. Then, use pip for installation:
Deploy Redis following one of these convenient paths:
- Redis Cloud – Jumpstart your project for FREE with a fully managed Redis service.
- Redis Stack Docker Image – Ideal for local development. Get Redis running swiftly using the following Docker command:
redisvl also ships with a dedicated CLI tool called rvl. You can learn more about using the CLI in the docs.
Define a schema
Black box search applications rarely get the job done in production.
Redis optimizes production search performance by letting you explicitly configure index settings and dataset schema. With redisvl, defining, loading, and managing a custom schema is straightforward.
Consider a dataset composed of 10k SEC filings PDFs, each broken down into manageable text chunks. Each record in this dataset includes:
- Id: A unique identifier for each PDF chunk.
- Content: The actual text extracted from the PDF.
- Content Embedding: A vector representation of the section’s text.
- Company: The name of the associated company.
- Timestamp: A numeric value representing the last update time.
First, define a schema that models this data’s structure in an index named sec-filings. Use a YAML file for convenience:
The schema.yaml file provides a clear, declarative expression of the schema. By default, the index will use a Hash data structure to store the data in Redis. JSON is also available along with support for different field types.
Now, load and validate this schema:
Create an index
Now we’ll create the index for our dataset by passing a Redis Python client connection to a SearchIndex:
Load data
Before querying, populate the index with your data. If your dataset is a collection of dictionary objects, the .load() method simplifies insertion. It batches upsert operations, efficiently storing your data in Redis and returning the keys for each record:
Run queries
The VectorQuery is a simple abstraction for performing KNN/ANN style vector searches with optional filters.
Imagine you want to find the 5 PDF chunks most semantically related to a user’s query, such as "How much debt is the company in?". First, convert the query into a vector using a text embedding model (see below section on vectorizers). Next, define and execute the query:
To further refine the search results, you can apply various metadata filters. For example, if you’re interested in documents specifically related to “Nike”, use a Tag filter on the company field:
Filters allow you to combine searches over structured data (metadata) with vector similarity to improve retrieval precision.
The VectorQuery is just the starting point. For those looking to explore more advanced querying techniques and data types (text, tag, numeric, vector, geo), this dedicated user guide will get you started.
Simplify embedding generation
The vectorizer module provides access to popular embedding providers like Cohere, OpenAI, VertexAI, and HuggingFace, letting you quickly turn text into dense, semantic vectors.
Below is an example using the Cohere vectorizer, assuming you have the cohere Python library installed and your COHERE_API_KEY set in the environment:
Learn more about working with Redis & Cohere in this dedicated integration guide!
Boost performance with semantic caching
redisvl goes beyond facilitating vector search and query operations in Redis; it aims to showcase practical use cases and common LLM design patterns.
Semantic Caching is designed to boost the efficiency of applications interacting with LLMs by caching responses based on semantic similarity. For example, when similar user queries are presented to the app, previously cached responses can be used instead of processing the query through the model again, significantly reducing response times and API costs.
To do this, use the SemanticCache interface. You can store user queries and response pairs in the semantic cache as follows:
When a new query is received, its embedding is compared against those in the semantic cache. If a sufficiently similar embedding is found, the corresponding cached response is served, bypassing the need for another expensive LLM computation.
We’ll be adding additional abstractions shortly, including patterns for LLM session management and LLM contextual access control. Follow and ⭐ the redisvl GitHub repository to stay tuned!
Bringing it all together
If you’re following along so far, you’ll want to take a look at our end-to-end RAG tutorial that walks through the process of PDF data preparation (extraction, chunking, modeling), indexing, search, and question answering with an LLM.
This particular use case centers around processing and extracting insights from public 10k filings PDFs, as introduced above. It’s been optimized for use on Google Colab so that you won’t need to worry about dependency management or environment setup!
Learn with additional resources
We hope you’re as excited as we are about building real-time GenAI apps with Redis. Get started by installing the client with pip:
We’re also providing these additional resources to help you take your learning to the next level:
Resource | Description | Link |
Documentation | Hosted documentation for redisvl. | https://redisvl.com |
GitHub | The GitHub repository for redisvl. | https://github.com/RedisVentures/redisvl |
Tutorial | A step-by-step guide to using redisvl in a RAG pipeline from scratch. | https://github.com/redis-developer/financial-vss |
Application | An end-to-end application showcasing Redis as a vector database for a document retrieval application with multiple embedding models. | https://github.com/redis-developer/redis-arXiv-search |
Connect with experts at Redis to learn more. Download the Redis vector search cheat sheet and register for our upcoming webinar with LangChain on February 29, 2024!
Stay tuned for more updates on our AI ecosystem integrations, clients, and developer-focused innovations.
The post Introducing the Redis Vector Library for Enhancing GenAI Development appeared first on Redis.