Skip to main content

2 posts tagged with "Ollama"

View All Tags

Integrating Spring AI with Vector Databases - A Guide Using PGVector

· 7 min read
Huseyin BABAL
Software Developer

What is a Vector Database?

A vector database is a specialized type of database optimized for storing, retrieving, and performing operations on vector data. Vectors, in this context, are typically arrays of numerical values that represent data in a multi-dimensional space. These are widely used in machine learning and AI for tasks like similarity search, where the goal is to find data points that are close to a given query point in this multi-dimensional space. Vector databases provide efficient indexing and querying capabilities for such operations, often leveraging advanced mathematical and computational techniques to ensure fast and accurate results.

What is PGVector?

pgvector is an extension for PostgreSQL that adds support for storing and querying vector data. It allows users to leverage PostgreSQL's powerful database capabilities while adding specialized functionality for vector operations. With pgvector, you can store high-dimensional vectors, perform similarity searches, and integrate vector operations seamlessly with your existing PostgreSQL databases.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with pgvector support in Rapidapp in seconds here

How Spring Integrates with Vector Databases

Spring, a popular framework for building Java applications, provides robust support for integrating with various types of databases, including vector databases like pgvector. Using Spring AI PGVector Store, developers can easily manage data access and integrate vector operations into their applications. Spring AI offers additional capabilities to enhance machine learning and AI integrations, making it a powerful choice for applications that require advanced data handling and analytics.

Creating a Spring Project

To get started, we'll create a new Spring project. This can be done using Spring Initializr or any other method you prefer. For simplicity, we'll use Spring Initializr here.

  1. Navigate to Spring Initializr: Open your browser and go to Spring Initializr.
  2. Project Settings: Set the following options:
    • Project: Maven Project
    • Language: Java
    • Spring Boot: (select the latest stable version)
    • Dependencies: Add Spring Web Download the project and unzip it. Open the pom.xml file and add the following dependencies to the dependencies section:
pom.xml
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>

Application YAML Configuration

Next, we need to configure our application to use PostgreSQL as a vector store. Update your application.yaml file as follows:

application.yaml
spring:
datasource:
url: jdbc:postgresql://<host>:<port>/<db>?application_name=rapidapp_spring_ai
username: <user>
password: <password
ai:
ollama:
embedding:
enabled: false
vectorstore:
pgvector:
index-type: hnsw
distance-type: cosine_distance
dimensions: 384

index-type: Specifies the type of index to be used for vector data. Common options include ivfflat and hnsw.

dimension: Indicates the dimensionality of the vectors being stored.

distance-type: Defines the distance metric used for similarity search, such as l2 (Euclidean distance) or ip (Inner Product).

Index Types

You can see the brief descriptions of index types used in pgvector below, but if you want to know more, you can refer here

HNSW

HNSW (Hierarchical Navigable Small World) is an advanced indexing algorithm designed for efficient approximate nearest neighbor search in high-dimensional spaces. It builds a graph structure where each node represents a vector, and edges represent connections to other vectors. The graph is navigable through multiple layers, allowing for fast and scalable searches by traversing the most relevant nodes. HNSW is known for its high accuracy and low search latency, making it suitable for real-time applications requiring quick similarity searches.

IVF Flat

IVF Flat (Inverted File Flat) is a popular indexing method that partitions the vector space into clusters using a coarse quantizer. Each vector is assigned to a cluster, and an inverted list is maintained for each cluster containing the vectors assigned to it. During a search, only the clusters closest to the query vector are examined, significantly reducing the number of comparisons needed. IVF Flat provides a good balance between search speed and accuracy, and it is especially effective when dealing with large datasets, as it limits the scope of the search to relevant clusters.

Distance Types

Distance types are metrics used to measure the similarity or dissimilarity between vectors in a vector database. Different applications and data types may require different distance metrics to ensure accurate and meaningful results. Here are some commonly used distance types

Euclidean Distance (L2)

This is the most widely used distance metric, measuring the straight-line distance between two points in a multi-dimensional space. It's calculated as the square root of the sum of the squared differences between corresponding elements of the vectors. Euclidean distance is suitable for general-purpose similarity searches and is often used in clustering algorithms.

Cosine Similarity

This metric measures the cosine of the angle between two vectors, providing a value between -1 and 1. Cosine similarity is particularly useful when the magnitude of the vectors is not important, focusing instead on the direction. It's commonly used in text mining and natural language processing to measure the similarity of documents or word embeddings.

Inner Product (Dot Product)

This metric calculates the sum of the products of corresponding elements of two vectors. It's often used in neural networks and machine learning models to measure the alignment between vectors. Inner product similarity is useful when comparing vectors where higher values indicate greater similarity.

Manhattan Distance (L1)

Also known as the city block distance, it measures the sum of the absolute differences between corresponding elements of two vectors. Manhattan distance is useful in scenarios where differences in individual dimensions are more significant than the overall geometric distance, such as in certain types of image processing.

Hamming Distance

This metric counts the number of positions at which the corresponding elements of two vectors are different. It's mainly used for binary vectors or strings of equal length, making it suitable for applications in error detection and correction, as well as DNA sequence analysis.

Choosing the right distance type depends on the specific requirements of your application and the nature of your data. Each distance metric has its strengths and weaknesses, and understanding these can help optimize the performance and accuracy of similarity searches in your vector database.

Implementing a Document Controller

Create a new controller that will manage vector data operations. Start by defining a service to handle vector store interactions.

DocumentController.java
@RestController
@RequestMapping("/documents")
class DocumentController {

@Autowired
private VectorStore vectorStore;

@PostMapping
public void create(@RequestBody CreateDocumentRequest request) {
vectorStore.add(List.of(new Document(request.text(), request.meta())));
}

@GetMapping
public String list(@RequestParam("query") String query) {
List<Document> results = vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5));
return results.toString();
}
}

Create Documents

# Document 1
curl \
-H "Content-Type: application/json" \
-d '{"text": "Prometheus collects metrics from targets by scraping metrics HTTP endpoints. Since Prometheus exposes data in the same manner about itself, it can also scrape and monitor its own health.", "meta": {"category": "getting-started"}}' \
http://localhost:8080/documents

# Document 2
curl \
-H "Content-Type: application/json" \
-d '{"text": "Prometheus local time series database stores data in a custom, highly efficient format on local storage.", "meta": {"category": "storage"}}' \
http://localhost:8080/documents

Search Documents

curl http://localhost:8080/documents?query="scrape"

Conclusion

Integrating Spring AI with vector databases like pgvector provides powerful capabilities for handling vector data and performing advanced similarity searches. By leveraging Spring's robust framework and pgvector's specialized vector operations, developers can build sophisticated applications that effectively manage and analyze high-dimensional data. Rapidapp further enhances this setup with its user-friendly interface and built-in vector store support, making it easier than ever to develop and maintain vector-based applications..

tip

You can find the complete source code for this project on GitHub.

Building Devops AI Assistant with Langchain, Ollama, and PostgreSQL

· 6 min read
Huseyin BABAL
Software Developer

Introduction

Vector databases emerge as a powerful tool for storing and searching high-dimensional data like document embeddings, offering lightning-fast similarity queries. This article delves into leveraging PostgreSQL, a popular relational database, as a vector database with the pgvector extension. We'll explore how to integrate it into a LangChain workflow for building a robust question-answering (QA) system.

What are Vector Databases?

Imagine a vast library holding countless documents. Traditional relational databases might classify them by subject or keyword. But what if you want to find documents most similar to a specific concept or question, even if keywords don't perfectly align? Vector databases excel in this scenario. They store data as numerical vectors in a high-dimensional space, where closeness in the space reflects semantic similarity. This enables efficient retrieval of similar documents based on their meaning, not just exact keyword matches.

PostgreSQL as Vector Database

PostgreSQL, a widely adopted and versatile relational database system, can be empowered with vector search capabilities using the pgvector extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

If you maintain PostgreSQL database on your own, you can enable pgvector extension by executing the following command for each database as shown below.

CREATE EXTENSION vector;

LangChain: Building Flexible AI Pipelines

LangChain is a powerful framework that facilitates the construction of modular AI pipelines. It allows you to chain together various AI components seamlessly, enabling the creation of complex and customizable workflows.

Your Use Case: Embedding Data for AI-powered QA

In your specific scenario, you're aiming to leverage vector search to enhance a question-answering system. Here's how the components might fit together:

  • Data Preprocessing: Process your documents (e.g., web pages) using Natural Language Processing (NLP) techniques to extract relevant text content. Generate vector representations of your documents using an appropriate AI library (e.g., OllamaEmbeddings in your code).

  • Embedding Storage with pgvector: Store the document vectors and their corresponding metadata (e.g., titles, URLs) in your PostgreSQL database table using pgvector.

  • Building the LangChain Workflow: Construct a LangChain pipeline that incorporates the following elements:

    • Retriever: This component retrieves relevant documents from your PostgreSQL database using vector similarity search powered by pgvector. When a user poses a question, the retriever searches for documents with vector representations closest to the query's vector.
    • Question Passage Transformer: (Optional) This component can further process the retrieved documents to extract snippets most relevant to the user's query.
    • Language Model (LLM): This component uses the retrieved context (potentially augmented with question-specific passages) to formulate a comprehensive response to the user's question.

DevOps AI Assistant: Step-by-step Implementation

We will implement the application by using Pyhton, and will use Poetry for dependency management.

Project Creation

Create a directory and initiate a project by running the following command:

poetry init

This will create a pyproject.toml file in the current directory.

Dependencies

You can install dependencies by running the following command:

poetry add langchain-cohere \
langchain-postgres \
langchain-community \
html2text \
tiktoken

Once you installed dependencies, you can create a empty main.py file to implement our business logic.

Preparing the PostgreSQL Connection URL

Once you create your database on Rapidapp, or use your own database, you can construct the PostgreSQL connection URL as follows. postgresql+psycopg://<user>:<pass>@<host>:<port>/<db>

Defining the Vector Store

connection = "<connection_string>"
collection_name = "prometheus_docs"
embeddings = OllamaEmbeddings()

vectorstore = PGVector(
embeddings=embeddings,
collection_name=collection_name,
connection=connection,
use_jsonb=True,
)

As you can see, we use embedding in the codebase. Your implementation can interact with different AI providers like OpenAI, HuggingFace, HuggingFace, Ollama, etc. Embedding provides a standard interface for all of them. In our case, we use OllamaEmbeddings, since we will be using Ollama as AI provider.

Line 2: This is the collection name in the PostgreSQL database where we store vector documents. In our case, we will store a couple of Prometheus document to help AI provider to decide answers to user's questions.

Line 5: LangChain has lots of vector store implementations and PGVector is one of them. This will help us to interact vector search with PostgreSQL database.

Indexing Documents

urls = ["https://prometheus.io/docs/prometheus/latest/getting_started/", "https://prometheus.io/docs/prometheus/latest/federation/"]
loader = AsyncHtmlLoader(urls)
docs = loader.load()

htmlToText = Html2TextTransformer()
docs_transformed = htmlToText.transform_documents(docs)

splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
docs = splitter.split_documents(docs_transformed)
vectorstore.add_documents(docs)

Line 1-3: With the help of AsyncLoader, we simply load 2 documentation pages of Prometheus.

Line 5-6: Since we cannot use raw html files, we will convert them to text using Html2TextTransformer.

Line 8-11: RecursiveCharacterTextSplitter helps by chunking large text documents into manageable pieces that comply with vector store limitations, improve embedding efficiency, and potentially enhance retrieval accuracy.

Line 12: Store processed documents into vector store.

Building the LangChain Workflow

retriever = vectorstore.as_retriever()
llm = Ollama()

message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""

prompt = ChatPromptTemplate.from_messages([("human", message)])

rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm
response = rag_chain.invoke("how to federate on prometheus")
print(response)

Above code snippet demonstrates how to use LangChain to retrieve information from a vector store and generate a response using a large language model (LLM) based on the retrieved information. Let's break it down step-by-step:

Line 1: This line assumes you have a vector store set up and imports a function to use it as a retriever within LangChain. The retriever will be responsible for fetching relevant information based on a query.

Line 2: This line initializes an instance of the Ollama LLM, which will be used to generate the response to the question.

Line 4: The code defines a multi-line string variable named message. This string uses a template format to include two sections: question: This section will hold the specific question you want to answer. context: This section will contain the relevant background information for the question.

Line 13: Generates chat prompt template.

Line 15: Here the question and context is piped to template to generate prompt, then passed to llm to generate the response. Be sure this is a runnable chain.

Line 16: We invoke the chain with a question and get the response.

Conclusion

In this practical guide, we've delved into using PostgreSQL as a vector database, leveraging the pgvector extension. We explored how this approach can be used to build a context-aware AI assistant, focusing on Prometheus documentation as an example. By storing document embeddings alongside their metadata, we enabled the assistant to retrieve relevant information based on semantic similarity, going beyond simple keyword matching. LangChain played a crucial role in this process. Its modular framework allowed us to effortlessly connect various AI components, like PGVector for vector retrieval and OllamaEmbeddings for interacting with our chosen AI provider. Furthermore, LangChain's ability to incorporate context within user questions significantly enhances the relevance and accuracy of the assistant's responses.

tip

You can find the complete source code for this project on GitHub.