newcohospitality.com

Innovative RAG Pipeline Development with Haystack Framework

Written on

Haystack has emerged as a dynamic open-source framework that enables developers to construct production-grade applications utilizing Large Language Models (LLMs), retrieval-augmented generation (RAG) pipelines, and cutting-edge search systems. Thanks to its modular architecture and extensive tools, Haystack streamlines the integration of advanced AI models, allowing developers to create sophisticated applications capable of scaling across large document repositories. Whether you're an AI novice or an experienced developer, Haystack equips you with the necessary tools to realize your AI initiatives.

What is Haystack?

Haystack is an open-source framework crafted for developing LLM applications and advanced search systems that efficiently handle extensive document collections. It empowers developers to experiment with state-of-the-art AI models while offering the versatility to incorporate various data sources and model providers. The framework has evolved through community contributions, resulting in a modular, user-friendly, and comprehensive solution that addresses a diverse array of AI applications.

Key Features of Haystack

  • Model Integration: Haystack supports models from platforms such as Hugging Face, OpenAI, Cohere, and Mistral, enabling developers to utilize top-tier AI models.
  • Document Stores: Seamlessly connect with leading document stores like OpenSearch, Pinecone, Weaviate, and QDrant to effectively manage and retrieve large datasets.
  • Community Integrations: A growing array of community-driven integrations offers tools for evaluation, monitoring, data ingestion, and more, ensuring your LLM application is resilient and ready for production.

Building with Haystack

Haystack provides a robust toolkit for constructing advanced AI systems:

  • RAG Pipelines: Deploy sophisticated RAG systems on your datasets using cutting-edge retrieval and generation methods.
  • Chatbots and Agents: Develop chatbots powered by leading generative models like GPT-4, capable of interacting with external functions and services.
  • Generative Multi-Modal QA: Construct question-answering systems that manage various information types, including text, images, audio, and tables.
  • Information Extraction: Derive essential information from documents to populate databases or create knowledge graphs.

End-to-End Functionality for LLM Projects

A successful LLM project encompasses more than just language models. Haystack delivers comprehensive support to ensure your project thrives:

  • Model Integration: Effortlessly integrate models from Hugging Face and other providers into your workflow.
  • Data Retrieval: Incorporate diverse data sources for retrieval augmentation, guaranteeing your models access relevant information.
  • Advanced Prompting: Utilize the Jinja2 templating language for dynamic LLM prompting, customized for your requirements.
  • Data Preprocessing: Access built-in cleaning and preprocessing functionalities for multiple data formats and sources.
  • Document Stores: Keep your GenAI applications current with Haystack’s indexing pipelines, designed to prepare and maintain your data.
  • Evaluation Tools: Employ specialized tools to assess your entire system or specific components using various metrics.
  • Hayhooks Module: Serve Haystack Pipelines via HTTP endpoints, simplifying deployment and scalability of your applications.
  • Custom Logging: A customizable logging framework supports structured logging and tracing, with integration for Open Telemetry and Datadog.

Haystack's Rest API and Building Blocks

The architecture of Haystack is built around two core concepts: components and pipelines.

  • Components: These serve as the foundational elements of Haystack, tasked with operations like document retrieval, text generation, and embedding creation. Components can manage local language models or interact with hosted models via APIs. Developers can utilize pre-built components or develop custom solutions as needed.
  • Pipelines: Pipelines outline the data flow within your LLM application, consisting of interconnected components for flexible data processing. They can branch, merge, and cycle back, providing powerful abstractions for intricate workflows.

Who is Haystack For?

Haystack caters to anyone interested in developing AI applications, ranging from LLM enthusiasts to developers with basic Python skills. You don’t need to be a machine learning expert to begin using Haystack. Its user-friendly design and extensive documentation make it approachable for both novices and seasoned professionals.

CODE EXAMPLE: Constructing a RAG Pipeline with Haystack

Step 1: Preparing the Environment

Enable GPU runtime in Colab and set the logging level to INFO. Then, install Haystack and the necessary packages:

pip install haystack-ai

pip install "datasets>=2.6.1"

pip install "sentence-transformers>=3.0.0"

Step 2: Fetching and Indexing Documents

Begin by initializing a DocumentStore, retrieving data, and creating embeddings for your documents:

from haystack.document_stores.in_memory import InMemoryDocumentStore

from datasets import load_dataset

from haystack import Document

from haystack.components.embedders import SentenceTransformersDocumentEmbedder

# Initialize DocumentStore

document_store = InMemoryDocumentStore()

# Fetch the data

dataset = load_dataset("bilgeyucel/seven-wonders", split="train")

docs = [Document(content=doc["content"], meta=doc["meta"]) for doc in dataset]

# Initialize a Document Embedder

doc_embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")

doc_embedder.warm_up()

# Create embeddings and write documents to the DocumentStore

docs_with_embeddings = doc_embedder.run(docs)

document_store.write_documents(docs_with_embeddings["documents"])

Step 3: Building the RAG Pipeline

Now, initialize the components and assemble the pipeline:

from haystack.components.embedders import SentenceTransformersTextEmbedder

from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

from haystack.components.builders import PromptBuilder

from haystack.components.generators import OpenAIGenerator

from haystack import Pipeline

# Initialize a Text Embedder

text_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")

# Initialize the Retriever

retriever = InMemoryEmbeddingRetriever(document_store)

# Define a Template Prompt

template = """

Given the following information, answer the question.

Context:

{% for document in documents %}

{{ document.content }}

{% endfor %}

Question: {{question}}

Answer:

"""

prompt_builder = PromptBuilder(template=template)

# Initialize a Generator

import os

from getpass import getpass

if "OPENAI_API_KEY" not in os.environ:

os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key:")

generator = OpenAIGenerator(model="gpt-3.5-turbo")

# Build the Pipeline

basic_rag_pipeline = Pipeline()

basic_rag_pipeline.add_component("text_embedder", text_embedder)

basic_rag_pipeline.add_component("retriever", retriever)

basic_rag_pipeline.add_component("prompt_builder", prompt_builder)

basic_rag_pipeline.add_component("llm", generator)

# Connect the components

basic_rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

basic_rag_pipeline.connect("retriever", "prompt_builder.documents")

basic_rag_pipeline.connect("prompt_builder", "llm")

Step 4: Asking a Question

Utilize the pipeline to pose a question and receive a generated response:

question = "What does Rhodes Statue look like?"

response = basic_rag_pipeline.run({"text_embedder": {"text": question}, "prompt_builder": {"question": question}})

print(response["llm"]["replies"][0])

You can try additional questions like:

examples = [

"Where is Gardens of Babylon?",

"Why did people build Great Pyramid of Giza?",

"What does Rhodes Statue look like?",

"Why did people visit the Temple of Artemis?",

"What is the importance of Colossus of Rhodes?",

"What happened to the Tomb of Mausolus?",

"How did Colossus of Rhodes collapse?",

]

for q in examples:

response = basic_rag_pipeline.run({"text_embedder": {"text": q}, "prompt_builder": {"question": q}})

print(f"Question: {q}")

print(f"Answer: {response['llm']['replies'][0]}n")

BONUS: Google Colab example of using local LLM on Haystack for Research Paper RAG

https://colab.research.google.com/drive/1A5gniIhI6Dqj_U2tdRnruxvtSMDaeqq0?usp=sharing

  • Parts of this article were generated using Generative AI
  • Subscribe or leave a comment to stay updated with the latest AI trends.

Plug: Explore my digital products on Gumroad here. Please purchase ONLY if you can afford to. Use code: MEDSUB for a 10% discount!

Sponsor: Order Trendy New Fashions Here at ItsBenLifeStyle