上一篇文章 用大语言模型LLM查询图数据库NEO4J(1) 介绍了使用GraphQACypherChain查询NEO4J。用它实现简单快捷,但是不容易定制,在生产环境中可能会面临挑战。

本文将基于langgraph 框架,用LLM(大语言模型)查询图数据库NEO4J。它可以定义清晰复杂的工作流,能应对比较复杂的应用场景。

以下是即将实现的可视化LangGraph流程:
LLM查询图数据库NEO4J

文章目录

    • 定义状态
    • 第一个节点:护栏/guardrails
    • 节点:生成Cypher/generate_cypher(查询NEO4J的语句)
      • 使用少量例子增强提示词
      • 用提示词推理Cypher
    • 节点:执行Cypher查询
    • 生成最终回答
    • 构建工作流
    • 见证效果
    • 总结
    • 代码
    • 参考

定义状态

我们将首先定义 LangGraph 应用程序的输入、输出和整体状态。
我们可以认为所谓的状态是:节点之间数据交换的数据格式。它们都继承自TypedDict

from operator import add
from typing import Annotated, List
from typing_extensions import TypedDictclass InputState(TypedDict):"""输入"""question: strclass OverallState(TypedDict):"""整体"""question: strnext_action: strcypher_statement: strcypher_errors: List[str]database_records: List[dict]steps: Annotated[List[str], add]class OutputState(TypedDict):"""输出"""answer: strsteps: List[str]cypher_statement: str

第一个节点:护栏/guardrails

第一个节点 guardrails 是一个简单的“护栏”步骤:我们会验证问题是否与电影或其演员阵容相关,如果不是,我们会通知用户我们无法回答任何其他问题。否则,我们将进入 Cypher 生成节点。

from typing import Literalfrom langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Fieldguardrails_system = """
As an intelligent assistant, your primary objective is to decide whether a given question is related to movies or not. 
If the question is related to movies, output "movie". Otherwise, output "end".
To make this decision, assess the content of the question and determine if it refers to any movie, actor, director, film industry, 
or related topics. Provide only the specified output: "movie" or "end".
"""
guardrails_prompt = ChatPromptTemplate.from_messages([("system",guardrails_system,),("human",("{question}"),),]
)class GuardrailsOutput(BaseModel):decision: Literal["movie", "end"] = Field(description="Decision on whether the question is related to movies")from langchain_ollama import ChatOllama
llm_llama = ChatOllama(model="llama3.1",temperature=0, verbose=True)guardrails_chain = guardrails_prompt | llm_llama.with_structured_output(GuardrailsOutput)def guardrails(state: InputState) -> OverallState:"""Decides if the question is related to movies or not."""guardrails_output = guardrails_chain.invoke({"question": state.get("question")})database_records = Noneif guardrails_output.decision == "end":database_records = "This questions is not about movies or their cast. Therefore I cannot answer this question."return {"next_action": guardrails_output.decision,"database_records": database_records,"steps": ["guardrail"],}

该节点使用llama3.1,通过提示词判断输出的内容是否与电影有关:如果有关则返回movie,在后面会生成Cypher并查询图数据库NEO4J,如果无关则返回end,交给大语言模型处理。

节点:生成Cypher/generate_cypher(查询NEO4J的语句)

使用少量例子增强提示词

将自然语言转换为准确的 Cypher 查询极具挑战性。增强此过程的一种方法是提供相关的少样本示例来指导 LLM 生成查询。为此,我们将使用 Semantic SimilarityExampleSelector 来动态选择最相关的示例。

# Few-shot prompting
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_neo4j import Neo4jVectorexamples = [{"question": "How many artists are there?","query": "MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)",},{"question": "Which actors played in the movie Casino?","query": "MATCH (m:Movie {title: 'Casino'})<-[:ACTED_IN]-(a) RETURN a.name",},{"question": "How many movies has Tom Hanks acted in?","query": "MATCH (a:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN count(m)",},{"question": "List all the genres of the movie Schindler's List","query": "MATCH (m:Movie {title: 'Schindler's List'})-[:IN_GENRE]->(g:Genre) RETURN g.name",},{"question": "Which actors have worked in movies from both the comedy and action genres?","query": "MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name",},{"question": "Which directors have made movies with at least three different actors named 'John'?","query": "MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person) WHERE a.name STARTS WITH 'John' WITH d, COUNT(DISTINCT a) AS JohnsCount WHERE JohnsCount >= 3 RETURN d.name",},{"question": "Identify movies where directors also played a role in the film.","query": "MATCH (p:Person)-[:DIRECTED]->(m:Movie), (p)-[:ACTED_IN]->(m) RETURN m.title, p.name",},{"question": "Find the actor with the highest number of movies in the database.","query": "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) RETURN a.name, COUNT(m) AS movieCount ORDER BY movieCount DESC LIMIT 1",},
]from langchain_ollama import OllamaEmbeddings
embeddings = OllamaEmbeddings(model="nomic-embed-text")example_selector = SemanticSimilarityExampleSelector.from_examples(examples, embeddings, Neo4jVector, k=5, input_keys=["question"]
)

用提示词推理Cypher

我们马上要实现 Cypher 生成链。提示词包含图数据的结构、动态选择的少样本示例以及用户的问题。这种组合能够生成 Cypher 查询,以从图数据库中检索相关信息。

import osdef create_enhanced_graph():"""创建NEO4J对象"""os.environ["NEO4J_URI"] = "bolt://localhost:7687"os.environ["NEO4J_USERNAME"] = "neo4j"os.environ["NEO4J_PASSWORD"] = "neo4j"from langchain_neo4j import Neo4jGraphenhanced_graph = Neo4jGraph(enhanced_schema=True)#print(enhanced_graph.schema)return enhanced_graph
enhanced_graph = create_enhanced_graph()from langchain_core.output_parsers import StrOutputParsertext2cypher_prompt = ChatPromptTemplate.from_messages([("system",("Given an input question, convert it to a Cypher query. No pre-amble.""Do not wrap the response in any backticks or anything else. Respond with a Cypher statement only!"),),("human",("""You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.
Do not wrap the response in any backticks or anything else. Respond with a Cypher statement only!
Here is the schema information
{schema}Below are a number of examples of questions and their corresponding Cypher queries.{fewshot_examples}User input: {question}
Cypher query:"""),),]
)llm_qwen = ChatOllama(model="qwen2.5",temperature=0, verbose=True)text2cypher_chain = text2cypher_prompt | llm_qwen | StrOutputParser()def generate_cypher(state: OverallState) -> OverallState:"""Generates a cypher statement based on the provided schema and user input"""NL = "\n"fewshot_examples = (NL * 2).join([f"Question: {el['question']}{NL}Cypher:{el['query']}"for el in example_selector.select_examples({"question": state.get("question")})])generated_cypher = text2cypher_chain.invoke({"question": state.get("question"),"fewshot_examples": fewshot_examples,"schema": enhanced_graph.schema,})return {"cypher_statement": generated_cypher, "steps": ["generate_cypher"]}

节点:执行Cypher查询

现在我们添加一个节点来执行生成的 Cypher 语句。如果图数据库没有返回结果,我们应该明确告知 LLM,因为留空上下文有时会导致 LLM 幻觉。

可以在此节点前增加 校验查询更正查询 等节点提升结果的准确性。当然,增加这样的节点也不一定能达到预期效果,因为它们本身也可能出错,所以要小心对待。

no_results = "I couldn't find any relevant information in the database"def execute_cypher(state: OverallState) -> OverallState:"""Executes the given Cypher statement."""records = enhanced_graph.query(state.get("cypher_statement"))return {"database_records": records if records else no_results,"next_action": "end","steps": ["execute_cypher"],}

生成最终回答

最后一步是生成答案。这需要将初始问题与图数据库输出相结合,以生成相关的答案。

generate_final_prompt = ChatPromptTemplate.from_messages([("system","You are a helpful assistant",),("human",("""Use the following results retrieved from a database to provide
a succinct, definitive answer to the user's question.Respond as if you are answering the question directly.Results: {results}
Question: {question}"""),),]
)generate_final_chain = generate_final_prompt | llm_llama | StrOutputParser()def generate_final_answer(state: OverallState) -> OutputState:"""Decides if the question is related to movies."""final_answer = generate_final_chain.invoke({"question": state.get("question"), "results": state.get("database_records")})return {"answer": final_answer, "steps": ["generate_final_answer"]}

构建工作流

我们将实现 LangGraph 工作流。

先定义条件边函数:

def guardrails_condition(state: OverallState,
) -> Literal["generate_cypher", "generate_final_answer"]:if state.get("next_action") == "end":return "generate_final_answer"elif state.get("next_action") == "movie":return "generate_cypher"

这个函数将添加到 护栏/guardrails 后面,根据上一步是否生成了Cypher查询来决定路由到下面哪个节点去。

下面的代码将把以上的节点和边连接起来,成为一个完整的工作流:

from langgraph.graph import END, START, StateGraphlanggraph = StateGraph(OverallState, input=InputState, output=OutputState)
langgraph.add_node(guardrails)
langgraph.add_node(generate_cypher)
langgraph.add_node(execute_cypher)
langgraph.add_node(generate_final_answer)langgraph.add_edge(START, "guardrails")
langgraph.add_conditional_edges("guardrails",guardrails_condition,
)langgraph.add_edge("generate_cypher","execute_cypher")
langgraph.add_edge("execute_cypher","generate_final_answer")langgraph.add_edge("generate_final_answer", END)langgraph = langgraph.compile()

见证效果

万事俱备,我们给构建好的langgraph工作流提两个问题,看看它的表现吧:

def ask(question:str):response = langgraph.invoke({"question": question})print(f'response:\n{response["answer"]}')ask("What's the weather in Spain?")
ask("What was the cast of the Casino?")

第一个问题与电影无关,没有查询NEO4J,问题直接由LLM做了回答:

I'm happy to help with that! Unfortunately, I don't have access to real-time weather information for specific locations like Spain. However, I can suggest checking a reliable weather website or app, such as AccuWeather or Weather.com, for the most up-to-date forecast.Would you like me to provide some general information about Spain's climate instead?

对于第二个问题,执行时间较长,最后给出的回答是:

The cast of the movie "Casino" included James Woods, Joe Pesci, Robert De Niro, and Sharon Stone.

Nice!

总结

本文演示了通过比较复杂的langgraph构建了图形化的工作流,由它来处理对图数据的查询。
我觉得使用这种方式的弊端是比较麻烦,好处则是思路很清晰、容易定制修改,更加适合在生产环境中构建比较复杂的AI应用或者智能体Agent。


代码

本文涉及的所有代码以及相关资源都已经共享,参见:

  • github
  • gitee

为便于找到代码,程序文件名称最前面的编号与本系列文章的文档编号相同。

参考

  • Build a Question Answering application over a Graph Database

🪐感谢您观看,祝好运🪐

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。
如若转载,请注明出处:http://www.pswp.cn/diannao/81257.shtml
繁体地址,请注明出处:http://hk.pswp.cn/diannao/81257.shtml
英文地址,请注明出处:http://en.pswp.cn/diannao/81257.shtml

如若内容造成侵权/违法违规/事实不符,请联系英文站点网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

RPG_5.角色动画

1.创建一个动画实例 2.创建该实例的c子类 3.继续创建该类的子类&#xff0c;但是作用是用来链接&#xff08;以后会详细解释&#xff09; 4.基于PlayerAnimInstance类创建一个子类 5.目前一共创建了四个c类&#xff0c; 最基的类 角色的类 玩家控制的角色的类 玩家控制的角…

Sigmoid函数导数推导详解

Sigmoid函数导数推导详解 在逻辑回归中&#xff0c;Sigmoid函数的导数推导是一个关键步骤&#xff0c;它使得梯度下降算法能够高效地计算。 1. Sigmoid函数定义 首先回顾Sigmoid函数的定义&#xff1a; g ( z ) 1 1 e − z g(z) \frac{1}{1 e^{-z}} g(z)1e−z1​ 2. 导…

MS31860T——8 通道串行接口低边驱动器

MS31860T 是一款 8 通道低边驱动器&#xff0c;包含 SPI 串口通信、 PWM斩波器配置、过流保护、短路保护、欠压锁定和过热关断功能&#xff0c; 芯片可以读取每个通道的状态。MS31860T 可以诊断开路的负载情况&#xff0c;并可以读取故障信息。外部故障引脚指示芯片的故障状态。…

腾讯 Kuikly 正式开源,了解一下这个基于 Kotlin 的全平台框架

在 3月的时候通过 《腾讯 TDF 即将开源 Kuikly 跨端框架&#xff0c;Kotlin 支持全平台》 我们大致知道了 Kuikly 的基本情况&#xff0c;Kuikly 是一个面向终端技术栈的跨端开发框架&#xff0c;完全基于kotlin语言开发&#xff0c;提供原生的性能和体验。 按照官方的说法&…

AI驱动UI自动化测试框架调研

随着应用复杂度增加&#xff0c;手动测试变得费时且易出错&#xff0c;而自动化测试可提高效率和可靠性。如何借助大模型和一些自动化测试框架进行自动化测试&#xff0c;是一个研发团队很重要的诉求。 目前主流的自动化测试框架很多&#xff0c;Midscene.js结合Playwright提供…

关系型数据库设计指南

1. 前言 在自己独立开发一个项目的过程中&#xff0c;我发现了一些以往写小 Demo 从来没有遇到过的问题。 最近在独立制作一个全栈的通知管理平台。一开始我没有考虑太多&#xff0c;直接根据头脑中零星的想法就开撸后端数据库 model 和 API&#xff0c;用的是学了半成品的 M…

详解TypeScript中的类型断言及其绕过类型检查机制

TypeScript中的类型断言及其绕过类型检查机制 一、类型断言的本质与工作原理编译时与运行时的区别TypeScript编译器处理类型断言的步骤 二、类型断言的详细语法与进阶用法基础语法对比链式断言断言修饰符1. 非空断言操作符 (!)代码分析1. getLength 函数分析用法说明&#xff1…

XLSX.utils.sheet_to_json设置了blankrows:true,但无法获取到开头的空白行

在用sheetJs的XLSX库做导入&#xff0c;遇到一个bug。如果开头行是空白行的话&#xff0c;调用sheet_to_json转数组获得的数据也是没有包含空白行的。这样会导致在设置对应的起始行时&#xff0c;解析数据不生效。 目前是直接跳过了开头的两行空白行 正确应该获得一下数据 问…

PostgreSQL 数据库下载和安装

官网&#xff1a; PostgreSQL: Downloads 推荐下载网站&#xff1a;EDB downloads postgresql 我选了 postgresql-15.12-1-windows-x64.exe 鼠标双击&#xff0c;开始安装&#xff1a; 安装路径&#xff1a; Installation Directory: D:\Program Files\PostgreSQL\15 Serv…

一、Javaweb是什么?

1.1 客户端与服务端 客户端 &#xff1a;用于与用户进行交互&#xff0c;接受用户的输入或操作&#xff0c;且展示服务器端的数据以及向服务器传递数据。 例如&#xff1a;手机app&#xff0c;微信小程序、浏览器… 服务端 &#xff1a;与客户端进行交互&#xff0c;接受客户…

奇偶ASCII值判断

奇偶ASCII值判断 Description 任意输入一个字符&#xff0c;判断其ASCII是否是奇数&#xff0c;若是&#xff0c;输出YES&#xff0c;否则&#xff0c;输出NO。例如&#xff0c;字符A的ASCII值是65&#xff0c;则输出YES&#xff0c;若输入字符B(ASCII值是66)&#xff0c;则输…

OpenCV 图形API(74)图像与通道拼接函数-----合并三个单通道图像(GMat)为一个多通道图像的函数merge3()

操作系统&#xff1a;ubuntu22.04 OpenCV版本&#xff1a;OpenCV4.9 IDE:Visual Studio Code 编程语言&#xff1a;C11 算法描述 从3个单通道矩阵创建一个3通道矩阵。 此函数将多个矩阵合并以生成一个单一的多通道矩阵。即&#xff0c;输出矩阵的每个元素将是输入矩阵元素的…

多节点监测任务分配方法比较与分析

多监测节点任务分配方法是分布式系统、物联网&#xff08;IoT&#xff09;、工业监测等领域的核心技术&#xff0c;其核心目标是在资源受限条件下高效分配任务&#xff0c;以优化系统性能。以下从方法分类、对比分析、应用场景选择及挑战等方面进行系统阐述&#xff1a; 图1 多…

【推荐系统笔记】BPR损失函数公式

一、BPR损失函数公式 BPR 损失函数的核心公式如下&#xff1a; L BPR − ∑ ( u , i , j ) ∈ D ln ⁡ σ ( x ^ u i j ) λ ∣ ∣ Θ ∣ ∣ 2 L_{\text{BPR}} - \sum_{(u, i, j) \in D} \ln \sigma(\hat{x}_{uij}) \lambda ||\Theta||^2 LBPR​−(u,i,j)∈D∑​lnσ(x^ui…

Java 核心--泛型枚举

作者&#xff1a;IvanCodes 发布时间&#xff1a;2025年4月30日&#x1f913; 专栏&#xff1a;Java教程 各位 CSDN伙伴们&#xff0c;大家好&#xff01;&#x1f44b; 写了那么多代码&#xff0c;有没有遇到过这样的“惊喜”&#xff1a;满心欢喜地从 ArrayList 里取出数据…

新能源行业供应链规划及集成计划报告(95页PPT)(文末有下载方式)

资料解读&#xff1a;《数字化供应链规划及集成计划现状评估报告》 详细资料请看本解读文章的最后内容。 该报告围绕新能源行业 XX 企业供应链展开&#xff0c;全面评估其现状&#xff0c;剖析存在的问题&#xff0c;并提出改进方向和关键举措&#xff0c;旨在提升供应链竞争力…

Centos 7 yum配置出现一下报错:

One of the configured repositories failed (CentOS-$releaserver-Base), and yum doesnt have enough cached data to continue. At this point the only safe thing yum can do is fail. There are a few ways to work "fix" this: 1.解决CentOS Yum Repositor…

Redis 常见问题深度剖析与全方位解决方案指南

Redis 是一款广泛使用的开源内存数据库&#xff0c;在实际应用中常会遇到以下一些常见问题&#xff1a; 1.内存占用问题 问题描述&#xff1a;随着数据量的不断增加&#xff0c;Redis 占用的内存可能会超出预期&#xff0c;导致服务器内存不足&#xff0c;影响系统的稳定性和…

HOOK上瘾思维模型——AI与思维模型【88】

一、定义 HOOK上瘾思维模型是一种通过设计一系列的触发&#xff08;Trigger&#xff09;、行动&#xff08;Action&#xff09;、奖励&#xff08;Reward&#xff09;和投入&#xff08;Investment&#xff09;环节&#xff0c;来促使用户形成习惯并持续使用产品或服务的思维框…

【playwright】内网离线部署playwright

背景&#xff1a;安装好python3.9后&#xff0c;由于内网无法使用pip安装playwright&#xff0c;多方收集资料&#xff0c;终于部署完成&#xff0c;现汇总如下&#xff1a; 1、playwright需要python3.7以上的版本&#xff0c;如果低于这个版本先要将python解释器升级 2、在可…