LlamaIndex 提示词模板

2025-06-27 张子阳分类: 大语言模型

在LlamaIndex中，不论是构建索引、提出问题，还是合成响应结果，都会用到提示词。只是在默认情况下，对于用户而言，没有感知罢了。这篇文章将简单介绍LlamaIndex中提示词的使用方法，从而加深对于LlamaIndex的理解。

直接使用LLM模型

我们先从一个最简单的例子看起。再看一下《LlamaIndex 使用大语言模型》中的例子：

from llama_index.llms.openai import OpenAI from dotenv import load_dotenv load_dotenv() llm = OpenAI(model="gpt-4.1-mini") response = llm.complete("帮我介绍一下这本书：《人性的枷锁》") print(response.text)

返回的结果类似下面这样：

《人性的枷锁》是英国作家威廉·萨默塞特·毛姆（W. Somerset Maugham）创作的一部经典小说，首次出版于1915年。小说通过主人公菲利普·凯里的经历，深刻探讨了人性的复杂性、自由与束缚、理想与现实之间的矛盾 ...

可以看到，仅用了几行代码，就完成了一个简单的问答任务。接下来，我们使用之前学习的内容，引入LlamaIndex的能力，构建一个VectorStoreIndex，然后提出相同的问题，看一下输出的结果：

使用空的VectorStoreIndex

from llama_index.core import VectorStoreIndex from llama_index.llms.openai import OpenAI from dotenv import load_dotenv load_dotenv() llm = OpenAI(model="gpt-4.1-mini") index = VectorStoreIndex(nodes=[]) # 空索引，不包含任何节点 query_engine = index.as_query_engine(llm=llm) response = query_engine.query("帮我介绍一下这本书：《人性的枷锁》") print(response)

执行一下，输出的结果是这样的：Empty Response。为什么引入RAG后，反而得不到结果了呢？先回顾一下query_engin的处理流程：

检索：从索引中查找与问题相关的上下文（context_str）
增强生成：根据提示词模板，将context_str+用户问题，拼接后发送给LLM生成答案

查看提示词模板

接下来我们查看一下具体的提示词模板是什么。

query_engine = index.as_query_engine(llm=llm) # 获取query_engin的默认prompts prompt_dic = query_engine.get_prompts() for key, value in prompt_dic.items(): print(f"{key}:\n {value.get_template() }\n")

上面代码的输出结果如下：

response_synthesizer:text_qa_template: Context information is below. --------------------- {context_str} --------------------- Given the context information and not prior knowledge, answer the query. Query: {query_str} Answer: response_synthesizer:refine_template: The original query is as follows: {query_str} We have provided an existing answer: {existing_answer} We have the opportunity to refine the existing answer (only if needed) with some more context below. ------------ {context_msg} ------------ Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer. Refined Answer:

上面的模板，尤其是第一个模板text_qa_template，就是LlamaIndex用来拼接初始请求的。注意到其中的一句话：Given the context information and not prior knowledge, answer the query。意思是：在只依赖上下文信息而非先验知识的情况下，回答该问题。

修改默认的prompt_template

可以通过下面的方法来修改默认的text_qa_template。我们修改提示词模板，让模型在回答时，可以使用先验知识（也就是训练时的通用信息）。

index = VectorStoreIndex(nodes=[]) text_qa_template=PromptTemplate( """ 下面是上下文信息： --------------------- {context_str} --------------------- 根据上下文信息以及先验知识，回答问题。问题: {query_str} 回答:""" ) query_engine = index.as_query_engine( text_qa_template = text_qa_template, response_mode ="compact", llm=llm) prompt_dic = query_engine.get_prompts() print(prompt_dic["response_synthesizer:text_qa_template"].get_template()) response = query_engine.query("帮我介绍一下这本书：《人性的枷锁》") print(response)

上面的代码，在输出时，依然为：Empty Response。这是因为LlamaIndex在查询时，如果没有检索到任何节点（VectorStoreIndex的节点为0），就会直接返回空结果。

对比不同模版下的输出

我们手动创建一个非空，但是不包含任何知识的 VectorStoreIndex，但这一次我们使用自定义的模版，然后再次执行：

index = VectorStoreIndex(nodes=[TextNode(text="empty")]) text_qa_template=PromptTemplate( """ 下面是上下文信息： --------------------- {context_str} --------------------- 根据上下文信息以及先验知识，回答问题。问题: {query_str} 回答:""" ) # 输出结果： # 《人性的枷锁》是英国作家威廉·萨默塞特·毛姆（W. Somerset Maugham）创作的一部著名小说，首次出版于1915年。...

此时的输出结果，相当于我们直接调用 llm.compleate("帮我介绍一下这本书：《人性的枷锁》")

为了对比一下prompt_compleate所产生的效果，删掉自定义的prompt_template，使用LlamaIndex的默认template，也就是上面的text_qa_template。其值为下面这样，最核心的区别是不使用先验知识，也就是预训练数据(not prior knowledge)：

Context information is below. --------------------- {context_str} --------------------- Given the context information and not prior knowledge, answer the query. Query: {query_str} Answer:

再次执行脚本：

llm = OpenAI(model="gpt-4.1-mini") index = VectorStoreIndex(nodes=[TextNode(text="empty")]) query_engine = index.as_query_engine(llm=llm) response = query_engine.query("帮我介绍一下这本书：《人性的枷锁》") print(response) # 输出结果： # 抱歉，目前没有关于《人性的枷锁》的相关信息，无法为您提供介绍。

由此可见，通过修改text_qa_template，我们就可以改变llm的输出结果。这次，没有直接返回Empty Response（索引节点为空），但是因为llm也没有知识来回答我们的问题（not prior knowledge）。于是，呈现了上面的结果。

获得发给LLM的原始内容

在上面的代码中，我们通过在query_engine上面调用 get_prompts() 获得了所有的 prompts提示词模板。看到其中有context_str 和 query_str 两个占位符。这些占位符在实际发送给大模型时，将被替换为合适的值，那么如何获得发送给大模型的原始内容呢？可以通过Observability机制来完成，具体可以参看官方Observability文档。修改代码：

from pydoc import text from llama_index.core import PromptTemplate, VectorStoreIndex from llama_index.llms.openai import OpenAI from llama_index.core.schema import TextNode from llama_index.core.instrumentation.event_handlers import BaseEventHandler from llama_index.core.instrumentation.events.llm import LLMCompletionEndEvent, LLMChatEndEvent from llama_index.core.instrumentation.events.embedding import EmbeddingEndEvent from llama_index.core.instrumentation import get_dispatcher from dotenv import load_dotenv load_dotenv() class ModelEventHandler(BaseEventHandler): @classmethod def class_name(cls) -> str: return "ModelEventHandler" def handle(self, event) -> None: """处理事件""" if isinstance(event, LLMCompletionEndEvent): print(f"LLM 请求: { event.prompt }") print(f"LLM 响应: {str(event.response.text)}") elif isinstance(event, LLMChatEndEvent): messages_str = "\n".join([str(x) for x in event.messages]) print(f"LLM 请求2: { messages_str }") print(f"LLM 响应2: {str(event.response.message)}") elif isinstance(event, EmbeddingEndEvent): print(f"Embedding {len(event.chunks)} chunks") root_dispatcher = get_dispatcher() root_dispatcher.add_event_handler(ModelEventHandler()) llm = OpenAI(model="gpt-4.1-mini") index = VectorStoreIndex(nodes=[TextNode(text="empty")]) query_engine = index.as_query_engine(llm=llm) response = query_engine.query("帮我介绍一下这本书：《人性的枷锁》") print(f"print 输出：{response}")

输出类似下面这样：

Embedding 1 chunks Embedding 1 chunks LLM 请求2: system: You are an expert Q&A system that is trusted around the world. Always answer the query using the provided context information, and not prior knowledge. Some rules to follow: 1. Never directly reference the given context in your answer. 2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines. user: Context information is below. --------------------- empty --------------------- Given the context information and not prior knowledge, answer the query. Query: 帮我介绍一下这本书：《人性的枷锁》 Answer: LLM 响应2: assistant: 抱歉，目前没有相关信息介绍《人性的枷锁》这本书。您可以提供更多细节或其他问题，我会尽力帮您解答。 print 输出：抱歉，目前没有相关信息介绍《人性的枷锁》这本书。您可以提供更多细节或其他问题，我会尽力帮您解答。

可以看到发送给LLM的原始文本，其中又看到了熟悉的 text_qa_template。另外注意到在 text_qa_template 模板之前，还有一些LlamaIndex底层提供的规则。当我们自定义提示词模版时，前面的提示语也将被清空，这里就不再演示了。

感谢阅读，希望这篇文章能给你带来帮助！