问题引出

在我们日常使用大模型时，有一类典型的应用场景，就是将文件发送给大模型，然后由大模型进行解析，提炼总结等，这一类功能在官方app中较为常见，但是在很多模型的api中都不支持，那如何使用api实现该功能呢？

在阿里云官方文档的模型列表中（模型列表与价格_大模型服务平台百炼(Model Studio)-阿里云帮助中心），我们可以看到，qwen-vl系列大模型支持图片类文档的解析（如扫描件或图片pdf），其底层即使用OCR进行图片识别来解析

那如果是文字版pdf呢？先转成图片，再传给qwen-vl系列大模型？

有一个方案是自己将pdf解析为纯文本，然后发送给大模型，例如使用tika对pdf进行解析

    public String summarize(@RequestParam("file") MultipartFile file) {List<Document> documents = new TikaDocumentReader(file.getResource()).get();String documentText = documents.stream().map(Document::getFormattedContent).collect(Collectors.joining("\n\n"));return chatClient.prompt().user(DEFAULT_SUMMARY_PROMPT).system(systemSpec ->systemSpec.text(summarizeTemplate).param("document", documentText)).call().content();}

不过这个方案略微繁琐了点，需要自行对文件进行解析，那有没有可以直接上传文件的方案呢？

答案就是qwen-long系列大模型

qwen-long简介

通义千问系列上下文窗口最长，能力均衡且成本较低的模型，适合长文本分析、信息抽取、总结摘要和分类打标等任务。参考文档：Qwen-Long_大模型服务平台百炼(Model Studio)-阿里云帮助中心

使用方式

先将文件上传，然后在与大模型的对话中携带fileid

代码调用

spring-ai-alibaba中为了方便用户使用，最新提供了DashScopeDocumentAnalysisAdvisor，省略了用户需要自行实现文件上传及附带fileid的步骤

    @PostMapping(path = "/analyze", produces = "text/plain")public String analyze(@RequestParam("file") MultipartFile file) {ApiKey apiKey = new SimpleApiKey("your key");return chatClient.prompt().advisors(new DashScopeDocumentAnalysisAdvisor(apiKey)).advisors(a -> a.param(DashScopeDocumentAnalysisAdvisor.RESOURCE, file.getResource())).user("总结文档内容") //或根据文档内容提问.options(DashScopeChatOptions.builder().withModel("qwen-long").build()).call().content();}

若模型默认不是qwen-long，需指定模型为qwen-long

使用时需创建一个DashScopeDocumentAnalysisAdvisor，并传入一个Resource

Resource可以是本地文件FileSystemResource，也可以是网址UrlResource，或者上传的文件，如上述样例中的代码

ps：相关功能目前未上传到中央仓库，需自行编译打包，或等待新版本发布（>1.0.0.2）

ps：虽然该实现由文字版pdf引出，实际支持文本文件（ TXT、DOCX、PDF、XLSX、EPUB、MOBI、MD、CSV、JSON），图片文件（BMP、PNG、JPG/JPEG、GIF 以及PDF扫描件）等

代码解析

DashScopeDocumentAnalysisAdvisor内部代码逻辑较为简单

当resource参数不为空时，获取Resource并将其上传，得到文件信息，将其加入上下文

然后提取对话的SystemMessage，在其中加入文件信息，然后大模型即可识别用户上传的文档

    public ChatClientRequest before(ChatClientRequest chatClientRequest, AdvisorChain advisorChain) {var context = chatClientRequest.context();Resource resource = (Resource) context.get(RESOURCE);if (resource != null) {ResponseEntity<UploadResponse> uploadResponse = upload(resource);context.put(UPLOAD_RESPONSE, uploadResponse);Assert.notNull(uploadResponse.getBody(), "upload response body is null");String augmentSystemMessage = DEFAULT_PROMPT_TEMPLATE.render(Map.of("id", uploadResponse.getBody().id,"originSystemMessage", chatClientRequest.prompt().getSystemMessage().getText()));return chatClientRequest.mutate().prompt(chatClientRequest.prompt().augmentSystemMessage(augmentSystemMessage)).build();}return chatClientRequest;}