无需环境配置，阿里通义千问-7B-Chat本地一键体验

2023-08-04 15:13 作者:IT教程精选 0人读过 | 我要投稿

无需环境配置，阿里通义千问-7B-Chat本地一键体验

介绍（Introduction）

通义千问-7B（Qwen-7B）是阿里云研发的通义千问大模型系列的70亿参数规模的模型。Qwen-7B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样，覆盖广泛，包括大量网络文本、专业书籍、代码等。同时，在Qwen-7B的基础上，我们使用对齐机制打造了基于大语言模型的AI助手Qwen-7B-Chat。

学术Fun将上述工具制作成一键启动包，点击即可使用，避免大家配置Python环境出现各种问题，下载地址： https://xueshu.fun/2809/

整合包使用教程

下载压缩包下载地址： https://xueshu.fun/2809/
解压后，如下图所示，双击bat文件运行

浏览器访问http://127.0.0.1:7860/，即可在浏览器进行对话啦

量化（Quantization）

此次整合的一键安装包默认采用BF16精度，占用显存16G左右。

PrecisionMMLUMemoryBF1656.716.2GInt852.810.1GNF448.97.4G

如希望使用更低精度的量化模型，如4比特和8比特的模型，可参考以下代码修改文件夹里'app.py'文件。

import os from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks from transformers import BitsAndBytesConfig import torch model_id = 'qwen/Qwen-7B-Chat' quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type='nf4', bnb_4bit_compute_dtype=torch.bfloat16) pipe = pipeline( task=Tasks.chat, model=model_id, device_map='auto', quantization_config=quantization_config) history = None text = '浙江的省会在哪里？' results = pipe(text, history=history) response, history = results['response'], results['history'] print(f'Response: {response}') text = '它有什么好玩的地方呢？' results = pipe(text, history=history) response, history = results['response'], results['history'] print(f'Response: {response}')

上述方法可以让我们将模型量化成NF4和Int8精度的模型进行读取，帮助我们节省显存开销。我们也提供了相关性能数据。我们发现尽管模型在效果上存在损失，但模型的显存开销大幅降低。

标签：

无需环境配置，阿里通义千问-7B-Chat本地一键体验

无需环境配置，阿里通义千问-7B-Chat本地一键体验

介绍（Introduction）

整合包使用教程

量化（Quantization）