Ollama 构建自定义模型包

内容分享3天前发布
1 0 0

QWen 模型打包成ollama可运行格式

一、模型转换

  • 1.下载ollama和llm

    git clone https://github.com/ollama/ollama.git
    cd ollama
    git submodule init
    git submodule update llm/llama.cpp
    

  • 2.安装依赖

    python3 -m venv llm/llama.cpp/.venv
    source llm/llama.cpp/.venv/bin/activate
    pip install -r llm/llama.cpp/requirements.txt
    

  • 3.构建量化工具

    make -C llm/llama.cpp quantize
    

  • 4.下载模型

    export HF_HUB_ENABLE_HF_TRANSFER=1
    huggingface-cli download --token hf_xxx 
    --resume-download 
    --local-dir-use-symlinks False Qwen/Qwen1.5-7B-Chat  
    --local-dir  m3e-large
    

    1. 模型转化
      此处用convert-hf-to-gguf.py 提换 convert.py

    python llm/llama.cpp/convert-hf-to-gguf.py  ./Qwen/Qwen1.5-7B-Chat --outtype f16 --outfile converted.bin
    

    1. 模型量化

    llm/llama.cpp/quantize converted.bin quantized.bin q4_0
    

二、ollama 构建包

  • 准备Modelfile

     FROM quantized.bin
     TEMPLATE "[INST] {{ .Prompt }} [/INST]"
    

  • 构建部署包

      davisgao@mac ~/ ollama create davisgao/qwen1.5 -f ModelFile
      transferring model data
      creating model layer
      creating template layer
      using already created layer sha256:0d655da2f0b08a1210068e234792da4dfcb5cd2896dfd57a813f52ccc9d0ab95
      using already created layer sha256:68693db5eb3e0501c644080a545730fc93d2ca2dfddf03633642b99f3a1f0e3c
      using already created layer sha256:92a6f4b0a39deb0199816f5b3f25ad0db39e3d04b262a204ac76595e0e979654
      writing manifest
      success
    

  • 推送部署包
    需要注册ollama账号

    ollama push davisgao/qwen1.5
    

© 版权声明

相关文章

暂无评论

none
暂无评论...