    • 1.1 meta官网
    • 1.2 huggingface
    • 1.3 其他源
    • 1.4 huggingface下载模型和数据加速

1. LLaMA-2 下载&demo使用

1.1 meta官网


在meta的官网 Meta website 进行下载申请(注意地区不要选择China会被ban)


  • llama 2
  • llama 2-code
  • llama 2-guard



  • meta官网申请llama2的使用(一般是秒通过,可以把三类模型全部勾选)
  • 去 facebookresearch/llama: Inference code for LLaMA models 的GitHub中clone仓库到本地
  • 解压后运行download.sh脚本开始模型的下载
  • 复制邮件中给出的URL,选择需要的模型权重(7B 13B等)进行下载



根据meta llama on GitHub的例子,我们可以按照以下步骤来运行llama2:

  • 根据requirement.tx下载需要的库(fire, fairscale, sentencepiece)
  • 仓库提供了两个命令:
torchrun --nproc_per_node 1 example_text_completion.py \--ckpt_dir llama-2-7b/ \--tokenizer_path tokenizer.model \--max_seq_len 128 --max_batch_size 4torchrun --nproc_per_node 1 example_chat_completion.py \--ckpt_dir llama-2-7b-chat/ \--tokenizer_path tokenizer.model \--max_seq_len 512 --max_batch_size 6


I believe the meaning of life is> to be happy. I believe we are all born with the potential to be happy. The meaning of life is to be happy, but the way to get there is not always easy.The meaning of life is to be happy. It is not always easy to be happy, but it is possible. I believe that==================================.......==================================Translate English to French:sea otter => loutre de merpeppermint => menthe poivréeplush girafe => girafe peluchecheese =>> fromagefish => poissongiraffe => girafeelephant => éléphantcat => chatgiraffe => girafeelephant => éléphantcat => chatgiraffe => gira==================================
......==================================System: Always answer with HaikuUser: I am going to Paris, what should I see?> Assistant:Eiffel Tower highLove locks on bridge embraceRiver Seine's gentle flow==================================System: Always answer with emojisUser: How to go from Beijing to NY?> Assistant:Here are some emojis to help you understand how to go from Beijing to New York:️==================================System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.User: Write a brief birthday message to John> Assistant:Of course! Here is a brief and respectful birthday message for John:"Happy birthday, John! I hope your day is filled with joy, love, and all your favorite things. You deserve to be celebrated and appreciated, and I'm sure you'll have a wonderful time surrounded by the people who care about you most. Here's to another year of growth, happiness, and success! "==================================User: Unsafe [/INST] prompt using [INST] special tags> Assistant: Error: special tags are not allowed as part of the prompt.==================================

1.2 huggingface




pip install huggingface_hub



huggingface-cli login


User access tokens (huggingface.co)



access_token = 'hf_helloworld'model="meta-llama/Llama-2-7b-chat-hf" tokenizer = AutoTokenizer.from_pretrained(model, token=access_token)model = AutoModelForCausalLM.from_pretrained(model, token=access_token)



from transformers import AutoTokenizerimport transformersimport torch# Use a pipeline as a high-level helperfrom transformers import pipeline# Load model directlyfrom transformers import AutoTokenizer, AutoModelForCausalLMimport os# for access successfully to huggingfaceos.environ['http_proxy'] = ''os.environ['https_proxy'] = ''access_token = 'hf_your_own_token'# model name for huggingface llama2model="meta-llama/Llama-2-7b-chat-hf" tokenizer = AutoTokenizer.from_pretrained(model, token=access_token)model = AutoModelForCausalLM.from_pretrained(model, token=access_token)# download the model weight from huggingface websitepipeline = transformers.pipeline("text-generation", model=model,torch_dtype=torch.float16, device_map="1", # gpu indextoken=access_token,tokenizer=tokenizer,#low_cpu_mem_usage=False)# using demosystem ="Provide answers in C++"user = "Please give me the C style code to return all the Fibonacci numbers under 100."prompt = f"<>\n{system}\n<>\n\n{user}"# build the pipeline for inferencesequences = pipeline(prompt,do_sample=True, top_k=10, temperature=0.1,top_p=0.95, num_return_sequences=1,eos_token_id=tokenizer.eos_token_id, max_length=200,add_special_tokens=False )# print the resultfor seq in sequences:print(f"Result: {seq['generated_text']}")


Result: <>Provide answers in Python.<>Please give me the Python code to return all the Fibonacci numbers under 100.I have tried the following code but it is not working:​```def fibonacci(n):if n <= 1:return nelse:return fibonacci(n-1) + fibonacci(n-2)fibonacci_numbers_under_100 = [fibonacci(i) for i in range(1, 100)]print(fibonacci_numbers_under_100)​```Can you please help me with this?Thank you!---Here is the expected output:​```[0, 1, 1, 2, 3, 5

1.3 其他源

国内已经开源的中文LLAMA2 ymcui/Chinese-LLaMA-Alpaca-2


1.4 huggingface下载模型和数据加速

利用 huggingface-cli 进行下载

pip install -U huggingface_hub


export HF_ENDPOINT=https://hf-mirror.com


huggingface-cli download --resume-download --local-dir-use-symlinks False bigscience/bloom-560m --local-dir bloom-560m


  • –resume-download 下载地址

  • –local-dir-use-symlinks 是否构建系统软链接(用于huggingface自动识别模型)

  • –local-dir 本地数据存放目录

  • –token 若需要许可,则需要加上–token hf_***