Model Intelligence Sheet

richarderkhov/xiangxinai_-_xiangxin-2xl-chat-1048k-chinese-llama3-70b-gguf overview

Xiangxin-2XL-Chat-1048k是象信AI基于Meta Llama-3-70B-Instruct模型和Gradient AI的扩充上下文的工作，利用自行研发的中文价值观对齐数据集进行ORPO训练而形成的Chat模型。该模型具备更强的中文能力和中文价值观，其上下文长度达到100万字。在模型性能方面，该模型在ARC、HellaSwag、MMLU、TruthfulQAmc2、Winogrande、GSM8Kflex、CMMLU、CEVAL-VALID等八项测评中，取得了平均分70.22分的成绩，超过了Gradientai-Llama-3-70B-Instruct-Gradient-1048k。我们的训练数据并不包含任何测评数据集。 Xiangxin-2XL-Chat-1048k is a Chat model developed by Xiangxin AI, based on the Meta Llama-3-70B-Instruct model and expanded context from Gradient AI. It was trained using a proprietary Chinese value-aligned dataset through ORPO training, resulting in enhanced Chinese proficiency and alignment with Chinese values. The model has a context length of up to 1 million words. In terms of performance, it surpassed the Gradientai-Llama-3-70B-Instruct-Gradient-1048k model with an average score of 70.22 across eight evaluations including ARC, HellaSwag, MMLU, TruthfulQAmc2, Winogrande, GSM8Kflex, CMMLU, and C-EVAL. It's worth noting that our training data did not include any evaluation datasets. Model | Context Length | Pre-trained Tokens | :------------: | :------------: | :------------: | | Xiangxin-2XL-Chat-1048k | 1048k | 15T # Benchmark 结果/Benchmark Evaluation | | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | CMMLU | CEVAL | |:-----------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:|:-------:|:-------:|:-------:| |Xiangxin-2XL-Chat-1048k| 70.22 | 60.92 | 83.29 |75.13| 57.33| 76.64| 81.05| 65.40| 62.03 | |Llama-3-70B-Instruct-Gradient-1048k| 69.66| 61.18 |82.88 |74.95 |55.28 |75.77 |77.79 |66.44 |63.00| Note：truthfulqa_mc2, gsm8k flexible-extract # 训练过程模型/Training 该模型是使用ORPO技术和自行研发的中文价值观对齐数据集进行训练的。由于内容的敏感性，该数据集无法公开披露。 The model was trained using ORPO and a proprietary Chinese alignment dataset developed in-house. Due to the sensitivity of the content, the dataset cannot be publicly disclosed.

ggufendpoints_compatibleregion:usconversational

Downloads

101

Likes

Pipeline

—

Library

—

Visibility

Public

Access

Open

Repository Files & Downloads

31 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_M.gguf	GGUF	IQ3_M	29.74 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_S.gguf	GGUF	IQ3_S	24.21 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_XS.gguf	GGUF	IQ3_XS	23.93 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ4_XS.gguf	GGUF	IQ4_XS	35.64 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q2_K.gguf	GGUF	Q2_K	24.56 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K.gguf	GGUF	Q3_K	31.91 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_L.gguf	GGUF	Q3_K_L	34.59 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_M.gguf	GGUF	Q3_K_M	31.91 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_S.gguf	GGUF	Q3_K_S	28.79 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_0.gguf	GGUF	—	37.22 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_IQ4_NL-00001-of-00002.gguf	GGUF	IQ4_NL	36.77 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_IQ4_NL-00002-of-00002.gguf	GGUF	IQ4_NL	821.95 MB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_1-00001-of-00002.gguf	GGUF	—	37.25 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_1-00002-of-00002.gguf	GGUF	—	4.02 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K-00001-of-00002.gguf	GGUF	Q4_K	37.24 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K-00002-of-00002.gguf	GGUF	Q4_K	2.36 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K_M-00001-of-00002.gguf	GGUF	Q4_K_M	37.24 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K_M-00002-of-00002.gguf	GGUF	Q4_K_M	2.36 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K_S-00001-of-00002.gguf	GGUF	Q4_K_S	36.77 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q4_K_S-00002-of-00002.gguf	GGUF	Q4_K_S	821.95 MB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_0-00001-of-00002.gguf	GGUF	—	37.14 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_0-00002-of-00002.gguf	GGUF	—	8.17 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_1-00001-of-00002.gguf	GGUF	—	20.52 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_K-00001-of-00002.gguf	GGUF	Q5_K	37.14 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_K-00002-of-00002.gguf	GGUF	Q5_K	9.38 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_K_M-00001-of-00002.gguf	GGUF	Q5_K_M	26.15 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_K_S-00001-of-00002.gguf	GGUF	Q5_K_S	37.14 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q5_K_S-00002-of-00002.gguf	GGUF	Q5_K_S	8.17 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q6_K-00001-of-00002.gguf	GGUF	Q6_K	37.13 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q6_K-00002-of-00002.gguf	GGUF	Q6_K	11.18 GB	Download
Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B_Q8_0-00001-of-00002.gguf	GGUF	—	18.23 GB	Download

Model Details Live

Model Slug

richarderkhov/xiangxinai_-_xiangxin-2xl-chat-1048k-chinese-llama3-70b-gguf

Author

RichardErkhov

Pipeline Task

—

Library

—

Created

2024-08-02

Last Modified

2024-08-03

Gated

Private

HF SHA

43a3da5266d6e811772ad0e7eef85cb6400b2e9c

License

Unknown

Language

Unknown

Base Model

Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://github.com/xiangxinai/XiangxinLM/blob/main/assets/logo.png?raw=true",
    "summary": "Xiangxin-2XL-Chat-1048k是象信AI基于Meta Llama-3-70B-Instruct模型和Gradient AI的扩充上下文的工作，利用自行研发的中文价值观对齐数据集进行ORPO训练而形成的Chat模型。该模型具备更强的中文能力和中文价值观，其上下文长度达到100万字。在模型性能方面，该模型在ARC、HellaSwag、MMLU、TruthfulQA_mc2、Winogrande、GSM8K_flex、CMMLU、CEVAL-VALID等八项测评中，取得了平均分70.22分的成绩，超过了Gradientai-Llama-3-70B-Instruct-Gradient-1048k。我们的训练数据并不包含任何测评数据集。 Xiangxin-2XL-Chat-1048k is a Chat model developed by Xiangxin AI, based on the Meta Llama-3-70B-Instruct model and expanded context from Gradient AI. It was trained using a proprietary Chinese value-aligned dataset through ORPO training, resulting in enhanced Chinese proficiency and alignment with Chinese values. The model has a context length of up to 1 million words. In terms of performance, it surpassed the Gradientai-Llama-3-70B-Instruct-Gradient-1048k model with an average score of 70.22 across eight evaluations including ARC, HellaSwag, MMLU, TruthfulQA_mc2, Winogrande, GSM8K_flex, CMMLU, and C-EVAL. It's worth noting that our training data did not include any evaluation datasets.  Model | Context Length | Pre-trained Tokens | :------------: | :------------: | :------------: | | Xiangxin-2XL-Chat-1048k | 1048k | 15T  # Benchmark 结果/Benchmark Evaluation |                         | **Average** | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** | **Winogrande** | **GSM8K** | **CMMLU** | **CEVAL** | |:-----------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:|:-------:|:-------:|:-------:| |**Xiangxin-2XL-Chat-1048k**| 70.22     |\t60.92\t|  83.29\t|75.13|\t57.33|\t76.64|\t81.05|\t65.40|\t62.03   | |**Llama-3-70B-Instruct-Gradient-1048k**| 69.66|\t61.18\t|82.88\t|74.95\t|55.28\t|75.77\t|77.79\t|66.44\t|63.00| Note：truthfulqa_mc2, gsm8k flexible-extract # 训练过程模型/Training 该模型是使用ORPO技术和自行研发的中文价值观对齐数据集进行训练的。由于内容的敏感性，该数据集无法公开披露。 The model was trained using ORPO and a proprietary Chinese alignment dataset developed in-house. Due to the sensitivity of the content, the dataset cannot be publicly disclosed.",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nXiangxin-2XL-Chat-1048k-Chinese-Llama3-70B - GGUF\n- Model creator: https://huggingface.co/xiangxinai/\n- Original model: https://huggingface.co/xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q2_K.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q2_K.gguf) | Q2_K | 24.56GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_XS.gguf) | IQ3_XS | 23.93GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_S.gguf) | IQ3_S | 24.21GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_S.gguf) | Q3_K_S | 28.79GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ3_M.gguf) | IQ3_M | 29.74GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K.gguf) | Q3_K | 31.91GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_M.gguf) | Q3_K_M | 31.91GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q3_K_L.gguf) | Q3_K_L | 34.59GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ4_XS.gguf) | IQ4_XS | 35.64GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_0.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/blob/main/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_0.gguf) | Q4_0 | 37.22GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | IQ4_NL | 37.58GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q4_K_S | 37.58GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_K.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q4_K | 39.6GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q4_K_M | 39.6GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q4_1.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q4_1 | 41.27GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q5_0.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q5_0 | 45.32GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q5_K_S | 45.32GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q5_K.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q5_K | 46.52GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q5_K_M | 46.52GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q5_1.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q5_1 | 49.36GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q6_K.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q6_K | 53.91GB |\n| [Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B.Q8_0.gguf](https://huggingface.co/RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf/tree/main/) | Q8_0 | 69.83GB |\n\n\n\n\nOriginal model description:\n---\nlicense: llama3\nlanguage:\n- zh\n- en\npipeline_tag: text-generation\n---\n<div align=\"center\">\n\n<picture> \n  <img src=\"https://github.com/xiangxinai/XiangxinLM/blob/main/assets/logo.png?raw=true\" width=\"150px\">\n</picture>\n\n</div>\n<div align=\"center\">\n<h1>\n  Xiangxin-2XL-Chat-1048k\n</h1>\n</div>\n\n我们提供私有化模型训练服务，如果您需要训练行业模型、领域模型或者私有模型，请联系我们: customer@xiangxinai.cn\n\nWe offer customized model training services. If you need to train industry-specific models, domain-specific models, or private models, please contact us at: customer@xiangxinai.cn.\n\n\n# <span id=\"Introduction\">模型介绍/Introduction</span>\n\nXiangxin-2XL-Chat-1048k是[象信AI](https://www.xiangxinai.cn)基于Meta Llama-3-70B-Instruct模型和[Gradient AI的扩充上下文的工作](https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-1048k)，利用自行研发的中文价值观对齐数据集进行ORPO训练而形成的Chat模型。该模型具备更强的中文能力和中文价值观，其上下文长度达到100万字。在模型性能方面，该模型在ARC、HellaSwag、MMLU、TruthfulQA_mc2、Winogrande、GSM8K_flex、CMMLU、CEVAL-VALID等八项测评中，取得了平均分70.22分的成绩，超过了Gradientai-Llama-3-70B-Instruct-Gradient-1048k。我们的训练数据并不包含任何测评数据集。\n\nXiangxin-2XL-Chat-1048k is a Chat model developed by [Xiangxin AI](https://www.xiangxinai.cn), based on the Meta Llama-3-70B-Instruct model and [expanded context from Gradient AI](https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-1048k). It was trained using a proprietary Chinese value-aligned dataset through ORPO training, resulting in enhanced Chinese proficiency and alignment with Chinese values. The model has a context length of up to 1 million words. In terms of performance, it surpassed the Gradientai-Llama-3-70B-Instruct-Gradient-1048k model with an average score of 70.22 across eight evaluations including ARC, HellaSwag, MMLU, TruthfulQA_mc2, Winogrande, GSM8K_flex, CMMLU, and C-EVAL. It's worth noting that our training data did not include any evaluation datasets.\n<div align=\"center\">\n  \nModel | Context Length | Pre-trained Tokens\n| :------------: | :------------: | :------------: |\n| Xiangxin-2XL-Chat-1048k | 1048k | 15T\n\n</div>\n\n\n# <span id=\"Benchmark\">Benchmark 结果/Benchmark Evaluation</span>\n\n|                         | **Average** | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** | **Winogrande** | **GSM8K** | **CMMLU** | **CEVAL** |\n|:-----------------------:|:----------:|:--------:|:---------:|:----------:|:-----------:|:-------:|:-------:|:-------:|:-------:|\n|**Xiangxin-2XL-Chat-1048k**| 70.22     |\t60.92\t|  83.29\t|75.13|\t57.33|\t76.64|\t81.05|\t65.40|\t62.03   |\n|**Llama-3-70B-Instruct-Gradient-1048k**| 69.66|\t61.18\t|82.88\t|74.95\t|55.28\t|75.77\t|77.79\t|66.44\t|63.00|\n\nNote：truthfulqa_mc2, gsm8k flexible-extract\n\n\n# <span id=\"Training\">训练过程模型/Training</span>\n\n该模型是使用ORPO技术和自行研发的中文价值观对齐数据集进行训练的。由于内容的敏感性，该数据集无法公开披露。\n\nThe model was trained using ORPO and a proprietary Chinese alignment dataset developed in-house. Due to the sensitivity of the content, the dataset cannot be publicly disclosed.\n\n## Training loss\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/oLLnrWaxQnyVwI8n2QqHK.png)\n\n## Reward accuracies\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/yD4My-43lLRWecyq-bgZ2.png)\n\n## SFT loss\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/655b15957f2466433998bb89/iUoQfVZDftoW7C-2VXeWe.png)\n\n\n# <span id=\"Start\">快速开始/Quick Start</span>\n\n## Use with transformers\n\nYou can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.\n\n使用Transformers运行本模型推理需要约400GB的显存。\n\nRunning inference with this model using Transformers requires approximately 400GB of GPU memory.\n\n\n### Transformers pipeline\n\n```python\nimport transformers\nimport torch\n\nmodel_id = \"xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B\"\n\npipeline = transformers.pipeline(\n    \"text-generation\",\n    model=model_id,\n    model_kwargs={\"torch_dtype\": torch.bfloat16},\n    device_map=\"auto\",\n)\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"\"},\n    {\"role\": \"user\", \"content\": \"解释一下“温故而知新”\"},\n]\n\nprompt = pipeline.tokenizer.apply_chat_template(\n\t\tmessages, \n\t\ttokenize=False, \n\t\tadd_generation_prompt=True\n)\n\nterminators = [\n    pipeline.tokenizer.eos_token_id,\n    pipeline.tokenizer.convert_tokens_to_ids(\"<|eot_id|>\")\n]\n\noutputs = pipeline(\n    prompt,\n    max_new_tokens=256,\n    eos_token_id=terminators,\n    do_sample=True,\n    temperature=0.6,\n    top_p=0.9,\n)\nprint(outputs[0][\"generated_text\"][len(prompt):])\n\n“温故而知新”是中国古代的一句成语，出自《论语·子路篇》。\n它的意思是通过温习过去的知识和经验，来获得新的理解和见解。\n这里的“温故”是指温习过去，回顾历史，复习旧知识，\n而“知新”则是指了解新鲜事物，掌握新知识。\n这个成语强调学习的循序渐进性，强调在学习新知识时，\n不能忽视过去的基础，而是要在继承和发扬的基础上，去理解和创新。\n```\n\n### Transformers AutoModelForCausalLM\n\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM\nimport torch\n\nmodel_id = \"xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B\"\n\ntokenizer = AutoTokenizer.from_pretrained(model_id)\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_id,\n    torch_dtype=torch.bfloat16,\n    device_map=\"auto\",\n)\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"\"},\n    {\"role\": \"user\", \"content\": \"解释一下“温故而知新”\"},\n]\n\ninput_ids = tokenizer.apply_chat_template(\n    messages,\n    add_generation_prompt=True,\n    return_tensors=\"pt\"\n).to(model.device)\n\nterminators = [\n    tokenizer.eos_token_id,\n    tokenizer.convert_tokens_to_ids(\"<|eot_id|>\")\n]\n\noutputs = model.generate(\n    input_ids,\n    max_new_tokens=256,\n    eos_token_id=terminators,\n    do_sample=True,\n    temperature=0.6,\n    top_p=0.9,\n)\nresponse = outputs[0][input_ids.shape[-1]:]\nprint(tokenizer.decode(response, skip_special_tokens=True))\n\n“温故而知新”是中国古代的一句成语，出自《论语·子路篇》。\n它的意思是通过温习过去的知识和经验，来获得新的理解和见解。\n这里的“温故”是指温习过去，回顾历史，复习旧知识，\n而“知新”则是指了解新鲜事物，掌握新知识。\n这个成语强调学习的循序渐进性，强调在学习新知识时，\n不能忽视过去的基础，而是要在继承和发扬的基础上，去理解和创新。\n```\n\n# 协议/License\nThis code is licensed under the META LLAMA 3 COMMUNITY LICENSE AGREEMENT License.\n\n# 联系我们/Contact Us\nFor inquiries, please contact us via email at customer@xiangxinai.cn.\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 101,
  "gated": false,
  "private": false,
  "last_modified": "2024-08-03T05:04:53.000Z",
  "created_at": "2024-08-02T01:43:58.000Z",
  "pipeline_tag": "",
  "library_name": ""
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "66ac39de308c1f8ae908d137",
  "id": "RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf",
  "modelId": "RichardErkhov/xiangxinai_-_Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B-gguf",
  "sha": "43a3da5266d6e811772ad0e7eef85cb6400b2e9c",
  "createdAt": "2024-08-02T01:43:58.000Z",
  "lastModified": "2024-08-03T05:04:53.000Z",
  "author": "RichardErkhov",
  "downloads": 101,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 33
}