richarderkhov/weyaxi_-_einstein-v6.1-llama3-8b-gguf overview
This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets. This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl. This model's training was sponsored by sablo.ai. See axolotl config axolotl version: 0.4.0 # ๐ฌ Prompt Template You can use ChatML prompt template while using the model: ### ChatML This prompt template is available as a chat template, which means you can format messages using the tokenizer.applychattemplate() method: # ๐ Datasets used in this model The datasets used to train this model are listed in the metadata section of the model card. Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria. The results of this filtering process and its outcomes are in the data folder of this repository: Weyaxi/Einstein-v6.1-Llama3-8B/data # ๐ Quantizationed versions
Repository Files & Downloads
| File | Type | Quantization | Size | Link |
|---|---|---|---|---|
| Einstein-v6.1-Llama3-8B.IQ3_M.gguf | GGUF | IQ3_M | 3.52 GB | Download |
| Einstein-v6.1-Llama3-8B.IQ3_S.gguf | GGUF | IQ3_S | 3.43 GB | Download |
| Einstein-v6.1-Llama3-8B.IQ3_XS.gguf | GGUF | IQ3_XS | 3.28 GB | Download |
| Einstein-v6.1-Llama3-8B.IQ4_NL.gguf | GGUF | IQ4_NL | 4.38 GB | Download |
| Einstein-v6.1-Llama3-8B.IQ4_XS.gguf | GGUF | IQ4_XS | 4.18 GB | Download |
| Einstein-v6.1-Llama3-8B.Q2_K.gguf | GGUF | Q2_K | 2.96 GB | Download |
| Einstein-v6.1-Llama3-8B.Q3_K.gguf | GGUF | Q3_K | 3.74 GB | Download |
| Einstein-v6.1-Llama3-8B.Q3_K_L.gguf | GGUF | Q3_K_L | 4.03 GB | Download |
| Einstein-v6.1-Llama3-8B.Q3_K_M.gguf | GGUF | Q3_K_M | 3.74 GB | Download |
| Einstein-v6.1-Llama3-8B.Q3_K_S.gguf | GGUF | Q3_K_S | 3.41 GB | Download |
| Einstein-v6.1-Llama3-8B.Q4_0.gguf | GGUF | โ | 4.34 GB | Download |
| Einstein-v6.1-Llama3-8B.Q4_1.gguf | GGUF | โ | 4.78 GB | Download |
| Einstein-v6.1-Llama3-8B.Q4_K.gguf | GGUF | Q4_K | 4.58 GB | Download |
| Einstein-v6.1-Llama3-8B.Q4_K_M.gguf | GGUF | Q4_K_M | 4.58 GB | Download |
| Einstein-v6.1-Llama3-8B.Q4_K_S.gguf | GGUF | Q4_K_S | 4.37 GB | Download |
| Einstein-v6.1-Llama3-8B.Q5_0.gguf | GGUF | โ | 5.21 GB | Download |
| Einstein-v6.1-Llama3-8B.Q5_1.gguf | GGUF | โ | 5.65 GB | Download |
| Einstein-v6.1-Llama3-8B.Q5_K.gguf | GGUF | Q5_K | 5.34 GB | Download |
| Einstein-v6.1-Llama3-8B.Q5_K_M.gguf | GGUF | Q5_K_M | 5.34 GB | Download |
| Einstein-v6.1-Llama3-8B.Q5_K_S.gguf | GGUF | Q5_K_S | 5.21 GB | Download |
| Einstein-v6.1-Llama3-8B.Q6_K.gguf | GGUF | Q6_K | 6.14 GB | Download |
| Einstein-v6.1-Llama3-8B.Q8_0.gguf | GGUF | โ | 7.95 GB | Download |
Model Details Live
Metadata Inspector
Normalized metadata (stored in metadata_json)
{
"metadata": {},
"card_data": {
"frontmatter": {},
"hero_image_url": "https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png",
"summary": "This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets. This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl. This model's training was sponsored by sablo.ai. See axolotl config axolotl version: 0.4.0 ``yaml base_model: meta-llama/Meta-Llama-3-8B model_type: LlamaForCausalLM tokenizer_type: AutoTokenizer load_in_8bit: false load_in_4bit: false strict: false chat_template: chatml datasets: ds_type: json type: alpaca conversation: chatml ds_type: json type: gpteacher conversation: chatml ds_type: json type: alpaca conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml dataset_prepared_path: last_run_prepared val_set_size: 0.002 output_dir: ./Einstein-v6.1-Llama3-8B-model sequence_len: 8192 sample_packing: true pad_to_sequence_len: true eval_sample_packing: false wandb_project: Einstein wandb_entity: wandb_watch: wandb_name: Einstein-v6.1-Llama3-2-epoch wandb_log_model: hub_model_id: Weyaxi/Einstein-v6.1-Llama3-8B save_safetensors: true gradient_accumulation_steps: 4 micro_batch_size: 1 num_epochs: 2 optimizer: adamw_bnb_8bit # look lr_scheduler: cosine learning_rate: 0.000005 # look train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: false gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true warmup_steps: 10 evals_per_epoch: 2 eval_table_size: eval_table_max_new_tokens: 128 saves_per_epoch: 2 debug: deepspeed: zero3_bf16_cpuoffload_params.json weight_decay: 0.0 fsdp: fsdp_config: special_tokens: bos_token: \"\" eos_token: \"\" unk_token: \"\" pad_token: # changed tokens: ` # ๐ฌ Prompt Template You can use ChatML prompt template while using the model: ### ChatML ` system {system} user {user} assistant {asistant} ` This prompt template is available as a chat template, which means you can format messages using the tokenizer.apply_chat_template() method: `python messages = [ {\"role\": \"system\", \"content\": \"You are helpful AI asistant.\"}, {\"role\": \"user\", \"content\": \"Hello!\"} ] gen_input = tokenizer.apply_chat_template(message, return_tensors=\"pt\") model.generate(**gen_input) `` # ๐ Datasets used in this model The datasets used to train this model are listed in the metadata section of the model card. Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria. The results of this filtering process and its outcomes are in the data folder of this repository: Weyaxi/Einstein-v6.1-Llama3-8B/data # ๐ Quantizationed versions",
"quick_links": [],
"benchmark_table_html": "",
"readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nEinstein-v6.1-Llama3-8B - GGUF\n- Model creator: https://huggingface.co/Weyaxi/\n- Original model: https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Einstein-v6.1-Llama3-8B.Q2_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q2_K.gguf) | Q2_K | 2.96GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_XS.gguf) | IQ3_XS | 3.28GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_S.gguf) | IQ3_S | 3.43GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_S.gguf) | Q3_K_S | 3.41GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_M.gguf) | IQ3_M | 3.52GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K.gguf) | Q3_K | 3.74GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_M.gguf) | Q3_K_M | 3.74GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_L.gguf) | Q3_K_L | 4.03GB |\n| [Einstein-v6.1-Llama3-8B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ4_XS.gguf) | IQ4_XS | 4.18GB |\n| [Einstein-v6.1-Llama3-8B.Q4_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_0.gguf) | Q4_0 | 4.34GB |\n| [Einstein-v6.1-Llama3-8B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ4_NL.gguf) | IQ4_NL | 4.38GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K_S.gguf) | Q4_K_S | 4.37GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K.gguf) | Q4_K | 4.58GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K_M.gguf) | Q4_K_M | 4.58GB |\n| [Einstein-v6.1-Llama3-8B.Q4_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_1.gguf) | Q4_1 | 4.78GB |\n| [Einstein-v6.1-Llama3-8B.Q5_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_0.gguf) | Q5_0 | 5.21GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K_S.gguf) | Q5_K_S | 5.21GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K.gguf) | Q5_K | 5.34GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K_M.gguf) | Q5_K_M | 5.34GB |\n| [Einstein-v6.1-Llama3-8B.Q5_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_1.gguf) | Q5_1 | 5.65GB |\n| [Einstein-v6.1-Llama3-8B.Q6_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q6_K.gguf) | Q6_K | 6.14GB |\n| [Einstein-v6.1-Llama3-8B.Q8_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q8_0.gguf) | Q8_0 | 7.95GB |\n\n\n\n\nOriginal model description:\n---\nlanguage:\n- en\nlicense: other\ntags:\n- axolotl\n- generated_from_trainer\n- instruct\n- finetune\n- chatml\n- gpt4\n- synthetic data\n- science\n- physics\n- chemistry\n- biology\n- math\n- llama\n- llama3\nbase_model: meta-llama/Meta-Llama-3-8B\ndatasets:\n- allenai/ai2_arc\n- camel-ai/physics\n- camel-ai/chemistry\n- camel-ai/biology\n- camel-ai/math\n- metaeval/reclor\n- openbookqa\n- mandyyyyii/scibench\n- derek-thomas/ScienceQA\n- TIGER-Lab/ScienceEval\n- jondurbin/airoboros-3.2\n- LDJnr/Capybara\n- Cot-Alpaca-GPT4-From-OpenHermes-2.5\n- STEM-AI-mtl/Electrical-engineering\n- knowrohit07/saraswati-stem\n- sablo/oasst2_curated\n- lmsys/lmsys-chat-1m\n- TIGER-Lab/MathInstruct\n- bigbio/med_qa\n- meta-math/MetaMathQA-40K\n- openbookqa\n- piqa\n- metaeval/reclor\n- derek-thomas/ScienceQA\n- scibench\n- sciq\n- Open-Orca/SlimOrca\n- migtissera/Synthia-v1.3\n- TIGER-Lab/ScienceEval\n- allenai/WildChat\n- microsoft/orca-math-word-problems-200k\n- openchat/openchat_sharegpt4_dataset\n- teknium/GPTeacher-General-Instruct\n- m-a-p/CodeFeedback-Filtered-Instruction\n- totally-not-an-llm/EverythingLM-data-V3\n- HuggingFaceH4/no_robots\n- OpenAssistant/oasst_top1_2023-08-25\n- WizardLM/WizardLM_evol_instruct_70k\nmodel-index:\n- name: Einstein-v6.1-Llama3-8B\n results:\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: AI2 Reasoning Challenge (25-Shot)\n type: ai2_arc\n config: ARC-Challenge\n split: test\n args:\n num_few_shot: 25\n metrics:\n - type: acc_norm\n value: 62.46\n name: normalized accuracy\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: HellaSwag (10-Shot)\n type: hellaswag\n split: validation\n args:\n num_few_shot: 10\n metrics:\n - type: acc_norm\n value: 82.41\n name: normalized accuracy\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: MMLU (5-Shot)\n type: cais/mmlu\n config: all\n split: test\n args:\n num_few_shot: 5\n metrics:\n - type: acc\n value: 66.19\n name: accuracy\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: TruthfulQA (0-shot)\n type: truthful_qa\n config: multiple_choice\n split: validation\n args:\n num_few_shot: 0\n metrics:\n - type: mc2\n value: 55.1\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: Winogrande (5-shot)\n type: winogrande\n config: winogrande_xl\n split: validation\n args:\n num_few_shot: 5\n metrics:\n - type: acc\n value: 79.32\n name: accuracy\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: GSM8k (5-shot)\n type: gsm8k\n config: main\n split: test\n args:\n num_few_shot: 5\n metrics:\n - type: acc\n value: 66.11\n name: accuracy\n source:\n url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: IFEval (0-Shot)\n type: HuggingFaceH4/ifeval\n args:\n num_few_shot: 0\n metrics:\n - type: inst_level_strict_acc and prompt_level_strict_acc\n value: 45.68\n name: strict accuracy\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: BBH (3-Shot)\n type: BBH\n args:\n num_few_shot: 3\n metrics:\n - type: acc_norm\n value: 29.38\n name: normalized accuracy\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: MATH Lvl 5 (4-Shot)\n type: hendrycks/competition_math\n args:\n num_few_shot: 4\n metrics:\n - type: exact_match\n value: 5.74\n name: exact match\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: GPQA (0-shot)\n type: Idavidrein/gpqa\n args:\n num_few_shot: 0\n metrics:\n - type: acc_norm\n value: 4.25\n name: acc_norm\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: MuSR (0-shot)\n type: TAUR-Lab/MuSR\n args:\n num_few_shot: 0\n metrics:\n - type: acc_norm\n value: 11.23\n name: acc_norm\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n - task:\n type: text-generation\n name: Text Generation\n dataset:\n name: MMLU-PRO (5-shot)\n type: TIGER-Lab/MMLU-Pro\n config: main\n split: test\n args:\n num_few_shot: 5\n metrics:\n - type: acc\n value: 23.68\n name: accuracy\n source:\n url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n name: Open LLM Leaderboard\n---\n\n\n# ๐ฌ Einstein-v6.1-Llama3-8B\n\nThis model is a full fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on diverse datasets.\n\nThis model is finetuned using `8xRTX3090` + `1xRTXA6000` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).\n\nThis model's training was sponsored by [sablo.ai](https://sablo.ai). \n\n<details><summary>See axolotl config</summary>\n\naxolotl version: `0.4.0`\n```yaml\nbase_model: meta-llama/Meta-Llama-3-8B\nmodel_type: LlamaForCausalLM\ntokenizer_type: AutoTokenizer\n\nload_in_8bit: false\nload_in_4bit: false\nstrict: false\n\nchat_template: chatml\ndatasets:\n - path: data/merged_all.json\n ds_type: json\n type: alpaca\n conversation: chatml\n\n - path: data/gpteacher-instruct-special-alpaca.json\n ds_type: json\n type: gpteacher\n conversation: chatml\n\n - path: data/wizardlm_evol_instruct_70k_random_half.json\n ds_type: json\n type: alpaca\n conversation: chatml\n\n - path: data/capybara_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml\n\n - path: data/synthia-v1.3_sharegpt_12500.json\n ds_type: json\n type: sharegpt\n conversation: chatml \n\n - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml\n\n - path: data/slimorca_dedup_filtered_95k_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml \n\n - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml \n\n - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json\n ds_type: json\n type: sharegpt\n strict: false\n conversation: chatml \n\n - path: data/pippa_bagel_repo_3k_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml \n\n - path: data/gpt4_data_lmys_1m_sharegpt.json\n ds_type: json\n type: sharegpt\n conversation: chatml \n\n - path: data/sharegpt_gpt4_english.json\n ds_type: json\n type: sharegpt\n conversation: chatml\n\n - path: data/no_robots_sharegpt.json\n ds_type: json\n type: sharegpt\n strict: false\n conversation: chatml\n\n - path: data/oasst_top1_from_fusechatmixture_sharegpt.json\n ds_type: json\n type: sharegpt\n strict: false\n conversation: chatml\n\n - path: data/everythinglm-data-v3_sharegpt.json\n ds_type: json\n type: sharegpt\n strict: false\n conversation: chatml\n\ndataset_prepared_path: last_run_prepared\nval_set_size: 0.002\n\noutput_dir: ./Einstein-v6.1-Llama3-8B-model\n\nsequence_len: 8192\nsample_packing: true\npad_to_sequence_len: true\neval_sample_packing: false\n\nwandb_project: Einstein\nwandb_entity:\nwandb_watch:\nwandb_name: Einstein-v6.1-Llama3-2-epoch\nwandb_log_model:\nhub_model_id: Weyaxi/Einstein-v6.1-Llama3-8B\n\nsave_safetensors: true\n\ngradient_accumulation_steps: 4\nmicro_batch_size: 1\nnum_epochs: 2\noptimizer: adamw_bnb_8bit # look\nlr_scheduler: cosine\nlearning_rate: 0.000005 # look\n\ntrain_on_inputs: false\ngroup_by_length: false\nbf16: true\nfp16: false\ntf32: false\n\ngradient_checkpointing: true\nearly_stopping_patience:\nresume_from_checkpoint:\nlocal_rank:\nlogging_steps: 1\nxformers_attention:\nflash_attention: true\n\nwarmup_steps: 10\nevals_per_epoch: 2\neval_table_size:\neval_table_max_new_tokens: 128\nsaves_per_epoch: 2\ndebug:\n\ndeepspeed: zero3_bf16_cpuoffload_params.json\nweight_decay: 0.0\nfsdp:\nfsdp_config:\nspecial_tokens:\n bos_token: \"<s>\"\n eos_token: \"<|im_end|>\"\n unk_token: \"<unk>\"\n pad_token: <|end_of_text|> # changed\ntokens:\n - \"<|im_start|>\"\n```\n</details><br>\n\n# ๐ฌ Prompt Template\n\nYou can use ChatML prompt template while using the model:\n\n### ChatML\n\n```\n<|im_start|>system\n{system}<|im_end|>\n<|im_start|>user\n{user}<|im_end|>\n<|im_start|>assistant\n{asistant}<|im_end|>\n```\n\nThis prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the\n`tokenizer.apply_chat_template()` method:\n\n```python\nmessages = [\n {\"role\": \"system\", \"content\": \"You are helpful AI asistant.\"},\n {\"role\": \"user\", \"content\": \"Hello!\"}\n]\ngen_input = tokenizer.apply_chat_template(message, return_tensors=\"pt\")\nmodel.generate(**gen_input)\n```\n\n# ๐ Datasets used in this model\n\nThe datasets used to train this model are listed in the metadata section of the model card.\n\nPlease note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria.\n\nThe results of this filtering process and its outcomes are in the data folder of this repository:\n\n[Weyaxi/Einstein-v6.1-Llama3-8B/data](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B/tree/main/data)\n\n# ๐ Quantizationed versions\n\n## GGUF [@bartowski](https://huggingface.co/bartowski)\n\n- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-GGUF\n\n## ExLlamaV2 [@bartowski](https://huggingface.co/bartowski)\n\n- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-exl2\n\n## AWQ [@solidrust](https://huggingface.co/solidrust)\n\n- https://huggingface.co/solidrust/Einstein-v6.1-Llama3-8B-AWQ\n\n# ๐ฏ [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)\nDetailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)\n\n| Metric |Value|\n|---------------------------------|----:|\n|Avg. |68.60|\n|AI2 Reasoning Challenge (25-Shot)|62.46|\n|HellaSwag (10-Shot) |82.41|\n|MMLU (5-Shot) |66.19|\n|TruthfulQA (0-shot) |55.10|\n|Winogrande (5-shot) |79.32|\n|GSM8k (5-shot) |66.11|\n\n# ๐ฏ [Open LLM Leaderboard v2 Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)\nDetailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)\n\n| Metric |Value|\n|-------------------|----:|\n|Avg. |19.99|\n|IFEval (0-Shot) |45.68|\n|BBH (3-Shot) |29.38|\n|MATH Lvl 5 (4-Shot)| 5.74|\n|GPQA (0-shot) | 4.25|\n|MuSR (0-shot) |11.23|\n|MMLU-PRO (5-shot) |23.68|\n\n\n# ๐ Some resources, discussions and reviews aboout this model\n\n#### ๐ฆ Announcement tweet: \n\n- https://twitter.com/Weyaxi/status/1783050724659675627\n\n#### ๐ Reddit post in r/LocalLLaMA:\n\n- https://www.reddit.com/r/LocalLLaMA/comments/1cdlym1/introducing_einstein_v61_based_on_the_new_llama3/\n\n#### โถ๏ธ Youtube Video(s)\n\n- [Install Einstein v6.1 Llama3-8B Locally on Windows](https://www.youtube.com/watch?v=VePvv6OM0JY)\n\n#### ๐ฑ Octopus-V4-3B\n\n- [Octopus-V4-3B](https://huggingface.co/NexaAIDev/Octopus-v4) leverages the incredible physics capabilities of [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B) in their model.\n\n# ๐ค Additional information about training\n\nThis model is full fine-tuned for 2 epoch. \n\nTotal number of steps was 2026.\n\n<details><summary>Loss graph</summary>\n\n\n\n</details><br>\n\n# ๐ค Acknowledgments\n\nThanks to [sablo.ai](https://sablo.ai) for sponsoring this model.\n\nThanks to all the dataset authors mentioned in the datasets section.\n\nThanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.\n\nThanks to all open source AI community.\n\n[<img src=\"https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png\" alt=\"Built with Axolotl\" width=\"200\" height=\"32\"/>](https://github.com/OpenAccess-AI-Collective/axolotl)\n\nIf you would like to support me:\n\n[โ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)\n\n",
"related_quantizations": []
},
"tags": [
"gguf",
"endpoints_compatible",
"region:us",
"conversational"
],
"likes": 0,
"downloads": 720,
"gated": false,
"private": false,
"last_modified": "2024-08-19T10:07:47.000Z",
"created_at": "2024-08-19T08:34:54.000Z",
"pipeline_tag": "",
"library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
"_id": "66c303aeb01b19d8c3eeeb67",
"id": "RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf",
"modelId": "RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf",
"sha": "3306b90cbf42e41ba66c4002adcb79b5826c51e9",
"createdAt": "2024-08-19T08:34:54.000Z",
"lastModified": "2024-08-19T10:07:47.000Z",
"author": "RichardErkhov",
"downloads": 720,
"likes": 0,
"gated": false,
"private": false,
"pipeline_tag": "",
"library_name": "",
"siblings_count": 24
}