GraySoft
Projects Models About FAQ Contact Download guIDE โ†’
Model Intelligence Sheet

richarderkhov/weyaxi_-_einstein-v6.1-llama3-8b-gguf overview

This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets. This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl. This model's training was sponsored by sablo.ai. See axolotl config axolotl version: 0.4.0 # ๐Ÿ’ฌ Prompt Template You can use ChatML prompt template while using the model: ### ChatML This prompt template is available as a chat template, which means you can format messages using the tokenizer.applychattemplate() method: # ๐Ÿ“Š Datasets used in this model The datasets used to train this model are listed in the metadata section of the model card. Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria. The results of this filtering process and its outcomes are in the data folder of this repository: Weyaxi/Einstein-v6.1-Llama3-8B/data # ๐Ÿ”„ Quantizationed versions

ggufendpoints_compatibleregion:usconversational
richarderkhov/weyaxi_-_einstein-v6.1-llama3-8b-gguf visual
Downloads
720
Likes
0
Pipeline
โ€”
Library
โ€”
Visibility
Public
Access
Open

Repository Files & Downloads

22 files detected
Direct downloads for all repository files
FileTypeQuantizationSizeLink
Einstein-v6.1-Llama3-8B.IQ3_M.gguf GGUF IQ3_M 3.52 GB Download
Einstein-v6.1-Llama3-8B.IQ3_S.gguf GGUF IQ3_S 3.43 GB Download
Einstein-v6.1-Llama3-8B.IQ3_XS.gguf GGUF IQ3_XS 3.28 GB Download
Einstein-v6.1-Llama3-8B.IQ4_NL.gguf GGUF IQ4_NL 4.38 GB Download
Einstein-v6.1-Llama3-8B.IQ4_XS.gguf GGUF IQ4_XS 4.18 GB Download
Einstein-v6.1-Llama3-8B.Q2_K.gguf GGUF Q2_K 2.96 GB Download
Einstein-v6.1-Llama3-8B.Q3_K.gguf GGUF Q3_K 3.74 GB Download
Einstein-v6.1-Llama3-8B.Q3_K_L.gguf GGUF Q3_K_L 4.03 GB Download
Einstein-v6.1-Llama3-8B.Q3_K_M.gguf GGUF Q3_K_M 3.74 GB Download
Einstein-v6.1-Llama3-8B.Q3_K_S.gguf GGUF Q3_K_S 3.41 GB Download
Einstein-v6.1-Llama3-8B.Q4_0.gguf GGUF โ€” 4.34 GB Download
Einstein-v6.1-Llama3-8B.Q4_1.gguf GGUF โ€” 4.78 GB Download
Einstein-v6.1-Llama3-8B.Q4_K.gguf GGUF Q4_K 4.58 GB Download
Einstein-v6.1-Llama3-8B.Q4_K_M.gguf GGUF Q4_K_M 4.58 GB Download
Einstein-v6.1-Llama3-8B.Q4_K_S.gguf GGUF Q4_K_S 4.37 GB Download
Einstein-v6.1-Llama3-8B.Q5_0.gguf GGUF โ€” 5.21 GB Download
Einstein-v6.1-Llama3-8B.Q5_1.gguf GGUF โ€” 5.65 GB Download
Einstein-v6.1-Llama3-8B.Q5_K.gguf GGUF Q5_K 5.34 GB Download
Einstein-v6.1-Llama3-8B.Q5_K_M.gguf GGUF Q5_K_M 5.34 GB Download
Einstein-v6.1-Llama3-8B.Q5_K_S.gguf GGUF Q5_K_S 5.21 GB Download
Einstein-v6.1-Llama3-8B.Q6_K.gguf GGUF Q6_K 6.14 GB Download
Einstein-v6.1-Llama3-8B.Q8_0.gguf GGUF โ€” 7.95 GB Download

Model Details Live

Model Slug
richarderkhov/weyaxi_-_einstein-v6.1-llama3-8b-gguf
Author
RichardErkhov
Pipeline Task
โ€”
Library
โ€”
Created
2024-08-19
Last Modified
2024-08-19
Gated
No
Private
No
HF SHA
3306b90cbf42e41ba66c4002adcb79b5826c51e9
License
Unknown
Language
Unknown
Base Model
Unknown

Metadata Inspector

Normalized metadata (stored in metadata_json)
{
  "metadata": {},
  "card_data": {
    "frontmatter": {},
    "hero_image_url": "https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png",
    "summary": "This model is a full fine-tuned version of meta-llama/Meta-Llama-3-8B on diverse datasets. This model is finetuned using 8xRTX3090 + 1xRTXA6000 using axolotl. This model's training was sponsored by sablo.ai. See axolotl config axolotl version: 0.4.0 ``yaml base_model: meta-llama/Meta-Llama-3-8B model_type: LlamaForCausalLM tokenizer_type: AutoTokenizer load_in_8bit: false load_in_4bit: false strict: false chat_template: chatml datasets: ds_type: json type: alpaca conversation: chatml ds_type: json type: gpteacher conversation: chatml ds_type: json type: alpaca conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml ds_type: json type: sharegpt strict: false conversation: chatml dataset_prepared_path: last_run_prepared val_set_size: 0.002 output_dir: ./Einstein-v6.1-Llama3-8B-model sequence_len: 8192 sample_packing: true pad_to_sequence_len: true eval_sample_packing: false wandb_project: Einstein wandb_entity: wandb_watch: wandb_name: Einstein-v6.1-Llama3-2-epoch wandb_log_model: hub_model_id: Weyaxi/Einstein-v6.1-Llama3-8B save_safetensors: true gradient_accumulation_steps: 4 micro_batch_size: 1 num_epochs: 2 optimizer: adamw_bnb_8bit # look lr_scheduler: cosine learning_rate: 0.000005 # look train_on_inputs: false group_by_length: false bf16: true fp16: false tf32: false gradient_checkpointing: true early_stopping_patience: resume_from_checkpoint: local_rank: logging_steps: 1 xformers_attention: flash_attention: true warmup_steps: 10 evals_per_epoch: 2 eval_table_size: eval_table_max_new_tokens: 128 saves_per_epoch: 2 debug: deepspeed: zero3_bf16_cpuoffload_params.json weight_decay: 0.0 fsdp: fsdp_config: special_tokens: bos_token: \"\" eos_token: \"\" unk_token: \"\" pad_token:  # changed tokens: `  # ๐Ÿ’ฌ Prompt Template You can use ChatML prompt template while using the model: ### ChatML ` system {system} user {user} assistant {asistant} ` This prompt template is available as a chat template, which means you can format messages using the tokenizer.apply_chat_template() method: `python messages = [ {\"role\": \"system\", \"content\": \"You are helpful AI asistant.\"}, {\"role\": \"user\", \"content\": \"Hello!\"} ] gen_input = tokenizer.apply_chat_template(message, return_tensors=\"pt\") model.generate(**gen_input) `` # ๐Ÿ“Š Datasets used in this model The datasets used to train this model are listed in the metadata section of the model card. Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria. The results of this filtering process and its outcomes are in the data folder of this repository: Weyaxi/Einstein-v6.1-Llama3-8B/data # ๐Ÿ”„ Quantizationed versions",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "Quantization made by Richard Erkhov.\n\n[Github](https://github.com/RichardErkhov)\n\n[Discord](https://discord.gg/pvy7H8DZMG)\n\n[Request more models](https://github.com/RichardErkhov/quant_request)\n\n\nEinstein-v6.1-Llama3-8B - GGUF\n- Model creator: https://huggingface.co/Weyaxi/\n- Original model: https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B/\n\n\n| Name | Quant method | Size |\n| ---- | ---- | ---- |\n| [Einstein-v6.1-Llama3-8B.Q2_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q2_K.gguf) | Q2_K | 2.96GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_XS.gguf) | IQ3_XS | 3.28GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_S.gguf) | IQ3_S | 3.43GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_S.gguf) | Q3_K_S | 3.41GB |\n| [Einstein-v6.1-Llama3-8B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ3_M.gguf) | IQ3_M | 3.52GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K.gguf) | Q3_K | 3.74GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_M.gguf) | Q3_K_M | 3.74GB |\n| [Einstein-v6.1-Llama3-8B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q3_K_L.gguf) | Q3_K_L | 4.03GB |\n| [Einstein-v6.1-Llama3-8B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ4_XS.gguf) | IQ4_XS | 4.18GB |\n| [Einstein-v6.1-Llama3-8B.Q4_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_0.gguf) | Q4_0 | 4.34GB |\n| [Einstein-v6.1-Llama3-8B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.IQ4_NL.gguf) | IQ4_NL | 4.38GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K_S.gguf) | Q4_K_S | 4.37GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K.gguf) | Q4_K | 4.58GB |\n| [Einstein-v6.1-Llama3-8B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_K_M.gguf) | Q4_K_M | 4.58GB |\n| [Einstein-v6.1-Llama3-8B.Q4_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q4_1.gguf) | Q4_1 | 4.78GB |\n| [Einstein-v6.1-Llama3-8B.Q5_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_0.gguf) | Q5_0 | 5.21GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K_S.gguf) | Q5_K_S | 5.21GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K.gguf) | Q5_K | 5.34GB |\n| [Einstein-v6.1-Llama3-8B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_K_M.gguf) | Q5_K_M | 5.34GB |\n| [Einstein-v6.1-Llama3-8B.Q5_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q5_1.gguf) | Q5_1 | 5.65GB |\n| [Einstein-v6.1-Llama3-8B.Q6_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q6_K.gguf) | Q6_K | 6.14GB |\n| [Einstein-v6.1-Llama3-8B.Q8_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf/blob/main/Einstein-v6.1-Llama3-8B.Q8_0.gguf) | Q8_0 | 7.95GB |\n\n\n\n\nOriginal model description:\n---\nlanguage:\n- en\nlicense: other\ntags:\n- axolotl\n- generated_from_trainer\n- instruct\n- finetune\n- chatml\n- gpt4\n- synthetic data\n- science\n- physics\n- chemistry\n- biology\n- math\n- llama\n- llama3\nbase_model: meta-llama/Meta-Llama-3-8B\ndatasets:\n- allenai/ai2_arc\n- camel-ai/physics\n- camel-ai/chemistry\n- camel-ai/biology\n- camel-ai/math\n- metaeval/reclor\n- openbookqa\n- mandyyyyii/scibench\n- derek-thomas/ScienceQA\n- TIGER-Lab/ScienceEval\n- jondurbin/airoboros-3.2\n- LDJnr/Capybara\n- Cot-Alpaca-GPT4-From-OpenHermes-2.5\n- STEM-AI-mtl/Electrical-engineering\n- knowrohit07/saraswati-stem\n- sablo/oasst2_curated\n- lmsys/lmsys-chat-1m\n- TIGER-Lab/MathInstruct\n- bigbio/med_qa\n- meta-math/MetaMathQA-40K\n- openbookqa\n- piqa\n- metaeval/reclor\n- derek-thomas/ScienceQA\n- scibench\n- sciq\n- Open-Orca/SlimOrca\n- migtissera/Synthia-v1.3\n- TIGER-Lab/ScienceEval\n- allenai/WildChat\n- microsoft/orca-math-word-problems-200k\n- openchat/openchat_sharegpt4_dataset\n- teknium/GPTeacher-General-Instruct\n- m-a-p/CodeFeedback-Filtered-Instruction\n- totally-not-an-llm/EverythingLM-data-V3\n- HuggingFaceH4/no_robots\n- OpenAssistant/oasst_top1_2023-08-25\n- WizardLM/WizardLM_evol_instruct_70k\nmodel-index:\n- name: Einstein-v6.1-Llama3-8B\n  results:\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: AI2 Reasoning Challenge (25-Shot)\n      type: ai2_arc\n      config: ARC-Challenge\n      split: test\n      args:\n        num_few_shot: 25\n    metrics:\n    - type: acc_norm\n      value: 62.46\n      name: normalized accuracy\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: HellaSwag (10-Shot)\n      type: hellaswag\n      split: validation\n      args:\n        num_few_shot: 10\n    metrics:\n    - type: acc_norm\n      value: 82.41\n      name: normalized accuracy\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: MMLU (5-Shot)\n      type: cais/mmlu\n      config: all\n      split: test\n      args:\n        num_few_shot: 5\n    metrics:\n    - type: acc\n      value: 66.19\n      name: accuracy\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: TruthfulQA (0-shot)\n      type: truthful_qa\n      config: multiple_choice\n      split: validation\n      args:\n        num_few_shot: 0\n    metrics:\n    - type: mc2\n      value: 55.1\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: Winogrande (5-shot)\n      type: winogrande\n      config: winogrande_xl\n      split: validation\n      args:\n        num_few_shot: 5\n    metrics:\n    - type: acc\n      value: 79.32\n      name: accuracy\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: GSM8k (5-shot)\n      type: gsm8k\n      config: main\n      split: test\n      args:\n        num_few_shot: 5\n    metrics:\n    - type: acc\n      value: 66.11\n      name: accuracy\n    source:\n      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: IFEval (0-Shot)\n      type: HuggingFaceH4/ifeval\n      args:\n        num_few_shot: 0\n    metrics:\n    - type: inst_level_strict_acc and prompt_level_strict_acc\n      value: 45.68\n      name: strict accuracy\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: BBH (3-Shot)\n      type: BBH\n      args:\n        num_few_shot: 3\n    metrics:\n    - type: acc_norm\n      value: 29.38\n      name: normalized accuracy\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: MATH Lvl 5 (4-Shot)\n      type: hendrycks/competition_math\n      args:\n        num_few_shot: 4\n    metrics:\n    - type: exact_match\n      value: 5.74\n      name: exact match\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: GPQA (0-shot)\n      type: Idavidrein/gpqa\n      args:\n        num_few_shot: 0\n    metrics:\n    - type: acc_norm\n      value: 4.25\n      name: acc_norm\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: MuSR (0-shot)\n      type: TAUR-Lab/MuSR\n      args:\n        num_few_shot: 0\n    metrics:\n    - type: acc_norm\n      value: 11.23\n      name: acc_norm\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n  - task:\n      type: text-generation\n      name: Text Generation\n    dataset:\n      name: MMLU-PRO (5-shot)\n      type: TIGER-Lab/MMLU-Pro\n      config: main\n      split: test\n      args:\n        num_few_shot: 5\n    metrics:\n    - type: acc\n      value: 23.68\n      name: accuracy\n    source:\n      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Weyaxi/Einstein-v6.1-Llama3-8B\n      name: Open LLM Leaderboard\n---\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/5s12oq859qLfDkkTNam_C.png)\n\n# ๐Ÿ”ฌ Einstein-v6.1-Llama3-8B\n\nThis model is a full fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on diverse datasets.\n\nThis model is finetuned using `8xRTX3090` + `1xRTXA6000` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).\n\nThis model's training was sponsored by [sablo.ai](https://sablo.ai). \n\n<details><summary>See axolotl config</summary>\n\naxolotl version: `0.4.0`\n```yaml\nbase_model: meta-llama/Meta-Llama-3-8B\nmodel_type: LlamaForCausalLM\ntokenizer_type: AutoTokenizer\n\nload_in_8bit: false\nload_in_4bit: false\nstrict: false\n\nchat_template: chatml\ndatasets:\n  - path: data/merged_all.json\n    ds_type: json\n    type: alpaca\n    conversation: chatml\n\n  - path: data/gpteacher-instruct-special-alpaca.json\n    ds_type: json\n    type: gpteacher\n    conversation: chatml\n\n  - path: data/wizardlm_evol_instruct_70k_random_half.json\n    ds_type: json\n    type: alpaca\n    conversation: chatml\n\n  - path: data/capybara_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml\n\n  - path: data/synthia-v1.3_sharegpt_12500.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml  \n\n  - path: data/cot_alpaca_gpt4_extracted_openhermes_2.5_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml\n\n  - path: data/slimorca_dedup_filtered_95k_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml  \n\n  - path: data/airoboros_3.2_without_contextual_slimorca_orca_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml  \n\n  - path: data/allenai_wild_chat_gpt4_english_toxic_random_half_4k_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    strict: false\n    conversation: chatml  \n\n  - path: data/pippa_bagel_repo_3k_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml  \n\n  - path: data/gpt4_data_lmys_1m_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml  \n\n  - path: data/sharegpt_gpt4_english.json\n    ds_type: json\n    type: sharegpt\n    conversation: chatml\n\n  - path: data/no_robots_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    strict: false\n    conversation: chatml\n\n  - path: data/oasst_top1_from_fusechatmixture_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    strict: false\n    conversation: chatml\n\n  - path: data/everythinglm-data-v3_sharegpt.json\n    ds_type: json\n    type: sharegpt\n    strict: false\n    conversation: chatml\n\ndataset_prepared_path: last_run_prepared\nval_set_size: 0.002\n\noutput_dir: ./Einstein-v6.1-Llama3-8B-model\n\nsequence_len: 8192\nsample_packing: true\npad_to_sequence_len: true\neval_sample_packing: false\n\nwandb_project: Einstein\nwandb_entity:\nwandb_watch:\nwandb_name: Einstein-v6.1-Llama3-2-epoch\nwandb_log_model:\nhub_model_id: Weyaxi/Einstein-v6.1-Llama3-8B\n\nsave_safetensors: true\n\ngradient_accumulation_steps: 4\nmicro_batch_size: 1\nnum_epochs: 2\noptimizer: adamw_bnb_8bit # look\nlr_scheduler: cosine\nlearning_rate: 0.000005 # look\n\ntrain_on_inputs: false\ngroup_by_length: false\nbf16: true\nfp16: false\ntf32: false\n\ngradient_checkpointing: true\nearly_stopping_patience:\nresume_from_checkpoint:\nlocal_rank:\nlogging_steps: 1\nxformers_attention:\nflash_attention: true\n\nwarmup_steps: 10\nevals_per_epoch: 2\neval_table_size:\neval_table_max_new_tokens: 128\nsaves_per_epoch: 2\ndebug:\n\ndeepspeed: zero3_bf16_cpuoffload_params.json\nweight_decay: 0.0\nfsdp:\nfsdp_config:\nspecial_tokens:\n  bos_token: \"<s>\"\n  eos_token: \"<|im_end|>\"\n  unk_token: \"<unk>\"\n  pad_token: <|end_of_text|> # changed\ntokens:\n  - \"<|im_start|>\"\n```\n</details><br>\n\n# ๐Ÿ’ฌ Prompt Template\n\nYou can use ChatML prompt template while using the model:\n\n### ChatML\n\n```\n<|im_start|>system\n{system}<|im_end|>\n<|im_start|>user\n{user}<|im_end|>\n<|im_start|>assistant\n{asistant}<|im_end|>\n```\n\nThis prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the\n`tokenizer.apply_chat_template()` method:\n\n```python\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are helpful AI asistant.\"},\n    {\"role\": \"user\", \"content\": \"Hello!\"}\n]\ngen_input = tokenizer.apply_chat_template(message, return_tensors=\"pt\")\nmodel.generate(**gen_input)\n```\n\n# ๐Ÿ“Š Datasets used in this model\n\nThe datasets used to train this model are listed in the metadata section of the model card.\n\nPlease note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria.\n\nThe results of this filtering process and its outcomes are in the data folder of this repository:\n\n[Weyaxi/Einstein-v6.1-Llama3-8B/data](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B/tree/main/data)\n\n# ๐Ÿ”„ Quantizationed versions\n\n## GGUF [@bartowski](https://huggingface.co/bartowski)\n\n- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-GGUF\n\n## ExLlamaV2 [@bartowski](https://huggingface.co/bartowski)\n\n- https://huggingface.co/bartowski/Einstein-v6.1-Llama3-8B-exl2\n\n## AWQ [@solidrust](https://huggingface.co/solidrust)\n\n- https://huggingface.co/solidrust/Einstein-v6.1-Llama3-8B-AWQ\n\n# ๐ŸŽฏ [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)\nDetailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)\n\n|             Metric              |Value|\n|---------------------------------|----:|\n|Avg.                             |68.60|\n|AI2 Reasoning Challenge (25-Shot)|62.46|\n|HellaSwag (10-Shot)              |82.41|\n|MMLU (5-Shot)                    |66.19|\n|TruthfulQA (0-shot)              |55.10|\n|Winogrande (5-shot)              |79.32|\n|GSM8k (5-shot)                   |66.11|\n\n# ๐ŸŽฏ [Open LLM Leaderboard v2 Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)\nDetailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Weyaxi__Einstein-v6.1-Llama3-8B)\n\n|      Metric       |Value|\n|-------------------|----:|\n|Avg.               |19.99|\n|IFEval (0-Shot)    |45.68|\n|BBH (3-Shot)       |29.38|\n|MATH Lvl 5 (4-Shot)| 5.74|\n|GPQA (0-shot)      | 4.25|\n|MuSR (0-shot)      |11.23|\n|MMLU-PRO (5-shot)  |23.68|\n\n\n# ๐Ÿ“š Some resources, discussions and reviews aboout this model\n\n#### ๐Ÿฆ Announcement tweet: \n\n- https://twitter.com/Weyaxi/status/1783050724659675627\n\n#### ๐Ÿ” Reddit post in r/LocalLLaMA:\n\n-  https://www.reddit.com/r/LocalLLaMA/comments/1cdlym1/introducing_einstein_v61_based_on_the_new_llama3/\n\n#### โ–ถ๏ธ Youtube Video(s)\n\n- [Install Einstein v6.1 Llama3-8B Locally on Windows](https://www.youtube.com/watch?v=VePvv6OM0JY)\n\n#### ๐Ÿ“ฑ Octopus-V4-3B\n\n- [Octopus-V4-3B](https://huggingface.co/NexaAIDev/Octopus-v4) leverages the incredible physics capabilities of [Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B) in their model.\n\n# ๐Ÿค– Additional information about training\n\nThis model is full fine-tuned for 2 epoch. \n\nTotal number of steps was 2026.\n\n<details><summary>Loss graph</summary>\n\n![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/Ycs7ZpoqmxFt0u9rybCO1.png)\n\n</details><br>\n\n# ๐Ÿค Acknowledgments\n\nThanks to [sablo.ai](https://sablo.ai) for sponsoring this model.\n\nThanks to all the dataset authors mentioned in the datasets section.\n\nThanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.\n\nThanks to all open source AI community.\n\n[<img src=\"https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png\" alt=\"Built with Axolotl\" width=\"200\" height=\"32\"/>](https://github.com/OpenAccess-AI-Collective/axolotl)\n\nIf you would like to support me:\n\n[โ˜• Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)\n\n",
    "related_quantizations": []
  },
  "tags": [
    "gguf",
    "endpoints_compatible",
    "region:us",
    "conversational"
  ],
  "likes": 0,
  "downloads": 720,
  "gated": false,
  "private": false,
  "last_modified": "2024-08-19T10:07:47.000Z",
  "created_at": "2024-08-19T08:34:54.000Z",
  "pipeline_tag": "",
  "library_name": ""
}
Source payload excerpt (from Hugging Face API)
{
  "_id": "66c303aeb01b19d8c3eeeb67",
  "id": "RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf",
  "modelId": "RichardErkhov/Weyaxi_-_Einstein-v6.1-Llama3-8B-gguf",
  "sha": "3306b90cbf42e41ba66c4002adcb79b5826c51e9",
  "createdAt": "2024-08-19T08:34:54.000Z",
  "lastModified": "2024-08-19T10:07:47.000Z",
  "author": "RichardErkhov",
  "downloads": 720,
  "likes": 0,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "",
  "siblings_count": 24
}