lewdiculous/l3-8b-stheno-v3.1-gguf-iq-imatrix Q4_K_S GGUF - Free GGUF Download is indexed on GraySoft with repository links, GGUF quant files, and Hugging Face metadata. This page helps you pick a local model for guIDE or other runtimes. See related models in the same shard below.

Model Intelligence Sheet

lewdiculous/l3-8b-stheno-v3.1-gguf-iq-imatrix overview

My GGUF-IQ-Imatrix quants for Sao10K/L3-8B-Stheno-v3.1. This is a very promising roleplay model cooked by the amazing Sao10K! Quantization process: For future reference, these quants have been done after the fixes from #6920 have been merged. Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF. This was a bit more disk and compute intensive but hopefully avoided any losses during conversion. If you noticed any issues let me know in the discussions. General usage: Use the latest version of KoboldCpp. For 8GB VRAM GPUs, I recommend the Q4KM-imat (4.89 BPW) quant for up to 12288 context sizes. Presets: Some compatible SillyTavern presets can be found here (Virt's Roleplay Presets). Check discussions such as this one for other recommendations and samplers. Personal-support: I apologize for disrupting your experience. Currently I'm working on moving for a better internet provider. If you want and you are able to... You can spare some change over here (Ko-fi). Author-support: You can support the author at their own page. !image/jpeg

transformersggufroleplayllama3sillytavernenbase_model:Sao10K/L3-8B-Stheno-v3.1base_model:quantized:Sao10K/L3-8B-Stheno-v3.1license:cc-by-nc-4.0region:usimatrixconversational

lewdiculous/l3-8b-stheno-v3.1-gguf-iq-imatrix visual

Downloads

798

Likes

Pipeline

—

Library

transformers

Visibility

Public

Access

Open

Repository Files & Downloads

11 files detected

Direct downloads for all repository files

File	Type	Quantization	Size	Link
L3-8B-Stheno-v3.1-IQ3_M-imat.gguf	GGUF	IQ3_M	3.52 GB	Download
L3-8B-Stheno-v3.1-IQ3_S-imat.gguf	GGUF	IQ3_S	3.43 GB	Download
L3-8B-Stheno-v3.1-IQ3_XXS-imat.gguf	GGUF	IQ3_XXS	3.05 GB	Download
L3-8B-Stheno-v3.1-IQ4_NL-imat.gguf	GGUF	IQ4_NL	4.36 GB	Download
L3-8B-Stheno-v3.1-IQ4_XS-imat.gguf	GGUF	IQ4_XS	4.14 GB	Download
L3-8B-Stheno-v3.1-Q4_K_M-imat.gguf	GGUF	Q4_K_M	4.58 GB	Download
L3-8B-Stheno-v3.1-Q4_K_S-imat.gguf	GGUF	Q4_K_S	4.37 GB	Download
L3-8B-Stheno-v3.1-Q5_K_M-imat.gguf	GGUF	Q5_K_M	5.34 GB	Download
L3-8B-Stheno-v3.1-Q5_K_S-imat.gguf	GGUF	Q5_K_S	5.21 GB	Download
L3-8B-Stheno-v3.1-Q6_K-imat.gguf	GGUF	Q6_K	6.14 GB	Download
L3-8B-Stheno-v3.1-Q8_0-imat.gguf	GGUF	—	7.95 GB	Download

Model Details Live

Model Slug

lewdiculous/l3-8b-stheno-v3.1-gguf-iq-imatrix

Author

Lewdiculous

Pipeline Task

—

Library

transformers

Created

2024-05-20

Last Modified

2024-09-03

Gated

Private

HF SHA

03edb90c91814827a12208290395e4b23374dc4d

License

cc-by-nc-4.0

Language

Base Model

Sao10K/L3-8B-Stheno-v3.1

Metadata Inspector

Normalized metadata (stored in metadata_json)

{
  "metadata": {},
  "card_data": {
    "base_model": "Sao10K/L3-8B-Stheno-v3.1",
    "quantized_by": "Lewdiculous",
    "library_name": "transformers",
    "license": "cc-by-nc-4.0",
    "inference": false,
    "language": [
      "en"
    ],
    "tags": [
      "roleplay",
      "llama3",
      "sillytavern"
    ],
    "frontmatter": {
      "base_model": "Sao10K/L3-8B-Stheno-v3.1",
      "quantized_by": "Lewdiculous",
      "library_name": "transformers",
      "license": "cc-by-nc-4.0",
      "inference": "false",
      "language": [
        "en"
      ],
      "tags": [
        "roleplay",
        "llama3",
        "sillytavern"
      ]
    },
    "hero_image_url": "https://w.forfun.com/fetch/cb/cba2205390e517bea1ea60ca0b491af4.jpeg",
    "summary": "My GGUF-IQ-Imatrix quants for Sao10K/L3-8B-Stheno-v3.1. This is a very promising roleplay model cooked by the amazing Sao10K! > [!IMPORTANT] > **Quantization process:**  > For future reference, these quants have been done after the fixes from **#6920** have been merged.  > Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF.  > This was a bit more disk and compute intensive but hopefully avoided any losses during conversion.  > If you noticed any issues let me know in the discussions. > [!NOTE] > **General usage:**  > Use the latest version of **KoboldCpp**.  > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** (4.89 BPW) quant for up to 12288 context sizes.  > > **Presets:**  > Some compatible SillyTavern presets can be found **here (Virt's Roleplay Presets)**.  > Check **discussions such as this one** for other recommendations and samplers. > [!TIP] > **Personal-support:**  > I apologize for disrupting your experience.  > Currently I'm working on moving for a better internet provider.  > If you **want** and you are **able to**...  > You can **spare some change over here (Ko-fi)**.  > > **Author-support:**  > You can support the author **at their own page**. !image/jpeg",
    "quick_links": [],
    "benchmark_table_html": "",
    "readme_markdown": "---\nbase_model: Sao10K/L3-8B-Stheno-v3.1\nquantized_by: Lewdiculous\nlibrary_name: transformers\nlicense: cc-by-nc-4.0\ninference: false\nlanguage:\n- en\ntags:\n- roleplay\n- llama3\n- sillytavern\n---\n\n> [!WARNING]\n> **Update:** <br>\n> [New and updated version 3.2 here!](https://huggingface.co/Lewdiculous/L3-8B-Stheno-v3.2-GGUF-IQ-Imatrix) <br>\n> It includes fixes for common issues! <br>\n> **You should prefer it over version 3.1.** <br>\n\n<br>\n\n# #roleplay #sillytavern #llama3\n\nMy GGUF-IQ-Imatrix quants for [Sao10K/L3-8B-Stheno-v3.1](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.1).\n\nThis is a very promising roleplay model cooked by the amazing Sao10K!\n\n> [!IMPORTANT]\n> **Quantization process:** <br>\n> For future reference, these quants have been done after the fixes from [**#6920**](https://github.com/ggerganov/llama.cpp/pull/6920) have been merged. <br>\n> Imatrix data was generated from the FP16-GGUF and conversions directly from the BF16-GGUF. <br>\n> This was a bit more disk and compute intensive but hopefully avoided any losses during conversion. <br>\n> If you noticed any issues let me know in the discussions.\n\n> [!NOTE]\n> **General usage:** <br>\n> Use the latest version of **KoboldCpp**. <br>\n> For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** (4.89 BPW) quant for up to 12288 context sizes. <br>\n>\n> **Presets:** <br>\n> Some compatible SillyTavern presets can be found [**here (Virt's Roleplay Presets)**](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>\n> Check [**discussions such as this one**](https://huggingface.co/Virt-io/SillyTavern-Presets/discussions/5#664d6fb87c563d4d95151baa) for other recommendations and samplers.\n\n> [!TIP]\n> **Personal-support:** <br>\n> I apologize for disrupting your experience. <br>\n> Currently I'm working on moving for a better internet provider. <br>\n> If you **want** and you are **able to**... <br>\n> You can [**spare some change over here (Ko-fi)**](https://ko-fi.com/Lewdiculous). <br>\n>\n> **Author-support:** <br>\n> You can support the author [**at their own page**](https://ko-fi.com/sao10k).\n\n![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/JVvOa7FKmO6FObjLdBWBv.jpeg)\n\n## **Original model information:**\n\n<img src=\"https://w.forfun.com/fetch/cb/cba2205390e517bea1ea60ca0b491af4.jpeg\"  style=\"width: 80%; min-width: 400px; display: block; margin: auto;\">\n\n**Model: Llama-3-8B-Stheno-v3.1**\n\nThis has been an experimental model I've been working on for a bit. Llama-3 was kind of difficult to work with. \n<br>I also had been hired to create a model for an Organisation, and I used the lessons I learnt from fine-tuning that one for this specific model. Unable to share that one though, unfortunately.\n<br>Made from outputs generated by Claude-3-Opus along with Human-Generated Data.\n\n\nStheno-v3.1\n\n\\- A model made for 1-on-1 Roleplay ideally, but one that is able to handle scenarios, RPGs and storywriting fine.\n<br>\\- Uncensored during actual roleplay scenarios. # I do not care for zero-shot prompting like what some people do. It is uncensored enough in actual usecases.\n<br>\\- I quite like the prose and style for this model.\n\n#### Testing Notes\n<br>\\- Known as L3-RP-v2.1 on Chaiverse, it did decently there [>1200 Elo]\n<br>\\- Handles character personalities well. Great for 1 on 1 Roleplay sessions.\n<br>\\- May need further token context & few-shot examples if using it as a Narrator / RPG Roleplaying session. It is able to handle them though.\n<br>\\- A model leaning towards NSFW, mention explicitly in prompts if you want to steer away. [Avoid Negative Reinforcement]\n<br>\\- Occasionally spits out leaking XML and nonsense. A regen / swipe instantly fixes that.\n<br>\\- Unique / Varied Answers when Regenerating answers. Pretty cool?\n<br>\\- Works best with *some* token context in the character card itself. A chef needs ingredients to cook, no?\n\n\n***\n\n**Recommended Samplers:**\n\n```\nTemperature - 1.12 to 1.32\nMin-P - 0.075\nTop-K - 40\nRepetition Penalty - 1.1\n```\n\n**Stopping Strings:**\n\n```\n\\n{{User}} # Or Equivalent, depending on Frontend\n<|eot_id|>\n<|end_of_text|>\n\\n< # If there is leakage of XML tags in response. May happen Occasionally, Regenerate Answer as Needed. Happens rarely.\n```\n\n**Prompting Template - Llama-3-Instruct**\n\n```\n<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n{output}<|eot_id|>\n```\n\n**Basic Roleplay System Prompt**\n```\nYou are an expert actor that can fully immerse yourself into any role given. You do not break character for any reason, even if someone tries addressing you as an AI or language model.\nCurrently your role is {{char}}, which is described in detail below. As {{char}}, continue the exchange with {{user}}.\n```\n\n***\n\nSupport me here if you're interested. [Ko-Fi](https://ko-fi.com/sao10k)\n\nIf not, that's fine too. Feedback would be nice.\n\n```\nArt by wada_kazu / わだかず (pixiv page private?)\n```\n\n***",
    "related_quantizations": []
  },
  "tags": [
    "transformers",
    "gguf",
    "roleplay",
    "llama3",
    "sillytavern",
    "en",
    "base_model:Sao10K/L3-8B-Stheno-v3.1",
    "base_model:quantized:Sao10K/L3-8B-Stheno-v3.1",
    "license:cc-by-nc-4.0",
    "region:us",
    "imatrix",
    "conversational"
  ],
  "likes": 94,
  "downloads": 798,
  "gated": false,
  "private": false,
  "last_modified": "2024-09-03T04:47:27.000Z",
  "created_at": "2024-05-20T15:01:38.000Z",
  "pipeline_tag": "",
  "library_name": "transformers"
}

Source payload excerpt (from Hugging Face API)

{
  "_id": "664b65d243f57ba4a46a9e73",
  "id": "Lewdiculous/L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix",
  "modelId": "Lewdiculous/L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix",
  "sha": "03edb90c91814827a12208290395e4b23374dc4d",
  "createdAt": "2024-05-20T15:01:38.000Z",
  "lastModified": "2024-09-03T04:47:27.000Z",
  "author": "Lewdiculous",
  "downloads": 798,
  "likes": 94,
  "gated": false,
  "private": false,
  "pipeline_tag": "",
  "library_name": "transformers",
  "siblings_count": 15
}

lewdiculous/l3-8b-stheno-v3.1-gguf-iq-imatrix overview

Repository Files & Downloads

Model Details Live

Metadata Inspector

More models in this shard