Navigate:
Heretic
~$HERET15.2%

Heretic: Automatic language model censorship removal

Tool that removes safety alignment from transformer language models using directional ablation without post-training.

LIVE RANKINGS • 10:07 AM • STEADY
TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10
OVERALL
#5
96
AI & ML
#4
42
30 DAY RANKING TREND
ovr#5
·AI#4
STARS
9.8K
FORKS
965
7D STARS
+1.3K
7D FORKS
+112
Tags:
See Repo:
Share:

Learn more about Heretic

Heretic is a tool that removes censorship (safety alignment) from transformer-based language models without requiring expensive post-training procedures. It implements directional ablation (abliteration) combined with a TPE-based parameter optimizer powered by Optuna to automatically find optimal parameters. The system co-minimizes refusal rates and KL divergence from the original model to preserve intelligence while removing safety constraints. It supports most dense models including multimodal architectures and several MoE variants.

Heretic

1

Fully Automatic

Requires no manual parameter tuning or understanding of transformer internals. Uses TPE optimization to automatically find optimal abliteration parameters.

2

Intelligence Preservation

Co-minimizes KL divergence from the original model while removing refusals, maintaining model capabilities better than manual approaches.

3

Research Features

Includes interpretability tools like residual vector visualization and PaCMAP projections for analyzing model internals and ablation effects.


pip install -U heretic-llm
heretic Qwen/Qwen3-4B-Instruct-2507

vv1.2.0

v1.2.0: @noctrex added a `max_memory` setting to limit memory usage in

  • @noctrex added a `max_memory` setting to limit memory usage in
  • @spikymoth added a mechanism to avoid excessive low-divergence iteration in
  • @accemlcc implemented a new LoRA-based abliteration engine with support for 4-bit quantization in
  • @accemlcc added enumeration of all available GPUs on startup in
  • @Vinayyyy7 added the ability to run more trials after optimization is complete in
vv1.1.0

v1.1.0: @mbarnson added basic MPS (Apple Silicon) support in

  • @mbarnson added basic MPS (Apple Silicon) support in
  • @red40maxxer reduced memory usage in
  • @Ooooze added IBM Granite MoE support in
  • @kldzj added multi-GPU support in and
  • @ricyoung fixed an error when Hugging Face user profile fields are missing in

See how people are using Heretic

Loading tweets...


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers