Navigate:
Heretic
~$HERET8.2%

Heretic: Automatic language model censorship removal

Tool that removes safety alignment from transformer language models using directional ablation without post-training.

LIVE RANKINGS • 03:33 PM • STEADY
TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10TOP 10
OVERALL
#7
4
AI & ML
#6
3
30 DAY RANKING TREND
ovr#7
·AI#6
STARS
17.3K
FORKS
1.7K
7D STARS
+1.3K
7D FORKS
+108
Tags:
See Repo:
Share:

Learn more about Heretic

Heretic is a tool that removes censorship (safety alignment) from transformer-based language models without requiring expensive post-training procedures. It implements directional ablation (abliteration) combined with a TPE-based parameter optimizer powered by Optuna to automatically find optimal parameters. The system co-minimizes refusal rates and KL divergence from the original model to preserve intelligence while removing safety constraints. It supports most dense models including multimodal architectures and several MoE variants.

Heretic

1

Fully Automatic

Requires no manual parameter tuning or understanding of transformer internals. Uses TPE optimization to automatically find optimal abliteration parameters.

2

Intelligence Preservation

Co-minimizes KL divergence from the original model while removing refusals, maintaining model capabilities better than manual approaches.

3

Research Features

Includes interpretability tools like residual vector visualization and PaCMAP projections for analyzing model internals and ablation effects.


pip install -U heretic-llm
heretic Qwen/Qwen3-4B-Instruct-2507

vv1.2.0

v1.2.0: @noctrex added a `max_memory` setting to limit memory usage in

  • @noctrex added a `max_memory` setting to limit memory usage in
  • @spikymoth added a mechanism to avoid excessive low-divergence iteration in
  • @accemlcc implemented a new LoRA-based abliteration engine with support for 4-bit quantization in
  • @accemlcc added enumeration of all available GPUs on startup in
  • @Vinayyyy7 added the ability to run more trials after optimization is complete in
vv1.1.0

v1.1.0: @mbarnson added basic MPS (Apple Silicon) support in

  • @mbarnson added basic MPS (Apple Silicon) support in
  • @red40maxxer reduced memory usage in
  • @Ooooze added IBM Granite MoE support in
  • @kldzj added multi-GPU support in and
  • @ricyoung fixed an error when Hugging Face user profile fields are missing in

See how people are using Heretic

Loading tweets...


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers