Heretic: A Python Tool That Automatically Strips Safety Guardrails from Any Language Model

Heretic uses directional ablation to remove alignment training from transformer models fully automatically. Over 4,000 community-created models on HuggingFace so far, with lower quality loss than manual approaches.
artificial-intelligence
Author

Kabui, Charles

Published

2026-06-06

Keywords

heretic, abliteration, language-model-safety, alignment-removal, open-source-ai