IlyaGusev
/

gemma-2-9b-it-abliterated

Text Generation

text-generation-inference

Model card Files Files and versions

IlyaGusev commited on Jul 15, 2024

Commit

6b842ae

·

verified ·

1 Parent(s): 3bc681e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ language:
 The abliteration script ([link](https://github.com/IlyaGusev/saiga/blob/main/scripts/abliterate.py)) is based on code from the blog post and heavily uses [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens). The only major difference from the code used for Llama is [scaling the embedding layer back](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/pretrained/weight_conversions/gemma.py#L13).
-Orthogonalization **did not** produce the same results as regular interventions. However, the final model still seems to be uncensored.
 ## Examples:

 The abliteration script ([link](https://github.com/IlyaGusev/saiga/blob/main/scripts/abliterate.py)) is based on code from the blog post and heavily uses [TransformerLens](https://github.com/TransformerLensOrg/TransformerLens). The only major difference from the code used for Llama is [scaling the embedding layer back](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/pretrained/weight_conversions/gemma.py#L13).
+Orthogonalization **did not** produce the same results as regular interventions since there are RMSNorm layers before merging activations into the residual stream. However, the final model still seems to be uncensored.
 ## Examples: