Update README.md

b8a8a59 verified 4 months ago

4.2 kB

	---
	datasets:
	- glaiveai/reasoning-v1-20m
	language:
	- en
	base_model:
	- facebook/galactica-1.3b
	tags:
	- reasoning
	- text-generation-inference
	- medical
	---
	## What is Galactic Reasoning?

	The Galactic Reasoning adapters are a collection of LoRA adapters, trained for the various sizes of the Facebook/Galactica models. These LoRAs enable the OPT architecture based Galactica models to use reasoning, inspired by more modern models like DeepSeek and OpenAI's O3.
	To achieve this, the [glaiveai/reasoning-v1-20m](https://huggingface.co/datasets/glaiveai/reasoning-v1-20m) dataset was used for both training and evalulation of points.

	\| Size \| Parameters \| Galactic Reasoning Adapter \|
	\|:-----------:\|:-----------:\|:--------------------------:\|
	\| `mini` \| 125 M \| Too few neurons for reason \|
	\| `base` \| 1.3 B \| You are here :) \|
	\| `standard` \| 6.7 B \| Coming Soon™ \|
	\| `large` \| 30 B \| Coming Soon™ \|
	\| `huge` \| 120 B \| Short of a GPU grant, unlikely to happen. \|

	## How were these adapters developed?
	In addition to the adapters, I will be releasing the training script I used soon on GitHub. The script supports the finetuning of a specified base model with a specified dataset for any number of steps, using a wide range of optional quantization.
	Included in the GitHub training repo will be a batch file to replicate the exact arguments and seed passed to said script used to create this adapter.

	## How do I prompt this galactic thinker?
	A proper inference script will be provided eventually™ but for the time being, refer to the following code snippet.

	```python
	import torch
	from peft import PeftModel
	from transformers import AutoTokenizer, OPTForCausalLM

	ADAPTER_PATH = "C:\\Users\\TitleOS\Downloads\GalacticReasoning-1.3b" # Change to point to your downloaded adapter of course.
	BASE_MODEL_NAME = "facebook/Galactica-1.3b" # Use the right adapter for the right sized Galactica.

	new_special_tokens = ["<think>", "</think>"]
	new_pad_token = "<PAD>"
	model = OPTForCausalLM.from_pretrained(
	BASE_MODEL_NAME,
	load_in_8bit=False,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_NAME)
	print(f"Original vocab size: {len(tokenizer)}")

	# Add the special tokens and the new pad token
	special_tokens_dict = {'additional_special_tokens': new_special_tokens, 'pad_token': new_pad_token}
	num_added_toks = tokenizer.add_special_tokens(special_tokens_dict)

	print(f"Number of tokens added: {num_added_toks}")
	print(f"New vocab size: {len(tokenizer)}")

	# Resize the model's token embeddings to match the new tokenizer
	model.resize_token_embeddings(len(tokenizer))
	print("Resized model's token embeddings to match the new tokenizer. This is critical for the model to recognize the thinking tokens and the new pad token.")

	print(f"New embed_tokens shape: {model.get_input_embeddings().weight.shape}")
	print(f"New lm_head shape: {model.get_output_embeddings().weight.shape}")
	print("\nLoading adapter...")
	model.load_adapter(ADAPTER_PATH, adapter_name="default", device_map="auto")
	print("Adapter loaded successfully!")

	def evaluate(instruction, input=None):
	prompt = "Do androids dream of electric sheep?"
	inputs = tokenizer(prompt, return_tensors="pt")
	input_ids = inputs["input_ids"].to(model.device)
	generation_output = model.generate(
	input_ids=input_ids,
	return_dict_in_generate=True,
	output_scores=True,
	do_sample=True,
	max_length=1024,
	temperature=0.7,
	top_k=50,
	top_p=0.95,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.pad_token_id
	)
	s = generation_output.sequences[0]
	output = tokenizer.decode(s, skip_special_tokens=False)

	print(output)
	```

	## Credits
	* Credit to Meta/Facebook for the Galactica OPT Based models.
	* Credit to GlaiveAi for the reasoning-v1-20m dataset.
	* Finally, credit to my highly overworked Tesla M40 who ran for days straight to produce this.