Official repository for the paper "Knowledge Acquisition through Continued Pretraining is Difficult: A Case Study on r/AskHistorians"
The AskHistorians Knowledge Filling
dataset can be found on Huggingface
The training commands can be found in training_commands and executed in a docker container. Example:
sh run_in_docker.sh training_commands/train_askhist_zephyr_sft_dpo_bf16.sh