Coming soon!
The repository accompanies the following preprint:
From Language Models over Tokens to Language Models over Characters. Tim Vieira, Ben LeBrun, Mario Giulianelli, Juan Luis Gastaldi, Brian DuSell, John Terilla, Timothy J. O'Donnell, Ryan Cotterell. 2024.
@misc{vieira2024languagemodelstokenslanguage,
title = {From Language Models over Tokens to Language Models over Characters},
author = {Tim Vieira and Ben LeBrun and Mario Giulianelli and Juan Luis Gastaldi and Brian DuSell and John Terilla and Timothy J. O'Donnell and Ryan Cotterell},
year = {2024},
eprint = {2412.03719},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2412.03719},
}