The overall goal of XymosTeX is to be a complete implementation of TeX. What does that mean? According to Knuth, a requirement is that it correctly translates trip.tex, a "torture test" for TeX. So, I will use that as our criterion for being correct and complete.
The other goal is orthogonal to the actual completeness or correctness of the implementation is the main reason that I'm going to the trouble of reimplementing TeX at all. Instead of trying to add additional functionality to the actual rendering side of TeX, my goal is to add a more reasonable debugging/tracing facility for TeX. The original reasoning for attempting the wild goal of reimplementing TeX was as an aide for implementing features in KaTeX, the web LaTeX math renderer. With plain TeX, it is very difficult to decipher what many of the more complicated LaTeX math macros expand to (\begin{align}
and friends being the most useful). With a reasonable debugging/tracing interface, it would be much easier to understand how these complicated macros work.
An auxiliary hope is that I will understand how core TeX works much better. To that end, I am going to try to make a complete implementation of TeX without looking at the original source of TeX and will only consult the TeXbook and look at the output of TeX. I'm worried that letting myself look at the source will encourage me to simply copy what is there instead of deeply understanding what is happening at its core. Maybe that is foolish, but it is interesting.
My plan is to implement features in stages, with each stage having a specific goal.
Status: Done! Difficulty: Easy Condition for success: correctly interpreting a series of macros, assignments, and conditionals that produce an output of prime numbers (this is a simplified version of the same function found in the TeXbook)
The goal of this first stage is to get some of the core parsing and lexing working. A large part of this stage will be ensuring that assignment, expansion, and conditionals work correctly.
Understanding and implementing the concepts in this stage is actually fairly difficult, but I have already gotten this working in my incomplete JavaScript implementation of TeX, so I can simply use a similar implementation here. Most of the problems here will be around translating JavaScript concepts into Rust ones.
Status: Done! Difficulty: Medium Condition for success: correctly evaluating and printing metrics for boxes and building horizontal boxes from commands that build horizontal boxes at different widths with glue
Instead of producing simple textual output as a result of the parsing, in this stage I will begin producing TeX boxes. I'll need to begin parsing character metrics for the individual characters to get the sizes for individual characters, and start allowing glue inside of horizontal boxes. I'll need to add box registers, and I'll need to allow setting the glue in a box, allow for reading metrics about the boxes, and allow for nesting boxes inside of other boxes.
Status: Done! Difficulty: Medium Condition for success: correctly parsing, building, and measuring the vertical boxes from commands that build vertical boxes and enter and leave vertical and horizontal mode using different techniques
At this point, I'll be able to start parsing from (internal) vertical mode, have that correctly call out to a (restricted) horizontal mode, and then return back to vertical mode to produce vertical boxes. This will add vertical glue as well.
Status: Done! Difficulty: Medium Condition for success: generate a DVI from a series of commands creating vertical and horizontal boxes with spacing and characters in them that is content-identical to the DVI produced by TeX run on the same commands
For this stage, I shouldn't have to work on the parser, because the current parser should be able to interpret the example file into boxes already. Instead, I'll be taking those generated boxes and turning them into a proper DVI file that represents the contents of the boxes.
Because the DVI file format isn't rigorous about how certain commands are used (like the orders of certain commands, several comments, which variables are used) and also because there are many different ways to produce the same output from a given box, I'll have to figure out a good way to ensure that the DVI that I produce is the same as the one that TeX produces, because I won't be able to simple byte-compare the DVI files.
Status: Done! Difficulty: Medium Condition for success: generate a DVI from a TeX file with basic a complicated math expressions that is content-identical to the DVI produced by TeX run on the same file.
For this stage, I will be working on both parsing and interpreting of math basic math expressions. For parsing, I will need to handle parsing the math expressions into a math list, which has some oddities like \atop
and ^
/_
which will hopefully ensure that the parsing works. Then, I will need to run the routine for turning a math list into a horizontal list. Once that is done, I will be able to use the existing horizontal list infrastructure to fully typeset and output the math into a DVI file.
Status: In progress Difficulty: Medium Condition for success: generate a DVI from a TeX file with basic paragraphs to be rendered that is content-identical to the DVI produced by TeX run on the same file (with very minor differences).
For this stage, I will be implementing the TeX line-breaking algorithm which splits horizontal lists into lines of a given width. In addition to the main algorithm, I will also implementing some of the extra features of the line-breaking algorithm, such as discretionary breaks and configurable tolerance and spaceskip. Also, I'm going to implement active characters so I can faithfully implement ties (~
).
Status: Not yet started Difficulty: Maybe Hard Condition for success: ???
Status: Not yet started Difficulty: ??? (Probably hard) Condition for success: ???
Status: Not yet started Difficulty: Hard Condition for success: Correctly interpreting trip.tex according to the manual.
Unaccounted for:
- Error recovery
- Alignment
- Headings & Footers
- \edef, \outer\def, \long\def
- \input
- \csname, \string
- Hyphenation