Releases: jeffpar/pcjs.v1
Another Fix for Microsoft Edge
I was hoping that Microsoft Edge would prove to be less of a headache than Internet Explorer, and as trouble-free as Safari, Firefox and Chrome usually are. Unfortunately, that's not been the case.
When Edge was first released, I had to release an update that removed the <DOCTYPE> tag from my XSL files, so that its XSLT processor wouldn't barf (some vestigial ActiveX behavior). And now a recent Edge update has caused another problem, this time involving self-closing tags.
Even though my XSL files do not contain any self-closing <div> or <select> tags, the XSLT processor in Edge now converts them into self-closing tags if they're empty. This is a problem when it comes time to render those tags, because the Edge browser doesn't like them; <div></div> works, but <div/> does not.
In the process of investigating this issue, I learned that I should have included <xsl:output method="html"/> in my XSL files. That informs the XSLT processor that the output will be used as HTML, and therefore it should avoid converting empty tags to self-closing tags. So, technically, the problem was my fault, but it sure would be nice if everyone used the same defaults.
8087 Coprocessor Support
The biggest new feature of this release is support for the 8087 Floating-Point Coprocessor (FPU). The feature is disabled by default. To enable it, you must explicitly add an FPU element to a machine's XML configuration file; eg:
<fpu id="fpu8087" model="8087"/>
Groundwork has also been laid for models "80287" and "80387", but support for those coprocessors is not yet complete. An FPU must be paired with a supported CPU:
- FPU model "8087" may only be paired with CPU models "8086" or "8088"
- FPU model "80287" may only be paired with CPU models "80286" or "80386"
- FPU model "80387" may only be paired with CPU model "80386"
Only limited testing has been done so far, so issues are to be expected.
The FPU doesn't perform its own cycle counting yet, so every FPU operation completes in 3 CPU cycles. In a future release, the FPU will start maintaining its own cycle count, so that an FWAIT instruction can "charge" the CPU a more realistic number of cycles. Like the CPU, the FPU will never be cycle-perfect -- I'm not that obsessive -- but I do want it to execute at roughly the same speed as a similarly clocked FPU. This means that the FPU element will also support a "cycles" attribute, defaulting to the CPU's value if none is specified.
It should also be noted that the precision of our FPU is slightly less than that of a real FPU. The latter uses an internal 80-bit "temporary real" format, whereas JavaScript provides only 64-bit "long real" support. In practice, the difference is probably insignificant for most applications, because values are frequently loaded in and out of the FPU as 64-bit numbers, but it does mean that test results from emulated and real FPUs will not always match down to the final bit.
The 16-bit difference means that the 80-bit values have 12 more bits of precision in the fraction and 4 more bits in the exponent; however, the effective increase in fractional precision is only 11 bits, since the most significant bit is always set unless/until the number becomes either an unnormal or denormal.
While it would certainly be possible to simulate 80-bit operations and theoretically provide bit-perfect results, by essentially writing our own 80-bit floating-point emulation library in JavaScript, that's a huge undertaking best left for another day. Another option might be to port an old floating-point emulation library to JavaScript, but such libraries usually suffered from similar (or worse) precision limitations.
The bottom line is that small differences in precision should not come as a surprise to any x86 application, since most applications had to be able to run on machines with or without a coprocessor, and so they had to include some sort of emulation fallback, which in turn came with its own set of compromises.
As usual, a number of big fixes are included in this release as well:
- Improved mapping of touch events to simulated mouse events
- Eliminated some video glitches during mode changes and buffer updates
- Fixed SGDT/SIDT behavior on the 80386
- Fixed SPACEBAR handling on iOS, which some recent iOS update had broken
Steppings, Old Instructions, and Errata
PCjs now supports a stepping attribute on the <cpu> element, which you can use to simulate specific stepping behavior. For example, a machine.xml file with the following CPU definition:
<cpu id="cpu386" model="80386" stepping="b0"/>
will include support for the short-lived XBTS and IBTS instructions. Whereas this machine:
<cpu id="cpu386" model="80386" stepping="b1"/>
will not contain those instructions; they will trigger a #UD exception. That machine will, however, enable selected B1 errata, including the infamous 80386 32-bit multiplication errors. And this machine:
<cpu id="cpu386" model="80386" stepping="b2"/>
will eliminate the 32-bit multiplication errors but still include all previous B stepping errata. B2 was not an actual 80386 stepping; it is a pseudo-stepping that provides a simple way of specifying a "double sigma" B1 stepping (a B1 part that passes all 32-bit multiplication tests).
PCjs stepping support is extremely limited at this point. Here's a summary:
- 80386 steppings A0-B0 provide limited support for the XBTS and IBTS instructions
- 80386 steppings A0-B1 enable Errata #7 for STOSB (as tested by Windows 95)
- 80386 stepping B1 enables 32-bit multiplication errors (as tested by Windows 95)
- 80386 stepping B2 includes all supported B1 errata, but without 32-bit multiplication errors
In addition, on 80386 reset, we set the CPU revision number in DX to the appropriate value for the specified stepping.
Support for additional 80286 and 80386 errata may be added over time, as interesting scenarios or test cases are discovered.
Aside from limited stepping support, this release also includes limited support for the XBTS and IBTS instructions, which existed only on 80386 A0-B0 steppings. Limited support means that those instructions function well enough to satisfy any code that merely tests for their existence. Because those instructions were short-lived, it seems unlikely we will find much if any code today that actually used them. In fact, it's difficult to even find a description of those instructions' precise operation.
Note that none of the machine configurations on pcjs.org currently support those instructions, because such a machine must explicitly include an 80386 stepping attribute from "A0" to "B0", which none of the machines in the project currently include.
Last but not least, a few lingering multiplication and division bugs were squashed (mostly 32-bit, but also one 16-bit unsigned division bug). While I can't claim that all 8-bit, 16-bit and 32-bit signed and unsigned multiplication and division operations are perfect now, I'm unaware of any more bugs in that general area.
Creating MS-DOS Prompts in Windows 95
A quick follow-up to v1.19.8, this release addresses two glaring problems creating "MS-DOS Prompt" windows in Windows 95:
- Starting a new "MS-DOS Prompt" would often result in a crash
- When creating the "MS-DOS Prompt" window, the desktop screen would flicker
The first problem was yet-another instruction restartability issue, this time involving the "POP [mem]" instruction, where [mem] was a location inside a not-present page in the newly allocated DOS Virtual Machine (VM): ESP wasn't being snapshot prior to the POP, so when the instruction restarted, ESP was pointing to the wrong part of the stack.
The second problem was caused by the way the Windows 95 Virtual Display Driver (VDD) reprograms the video card when initializing a new Virtual Machine (VM) state. For reasons not entirely clear, the VDD does more than simply initialize the virtual video card state of the new VM; it also briefly reprograms the physical video card state.
Although the VDD doesn't do anything to change the appearance of the desktop -- that responsibility belongs solely to the Windows display driver, not the VDD -- it does briefly put the video card into an inconsistent state. The PCjs Video component now detects this condition and ignores the VDD state change, preserving the appearance of the Windows 95 desktop.
I'm not really blaming the VDD for this problem, because the PCjs Video component was definitely misinterpreting the state change as a mode change, which it wasn't. However, the VDD does seem to be doing something that I would not have expected. It could be a bug in the VDD, which simply went unnoticed because it didn't change the appearance of the screen, or it could be something else. The VDD definitely needs to modify the physical video state whenever a VM is switched full-screen, but that's not what's happening here.
Assorted 32-bit Fixes
The list of fixes in this release includes:
- Make 32-bit multiply results more x86-compatible when overflow occurs
- Use normal JavaScript multiplication when both multiplicands are <= 16 bits
- Be clearer about which exceptions are "traps" and which are "faults"
- Treat Divide Error (#DE) exceptions as "faults" (must point at the DIV instruction)
- Treat data breakpoint (#DB) exceptions as "traps" (generated after trapped instruction)
- Fix IDT vector 0 dispatching, which was misidentified as a cached segment probe
- Make CALLF restartable whenever pushing triggers a segment or page fault
- Fix V86-mode interrupt frames (frame size is always 32-bit)
Fix # 7 eliminated a common crash in the Windows 95 Explorer, where a CALLF instruction with the stack pointing at a not-present page might not be restartable.
Fix # 8 makes WDEB386 work more reliably, especially when stepping through V86-mode code. If you prefer to use WDEB386 instead of the built-in PCjs Debugger, press F8 immediately after starting the Windows 95 Test Machine, choose boot menu option 5, and then type "WDEB386 /C:2 /V C:\WINDOWS\WIN.COM".
COM2 I/O is connected the same <textarea> window that the PCjs Debugger uses; make sure that window has focus before pressing CTRL-C to wake WDEB386.
Work continues on 32-bit multiplication and division compatibility testing. I've created a simple DOS-compatible test suite named test386.com in the /tests/pc/80386 folder that runs through a series of arithmetic operations and dumps the results to COM2. Run the same tests on another machine, capture the output, and diff the results.
test386.com is built from test386.nasm by running make
in the /tests/pc/80386 folder; you must have make
and nasm
installed on your machine. Also, since OS X continues to ship with an ancient version of nasm
(v0.98.40), I've tried to make all my nasm source files compatible with that version.
test386.com can also be loaded as pseudo ROM image using the command-line version of PCjs. In the /modules/pcjs/bin folder, run:
node pcjs --cmd="load test386.json"
test386.json is a JSON-style machine definition file that loads test386.com as a ROM image; the test machine is also configured to connect COM2 I/O to the console, making it easy to view and capture the test's output.
Windows 95 Desktop Release
This is the first release of PCjs capable of booting Windows 95 all the way to its desktop.
The most recent fix involved variations of three arithmetic instructions (specifically, AND, OR and XOR) that convert an immediate signed byte into a signed word. Those variations were failing to truncate the result when a 16-bit operand size was in effect, and if the destination was a register, the upper 16 bits of that register could become corrupted.
The Windows 95 Test Machine hard disk has been updated with a complete set of Windows 95 files from a "Compact" installation, and first boot has finished, so instead of the initial "Getting ready to run Windows 95 for the first time..." splash screen, you'll see the normal Windows 95 startup screen.
The machine is still a bit finicky. It easily gets confused about the state of its shift keys if you switch away from the browser and then back again. And Explorer windows don't open in the correct view; for example, both My Computer and Recycle Bin open the same (incorrect) view. And if you close an Explorer window, reopen it, and click around on the menus, a crash will likely result.
The adventure continues.
Restartability of Stack Instructions
The most recent bug fix for the Windows 95 test machine involved a 32-bit CALL and a stack page that was swapped out: the CALL was decoded, the stack pointer was decremented, and then a page fault was generated when the stack was written. Unfortunately, the stack pointer was saved in its decremented state, so when the CALL was restarted, the stack pointer was no longer where it was supposed to be. Hopefully the CALL instruction, as well as every other stack-based instruction, is now fully restartable after any #GP, #SS, or #PF fault.
The PCjs Debugger also does a better job of mimicking WDEB386 now, and is able to associate DLL exports with the appropriate memory segments.
Windows 95 Initial Boot
PCjs v1.19.5 is now able to run Windows 95 from first boot up until the initial "Plug and Play" setup process.
More bugs remain, but more have also been squashed in this release, including:
- CALLF bug when source and destination segment sizes differed
- VGA ATC flip-flop was incorrectly toggled on read operations
- ESP was incorrectly updated when SS was loaded with a 16-bit segment
- POP [mem] operations incorrectly calculated the EA before the pop
And the biggest change so far: PCjs now throws real JavaScript exceptions whenever simulated CPU faults occur. I had resisted making that change for a long time, out of concerns over unanticipated side-effects and performance impacts, but with the addition of page faults, there were simply too many situations where unwinding from a fault was too painful to do any other way.
Windows 95 Debugger Support
"Windows 95 Debugger Support" refers to INT 0x68 and INT 0x41 monitoring that the PCjs Debugger now performs when it detects Windows 95 probing for a debugger (which would normally be WDEB386.EXE). This enables the PCjs Debugger to receive the same 16-bit and 32-bit segment load notifications that WDEB386 would normally receive, so that PCjs can provide some basic information about component memory usage.
PCjs commands like "ln" (for a given address) and "ks" (for the current stack) now include the names of the nearest Windows 95 32-bit code and data sections in the Virtual Machine Manager (VMM) and assorted VxDs. These commands are very rudimentary right now but will be improved over time.
Names of 16-bit entry points in 16-bit DLLs and EXEs can be provided if they're listed as exports. Names of 32-bit entry points in the VMM and assorted VxDs can be provided as well, but those will require reverse-engineering some support files from a Windows 95 DDK. Last but not least, exported 32-bit entry points in 32-bit DLL or EXE modules should be doable as well, but that is TBD.
PCjs must respond to certain INT 0x68 and INT 0x41 requests in order to enable the load notifications. It does not do this, however, by installing or modifying any code inside the machine. And if WDEB386.EXE is loaded in the machine before Windows 95 (ie, WIN.COM) is started, the PCjs Debugger will not respond to either of those interrupts -- although it will still monitor them, effectively "eavesdropping" on Windows 95's communication with WDEB386.
As usual, a few bugs were fixed in this release as well, including:
- 32-bit INT gates could create an incorrect stack frame
- BT/BTC/BTR/BTS with a memory operand could access the wrong bit
- SHRD/SHLD instructions could produce incorrect results if bit 31 was set
- 16-bit I/O instructions using AX as a target could trash EAX
- 16-bit ports were not properly implemented
Support for 16-bit ports didn't become an issue until we started using machines with an AT-class Hard Disk Controller (HDC), which processes 16 bits of data for every word operation to data port 0x1F0. PCjs used to break all 16-bit I/O requests into 8-bit requests to successive ports (eg, port 0x1F0 and 0x1F1), which was fine for the vast majority of (8-bit) devices but incorrect for the HDC. REP INSW and REP OUTSW instructions would work, but only because those instructions were incorrectly using the same port for both bytes, effectively fixing one bug by introducing another.
To properly support 16-bit ports, a new Bus interface has been added that allows a component to specify the width of a port. When no width is specified, the width continues to default to 1 byte (8 bits). The width for data port 0x1F0 is now set to 2 bytes (16 bits), so that the Bus component no longer divides word I/O to that port into separate byte I/O requests.
Fixed EGA Graphics Modes
The EGA graphics mode fix was part of the previous release, but WebStorm's "Commit" button didn't pick up the compiled files for some reason. So, if you care about the compiled files (like the website does), then use this release instead of the previous one.