Skip to content
This repository has been archived by the owner on Dec 24, 2020. It is now read-only.

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
layout title permalink
page
Advanced MS-DOS Programming
/pubs/pc/reference/microsoft/mspl13/msdos/advdos/

Advanced MS-DOS Programming

{% raw %}

Advanced MS-DOS Programming


════════════════════════════════════════════════════════════════════════════


Advanced MS-DOS Programming

The Microsoft(R) Guide for Assembly Language and C Programmers

By Ray Duncan


════════════════════════════════════════════════════════════════════════════


  PUBLISHED BY
  Microsoft Press
  A Division of Microsoft Corporation
  16011 NE 36th Way, Box 97017, Redmond, Washington 98073-9717
  Copyright (C) 1986, 1988 by Ray Duncan
  Published 1986. Second edition 1988.
  All rights reserved. No part of the contents of this book may be
  reproduced or transmitted in any form or by any means without the written
  permission of the publisher.
  Library of Congress Cataloging in Publication Data

  Duncan, Ray, 1952-
  Advanced MS-DOS programming.
  Rev. ed. of: Advanced MS-DOS. (C)1986.
  Includes index.
  1. MS-DOS (Computer operating system)  2. Assembler language
  (Computer program language)  3. C (Computer program language)
  I. Duncan, Ray, 1952-    Advanced MS-DOS.    II. Title.
  QA76.76.063D858      1988      005.4'46      88-1251
  ISBN 1-55615-157-8
  Printed and bound in the United States of America.

  1 2 3 4 5 6 7 8 9    FGFG    3 2 1 0 9 8

  Distributed to the book trade in the United States by Harper & Row.

  Distributed to the book trade in Canada by General Publishing Company,
  Ltd.

  Penguin Books Ltd., Harmondworth, Middlesex, England
  Penguin Books Australia Ltd., Ringwood, Victoria, Australia
  Penguin Books N.Z. Ltd., 182-190 Wairu Road, Auckland 10, New Zealand

  British Cataloging in Publication Data available

  IBM(R), PC/AT(R), and PS/2(R) are registered trademarks of International
  Business Machines Corporation. CodeView(R), Microsoft(R), MS-DOS(R), and
  XENIX(R) are registered trademarks and InPort TM is a trademark of
  Microsoft Corporation.

  ──────────────────────────────────────────────────────────────────────────
      Technical Editor: Mike Halvorson  Production Editor: Mary Ann Jones
  ──────────────────────────────────────────────────────────────────────────



                                  Dedication

                                  For Carolyn



────────────────────────────────────────────────────────────────────────────
Contents

  Road Map to Figures and Tables

  Acknowledgments

  Introduction

  SECTION 1   PROGRAMMING FOR MS-DOS

  Chapter 1   Genealogy of MS-DOS

  Chapter 2   MS-DOS in Operation

  Chapter 3   Structure of MS-DOS Application Programs

  Chapter 4   MS-DOS Programming Tools

  Chapter 5   Keyboard and Mouse Input

  Chapter 6   Video Display

  Chapter 7   Printer and Serial Port

  Chapter 8   File Management

  Chapter 9   Volumes and Directories

  Chapter 10  Disk Internals

  Chapter 11  Memory Management

  Chapter 12  The EXEC Function

  Chapter 13  Interrupt Handlers

  Chapter 14  Installable Device Drivers

  Chapter 15  Filters

  Chapter 16  Compatibility and Portability

  SECTION 2   MS-DOS FUNCTIONS REFERENCE

  SECTION 3   IBM ROM BIOS AND MOUSE FUNCTIONS REFERENCE

  SECTION 4   LOTUS/INTEL/MICROSOFT EMS FUNCTIONS REFERENCE

  Index




────────────────────────────────────────────────────────────────────────────
Road Map to Figures and Tables

  MS-DOS versions and release dates

  MS-DOS memory map

  Structure of program segment prefix (PSP)

  Structure of .EXE load module

  Register conditions at program entry

  Segments, groups, and classes

  Macro Assembler switches

  C Compiler switches

  Linker switches

  MAKE switches

  ANSI escape sequences

  Video attributes

  Structure of normal file control block (FCB)

  Structure of extended file control block

  MS-DOS error codes

  Structure of boot sector

  Structure of directory entry

  Structure of fixed-disk master block

  LIM EMS error codes

  Intel 80x86 internal interrupts (faults)

  Intel 80x86, MS-DOS, and ROM BIOS interrupts

  Device-driver attribute word

  Device-driver command codes

  Structure of BIOS parameter block (BPB)

  Media descriptor byte



────────────────────────────────────────────────────────────────────────────
Acknowledgments

  My renewed thanks to the outstanding editors and production staff at
  Microsoft Press, who make beautiful books happen, and to the talented
  Microsoft developers, who create great programs to write books about.
  Special thanks to Mike Halvorson, Jeff Hinsch, Mary Ann Jones, Claudette
  Moore, Dori Shattuck, and Mark Zbikowski; if this book has anything unique
  to offer, these people deserve most of the credit.



────────────────────────────────────────────────────────────────────────────
Introduction

  Advanced MS-DOS Programming is written for the experienced C or
  assembly-language programmer. It provides all the information you need to
  write robust, high-performance applications under the MS-DOS operating
  system. Because I believe that working, well-documented programs are
  unbeatable learning tools, I have included detailed programming examples
  throughout──including complete utility programs that you can adapt to your
  own needs.

  This book is both a tutorial and a reference and is divided into four
  sections, so that you can find information more easily. Section 1
  discusses MS-DOS capabilities and services by functional group in the
  context of common programming issues, such as user input, control of the
  display, memory management, and file handling. Special classes of
  programs, such as interrupt handlers, device drivers, and filters, have
  their own chapters.

  Section 2 provides a complete reference guide to MS-DOS function calls,
  organized so that you can see the calling sequence, results, and version
  dependencies of each function at a glance. I have also included notes,
  where relevant, about quirks and special uses of functions as well as
  cross-references to related functions. An assembly-language example is
  included for each entry in Section 2.

  Sections 3 and 4 are references to IBM ROM BIOS, Microsoft Mouse driver,
  and Lotus/Intel/Microsoft Expanded Memory Specification functions. The
  entries in these two sections have the same form as in Section 2, except
  that individual programming examples have been omitted.

  The programs in this book were written with the marvelous Brief editor
  from Solution Systems and assembled or compiled with Microsoft Macro
  Assembler version 5.1 and Microsoft C Compiler version 5.1. They have been
  tested under MS-DOS versions 2.1, 3.1, 3.3, and 4.0 on an 8088-based IBM
  PC, an 80286-based IBM PC/AT, and an 80386-based IBM PS/2 Model 80. As far
  as I am aware, they do not contain any software or hardware dependencies
  that will prevent them from running properly on any IBM PC─compatible
  machine running MS-DOS version 2.0 or later.

Changes from the First Edition

  Readers who are familiar with the first edition will find many changes in
  the second edition, but the general structure of the book remains the
  same. Most of the material comparing MS-DOS to CP/M and UNIX/XENIX has
  been removed; although these comparisons were helpful a few years ago,
  MS-DOS has become its own universe and deserves to be considered on its
  own terms.

  The previously monolithic chapter on character devices has been broken
  into three more manageable chapters focusing on the keyboard and mouse,
  the display, and the serial port and printer. Hardware-dependent video
  techniques have been de-emphasized; although this topic is more important
  than ever, it has grown so complex that it requires a book of its own. A
  new chapter discusses compatibility and portability of MS-DOS applications
  and also contains a brief introduction to Microsoft OS/2, the new
  multitasking, protected-mode operating system.

  A road map to vital figures and tables has been added, following the Table
  of Contents, to help you quickly locate the layouts of the program segment
  prefix, file control block, and the like.

  The reference sections at the back of the book have been extensively
  updated and enlarged and are now complete through MS-DOS version 4.0, the
  IBM PS/2 Model 80 ROM BIOS and the VGA video adapter, the Microsoft Mouse
  driver version 6.0, and the Lotus/Intel/Microsoft Expanded Memory
  Specification version 4.0.

  In the two years since Advanced MS-DOS Programming was first published,
  hundreds of readers have been kind enough to send me their comments, and I
  have tried to incorporate many of their suggestions in this new edition.
  As before, please feel free to contact me via MCI Mail (user name LMI),
  CompuServe (user ID 72406,1577), or BIX (user name rduncan).

  Ray Duncan  Los Angeles, California  September 1988



────────────────────────────────────────────────────────────────────────────
SECTION 1  PROGRAMMING FOR MS-DOS
────────────────────────────────────────────────────────────────────────────



────────────────────────────────────────────────────────────────────────────
Chapter 1  Genealogy of MS-DOS

  In only seven years, MS-DOS has evolved from a simple program loader into
  a sophisticated, stable operating system for personal computers that are
  based on the Intel 8086 family of microprocessors (Figure 1-1). MS-DOS
  supports networking, graphical user interfaces, and storage devices of
  every description; it serves as the platform for thousands of application
  programs; and it has over 10 million licensed users──dwarfing the combined
  user bases of all of its competitors.

  The progenitor of MS-DOS was an operating system called 86-DOS, which was
  written by Tim Paterson for Seattle Computer Products in mid-1980. At that
  time, Digital Research's CP/M-80 was the operating system most commonly
  used on microcomputers based on the Intel 8080 and Zilog Z-80
  microprocessors, and a wide range of application software (word
  processors, database managers, and so forth) was available for use with
  CP/M-80.

  To ease the process of porting 8-bit CP/M-80 applications into the new
  16-bit environment, 86-DOS was originally designed to mimic CP/M-80 in
  both available functions and style of operation. Consequently, the
  structures of 86-DOS's file control blocks, program segment prefixes, and
  executable files were nearly identical to those of CP/M-80. Existing
  CP/M-80 programs could be converted mechanically (by processing their
  source-code files through a special translator program) and, after
  conversion, would run under 86-DOS either immediately or with very little
  hand editing.

  Because 86-DOS was marketed as a proprietary operating system for Seattle
  Computer Products' line of S-100 bus, 8086-based microcomputers, it made
  very little impact on the microcomputer world in general. Other vendors of
  8086-based microcomputers were understandably reluctant to adopt a
  competitor's operating system and continued to wait impatiently for the
  release of Digital Research's CP/M-86.

  In October 1980, IBM approached the major microcomputer-software houses in
  search of an operating system for the new line of personal computers it
  was designing. Microsoft had no operating system of its own to offer
  (other than a stand-alone version of Microsoft BASIC) but paid a fee to
  Seattle Computer Products for the right to sell Paterson's 86-DOS. (At
  that time, Seattle Computer Products received a license to use and sell
  Microsoft's languages and all 8086 versions of Microsoft's operating
  system.) In July 1981, Microsoft purchased all rights to 86-DOS, made
  substantial alterations to it, and renamed it MS-DOS. When the first IBM
  PC was released in the fall of 1981, IBM offered MS-DOS (referred to as
  PC-DOS 1.0) as its primary operating system.

  IBM also selected Digital Research's CP/M-86 and Softech's P-system as
  alternative operating systems for the PC. However, they were both very
  slow to appear at IBM PC dealers and suffered the additional disadvantages
  of higher prices and lack of available programming languages. IBM threw
  its considerable weight behind PC-DOS by releasing all the IBM-logo PC
  application software and development tools to run under it. Consequently,
  most third-party software developers targeted their products for PC-DOS
  from the start, and CP/M-86 and P-system never became significant factors
  in the IBM PC─compatible market.

  In spite of some superficial similarities to its ancestor CP/M-80, MS-DOS
  version 1.0 contained a number of improvements over CP/M-80, including the
  following:

  ■  An improved disk-directory structure that included information about a
     file's attributes (such as whether it was a system or a hidden file),
     its exact size in bytes, and the date that the file was created or last
     modified

  ■  A superior disk-space allocation and management method, allowing
     extremely fast sequential or random record access and program loading

  ■  An expanded set of operating-system services, including
     hardware-independent function calls to set or read the date and time, a
     filename parser, multiple-block record I/O, and variable record sizes

  ■  An AUTOEXEC.BAT batch file to perform a user-defined series of commands
     when the system was started or reset

  IBM was the only major computer manufacturer (sometimes referred to as
  OEM, for original equipment manufacturer) to ship MS-DOS version 1.0 (as
  PC-DOS 1.0) with its products. MS-DOS version 1.25 (equivalent to IBM
  PC-DOS 1.1) was released in June 1982 to fix a number of bugs and also to
  support double-sided disks and improved hardware independence in the DOS
  kernel. This version was shipped by several vendors besides IBM, including
  Texas Instruments, COMPAQ, and Columbia, who all entered the personal
  computer market early. Due to rapid decreases in the prices of RAM and
  fixed disks, MS-DOS version 1 is no longer in common use.

  MS-DOS version 2.0 (equivalent to PC-DOS 2.0) was first released in March
  1983. It was, in retrospect, a new operating system (though great care was
  taken to maintain compatibility with MS-DOS version 1). It contained many
  significant innovations and enhanced features, including those listed on
  the following page.

  ■  Support for both larger-capacity floppy disks and hard disks

  ■  Many UNIX/XENIX-like features, including a hierarchical file structure,
     file handles, I/O redirection, pipes, and filters

  ■  Background printing (print spooling)

  ■  Volume labels, plus additional file attributes

  ■  Installable device drivers

  ■  A user-customizable system-configuration file that controlled the
     loading of additional device drivers, the number of system disk
     buffers, and so forth

  ■  Maintenance of environment blocks that could be used to pass
     information between programs

  ■  An optional ANSI display driver that allowed programs to position the
     cursor and control display characteristics in a hardware-independent
     manner

  ■  Support for the dynamic allocation, modification, and release of memory
     by application programs

  ■  Support for customized user command interpreters (shells)

  ■  System tables to assist application software in modifying its currency,
     time, and date formats (known as international support)

  MS-DOS version 2.11 was subsequently released to improve international
  support (table-driven currency symbols, date formats, decimal-point
  symbols, currency separators, and so forth), to add support for 16-bit
  Kanji characters throughout, and to fix a few minor bugs. Version 2.11
  rapidly became the base version shipped for 8086/8088-based personal
  computers by every major OEM, including Hewlett-Packard, Wang, Digital
  Equipment Corporation, Texas Instruments, COMPAQ, and Tandy.

  MS-DOS version 2.25, released in October 1985, was distributed in the Far
  East but was never shipped by OEMs in the United States and Europe. In
  this version, the international support for Japanese and Korean character
  sets was extended even further, additional bugs were repaired, and many of
  the system utilities were made compatible with MS-DOS version 3.0.

  MS-DOS version 3.0 was introduced by IBM in August 1984 with the release
  of the 80286-based PC/AT machines. It represented another major rewrite of
  the entire operating system and included the important new features listed
  on the following page.

  ■  Direct control of the print spooler by application software

  ■  Further expansion of international support for currency formats

  ■  Extended error reporting, including a code that suggests a recovery
     strategy to the application program

  ■  Support for file and record locking and sharing

  ■  Support for larger fixed disks

  MS-DOS version 3.1, which was released in November 1984, added support for
  the sharing of files and printers across a network. Beginning with version
  3.1, a new operating-system module called the redirector intercepts an
  application program's requests for I/O and filters out the requests that
  are directed to network devices, passing these requests to another machine
  for processing.

  Since version 3.1, the changes to MS-DOS have been evolutionary rather
  than revolutionary. Version 3.2, which appeared in 1986, generalized the
  definition of device drivers so that new media types (such as 3.5-inch
  floppy disks) could be supported more easily. Version 3.3 was released in
  1987, concurrently with the new IBM line of PS/2 personal computers, and
  drastically expanded MS-DOS's multilanguage support for keyboard mappings,
  printer character sets, and display fonts. Version 4.0, delivered in 1988,
  was enhanced with a visual shell as well as support for very large file
  systems.

  While MS-DOS has been evolving, Microsoft has also put intense efforts
  into the areas of user interfaces and multitasking operating systems.
  Microsoft Windows, first shipped in 1985, provides a multitasking,
  graphical user "desktop" for MS-DOS systems. Windows has won widespread
  support among developers of complex graphics applications such as desktop
  publishing and computer-aided design because it allows their programs to
  take full advantage of whatever output devices are available without
  introducing any hardware dependence.

  Microsoft Operating System/2 (MS OS/2), released in 1987, represents a new
  standard for application developers: a protected-mode, multitasking,
  virtual-memory system specifically designed for applications requiring
  high-performance graphics, networking, and interprocess communications.
  Although MS OS/2 is a new product and is not a derivative of MS-DOS, its
  user interface and file system are compatible with MS-DOS and Microsoft
  Windows, and it offers the ability to run one real-mode (MS-DOS)
  application alongside MS OS/2 protected-mode applications. This
  compatibility allows users to move between the MS-DOS and OS/2
  environments with a minimum of difficulty.

  ┌─────────────┐
  │ MS-DOS 1.0  │ 1981: First operating system on IBM PC
  │ PC-DOS 1.0  │
  └──────┬──────┘
         │
  ┌──────▼──────┐
  │ MS-DOS 1.25 │ Double-sided disk support and bug fixes added:
  │ PC-DOS 1.1  │ widely distributed by OEMs other than IBM
  └──────┬──────┘
         │
  ┌──────▼──────┐ 1983: Introduced with IBM PC/XT;
  │ MS-DOS 2.0  │ support for UNIX/XENIX-like hierarchical
  │ PC-DOS 2.0  │ file structure and hard disks added
  └──────┬──────┘
         ├──────────────────────────────────────┐
  ┌──────▼──────┐                        ┌──────▼──────┐
  │ MS-DOS 2.01 │ 2.0 with international │ PC-DOS 2.1  │ Introduced with PCjr
  └──────┬──────┘ support                └─────────────┘ 2.0 with bug fixes
         │
  ┌──────▼──────┐
  │ MS-DOS 2.11 │ 2.01 with bug fixes
  └──────┬──────┘
         ├──────────────────────────────────────┐
  ┌──────▼──────┐ 1984: Introduced with  ┌──────▼──────┐ 1985: Far East OEMs;
  │ MS-DOS 3.0  │ PC/AT; support for     │ MS-DOS 2.25 │ support for extended
  │ PC-DOS 3.0  │ 1.2 MB floppy disk,    └─────────────┘ character sets
  └──────┬──────┘ larger hard disk added
         │
  ┌──────▼──────┐
  │ MS-DOS 3.1  │ Support for Microsoft  ┌─────────────┐ 1985: Graphical
  │ PC-DOS 3.1  │ Networks added         │   Windows   │ user interface
  └──────┬──────┘                        │     1.0     │ for MS-DOS
         │                               └──────┬──────┘
  ┌──────▼──────┐                               │
  │ MS-DOS 3.2  │ 1986: Support for 3.5-        │
  │ PC-DOS 3.2  │ inch disks added              │
  └──────┬──────┘                               │
         │                               ┌──────▼──────┐ 1987: Compatibility
  ┌──────▼──────┐ 1987: Introduced with  │   Windows   │ with OS/2
  │ MS-DOS 3.3  │ IBM PS/2; generalized  │     2.0     │ Presentation Manager
  │ PC-DOS 3.3  │ code-page (font)       └─────────────┘
  └──────┬──────┘ support
         │
  ┌──────▼──────┐ 1988: Support for
  │ MS-DOS 4.0  │ logical volumes larger
  │ PC-DOS 4.0  │ than 32 MB; visual shell
  └─────────────┘

  Figure 1-1.  The evolution of MS-DOS.

  What does the future hold for MS-DOS? Only the long-range planning teams
  at Microsoft and IBM know for sure. But it seems safe to assume that
  MS-DOS, with its relatively small memory requirements, adaptability to
  diverse hardware configurations, and enormous base of users, will remain
  important to programmers and software publishers for years to come.



────────────────────────────────────────────────────────────────────────────
Chapter 2  MS-DOS in Operation

  It is unlikely that you will ever be called upon to configure the MS-DOS
  software for a new model of computer. Still, an acquaintance with the
  general structure of MS-DOS can often be very helpful in understanding the
  behavior of the system as a whole. In this chapter, we will discuss how
  MS-DOS is organized and how it is loaded into memory when the computer is
  turned on.


The Structure of MS-DOS

  MS-DOS is partitioned into several layers that serve to isolate the kernel
  logic of the operating system, and the user's perception of the system,
  from the hardware it is running on. These layers are

  ■  The BIOS (Basic Input/Output System)

  ■  The DOS kernel

  ■  The command processor (shell)

  We'll discuss the functions of each of these layers separately.

The BIOS Module

  The BIOS is specific to the individual computer system and is provided by
  the manufacturer of the system. It contains the default resident
  hardware-dependent drivers for the following devices:

  ■  Console display and keyboard (CON)

  ■  Line printer (PRN)

  ■  Auxiliary device (AUX)

  ■  Date and time (CLOCK$)

  ■  Boot disk device (block device)

  The MS-DOS kernel communicates with these device drivers through I/O
  request packets; the drivers then translate these requests into the proper
  commands for the various hardware controllers. In many MS-DOS systems,
  including the IBM PC, the most primitive parts of the hardware drivers are
  located in read-only memory (ROM) so that they can be used by stand-alone
  applications, diagnostics, and the system startup program.

  The terms resident and installable are used to distinguish between the
  drivers built into the BIOS and the drivers installed during system
  initialization by DEVICE commands in the CONFIG.SYS file. (Installable
  drivers will be discussed in more detail later in this chapter and in
  Chapter 14.)

  The BIOS is read into random-access memory (RAM) during system
  initialization as part of a file named IO.SYS. (In PC-DOS, the file is
  called IBMBIO.COM.) This file is marked with the special attributes hidden
  and system.

The DOS Kernel

  The DOS kernel implements MS-DOS as it is seen by application programs.
  The kernel is a proprietary program supplied by Microsoft Corporation and
  provides a collection of hardware-independent services called system
  functions. These functions include the following:

  ■  File and record management

  ■  Memory management

  ■  Character-device input/output

  ■  Spawning of other programs

  ■  Access to the real-time clock

  Programs can access system functions by loading registers with
  function-specific parameters and then transferring to the operating system
  by means of a software interrupt.

  The DOS kernel is read into memory during system initialization from the
  MSDOS.SYS file on the boot disk. (The file is called IBMDOS.COM in
  PC-DOS.) This file is marked with the attributes hidden and system.

The Command Processor

  The command processor, or shell, is the user's interface to the operating
  system. It is responsible for parsing and carrying out user commands,
  including the loading and execution of other programs from a disk or other
  mass-storage device.

  The default shell that is provided with MS-DOS is found in a file called
  COMMAND.COM. Although COMMAND.COM prompts and responses constitute the
  ordinary user's complete perception of MS-DOS, it is important to realize
  that COMMAND.COM is not the operating system, but simply a special class
  of program running under the control of MS-DOS.

  COMMAND.COM can be replaced with a shell of the programmer's own design by
  simply adding a SHELL directive to the system-configuration file
  (CONFIG.SYS) on the system startup disk. The product COMMAND-PLUS from ESP
  Systems is an example of such an alternative shell.

  More about COMMAND.COM

  The default MS-DOS shell, COMMAND.COM, is divided into three parts:

  ■  A resident portion

  ■  An initialization section

  ■  A transient module

  The resident portion is loaded in lower memory, above the DOS kernel and
  its buffers and tables. It contains the routines to process Ctrl-C and
  Ctrl-Break, critical errors, and the termination (final exit) of other
  transient programs. This part of COMMAND.COM issues error messages and is
  responsible for the familiar prompt

  Abort, Retry, Ignore?

  The resident portion also contains the code required to reload the
  transient portion of COMMAND.COM when necessary.

  The initialization section of COMMAND.COM is loaded above the resident
  portion when the system is started. It processes the AUTOEXEC.BAT batch
  file (the user's list of commands to execute at system startup), if one is
  present, and is then discarded.

  The transient portion of COMMAND.COM is loaded at the high end of memory,
  and its memory can also be used for other purposes by application
  programs. The transient module issues the user prompt, reads the commands
  from the keyboard or batch file, and causes them to be executed. When an
  application program terminates, the resident portion of COMMAND.COM does a
  checksum of the transient module to determine whether it has been
  destroyed and fetches a fresh copy from the disk if necessary.

  The user commands that are accepted by COMMAND.COM fall into three
  categories:

  ■  Internal commands

  ■  External commands

  ■  Batch files

  Internal commands, sometimes called intrinsic commands, are those carried
  out by code embedded in COMMAND.COM itself. Commands in this category
  include COPY, REN(AME), DIR(ECTORY), and DEL(ETE). The routines for the
  internal commands are included in the transient part of COMMAND.COM.

  External commands, sometimes called extrinsic commands or transient
  programs, are the names of programs stored in disk files. Before these
  programs can be executed, they must be loaded from the disk into the
  transient program area (TPA) of memory. (See "How MS-DOS Is Loaded" in
  this chapter.) Familiar examples of external commands are CHKDSK, BACKUP,
  and RESTORE. As soon as an external command has completed its work, it is
  discarded from memory; hence, it must be reloaded from disk each time it
  is invoked.

  Batch files are text files that contain lists of other intrinsic,
  extrinsic, or batch commands. These files are processed by a special
  interpreter that is built into the transient portion of COMMAND.COM. The
  interpreter reads the batch file one line at a time and carries out each
  of the specified operations in order.

  In order to interpret a user's command, COMMAND.COM first looks to see if
  the user typed the name of a built-in (intrinsic) command that it can
  carry out directly. If not, it searches for an external command
  (executable program file) or batch file by the same name. The search is
  carried out first in the current directory of the current disk drive and
  then in each of the directories specified in the most recent PATH command.
  In each directory inspected, COMMAND.COM first tries to find a file with
  the extension .COM, then .EXE, and finally .BAT. If the search fails for
  all three file types in all of the possible locations, COMMAND.COM
  displays the familiar message

  Bad command or file name

  If a .COM file or a .EXE file is found, COMMAND.COM uses the MS-DOS EXEC
  function to load and execute it. The EXEC function builds a special data
  structure called a program segment prefix (PSP) above the resident portion
  of COMMAND.COM in the transient program area. The PSP contains various
  linkages and pointers needed by the application program. Next, the EXEC
  function loads the program itself, just above the PSP, and performs any
  relocation that may be necessary. Finally, it sets up the registers
  appropriately and transfers control to the entry point for the program.
  (Both the PSP and the EXEC function will be discussed in more detail in
  Chapters 3 and 12.) When the transient program has finished its job, it
  calls a special MS-DOS termination function that releases the transient
  program's memory and returns control to the program that caused the
  transient program to be loaded (COMMAND.COM, in this case).

  A transient program has nearly complete control of the system's resources
  while it is executing. The only other tasks that are accomplished are
  those performed by interrupt handlers (such as the keyboard input driver
  and the real-time clock) and operations that the transient program
  requests from the operating system. MS-DOS does not support sharing of the
  central processor among several tasks executing concurrently, nor can it
  wrest control away from a program when it crashes or executes for too
  long. Such capabilities are the province of MS OS/2, which is a
  protected-mode system with preemptive multitasking (time-slicing).


How MS-DOS Is Loaded

  When the system is started or reset, program execution begins at address
  0FFFF0H. This is a feature of the 8086/8088 family of microprocessors and
  has nothing to do with MS-DOS. Systems based on these processors are
  designed so that address 0FFFF0H lies within an area of ROM and contains a
  jump machine instruction to transfer control to system test code and the
  ROM bootstrap routine (Figure 2-1).

  The ROM bootstrap routine reads the disk bootstrap routine from the first
  sector of the system startup disk (the boot sector) into memory at some
  arbitrary address and then transfers control to it (Figure 2-2). (The
  boot sector also contains a table of information about the disk format.)

  The disk bootstrap routine checks to see if the disk contains a copy of
  MS-DOS. It does this by reading the first sector of the root directory and
  determining whether the first two files are IO.SYS and MSDOS.SYS (or
  IBMBIO.COM and IBMDOS.COM), in that order. If these files are not present,
  the user is prompted to change disks and strike any key to try again.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │                                               │
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         │                                               │
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-1.  A typical 8086/8088-based computer system immediately after
  system startup or reset. Execution begins at location 0FFFF0H, which
  contains a jump instruction that directs program control to the ROM
  bootstrap routine.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │                                               │
         ├───────────────────────────────────────────────┤
         │           Disk bootstrap routine              │
         ├───────────────────────────────────────────────┤ ◄ Arbitrary
         │                                               │   load location
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-2.  The ROM bootstrap routine loads the disk bootstrap routine
  into memory from the first sector of the system startup disk and then
  transfers control to it.

  If the two system files are found, the disk bootstrap reads them into
  memory and transfers control to the initial entry point of IO.SYS (Figure
  2-3). (In some implementations, the disk bootstrap reads only IO.SYS into
  memory, and IO.SYS in turn loads the MSDOS.SYS file.)

  The IO.SYS file that is loaded from the disk actually consists of two
  separate modules. The first is the BIOS, which contains the linked set of
  resident device drivers for the console, auxiliary port, printer, block,
  and clock devices, plus some hardware-specific initialization code that is
  run only at system startup. The second module, SYSINIT, is supplied by
  Microsoft and linked into the IO.SYS file, along with the BIOS, by the
  computer manufacturer.

  SYSINIT is called by the manufacturer's BIOS initialization code. It
  determines the amount of contiguous memory present in the system and then
  relocates itself to high memory. Then it moves the DOS kernel, MSDOS.SYS,
  from its original load location to its final memory location, overlaying
  the original SYSINIT code and any other expendable initialization code
  that was contained in the IO.SYS file (Figure 2-4).

  Next, SYSINIT calls the initialization code in MSDOS.SYS. The DOS kernel
  initializes its internal tables and work areas, sets up the interrupt
  vectors 20H through 2FH, and traces through the linked list of resident
  device drivers, calling the initialization function for each. (See Chapter
  14.)

         ┌───────────────────────────────────────────────┐
         │             ROM bootstrap routine             │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │                                               │
         ├───────────────────────────────────────────────┤
         │            Disk bootstrap routine             │
         ├───────────────────────────────────────────────┤
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         ├───────────────────────────────────────────────┤
         │          DOS kernel (from MSDOS.SYS)          │
         ├───────────────────────────────────────────────┤ ◄ In temporary
         │             SYSINIT (from IO.SYS)             │   location
         ├───────────────────────────────────────────────┤
         │              BIOS (from IO.SYS)               │
         ├───────────────────────────────────────────────┤
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │               Interrupt vectors               │
  00000H └───────────────────────────────────────────────┘

  Figure 2-3.  The disk bootstrap reads the file IO.SYS into memory. This
  file contains the MS-DOS BIOS (resident device drivers) and the SYSINIT
  module. Either the disk bootstrap or the BIOS (depending upon the
  manufacturer's implementation) then reads the DOS kernel into memory from
  the MSDOS.SYS file.

  These driver functions determine the equipment status, perform any
  necessary hardware initialization, and set up the vectors for any external
  hardware interrupts the drivers will service.

  As part of the initialization sequence, the DOS kernel examines the
  disk-parameter blocks returned by the resident block-device drivers,
  determines the largest sector size that will be used in the system, builds
  some drive-parameter blocks, and allocates a disk sector buffer. Control
  then returns to SYSINIT.

  When the DOS kernel has been initialized and all resident device drivers
  are available, SYSINIT can call on the normal MS-DOS file services to open
  the CONFIG.SYS file. This optional file can contain a variety of commands
  that enable the user to customize the MS-DOS environment. For instance,
  the user can specify additional hardware device drivers, the number of
  disk buffers, the maximum number of files that can be open at one time,
  and the filename of the command processor (shell).

  If it is found, the entire CONFIG.SYS file is loaded into memory for
  processing. All lowercase characters are converted to uppercase, and the
  file is interpreted one line at a time to process the commands. Memory is
  allocated for the disk buffer cache and the internal file control blocks
  used by the handle file and record system functions. (See Chapter 8.) Any
  device drivers indicated in the CONFIG.SYS file are sequentially loaded
  into memory, initialized by calls to their init modules, and linked into
  the device-driver list. The init function of each driver tells SYSINIT how
  much memory to reserve for that driver.

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │               SYSINIT module                  │
         ├───────────────────────────────────────────────┤
         │                                               │
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │                                               │
         ├───────────────────────────────────────────────┤
         │              Installable drivers              │
         ├───────────────────────────────────────────────┤
         │              File control blocks              │
         ├───────────────────────────────────────────────┤
         │               Disk buffer cache               │
         ├───────────────────────────────────────────────┤
         │                  DOS kernel                   │
         ├───────────────────────────────────────────────┤ ◄ In final
         │                     BIOS                      │   location
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-4.  SYSINIT moves itself to high memory and relocates the DOS
  kernel, MSDOS.SYS, downward to its final address. The MS-DOS disk buffer
  cache and file control block areas are allocated, and then the installable
  device drivers specified in the CONFIG.SYS file are loaded and linked into
  the system.

  After all installable device drivers have been loaded, SYSINIT closes all
  file handles and reopens the console (CON), printer (PRN), and auxiliary
  (AUX) devices as the standard input, standard output, standard error,
  standard list, and standard auxiliary devices. This allows a
  user-installed character-device driver to override the BIOS's resident
  drivers for the standard devices.

  Finally, SYSINIT calls the MS-DOS EXEC function to load the command
  interpreter, or shell. (The default shell is COMMAND.COM, but another
  shell can be substituted by means of the CONFIG.SYS file.) Once the shell
  is loaded, it displays a prompt and waits for the user to enter a command.
  MS-DOS is now ready for business, and the SYSINIT module is discarded
  (Figure 2-5).

         ┌───────────────────────────────────────────────┐
         │            ROM bootstrap routine              │
         ├───────────────────────────────────────────────┤
         │                                               │
         ├───────────────────────────────────────────────┤ ◄ Top of RAM
         │         Transient part of COMMAND.COM         │
         ├───────────────────────────────────────────────┤
         └──────────────────────┐                        │
         ┌────────────────────┐ └────────────────────────┘
         │                    └──────────────────────────┐
         │            Transient program area             │
         ├───────────────────────────────────────────────┤
         │         Resident part of COMMAND.COM          │
         ├───────────────────────────────────────────────┤
         │              Installable drivers              │
         ├───────────────────────────────────────────────┤
         │              File control blocks              │
         ├───────────────────────────────────────────────┤
         │               Disk buffer cache               │
         ├───────────────────────────────────────────────┤
         │                  DOS kernel                   │
         ├───────────────────────────────────────────────┤
         │                     BIOS                      │
         ├───────────────────────────────────────────────┤
         │                                               │
  00400H ├───────────────────────────────────────────────┤
         │             Interrupt vectors                 │
  00000H └───────────────────────────────────────────────┘

  Figure 2-5.  The final result of the MS-DOS startup process for a typical
  system. The resident portion of COMMAND.COM lies in low memory, above the
  DOS kernel. The transient portion containing the batch-file interpreter
  and intrinsic commands is placed in high memory, where it can be overlaid
  by extrinsic commands and application programs running in the transient
  program area.



────────────────────────────────────────────────────────────────────────────
Chapter 3  Structure of MS-DOS Application Programs

  Programs that run under MS-DOS come in two basic flavors: .COM programs,
  which have a maximum size of approximately 64 KB, and .EXE programs, which
  can be as large as available memory. In Intel 8086 parlance, .COM programs
  fit the tiny model, in which all segment registers contain the same value;
  that is, the code and data are mixed together. In contrast, .EXE programs
  fit the small, medium, or large model, in which the segment registers
  contain different values; that is, the code, data, and stack reside in
  separate segments. .EXE programs can have multiple code and data segments,
  which are respectively addressed by long calls and by manipulation of the
  data segment (DS) register.

  A .COM-type program resides on the disk as an absolute memory image, in a
  file with the extension .COM. The file does not have a header or any other
  internal identifying information. A .EXE program, on the other hand,
  resides on the disk in a special type of file with a unique header, a
  relocation map, a checksum, and other information that is (or can be) used
  by MS-DOS.

  Both .COM and .EXE programs are brought into memory for execution by the
  same mechanism: the EXEC function, which constitutes the MS-DOS loader.
  EXEC can be called with the filename of a program to be loaded by
  COMMAND.COM (the normal MS-DOS command interpreter), by other shells or
  user interfaces, or by another program that was previously loaded by EXEC.
  If there is sufficient free memory in the transient program area, EXEC
  allocates a block of memory to hold the new program, builds the program
  segment prefix (PSP) at its base, and then reads the program into memory
  immediately above the PSP. Finally, EXEC sets up the segment registers and
  the stack and transfers control to the program.

  When it is invoked, EXEC can be given the addresses of additional
  information, such as a command tail, file control blocks, and an
  environment block; if supplied, this information will be passed on to the
  new program. (The exact procedure for using the EXEC function in your own
  programs is discussed, with examples, in Chapter 12.)

  .COM and .EXE programs are often referred to as transient programs. A
  transient program "owns" the memory block it has been allocated and has
  nearly total control of the system's resources while it is executing. When
  the program terminates, either because it is aborted by the operating
  system or because it has completed its work and systematically performed a
  final exit back to MS-DOS, the memory block is then freed (hence the term
  transient) and can be used by the next program in line to be loaded.


The Program Segment Prefix

  A thorough understanding of the program segment prefix is vital to
  successful programming under MS-DOS. It is a reserved area, 256 bytes
  long, that is set up by MS-DOS at the base of the memory block allocated
  to a transient program. The PSP contains some linkages to MS-DOS that can
  be used by the transient program, some information MS-DOS saves for its
  own purposes, and some information MS-DOS passes to the transient
  program──to be used or not, as the program requires (Figure 3-1).

  Offset
  0000H ┌────────────────────────────────────────────────────────┐
        │                        Int 20H                         │
  0002H ├────────────────────────────────────────────────────────┤
        │            Segment, end of allocation block            │
  0004H ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  0005H ├────────────────────────────────────────────────────────┤
        │        Long call to MS-DOS function dispatcher         │
  000AH ├────────────────────────────────────────────────────────┤
        │        Previous contents of termination handler        │
        │               interrupt vector (Int 22H)               │
  000EH ├────────────────────────────────────────────────────────┤
        │ Previous contents of Ctrl-C interrupt vector (Int 23H) │
  0012H ├────────────────────────────────────────────────────────┤
        │      Previous contents of critical-error handler       │
        │               interrupt vector (Int 24H)               │
  0016H ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  002CH ├────────────────────────────────────────────────────────┤
        │          Segment address of environment block          │
  002EH ├────────────────────────────────────────────────────────┤
        │                        Reserved                        │
  005CH ├────────────────────────────────────────────────────────┤
        │             Default file control block #1              │
  006CH ├────────────────────────────────────────────────────────┤
        │             Default file control block #2              │
        │              (overlaid if FCB #1 opened)               │
  008OH ├────────────────────────────────────────────────────────┤
        └──────────────────────────┐                             │
        ┌────────────────────────┐ └─────────────────────────────┘
        │                        └───────────────────────────────┐
        │  Command tail and default disk transfer area (buffer)  │
  OOFFH └────────────────────────────────────────────────────────┘

  Figure 3-1.  The structure of the program segment prefix.

  In the first versions of MS-DOS, the PSP was designed to be compatible
  with a control area that was built beneath transient programs under
  Digital Research's venerable CP/M operating system, so that programs could
  be ported to MS-DOS without extensive logical changes. Although MS-DOS has
  evolved considerably since those early days, the structure of the PSP is
  still recognizably similar to its CP/M equivalent. For example, offset
  0000H in the PSP contains a linkage to the MS-DOS process-termination
  handler, which cleans up after the program has finished its job and
  performs a final exit. Similarly, offset 0005H in the PSP contains a
  linkage to the MS-DOS function dispatcher, which performs disk operations,
  console input/output, and other such services at the request of the
  transient program. Thus, calls to PSP:0000 and PSP:0005 have the same
  effect as CALL 0000 and CALL 0005 under CP/M. (These linkages are not the
  "approved" means of obtaining these services, however.)

  The word at offset 0002H in the PSP contains the segment address of the
  top of the transient program's allocated memory block. The program can use
  this value to determine whether it should request more memory to do its
  job or whether it has extra memory that it can release for use by other
  processes.

  Offsets 000AH through 0015H in the PSP contain the previous contents of
  the interrupt vectors for the termination, Ctrl-C, and critical-error
  handlers. If the transient program alters these vectors for its own
  purposes, MS-DOS restores the original values saved in the PSP when the
  program terminates.

  The word at PSP offset 002CH holds the segment address of the environment
  block, which contains a series of ASCIIZ strings (sequences of ASCII
  characters terminated by a null, or zero, byte). The environment block is
  inherited from the program that called the EXEC function to load the
  currently executing program. It contains such information as the current
  search path used by COMMAND.COM to find executable programs, the location
  on the disk of COMMAND.COM itself, and the format of the user prompt used
  by COMMAND.COM.

  The command tail──the remainder of the command line that invoked the
  transient program, after the program's name──is copied into the PSP
  starting at offset 0081H. The length of the command tail, not including
  the return character at its end, is placed in the byte at offset 0080H.
  Redirection or piping parameters and their associated filenames do not
  appear in the portion of the command line (the command tail) that is
  passed to the transient program, because redirection is transparent to
  applications.

  To provide compatibility with CP/M, MS-DOS parses the first two parameters
  in the command tail into two default file control blocks (FCBs) at
  PSP:005CH and PSP:006CH, under the assumption that they may be filenames.
  However, if the parameters are filenames that include a path
  specification, only the drive code will be valid in these default FCBs,
  because FCB-type file- and record-access functions do not support
  hierarchical file structures. Although the default FCBs were an aid in
  earlier years, when compatibility with CP/M was more of a concern, they
  are essentially useless in modern MS-DOS application programs that must
  provide full path support. (File control blocks are discussed in detail in
  Chapter 8 and hierarchical file structures are discussed in Chapter 9.)

  The 128-byte area from 0080H through 00FFH in the PSP also serves as the
  default disk transfer area (DTA), which is set by MS-DOS before passing
  control to the transient program. If the program does not explicitly
  change the DTA, any file read or write operations requested with the FCB
  group of function calls automatically use this area as a data buffer. This
  is rarely useful and is another facet of MS-DOS's handling of the PSP that
  is present only for compatibility with CP/M.

  ──────────────────────────────────────────────────────────────────────────
  WARNING
    Programs must not alter any part of the PSP below offset 005CH.
  ──────────────────────────────────────────────────────────────────────────


Introduction to .COM Programs

  Programs of the .COM persuasion are stored in disk files that hold an
  absolute image of the machine instructions to be executed. Because the
  files contain no relocation information, they are more compact, and are
  loaded for execution slightly faster, than equivalent .EXE files. Note
  that MS-DOS does not attempt to ascertain whether a .COM file actually
  contains executable code (there is no signature or checksum, as in the
  case of a .EXE file); it simply brings any file with the .COM extension
  into memory and jumps to it.

  Because .COM programs are loaded immediately above the program segment
  prefix and do not have a header that can specify another entry point, they
  must always have an origin of 0100H, which is the length of the PSP.
  Location 0100H must contain an executable instruction. The maximum length
  of a .COM program is 65,536 bytes, minus the length of the PSP (256 bytes)
  and a mandatory word of stack (2 bytes).

  When control is transferred to the .COM program from MS-DOS, all of the
  segment registers point to the PSP (Figure 3-2). The stack pointer
  register contains 0FFFEH if memory allows; otherwise, it is set as high as
  possible in memory minus 2 bytes. (MS-DOS pushes a zero word on the stack
  before entry.)

     SS:SP  ┌────────────────────────────────────────────────────────┐
            │                                                        │
            │       Stack grows downward from top of segment         │
            │                           │                            │
            │                           ▼                            │
            │                                                       │
            │                           │                            │
            │                 Program code and data                  │
            │                                                        │
  CS:0100H  ├────────────────────────────────────────────────────────┤
            │                 Program segment prefix                 │
  CS:0000H  └────────────────────────────────────────────────────────┘
  DS:0000H
  ES:0000H
  SS:0000H

  Figure 3-2.  A memory image of a typical .COM-type program after loading.
  The contents of the .COM file are brought into memory just above the
  program segment prefix. Program, code, and data are mixed together in the
  same segment, and all segment registers contain the same value.

  Although the size of an executable .COM file can't exceed 64 KB, the
  current versions of MS-DOS allocate all of the transient program area to
  .COM programs when they are loaded. Because many such programs date from
  the early days of MS-DOS and are not necessarily "well-behaved" in their
  approach to memory management, the operating system simply makes the
  worst-case assumption and gives .COM programs everything that is
  available. If a .COM program wants to use the EXEC function to invoke
  another process, it must first shrink down its memory allocation to the
  minimum memory it needs in order to continue, taking care to protect its
  stack. (This is discussed in more detail in Chapter 12.)

  When a .COM program finishes executing, it can return control to MS-DOS by
  several means. The preferred method is Int 21H Function 4CH, which allows
  the program to pass a return code back to the program, shell, or batch
  file that invoked it. However, if the program is running under MS-DOS
  version 1, it must exit by means of Int 20H, Int 21H Function 0, or a
  NEAR RETURN. (Because a word of zero was pushed onto the stack at entry, a
  NEAR RETURN causes a transfer to PSP:0000, which contains an Int 20H
  instruction.)

  A .COM-type application can be linked together from many separate object
  modules. All of the modules must use the same code-segment name and class
  name, and the module with the entry point at offset 0100H within the
  segment must be linked first. In addition, all of the procedures within a
  .COM program should have the NEAR attribute, because all executable code
  resides in one segment.

  When linking a .COM program, the linker will display the message

  Warning: no stack segment

  This message can be ignored. The linker output is a .EXE file, which must
  be converted into a .COM file with the MS-DOS EXE2BIN utility before
  execution. You can then delete the .EXE file. (An example of this process
  is provided in Chapter 4.)

An Example .COM Program

  The HELLO.COM program listed in Figure 3-3 demonstrates the structure of
  a simple assembly-language program that is destined to become a .COM file.
  (You may find it helpful to compare this listing with the HELLO.EXE
  program later in this chapter.) Because this program is so short and
  simple, a relatively high proportion of the source code is actually
  assembler directives that do not result in any executable code.

  The NAME statement simply provides a module name for use during the
  linkage process. This aids understanding of the map that the linker
  produces. In MASM versions 5.0 and later, the module name is always the
  same as the filename, and the NAME statement is ignored.

  The PAGE command, when used with two operands, as in line 2, defines the
  length and width of the page. These default respectively to 66 lines and
  80 characters. If you use the PAGE command without any operands, a
  formfeed is sent to the printer and a heading is printed. In larger
  programs, use the PAGE command liberally to place each of your subroutines
  on separate pages for easy reading.

  The TITLE command, in line 3, specifies the text string (limited to 60
  characters) that is to be printed at the upper left corner of each page.
  The TITLE command is optional and cannot be used more than once in each
  assembly-language source file.

  ──────────────────────────────────────────────────────────────────────────
   1:          name    hello
   2:          page    55,132
   3:          title   HELLO.COM--print hello on terminal
   4:
   5:  ;
   6:  ; HELLO.COM:    demonstrates various components
   7:  ;               of a functional .COM-type assembly-
   8:  ;               language program, and an MS-DOS
   9:  ;               function call.
  10:  ;
  11:  ; Ray Duncan, May 1988
  12:  ;
  13:
  14:  stdin   equ     0               ; standard input handle
  15:  stdout  equ     1               ; standard output handle
  16:  stderr  equ     2               ; standard error handle
  17:
  18:  cr      equ     0dh             ; ASCII carriage return
  19:  lf      equ     0ah             ; ASCII linefeed
  20:
  21:
  22:  _TEXT   segment word public 'CODE'
  23:
  24:          org     100h            ; .COM files always have
  25:                                  ; an origin of 100h
  26:
  27:          assume  cs:_TEXT,ds:_TEXT,es:_TEXT,ss:_TEXT
  28:
  29:  print   proc    near            ; entry point from MS-DOS
  30:
  31:          mov     ah,40h          ; function 40h = write
  32:          mov     bx,stdout       ; handle for standard output
  33:          mov     cx,msg_len      ; length of message
  34:          mov     dx,offset msg   ; address of message
  35:          int     21h             ; transfer to MS-DOS
  36:
  37:          mov     ax,4c00h        ; exit, return code = 0
  38:          int     21h             ; transfer to MS-DOS
  39:
  40:  print   endp
  41:
  42:
  43:  msg     db      cr,lf           ; message to display
  44:          db      'Hello World!',cr,lf
  45:
  46:  msg_len equ     $-msg           ; length of message
  47:
  48:
  49:  _TEXT   ends
  50:
  51:          end     print           ; defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-3.  The HELLO.COM program listing.

  Dropping down past a few comments and EQU statements, we come to a
  declaration of a code segment that begins in line 22 with a SEGMENT
  command and ends in line 49 with an ENDS command. The label in the
  leftmost field of line 22 gives the code segment the name _TEXT. The
  operand fields at the right end of the line give the segment the
  attributes WORD, PUBLIC, and `CODE'. (You might find it helpful to read
  the Microsoft Macro Assembler manual for detailed explanations of each
  possible segment attribute.)

  Because this program is going to be converted into a .COM file, all of its
  executable code and data areas must lie within one code segment. The
  program must also have its origin at offset 0100H (immediately above the
  program segment prefix), which is taken care of by the ORG statement
  in line 24.

  Following the ORG instruction, we encounter an ASSUME statement on line
  27. The concept of ASSUME often baffles new assembly-language programmers.
  In a way, ASSUME doesn't "do" anything; it simply tells the assembler
  which segment registers you are going to use to point to the various
  segments of your program, so that the assembler can provide segment
  overrides when they are necessary. It's important to notice that the
  ASSUME statement doesn't take care of loading the segment registers with
  the proper values; it merely notifies the assembler of your intent to do
  that within the program. (Remember that, in the case of a .COM program,
  MS-DOS initializes all the segment registers before entry to point to the
  PSP.)

  Within the code segment, we come to another type of block declaration that
  begins with the PROC command on line 29 and closes with ENDP on line 40.
  These two instructions declare the beginning and end of a procedure, a
  block of executable code that performs a single distinct function. The
  label in the leftmost field of the PROC statement (in this case, print)
  gives the procedure a name. The operand field gives it an attribute. If
  the procedure carries the NEAR attribute, only other code in the same
  segment can call it, whereas if it carries the FAR attribute, code located
  anywhere in the CPU's memory-addressing space can call it. In .COM
  programs, all procedures carry the NEAR attribute.

  For the purposes of this example program, I have kept the print procedure
  ridiculously simple. It calls MS-DOS Int 21H Function 40H to send the
  message Hello World! to the video screen, and calls Int 21H Function 4CH
  to terminate the program.

  The END statement in line 51 tells the assembler that it has reached the
  end of the source file and also specifies the entry point for the program.
  If the entry point is not a label located at offset 0100H, the .EXE file
  resulting from the assembly and linkage of this source program cannot be
  converted into a .COM file.


Introduction to .EXE Programs

  We have just discussed a program that was written in such a way that it
  could be assembled into a .COM file. Such a program is simple in
  structure, so a programmer who needs to put together this kind of quick
  utility can concentrate on the program logic and do a minimum amount of
  worrying about control of the assembler. However, .COM-type programs have
  some definite disadvantages, and so most serious assembly-language efforts
  for MS-DOS are written to be converted into .EXE files.

  Although .COM programs are effectively restricted to a total size of 64 KB
  for machine code, data, and stack combined, .EXE programs can be
  practically unlimited in size (up to the limit of the computer's available
  memory). .EXE programs also place the code, data, and stack in separate
  parts of the file. Although the normal MS-DOS program loader does not take
  advantage of this feature of .EXE files, the ability to load different
  parts of large programs into several separate memory fragments, as well as
  the opportunity to designate a "pure" code portion of your program that
  can be shared by several tasks, is very significant in multitasking
  environments such as Microsoft Windows.

  The MS-DOS loader always brings a .EXE program into memory immediately
  above the program segment prefix, although the order of the code, data,
  and stack segments may vary (Figure 3-4). The .EXE file has a header, or
  block of control information, with a characteristic format (Figures 3-5
  and 3-6). The size of this header varies according to the number of
  program instructions that need to be relocated at load time, but it is
  always a multiple of 512 bytes.

  Before MS-DOS transfers control to the program, the initial values of the
  code segment (CS) register and instruction pointer (IP) register are
  calculated from the entry-point information in the .EXE file header and
  the program's load address. This information derives from an END statement
  in the source code for one of the program's modules. The data segment (DS)
  and extra segment (ES) registers are made to point to the PSP so that the
  program can access the environment-block pointer, command tail, and other
  useful information contained there.

     SS:SP ┌────────────────────────────────────────────────────────┐
           │                                                        │
           │                     Stack segment:                     │
           │        stack grows downward from top of segment        │
           │                           │                            │
           │                           ▼                            │
  SS:0000H ├────────────────────────────────────────────────────────┤
           │                      Data segment                      │
           ├────────────────────────────────────────────────────────┤
           │                      Program code                      │
  CS:0000H ├────────────────────────────────────────────────────────┤
           │                 Program segment prefix                 │
  DS:0000H └────────────────────────────────────────────────────────┘
  ES:0000H

  Figure 3-4.  A memory image of a typical .EXE-type program immediately
  after loading. The contents of the .EXE file are relocated and brought
  into memory above the program segment prefix. Code, data, and stack reside
  in separate segments and need not be in the order shown here. The entry
  point can be anywhere in the code segment and is specified by the END
  statement in the main module of the program. When the program receives
  control, the DS (data segment) and ES (extra segment) registers point to
  the program segment prefix; the program usually saves this value and then
  resets the DS and ES registers to point to its data area.

  The initial contents of the stack segment (SS) and stack pointer (SP)
  registers come from the header. This information derives from the
  declaration of a segment with the attribute STACK somewhere in the
  program's source code. The memory space allocated for the stack may be
  initialized or uninitialized, depending on the stack-segment definition;
  many programmers like to initialize the stack memory with a recognizable
  data pattern so that they can inspect memory dumps and determine how much
  stack space is actually used by the program.

  When a .EXE program finishes processing, it should return control to
  MS-DOS through Int 21H Function 4CH. Other methods are available, but
  they offer no advantages and are considerably less convenient (because
  they usually require the CS register to point to the PSP).

  Byte
  offset
  0000H ┌────────────────────────────────────────────────────────┐
        │           First of .EXE file signature (4DH)           │
  0001H ├────────────────────────────────────────────────────────┤
        │        Second part of .EXE file signature (5AH)        │
  0002H ├────────────────────────────────────────────────────────┤
        │                 Length of file MOD 512                 │
  0004H ├────────────────────────────────────────────────────────┤
        │    Size of file in 512-byte pages, including header    │
  0006H ├────────────────────────────────────────────────────────┤
        │            Number of relocation-table items            │
  0008H ├────────────────────────────────────────────────────────┤
        │      Size of header in paragraphs (16-byte units)      │
  000AH ├────────────────────────────────────────────────────────┤
        │   Minimum number of paragraphs needed above program    │
  000CH ├────────────────────────────────────────────────────────┤
        │   Maximum number of paragraphs desired above program   │
  000EH ├────────────────────────────────────────────────────────┤
        │          Segment displacement of stack module          │
  0010H ├────────────────────────────────────────────────────────┤
        │            Contents of SP register at entry            │
  0012H ├────────────────────────────────────────────────────────┤
        │                     Word checksum                      │
  0014H ├────────────────────────────────────────────────────────┤
        │            Contents of IP register at entry            │
  0016H ├────────────────────────────────────────────────────────┤
        │          Segment displacement of code module           │
  0018H ├────────────────────────────────────────────────────────┤
        │        Offset of first relocation item in file         │
  001AH ├────────────────────────────────────────────────────────┤
        │    Overlay number (0 for resident part of program)     │
  001BH ├────────────────────────────────────────────────────────┤
        │                Variable reserved space                 │
        ├────────────────────────────────────────────────────────┤
        │                    Relocation table                    │
        ├────────────────────────────────────────────────────────┤
        │                Variable reserved space                 │
        ├────────────────────────────────────────────────────────┤
        │               Program and data segments                │
        ├────────────────────────────────────────────────────────┤
        │                     Stack segment                      │
        └────────────────────────────────────────────────────────┘

  Figure 3-5.  The format of a .EXE load module.

  The input to the linker for a .EXE-type program can be many separate
  object modules. Each module can use a unique code-segment name, and the
  procedures can carry either the NEAR or the FAR attribute, depending on
  naming conventions and the size of the executable code. The programmer
  must take care that the modules linked together contain only one segment
  with the STACK attribute and only one entry point defined with an END
  assembler directive. The output from the linker is a file with a .EXE
  extension. This file can be executed immediately.

  ──────────────────────────────────────────────────────────────────────────
  C>DUMP HELLO.EXE
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  4D 5A 28 00 02 00 01 00 20 00 09 00 FF FF 03 00  MZ(..... .......
  0010  80 00 20 05 00 00 00 00 1E 00 00 00 01 00 01 00  .. .............
  0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0050  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        .
        .
        .
  0200  B8 01 00 8E D8 B4 40 BB 01 00 B9 10 00 90 BA 08  ......@.........
  0210  00 CD 21 B8 00 4C CD 21 0D 0A 48 65 6C 6C 6F 20  ..!..L.!..Hello
  0220  57 6F 72 6C 64 21 0D 0A                          World!..
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-6.  A hex dump of the HELLO.EXE program, demonstrating the
  contents of a simple .EXE load module. Note the following interesting
  values: the .EXE signature in bytes 0000H and 0001H, the number of
  relocation-table items in bytes 0006H and 0007H, the minimum extra memory
  allocation (MIN_ALLOC) in bytes 000AH and 000BH, the maximum extra memory
  allocation (MAX_ALLOC) in bytes 000CH and 000DH, and the initial IP
  (instruction pointer) register value in bytes 0014H and 0015H. See also
  Figure 3-5.

An Example .EXE Program

  The HELLO.EXE program in Figure 3-7 demonstrates the fundamental
  structure of an assembly-language program that is destined to become a
  .EXE file. At minimum, it should have a module name, a code segment, a
  stack segment, and a primary procedure that receives control of the
  computer from MS-DOS after the program is loaded. The HELLO.EXE program
  also contains a data segment to provide a more complete example.

  The NAME, TITLE, and PAGE directives were covered in the HELLO.COM example
  program and are used in the same manner here, so we'll move to the first
  new item of interest. After a few comments and EQU statements, we come to
  a declaration of a code segment that begins on line 21 with a SEGMENT
  command and ends on line 41 with an ENDS command. As in the HELLO.COM
  example program, the label in the leftmost field of the line gives the
  code segment the name _TEXT. The operand fields at the right end of the
  line give the attributes WORD, PUBLIC, and `CODE'.

  Following the code-segment instruction, we find an ASSUME statement on
  line 23. Notice that, unlike the equivalent statement in the HELLO.COM
  program, the ASSUME statement in this program specifies several different
  segment names. Again, remember that this statement has no direct effect on
  the contents of the segment registers but affects only the operation of
  the assembler itself.

  ──────────────────────────────────────────────────────────────────────────
   1:          name    hello
   2:          page    55,132
   3:          title   HELLO.EXE--print Hello on terminal
   4:  ;
   5:  ; HELLO.EXE:    demonstrates various components
   6:  ;               of a functional .EXE-type assembly-
   7:  ;               language program, use of segments,
   8:  ;               and an MS-DOS function call.
   9:  ;
  10:  ; Ray Duncan, May 1988
  11:  ;
  12:
  13:  stdin   equ     0               ; standard input handle
  14:  stdout  equ     1               ; standard output handle
  15:  stderr  equ     2               ; standard error handle
  16:
  17:  cr      equ     0dh             ; ASCII carriage return
  18:  lf      equ     0ah             ; ASCII linefeed
  19:
  20:
  21:  _TEXT   segment word public 'CODE'
  22:
  23:          assume  cs:_TEXT,ds:_DATA,ss:STACK
  24:
  25:  print   proc    far             ; entry point from MS-DOS
  26:
  27:          mov     ax,_DATA        ; make our data segment
  28:          mov     ds,ax           ; addressable...
  29:
  30:          mov     ah,40h          ; function 40h = write
  31:          mov     bx,stdout       ; standard output handle
  32:          mov     cx,msg_len      ; length of message
  33:          mov     dx,offset msg   ; address of message
  34:          int     21h             ; transfer to MS-DOS
  35:
  36:          mov     ax,4c00h        ; exit, return code = 0
  37:          int     21h             ; transfer to MS-DOS
  38:
  39:  print   endp
  40:
  41:  _TEXT   ends
  42:
  43:
  44:  _DATA   segment word public 'DATA'
  45:
  46:  msg     db      cr,lf           ; message to display
  47:          db      'Hello World!',cr,lf
  48:
  49:  msg_len equ     $-msg           ; length of message
  50:
  51:  _DATA   ends
  52:
  53:
  54:  STACK   segment para stack `STACK'
  55:
  56:          db      128 dup (?)
  57:
  58:  STACK   ends
  59:
  60:          end     print           ; defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-7.  The HELLO.EXE program listing.

  Within the code segment, the main print procedure is declared by the PROC
  command on line 25 and closed with ENDP on line 39. Because the procedure
  resides in a .EXE file, we have given it the FAR attribute as an example,
  but the attribute is really irrelevant because the program is so small and
  the procedure is not called by anything else in the same program.

  The print procedure first initializes the DS register, as indicated in the
  earlier ASSUME statement, loading it with a value that causes it to point
  to the base of the data area. (MS-DOS automatically sets up the CS and SS
  registers.) Next, the procedure uses MS-DOS Int 21H Function 40H to
  display the message Hello World! on the screen, just as in the HELLO.COM
  program. Finally, the procedure exits back to MS-DOS with an Int 21H
  Function 4CH on lines 36 and 37, passing a return code of zero (which by
  convention means a success).

  Lines 44 through 51 declare a data segment named _DATA, which contains the
  variables and constants the program will use. If the various modules of a
  program contain multiple data segments with the same name, the linker will
  collect them and place them in the same physical memory segment.

  Lines 54 through 58 establish a stack segment; PUSH and POP instructions
  will access this area of scratch memory. Before MS-DOS transfers control
  to a .EXE program, it sets up the SS and SP registers according to the
  declared size and location of the stack segment. Be sure to allow enough
  room for the maximum stack depth that can occur at runtime, plus a safe
  number of extra words for registers pushed onto the stack during an MS-DOS
  service call. If the stack overflows, it may damage your other code and
  data segments and cause your program to behave strangely or even to crash
  altogether!

  The END statement on line 60 winds up our brief HELLO.EXE program, telling
  the assembler that it has reached the end of the source file and providing
  the label of the program's point of entry from MS-DOS.

  The differences between .COM and .EXE programs are summarized in Figure
  3-8.

╓┌─┌──────────────────┌──────────────────────────┌───────────────────────────╖
                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────
  Maximum size       65,536 bytes minus 256     No limit
                     bytes for PSP and 2 bytes
                     for stack

  Entry point        PSP:0100H                  Defined by END statement

  AL at entry        00H if default FCB #1 has  Same
                     valid drive, 0FFH if
                     invalid drive

                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────

  AH at entry        00H if default FCB #2 has  Same
                     valid drive, 0FFH if
                     invalid drive

  CS at entry        PSP                        Segment containing module
                                                with entry point

  IP at entry        0100H                      Offset of entry point within
                                                its segment

  DS at entry        PSP                        PSP

  ES at entry        PSP                        PSP

  SS at entry        PSP                        Segment with STACK attribute

  SP at entry        0FFFEH or top word in      Size of segment defined with
                     available memory,          STACK attribute
                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────
                     available memory,          STACK attribute
                     whichever is lower

  Stack at entry     Zero word                  Initialized or uninitialized

  Stack size         65,536 bytes minus 256     Defined in segment with
                     bytes for PSP and size of  STACK attribute
                     executable code and data

  Subroutine calls   Usually NEAR               NEAR or FAR

  Exit method        Int 21H Function 4CH      Int 21H Function 4CH
                     preferred, NEAR RET if     preferred
                     MS-DOS version 1

  Size of file       Exact size of program      Size of program plus header
                                                (multiple of 512 bytes)
  ──────────────────────────────────────────────────────────────────────────

                     .COM program               .EXE program
  ──────────────────────────────────────────────────────────────────────────


  Figure 3-8.  Summary of the differences between .COM and .EXE programs,
  including their entry conditions.


More About Assembly-Language Programs

  Now that we've looked at working examples of .COM and .EXE
  assembly-language programs, let's backtrack and discuss their elements a
  little more formally. The following discussion is based on the Microsoft
  Macro Assembler, hereafter referred to as MASM. If you are familiar with
  MASM and are an experienced assembly-language programmer, you may want to
  skip this section.

  MASM programs can be thought of as having three structural levels:

  ■  The module level

  ■  The segment level

  ■  The procedure level

  Modules are simply chunks of source code that can be independently
  maintained and assembled. Segments are physical groupings of like items
  (machine code or data) within a program and a corresponding segregation of
  dissimilar items. Procedures are functional subdivisions of an executable
  program──routines that carry out a particular task.

Program Modules

  Under MS-DOS, the module-level structure consists of files containing the
  source code for individual routines. Each source file is translated by the
  assembler into a relocatable object module. An object module can reside
  alone in an individual file or with many other object modules in an
  object-module library of frequently used or related routines. The
  Microsoft Object Linker (LINK) combines object-module files, often with
  additional object modules extracted from libraries, into an executable
  program file.

  Using modules and object-module libraries reduces the size of your
  application source files (and vastly increases your productivity), because
  these files need not contain the source code for routines they have in
  common with other programs. This technique also allows you to maintain the
  routines more easily, because you need to alter only one copy of their
  source code stored in one place, instead of many copies stored in
  different applications. When you improve (or fix) one of these routines,
  you can simply reassemble it, put its object module back into the library,
  relink all of the programs that use the routine, and voilga: instant
  upgrade.

Program Segments

  The term segments refers to two discrete programming concepts: physical
  segments and logical segments.

  Physical segments are 64 KB blocks of memory. The Intel 8086/8088 and
  80286 microprocessors have four segment registers, which are essentially
  used as pointers to these blocks. (The 80386 has six segment registers,
  which are a superset of those found on the 8086/8088 and 80286.) Each
  segment register can point to the bottom of a different 64 KB area of
  memory. Thus, a program can address any location in memory by appropriate
  manipulation of the segment registers, but the maximum amount of memory
  that it can address simultaneously is 256 KB.

  As we discussed earlier in the chapter, .COM programs assume that all four
  segment registers always point to the same place──the bottom of the
  program. Thus, they are limited to a maximum size of 64 KB. .EXE programs,
  on the other hand, can address many different physical segments and can
  reset the segment registers to point to each segment as it is needed.
  Consequently, the only practical limit on the size of a .EXE program is
  the amount of available memory. The example programs throughout the
  remainder of this book focus on .EXE programs.

  Logical segments are the program components. A minimum of three logical
  segments must be declared in any .EXE program: a code segment, a data
  segment, and a stack segment. Programs with more than 64 KB of code or
  data have more than one code or data segment. The routines or data that
  are used most frequently are put into the primary code and data segments
  for speed, and routines or data that are used less frequently are put into
  secondary code and data segments.

  Segments are declared with the SEGMENT and ENDS directives in the
  following form:

  name   SEGMENT attributes
  .
  .
  .
  name   ENDS

  The attributes of a segment include its align type (BYTE, WORD, or PARA),
  combine type (PUBLIC, PRIVATE, COMMON, or STACK), and class type. The
  segment attributes are used by the linker when it is combining logical
  segments to create the physical segments of an executable program. Most of
  the time, you can get by just fine using a small selection of attributes
  in a rather stereotypical way. However, if you want to use the full range
  of attributes, you might want to read the detailed explanation in the MASM
  manual.

  Programs are classified into one memory model or another based on the
  number of their code and data segments. The most commonly used memory
  model for assembly-language programs is the small model, which has one
  code and one data segment, but you can also use the medium, compact, and
  large models (Figure 3-9). (Two additional models exist with which we
  will not be concerning ourselves further: the tiny model, which consists
  of intermixed code and data in a single segment── for example, a .COM file
  under MS-DOS; and the huge model, which is supported by the Microsoft C
  Optimizing Compiler and which allows use of data structures larger than 64
  KB.)

  Model                    Code segments           Data segments
  ──────────────────────────────────────────────────────────────────────────
  Small                    One                     One
  Medium                   Multiple                One
  Compact                  One                     Multiple
  Large                    Multiple                Multiple
  ──────────────────────────────────────────────────────────────────────────

  Figure 3-9.  Memory models commonly used in assembly-language and C
  programs.

  For each memory model, Microsoft has established certain segment and class
  names that are used by all its high-level-language compilers (Figure
  3-10). Because segment names are arbitrary, you may as well adopt the
  Microsoft conventions. Their use will make it easier for you to integrate
  your assembly-language routines into programs written in languages such as
  C, or to use routines from high-level-language libraries in your
  assembly-language programs.

  Another important Microsoft high-level-language convention is to use the
  GROUP directive to name the near data segment (the segment the program
  expects to address with offsets from the DS register) and the stack
  segment as members of DGROUP (the automatic data group), a special name
  recognized by the linker and also by the program loaders in Microsoft
  Windows and Microsoft OS/2. The GROUP directive causes logical segments
  with different names to be combined into a single physical segment so that
  they can be addressed using the same segment base address. In C programs,
  DGROUP also contains the local heap, which is used by the C runtime
  library for dynamic allocation of small amounts of memory.

╓┌─┌───────────┌────────────┌───────────┌───────────┌────────────┌───────────╖
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
  Small       _TEXT        WORD        PUBLIC      CODE
              _DATA        WORD        PUBLIC      DATA         DGROUP
              STACK        PARA        STACK       STACK        DGROUP

  Medium      module_TEXT  WORD        PUBLIC      CODE
              .            WORD        PUBLIC      DATA         DGROUP
              .
              .
              _DATA
              STACK        PARA        STACK       STACK        DGROUP

  Compact     _TEXT        WORD        PUBLIC      CODE
              data         PARA        PRIVATE     FAR_DATA
              .            WORD        PUBLIC      DATA         DGROUP
              .
              .
              _DATA
              STACK        PARA        STACK       STACK        DGROUP
  Memory      Segment      Align       Combine     Class        Group
  model       name         type        type        type
  ──────────────────────────────────────────────────────────────────────────
              STACK        PARA        STACK       STACK        DGROUP

  Large       module_TEXT  WORD        PUBLIC      CODE
              .
              .
              .
              data         PARA        PRIVATE     FAR_DATA
              .
              .
              .
              _DATA        WORD        PUBLIC      DATA         DGROUP
              STACK        PARA        STACK       STACK        DGROUP
  ──────────────────────────────────────────────────────────────────────────


  Figure 3-10.  Segments, groups, and classes for the standard memory models
  as used with assembly-language programs. The Microsoft C Optimizing
  Compiler and other high-level-language compilers use a superset of these
  segments and classes.

  For pure assembly-language programs that will run under MS-DOS, you can
  ignore DGROUP. However, if you plan to integrate assembly-language
  routines and programs written in high-level languages, you'll want to
  follow the Microsoft DGROUP convention. For example, if you are planning
  to link routines from a C library into an assembly-language program, you
  should include the line

  DGROUP group _DATA,STACK

  near the beginning of the program.

  The final Microsoft convention of interest in creating .EXE programs is
  segment order. The high-level compilers assume that code segments always
  come first, followed by far data segments, followed by the near data
  segment, with the stack and heap last. This order won't concern you much
  until you begin integrating assembly-language code with routines from
  high-level-language libraries, but it is easiest to learn to use the
  convention right from the start.

Program Procedures

  The procedure level of program structure is partly real and partly
  conceptual. Procedures are basically just a fancy guise for subroutines.

  Procedures within a program are declared with the PROC and ENDP directives
  in the following form:

  name   PROC attribute
  .
  .
  .
         RET
  name   ENDP

  The attribute carried by a PROC declaration, which is either NEAR or FAR,
  tells the assembler what type of call you expect to use to enter the
  procedure──that is, whether the procedure will be called from other
  routines in the same segment or from routines in other segments. When the
  assembler encounters a RET instruction within the procedure, it uses the
  attribute information to generate the correct opcode for either a near
  (intra-segment) or far (inter-segment) return.

  Each program should have a main procedure that receives control from
  MS-DOS. You specify the entry point for the program by including the name
  of the main procedure in the END statement in one of the program's source
  files. The main procedure's attribute (NEAR or FAR) is really not too
  important, because the program returns control to MS-DOS with a function
  call rather than a RET instruction. However, by convention, most
  programmers assign the main procedure the FAR attribute anyway.

  You should break the remainder of the program into procedures in an
  orderly way, with each procedure performing a well-defined single
  function, returning its results to its caller, and avoiding actions that
  have global effects within the program. Ideally procedures invoke each
  other only by CALL instructions, have only one entry point and one exit
  point, and always exit by means of a RET instruction, never by jumping to
  some other location within the program.

  For ease of understanding and maintenance, a procedure should not exceed
  one page (about 60 lines); if it is longer than a page, it is probably too
  complex and you should delegate some of its function to one or more
  subsidiary procedures. You should preface the source code for each
  procedure with a detailed comment that states the procedure's calling
  sequence, results returned, registers affected, and any data items
  accessed or modified. The effort invested in making your procedures
  compact, clean, flexible, and well-documented will be repaid many times
  over when you reuse the procedures in other programs.



────────────────────────────────────────────────────────────────────────────
Chapter 4  MS-DOS Programming Tools

  Preparing a new program to run under MS-DOS is an iterative process with
  four basic steps:

  ■  Use of a text editor to create or modify an ASCII source-code file

  ■  Use of an assembler or high-level-language compiler (such as the
     Microsoft Macro Assembler or the Microsoft C Optimizing Compiler) to
     translate the source file into relocatable object code

  ■  Use of a linker to transform the relocatable object code into an
     executable MS-DOS load module

  ■  Use of a debugger to methodically test and debug the program

  Additional utilities the MS-DOS software developer may find necessary or
  helpful include the following:

  ■  LIB, which creates and maintains object-module libraries

  ■  CREF, which generates a cross-reference listing

  ■  EXE2BIN, which converts .EXE files to .COM files

  ■  MAKE, which compares dates of files and carries out operations based on
     the result of the comparison

  This chapter gives an operational overview of the Microsoft programming
  tools for MS-DOS, including the assembler, the C compiler, the linker, and
  the librarian. In general, the information provided here also applies to
  the IBM programming tools for MS-DOS, which are really the Microsoft
  products with minor variations and different version numbers. Even if your
  preferred programming language is not C or assembly language, you will
  need at least a passing familiarity with these tools because all of the
  examples in the IBM and Microsoft DOS reference manuals are written in one
  of these languages.

  The survey in this chapter, together with the example programs and
  reference section elsewhere in the book, should provide the experienced
  programmer with sufficient information to immediately begin writing useful
  programs. Readers who do not have a background in C, assembly language, or
  the Intel 80x86 microprocessor architecture should refer to the tutorial
  and reference works listed at the end of this chapter.


File Types

  The MS-DOS programming tools can create and process many different file
  types. The following extensions are used by convention for these files:

╓┌─┌──────────┌──────────────────────────────────────────────────────────────╖
  Extension  File type
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
  .ASM       Assembly-language source file

  .C         C source file

  .COM       MS-DOS executable load module that does not require relocation
             at runtime

  .CRF       Cross-reference information file produced by the assembler for
             processing by CREF.EXE

  .DEF       Module-definition file describing a program's segment behavior
             (MS OS/2 and Microsoft Windows programs only; not relevant to
             normal MS-DOS applications)

  .EXE       MS-DOS executable load module that requires relocation at
             runtime

  .H         C header file containing C source code for constants, macros,
             and functions; merged into another C program with the #include
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
             and functions; merged into another C program with the #include
             directive

  .INC       Include file for assembly-language programs, typically
             containing macros and/or equates for systemwide values such as
             error codes

  .LIB       Object-module library file made up of one or more .OBJ files;
             indexed and manipulated by LIB.EXE

  .LST       Program listing, produced by the assembler, that includes
             memory locations, machine code, the original program text, and
             error messages

  .MAP       Listing of symbols and their locations within a load module;
             produced by the linker

  .OBJ       Relocatable-object-code file produced by an assembler or
             compiler
  Extension  File type
  ──────────────────────────────────────────────────────────────────────────
             compiler

  .REF       Cross-reference listing produced by CREF.EXE from the
             information in a .CRF file
  ──────────────────────────────────────────────────────────────────────────



The Microsoft Macro Assembler

  The Microsoft Macro Assembler (MASM) is distributed as the file MASM.EXE.
  When beginning a program translation, MASM needs the following
  information:

  ■  The name of the file containing the source program

  ■  The filename for the object program to be created

  ■  The destination of the program listing

  ■  The filename for the information that is later processed by the
     cross-reference utility (CREF.EXE)

  You can invoke MASM in two ways. If you enter the name of the assembler
  alone, it prompts you for the names of each of the various input and
  output files. The assembler supplies reasonable defaults for all the
  responses except the source filename, as shown in the following example:

  C>MASM  <Enter>

  Microsoft (R) Macro Assembler Version 5.10
  Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.

  Source filename [.ASM]: HELLO  <Enter>
  Object filename [HELLO.OBJ]:  <Enter>
  Source listing  [NUL.LST]:  <Enter>
  Cross-reference [NUL.CRF]:  <Enter>

    49006 Bytes symbol space free

        0 Warning Errors
        0 Severe Errors

  C>

  You can use a logical device name (such as PRN or COM1) at any of the MASM
  prompts to send that output of the assembler to a character device rather
  than a file. Note that the default for the listing and cross-reference
  files is the NUL device──that is, no file is created. If you end any
  response with a semicolon, MASM assumes that the remaining responses are
  all to be the default.

  A more efficient way to use MASM is to supply all parameters in the
  command line, as follows:

    MASM [options] source,[object],[listing],[crossref]

  For example, the following command lines are equivalent to the preceding
  interactive session:

  C>MASM HELLO,,NUL,NUL  <Enter>

  or

  C>MASM HELLO;  <Enter>

  These commands use the file HELLO.ASM as the source, generate the
  object-code file HELLO.OBJ, and send the listing and cross-reference files
  to the bit bucket.

  MASM accepts several optional switches in the command line, to control
  code generation and output files. Figure 4-1 lists the switches accepted
  by MASM version 5.1. As shown in the following example, you can put
  frequently used options in a MASM environment variable, where they will be
  found automatically by the assembler:

  C>SET MASM=/T /Zi  <Enter>

  The switches in the environment variable will be overridden by any that
  you enter in the command line.

  In other versions of the Microsoft Macro Assembler, additional or fewer
  switches may be available. For exact instructions, see the manual for the
  version of MASM that you are using.

╓┌─┌──────────┌──────────────────────────────────────────────────────────────╖
  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
  /A         Arrange segments in alphabetic order.
  /Bn        Set size of source-file buffer (in KB).
  /C         Force creation of a cross-reference (.CRF) file.
  /D         Produce listing on both passes (to find phase errors).
  /Dsymbol   Define symbol as a null text string (symbol can be referenced
             by conditional assembly directives in file).
  /E         Assemble for 80x87 numeric coprocessor emulator using IEEE
             real-number format.
  /Ipath     Set search path for include files.
  /L         Force creation of a program-listing file.
  /LA        Force listing of all generated code.
  /ML        Preserve case sensitivity in all names (uppercase names
             distinct from their lowercase equivalents).
  /MX        Preserve lowercase in external names only (names defined with
             PUBLIC or EXTRN directives).
  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
             PUBLIC or EXTRN directives).
  /MU        Convert all lowercase names to uppercase.
  /N         Suppress generation of tables of macros, structures, records,
             segments, groups, and symbols at the end of the listing.
  /P         Check for impure code in 80286/80386 protected mode.
  /S         Arrange segments in order of occurrence (default).
  /T         "Terse" mode; suppress all messages unless errors are
             encountered during the assembly.
  /V         "Verbose" mode; report number of lines and symbols at end of
             assembly.
  /Wn        Set error display (warning) level; n=0─2.
  /X         Force listing of false conditionals.
  /Z         Display source lines containing errors on the screen.
  /Zd        Include line-number information in .OBJ file.
  /Zi        Include line-number and symbol information in .OBJ file.
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-1.  Microsoft Macro Assembler version 5.1 switches.

  MASM allows you to override the default extensions on any file──a feature
  that can be rather dangerous. For example, if in the preceding example you
  had responded to the Object filename prompt with HELLO.ASM, the assembler
  would have accepted the entry without comment and destroyed your source
  file. This is not too likely to happen in the interactive command mode,
  but you must be very careful with file extensions when MASM is used in a
  batch file.


The Microsoft C Optimizing Compiler

  The Microsoft C Optimizing Compiler consists of three executable files──
  C1.EXE, C2.EXE, and C3.EXE──that implement the C preprocessor, language
  translator, code generator, and code optimizer. An additional control
  program, CL.EXE, executes the three compiler files in order, passing each
  the necessary information about filenames and compilation options.

  Before using the C compiler and the linker, you need to set up four
  environment variables:

  Variable                 Action
  ──────────────────────────────────────────────────────────────────────────
  PATH=path                Specifies the location of the three executable C
                           compiler files (C1, C2, and C3) if they are not
                           in the current directory; used by CL.EXE.

  INCLUDE=path             Specifies the location of #include files (default
                           extension .H) that are not found in the current
                           directory.

  LIB=path                 Specifies the location(s) for object-code
                           libraries that are not found in the current
                           directory.

  TMP=path                 Specifies the location for temporary working
                           files created by the C compiler and linker.
  ──────────────────────────────────────────────────────────────────────────

  CL.EXE does not support an interactive mode or response files. You always
  invoke it with a command line of the following form:

    CL [options] file [file ...]

  You may list any number of files──if a file has a .C extension, it will be
  compiled into a relocatable-object-module (.OBJ) file. Ordinarily, if the
  compiler encounters no errors, it automatically passes all resulting .OBJ
  files and any additional .OBJ files specified in the command line to the
  linker, along with the names of the appropriate runtime libraries.

  The C compiler has many optional switches controlling its memory models,
  output files, code generation, and code optimization. These are summarized
  in Figure 4-2. The C compiler's arcane switch syntax is derived largely
  from UNIX/XENIX, so don't expect it to make any sense.

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /Ax                      Select memory model:
                           C = compact model
                           H = huge model
                           L = large model
                           M = medium model
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           M = medium model
                           S = small model (default)
  /c                       Compile only; do not invoke linker.
  /C                       Do not strip comments.
  /D<name>[=text]          Define macro.
  /E                       Send preprocessor output to standard output.
  /EP                      Send preprocessor output to standard output
                           without line numbers.
  /F<n>                    Set stack size (in hexadecimal bytes).
  /Fa [filename]           Generate assembly listing.
  /Fc [filename]           Generate mixed source/object listing.
  /Fe [filename]           Force executable filename.
  /Fl [filename]           Generate object listing.
  /Fm [filename]           Generate map file.
  /Fo [filename]           Force object-module filename.
  /FPx                     Select floating-point control:
                           a = calls with alternate math library
                           c = calls with emulator library
                           c87 = calls with 8087 library
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           c87 = calls with 8087 library
                           i = in-line with emulator (default)
                           i87 = in-line with 8087
  /Fs [filename]           Generate source listing.
  /Gx                      Select code generation:
                           0 = 8086 instructions (default)
                           1 = 186 instructions
                           2 = 286 instructions
                           c = Pascal style function calls
                           s = no stack checking
                           t[n] = data size threshold
  /H<n>                    Specify external name length.
  /I<path>                 Specify additional #include path.
  /J                       Specify default char type as unsigned.
  /link [options]          Pass switches and library names to linker.
  /Ox                      Select optimization:
                           a = ignore aliasing
                           d = disable optimizations
                           i = enable intrinsic functions
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                           i = enable intrinsic functions
                           l = enable loop optimizations
                           n = disable "unsafe" optimizations
                           p = enable precision optimizations
                           r = disable in-line return
                           s = optimize for space
  /Ox                      t = optimize for speed (default)
                           w = ignore aliasing except across function
                           calls
                           x = enable maximum optimization (equivalent to
                           /Oailt /Gs)
  /P                       Send preprocessor output to file.
  /Sx                      Select source-listing control:
                           l<columns> = set line width
                           p<lines> = set page length
                           s<string> = set subtitle string
                           t<string> = set title string
  /Tc<file>                Compile file without .C extension.
  /u                       Remove all predefined macros.
  Switch                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /u                       Remove all predefined macros.
  /U<name>                 Remove specified predefined macro.
  /V<string>               Set version string.
  /W<n>                    Set warning level (0─3).
  /X                       Ignore "standard places" for include files.
  /Zx                      Select miscellaneous compilation control:
                           a = disable extensions
                           c = make Pascal functions case-insensitive
                           d = include line-number information
                           e = enable extensions (default)
                           g = generate declarations
                           i = include symbolic debugging information
                           l = remove default library info
                           p<n> = pack structures on n-byte boundary
                           s = check syntax only
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-2.  Microsoft C Optimizing Compiler version 5.1 switches.


The Microsoft Object Linker

  The object module produced by MASM from a source file is in a form that
  contains relocation information and may also contain unresolved references
  to external locations or subroutines. It is written in a common format
  that is also produced by the various high-level compilers (such as FORTRAN
  and C) that run under MS-DOS. The computer cannot execute object modules
  without further processing.

  The Microsoft Object Linker (LINK), distributed as the file LINK.EXE,
  accepts one or more of these object modules, resolves external references,
  includes any necessary routines from designated libraries, performs any
  necessary offset relocations, and writes a file that can be loaded and
  executed by MS-DOS. The output of LINK is always in .EXE load-module
  format. (See Chapter 3.)

  As with MASM, you can give LINK its parameters interactively or by
  entering all the required information in a single command line. If you
  enter the name of the linker alone, the following type of dialog ensues:

  C>LINK  <Enter>

  Microsoft (R) Overlay Linker  Version 3.61
  Copyright (C) Microsoft Corp 1983-1987. All rights reserved.

  Object Modules [.OBJ]: HELLO  <Enter>
  Run File [HELLO.EXE]:  <Enter>
  List File [NUL.MAP]: HELLO  <Enter>
  Libraries [.LIB]:  <Enter>

  C>

  If you are using LINK version 4.0 or later, the linker also asks for the
  name of a module-definition (.DEF) file. Simply press the Enter key in
  response to such a prompt. Module-definition files are used when building
  Microsoft Windows or MS OS/2 "new .EXE" executable files but are not
  relevant in normal MS-DOS applications.

  The input file for this example was HELLO.OBJ; the output files were
  HELLO.EXE (the executable program) and HELLO.MAP (the load map produced by
  the linker after all references and addresses were resolved). Figure 4-3
  shows the load map.

  ──────────────────────────────────────────────────────────────────────────
   Start  Stop   Length Name                   Class
   00000H 00017H 00018H _TEXT                  CODE
   00018H 00027H 00010H _DATA                  DATA
   00030H 000AFH 00080H STACK                  STACK
   000B0H 000BBH 0000CH $$TYPES                DEBTYP
   000C0H 000D6H 00017H $$SYMBOLS              DEBSYM

    Address         Publics by Name

    Address         Publics by Value

  Program entry point at 0000:0000
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-3.  Map produced by the Microsoft Object Linker (LINK) during the
  generation of the HELLO.EXE program from Chapter 3. The program contains
  one CODE, one DATA, and one STACK segment. The first instruction to be
  executed lies in the first byte of the CODE segment. The $$TYPES and
  $$SYMBOLS segments contain information for the CodeView debugger and are
  not part of the program; these segments are ignored by the normal MS-DOS
  loader.

  You can obtain the same result more quickly by entering all parameters in
  the command line, in the following form:

    LINK options objectfile, [exefile], [mapfile], [libraries]

  Thus, the command-line equivalent to the preceding interactive session is

  C>LINK HELLO,HELLO,HELLO,,  <Enter>

  or

  C>LINK HELLO,,HELLO;  <Enter>

  If you enter a semicolon as the last character in the command line, LINK
  assumes the default values for all further parameters.

  A third method of commanding LINK is with a response file. A response file
  contains lines of text that correspond to the responses you would give the
  linker interactively. You specify the name of the response file in the
  command line with a leading @ character, as follows:

    LINK @filename

  You can also enter the name of a response file at any prompt. If the
  response file is not complete, LINK will prompt you for the missing
  information.

  When entering linker commands, you can specify multiple object files with
  the + operator or with spaces, as in the following example:

  C>LINK HELLO+VMODE+DOSINT,MYPROG,,GRAPHICS;  <Enter>

  This command would link the files HELLO.OBJ, VMODE.OBJ, and DOSINT.OBJ,
  searching the library file GRAPHICS.LIB to resolve any references to
  symbols not defined in the specified object files, and would produce a
  file named MYPROG.EXE. LINK uses the current drive and directory when they
  are not explicitly included in a filename; it will not automatically use
  the same drive and directory you specified for a previous file in the same
  command line.

  By using the + operator or space characters in the libraries field, you
  can specify up to 32 library files to be searched. Each high-level-
  language compiler provides default libraries that are searched
  automatically during the linkage process if the linker can find them
  (unless they are explicitly excluded with the /NOD switch). LINK looks for
  libraries first in the current directory of the default disk drive, then
  along any paths that were provided in the command line, and finally along
  the path(s) specified by the LIB variable if it is present in the
  environment.

  LINK accepts several optional switches as part of the command line or at
  the end of any interactive prompt. Figure 4-4 lists these switches. The
  number of switches available and their actions vary among different
  versions of LINK. See your Microsoft Object Linker instruction manual for
  detailed information about your particular version.

╓┌─┌────────┌───────────────────────────┌────────────────────────────────────╖
  Switch   Full form                   Meaning
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /A:n     /ALIGNMENT:n                Set segment sector alignment factor.
                                       N must be a power of 2 (default =
                                       512). Not related to logical-segment
                                       alignment (BYTE, WORD, PARA, PAGE,
                                       and so forth). Relevant to segmented
                                       executable files (Microsoft Windows
                                       and MS OS/2) only.

  /B       /BATCH                      Suppress linker prompt if a library
                                       cannot be found in the current
                                       directory or in the locations
                                       specified by the LIB environment
                                       variable.

  /CO      /CODEVIEW                   Include symbolic debugging
                                       information in the .EXE file for use
                                       by CodeView.

  /CP      /CPARMAXALLOC               Set the field in the .EXE file header
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /CP      /CPARMAXALLOC               Set the field in the .EXE file header
                                       controlling the amount of memory
                                       allocated to the program in addition
                                       to the memory required for the
                                       program's code, stack, and
                                       initialized data.

  /DO      /DOSSEG                     Use standard Microsoft segment naming
                                       and ordering conventions.

  /DS      /DSALLOCATE                 Load data at high end of the data
                                       segment. Relevant to real-mode
                                       programs only.

  /E       /EXEPACK                    Pack executable file by removing
                                       sequences of repeated bytes and
                                       optimizing relocation table.

  /F       /FARCALLTRANSLATION         Optimize far calls to labels within
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
  /F       /FARCALLTRANSLATION         Optimize far calls to labels within
                                       the same physical segment for speed
                                       by replacing them with near calls and
                                       NOPs.

  /HE      /HELP                       Display information about available
                                       options.

  /HI      /HIGH                       Load program as high in memory as
                                       possible.

  /I       /INFORMATION                Display information about progress of
                                       linking, including pass numbers and
                                       the names of object files being
                                       linked.

  /INC     /INCREMENTAL                Force production of .SYM and .ILK
                                       files for subsequent use by ILINK
                                       (incremental linker). May not be used
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       (incremental linker). May not be used
                                       with /EXEPACK. Relevant to segmented
                                       executable files (Microsoft Windows
                                       and MS OS/2) only.

  /LI      /LINENUMBERS                Write address of the first
                                       instruction that corresponds to each
                                       source-code line to the map file. Has
                                       no effect if the compiler does not
                                       include line-number information in
                                       the object module. Force creation of
                                       a map file.

  /M[:n]   /MAP[:n]                    Force creation of a .MAP file listing
                                       all public symbols, sorted by name
                                       and by location. The optional value n
                                       is the maximum number of symbols that
                                       can be sorted (default = 2048); when
                                       n is supplied, the alphabetically
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       n is supplied, the alphabetically
                                       sorted list is omitted.

  /NOD     /NODEFAULTLIBRARYSEARCH     Skip search of any default compiler
                                       libraries specified in the .OBJ file.

  /NOE     /NOEXTENDEDDICTSEARCH       Ignore extended library dictionary
                                       (if it is present). The extended
                                       dictionary ordinarily provides the
                                       linker with information about
                                       inter-module dependencies, to speed
                                       up linking.

  /NOF     /NOFARCALLTRANSLATION       Disable optimization of far calls to
                                       labels within the same segment.

  /NOG     /NOGROUPASSOCIATION         Ignore group associations when
                                       assigning addresses to data and code
                                       items.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       items.

  /NOI     /NOIGNORECASE               Do not ignore case in names during
                                       linking.

  /NON     /NONULLSDOSSEG              Arrange segments as for /DOSSEG but
                                       do not insert 16 null bytes at start
                                       of _TEXT segment.

  /NOP     /NOPACKCODE                 Do not pack contiguous logical code
                                       segments into a single physical
                                       segment.

  /O:n     /OVERLAYINTERRUPT:n         Use interrupt number n with the
                                       overlay manager supplied with some
                                       Microsoft high-level languages.

  /PAC[:n] /PACKCODE[:n]               Pack contiguous logical code segments
                                       into a single physical code segment.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       into a single physical code segment.
                                       The optional value n is the maximum
                                       size for each packed physical code
                                       segment (default = 65,536 bytes).
                                       Segments in different groups are not
                                       packed.

  /PADC:n  /PADCODE:n                  Add n filler bytes to end of each
                                       code module so that a larger module
                                       can be inserted later with ILINK.
                                       Relevant to segmented executable
                                       files (Windows and MS OS/2) only.

  /PADD:n  /PADDATA:n                  Add n filler bytes to end of each
                                       data module so that a larger module
                                       can be inserted later with ILINK.
                                       Relevant to segmented executable
                                       files (Microsoft Windows and MS OS/2)
                                       only.
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       only.

  /PAU     /PAUSE                      Pause during linking, allowing a
                                       change of disks before .EXE file is
                                       written.

  /SE:n    /SEGMENTS:n                 Set maximum number of segments in
                                       linked program (default = 128).

  /ST:n    /STACK:n                    Set stack size of program in bytes;
                                       ignore stack segment size
                                       declarations within object modules
                                       and definition file.

  /W       /WARNFIXUP                  Display warning messages for offsets
                                       relative to a segment base that is
                                       not the same as the group base.
                                       Relevant to segmented executable
                                       files (Microsoft Windows and MS OS/2)
  Switch   Full form                   Meaning
  ──────────────────────────────────────────────────────────────────────────
                                       files (Microsoft Windows and MS OS/2)
                                       only.
  ──────────────────────────────────────────────────────────────────────────


  Figure 4-4.  Switches accepted by the Microsoft Object Linker (LINK)
  version 5.0. Earlier versions use a subset of these switches. Note that
  any abbreviation for a switch is acceptable as long as it is sufficient to
  specify the switch uniquely.


The EXE2BIN Utility

  The EXE2BIN utility (EXE2BIN.EXE) transforms a .EXE file created by LINK
  into an executable .COM file, if the program meets the following
  prerequisites:

  ■  It cannot contain more than one declared segment and cannot
     define a stack.

  ■  It must be less than 64 KB in length.

  ■  It must have an origin at 0100H.

  ■  The first location in the file must be specified as the entry point
     in the source code's END directive.

  Although .COM files are somewhat more compact than .EXE files, you should
  avoid using them. Programs that use separate segments for code, data, and
  stack are much easier to port to protected-mode environments such as MS
  OS/2; in addition, .COM files do not support the symbolic debugging
  information used by CodeView.

  Another use for the EXE2BIN utility is to convert an installable device
  driver──after it is assembled and linked into a .EXE file──into a
  memory-image .BIN or .SYS file with an origin of zero. This conversion is
  required in MS-DOS version 2, which cannot load device drivers as .EXE
  files. The process of writing an installable device driver is discussed in
  more detail in Chapter 14.

  Unlike most of the other programming utilities, EXE2BIN does not have an
  interactive mode. It always takes its source and destination filenames,
  separated by spaces, from the MS-DOS command line, as follows:

    EXE2BIN sourcefile [destinationfile]

  If you do not supply the source-file extension, it defaults to .EXE; the
  destination-file extension defaults to .BIN. If you do not specify a name
  for the destination file, EXE2BIN gives it the same name as the source
  file, with a .BIN extension.

  For example, to convert the file HELLO.EXE into HELLO.COM, you would use
  the following command line:

  C>EXE2BIN HELLO.EXE HELLO.COM  <Enter>

  The EXE2BIN program also has other capabilities, such as pure binary
  conversion with segment fixup for creating program images to be placed in
  ROM; but because these features are rarely used during MS-DOS application
  development, they will not be discussed here.


The CREF Utility

  The CREF cross-reference utility CREF.EXE processes a .CRF file produced
  by MASM, creating an ASCII text file with the default extension .REF. The
  file contains a cross-reference listing of all symbols declared in the
  program and the line numbers in which they are referenced. (See Figure
  4-5.) Such a listing is very useful when debugging large
  assembly-language programs with many interdependent procedures and
  variables.

  CREF may be supplied with its parameters interactively or in a single
  command line. If you enter the utility name alone, CREF prompts you for
  the input and output filenames, as shown in the following example:

  C>CREF  <Enter>

  Microsoft (R) Cross-Reference Utility  Version 5.10
  Copyright (C) Microsoft Corp 1981-1985, 1987. All rights reserved.

  Cross-reference [.CRF]: HELLO  <Enter>
  Listing [HELLO.REF]:

  15 Symbols

  C>

  ──────────────────────────────────────────────────────────────────────────
  Microsoft Cross-Reference  Version 5.10       Thu May 26 11:09:34 1988
  HELLO.EXE --- print Hello on terminal

    Symbol Cross-Reference    (# definition, + modification)Cref-1

  @CPU . . . . . . . . . . . . . .   1#
  @VERSION . . . . . . . . . . . .   1#

  CODE . . . . . . . . . . . . . .  21
  CR . . . . . . . . . . . . . . .  17#    46     47

  DATA . . . . . . . . . . . . . .  44

  LF . . . . . . . . . . . . . . .  18#    46     47

  MSG. . . . . . . . . . . . . . .  33     46#
  MSG_LEN. . . . . . . . . . . . .  32     49#

  PRINT. . . . . . . . . . . . . .  25#    39     60

  STACK. . . . . . . . . . . . . .  23     54#    54     58
  STDERR . . . . . . . . . . . . .  15#
  STDIN. . . . . . . . . . . . . .  13#
  STDOUT . . . . . . . . . . . . .  14#    31

  _DATA. . . . . . . . . . . . . .  23     27     44#    51
  _TEXT. . . . . . . . . . . . . .  21#    23     41

   15 Symbols
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-5.  Cross-reference listing HELLO.REF produced by the CREF
  utility from the file HELLO.CRF, for the HELLO.EXE program example from
  Chapter 3. The symbols declared in the program are listed on the left in
  alphabetic order. To the right of each symbol is a list of all the lines
  where that symbol is referenced. The number with a # sign after it denotes
  the line where the symbol is declared. Numbers followed by a + sign
  indicate that the symbol is modified at the specified line. The line
  numbers given in the cross-reference listing correspond to the line
  numbers generated by the assembler in the program-listing (.LST) file, not
  to any physical line count in the original source file.

  The parameters may also be entered in the command line in the following
  form:

    CREF CRF_file, listing_file

  For example, the command-line equivalent to the preceding interactive
  session is:

  C>CREF HELLO,HELLO  <Enter>

  If CREF cannot find the specified .CRF file, it displays an error message.
  Otherwise, it leaves the cross-reference listing in the specified file on
  the disk. You can send the file to the printer with the COPY command, in
  the following form:

    COPY listing_file PRN:

  You can also send the cross-reference listing directly to a character
  device as it is generated by responding to the Listing prompt with the
  name of the device.


The Microsoft Library Manager

  Although the object modules that are produced by MASM or by high-level-
  language compilers can be linked directly into executable load modules,
  they can also be collected into special files called object-module
  libraries. The modules in a library are indexed by name and by the public
  symbols they contain, so that they can be extracted by the linker to
  satisfy external references in a program.

  The Microsoft Library Manager (LIB) is distributed as the file LIB.EXE.
  LIB creates and maintains program libraries, adding, updating, and
  deleting object files as necessary. LIB can also check a library file for
  internal consistency or print a table of its contents (Figure 4-6).

  LIB follows the command conventions of most other Microsoft programming
  tools. You must supply it with the name of a library file to work on, one
  or more operations to perform, the name of a listing file or device, and
  (optionally) the name of the output library. If you do not specify a name
  for the output library, LIB gives it the same name as the input library
  and changes the extension of the input library to .BAK.

  The LIB operations are simply the names of object files, with a prefix
  character that specifies the action to be taken:

  Prefix     Meaning
  ──────────────────────────────────────────────────────────────────────────
  -          Delete an object module from the library.
  *          Extract a module and place it in a separate .OBJ file.
  +          Add an object module or the entire contents of another library
             to the library.
  ──────────────────────────────────────────────────────────────────────────

  You can combine command prefixes. For example, -+ replaces a module, and
  *- extracts a module into a new file and then deletes it from the library.

  ──────────────────────────────────────────────────────────────────────────
  _abort............abort             _abs..............abs
  _access...........access            _asctime..........asctime
  _atof.............atof              _atoi.............atoi
  _atol.............atol              _bdos.............bdos
  _brk..............brk               _brkctl...........brkctl
  _bsearch..........bsearch           _calloc...........calloc
  _cgets............cgets             _chdir............dir
  _chmod............chmod             _chsize...........chsize
       .
       .
       .
  _exit             Offset: 00000010H  Code and data size: 44H
    __exit

  _filbuf           Offset: 00000160H  Code and data size: BBH
    __filbuf

  _file             Offset: 00000300H  Code and data size: CAH
    __iob             __iob2            __lastiob
       .
       .
       .
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-6.  Extract from the table-of-contents listing produced by the
  Microsoft Library Manager (LIB) for the Microsoft C library SLIBC.LIB. The
  first part of the listing is an alphabetic list of all public names
  declared in all of the modules in the library. Each name is associated
  with the object module to which it belongs. The second part of the listing
  is an alphabetic list of the object-module names in the library, each
  followed by its offset within the library file and the actual size of the
  module in bytes. The entry for each module is followed by a summary of the
  public names that are declared within it.

  When you invoke LIB with its name alone, it requests the other information
  it needs interactively, as shown in the following example:

  C>LIB  <Enter>

  Microsoft (R) Library Manager  Version 3.08
  Copyright (C) Microsoft Corp 1983-1987. All rights reserved.

  Library name:  SLIBC  <Enter>
  Operations: +VIDEO  <Enter>
  List file:  SLIBC.LST  <Enter>
  Output library:  SLIBC2  <Enter>

  C>

  In this example, LIB added the object module VIDEO.OBJ to the library
  SLIBC.LIB, wrote a library table of contents into the file SLIBC.LST, and
  named the resulting new library SLIBC2.LIB.

  The Library Manager can also be run with a command line of the following
  form:

    LIB library [commands],[list],[newlibrary]

  For example, the following command line is equivalent to the preceding
  interactive session:

  C>LIB SLIBC +VIDEO,SLIBC.LST,SLIBC2;  <Enter>

  As with the other Microsoft utilities, a semicolon at the end of the
  command line causes LIB to use the default responses for any parameters
  that are omitted.

  Like LINK, LIB can also accept its commands from a response file. The
  contents of the file are lines of text that correspond exactly to the
  responses you would give LIB interactively. You specify the name of the
  response file in the command line with a leading @ character, as follows:

    LIB @filename

  LIB has only three switches: /I (/IGNORECASE), /N (/NOIGNORECASE), and
  /PAGESIZE:number. The /IGNORECASE switch is the default. The /NOIGNORECASE
  switch causes LIB to regard as distinct any symbols that differ only in
  the case of their component letters. You should place the /PAGESIZE
  switch, which defines the size of a unit of allocation space for a given
  library, immediately after the library filename. The library page size is
  in bytes and must be a power of 2 between 16 and 32,768 (16, 32, 64, and
  so forth); the default is 16 bytes. Because the index to a library is
  always a fixed number of pages, setting a larger page size allows you to
  store more object modules in that library; on the other hand, it will
  result in more wasted space within the file.


The MAKE Utility

  The MAKE utility (MAKE.EXE) compares dates of files and carries out
  commands based on the result of that comparison. Because of this single,
  rather basic capability, MAKE can be used to maintain complex programs
  built from many modules. The dates of source, object, and executable files
  are simply compared in a logical sequence; the assembler, compiler,
  linker, and other programming tools are invoked as appropriate.

  The MAKE utility processes a plain ASCII text file called, as you might
  expect, a make file. You start the utility with a command-line entry in
  the following form:

    MAKE makefile [options]

  By convention, a make file has the same name as the executable file that
  is being maintained, but without an extension. The available MAKE switches
  are listed in Figure 4-7.

  A simple make file contains one or more dependency statements separated by
  blank lines. Each dependency statement can be followed by a list of MS-DOS
  commands, in the following form:

    targetfile : sourcefile ...

      command

      command

      .

      .

      .

  If the date and time of any source file are later than those of the target
  file, the accompanying list of commands is carried out. You may use
  comment lines, which begin with a # character, freely in a make file. MAKE
  can also process inference rules and macro definitions. For further
  details on these advanced capabilities, see the Microsoft or IBM
  documentation.

  Switch     Meaning
  ──────────────────────────────────────────────────────────────────────────
  /D         Display last modification date of each file as it is processed.
  /I         Ignore exit (return) codes returned by commands and programs
             executed as a result of dependency statements.
  /N         Display commands that would be executed as a result of
             dependency statements but do not execute those commands.
  /S         Do not display commands as they are executed.
  /X         Direct error messages from MAKE, or any program that MAKE runs,
  <filename> to the specified file. If filename is a hyphen (-), direct
             error messages to the standard output.
  ──────────────────────────────────────────────────────────────────────────

  Figure 4-7.  Switches for the MAKE utility.


A Complete Example

  Let's put together everything we've learned about using the MS-DOS
  programming tools so far. Figure 4-8 shows a sketch of the overall
  process of building an executable program.

  Assume that we have the source code for the HELLO.EXE program from Chapter
  3 in the file HELLO.ASM. To assemble the source program into the
  relocatable object module HELLO.OBJ with symbolic debugging information
  included, also producing a program listing in the file HELLO.LST and a
  cross-reference data file HELLO.CRF, we would enter

  C>MASM /C /L /Zi /T HELLO;  <Enter>

  To convert the cross-reference raw-data file HELLO.CRF into a
  cross-reference listing in the file HELLO.REF, we would enter

  C>CREF HELLO,HELLO  <Enter>

  ┌───────────────┐             ┌───────────────┐
  │     MASM      │             │  C or other   │
  │  source-code  │             │  HLL source-  │
  │     file      │             │   code file   │
  └───┬───────────┘             └───┬───────────┘
      │       ┌─────────────────────┘  Compiler
  ┌───▼───────▼───┐
  │  Relocatable  │
  │ object-module ├────┐
  │  file (.OBJ)  │    │
  └───┬───────────┘    │
      │ LIB            │
  ┌───▼───────────┐    │        ┌───────────────┐
  │ Object-module │    ▼  LINK  │  Executable   │
  │   libraries   ├─────────────►   program     │
  │    (.LIB)     │            │    (.EXE)     │
  └───────────────┘      │      └───┬───────────┘
                         │          │ EXE2BIN
  ┌───────────────┐      │      ┌───▼───────────┐
  │     HLL       │      │      │   Executable  │
  │   runtime     ├──────┘      │    program    │
  │  libraries    │             │     (.COM)    │
  └───────────────┘             └───────────────┘

  Figure 4-8.  Creation of an MS-DOS application program, from source code
  to executable file.

  To convert the relocatable object file HELLO.OBJ into the executable file
  HELLO.EXE, creating a load map in the file HELLO.MAP and appending
  symbolic debugging information to the executable file, we would enter

  C>LINK /MAP /CODEVIEW HELLO;  <Enter>

  We could also automate the entire process just described by creating a
  make file named HELLO (with no extension) and including the following
  instructions:

  hello.obj : hello.asm
   masm /C /L /Zi /T hello;
   cref hello,hello

  hello.exe : hello.obj
   link /MAP /CODEVIEW hello;

  Then, when we have made some change to HELLO.ASM and want to rebuild the
  executable HELLO.EXE file, we need only enter

  C>MAKE HELLO  <Enter>


Programming Resources and References

  The literature on IBM PC─compatible personal computers, the Intel 80x86
  microprocessor family, and assembly-language and C programming is vast.
  The list below contains a selection of those books that I have found to be
  useful and reliable. The list should not be construed as an endorsement by
  Microsoft Corporation.

MASM Tutorials

  Assembly Language Primer for the IBM PC and XT, by Robert Lafore. New
  American Library, New York, NY, 1984. ISBN 0-452-25711-5.

  8086/8088/80286 Assembly Language, by Leo Scanlon. Brady Books, Simon and
  Schuster, New York, NY, 1988. ISBN 0-13-246919-7.

C Tutorials

  Microsoft C Programming for the IBM, by Robert Lafore. Howard K. Sams &
  Co., Indianapolis, IN, 1987. ISBN 0-672-22515-8.

  Proficient C, by Augie Hansen. Microsoft Press, Redmond, WA, 1987. ISBN
  1-55615-007-5.

Intel 80x86 Microprocessor References

  iAPX 88 Book. Intel Corporation, Literature Department SV3-3, 3065 Bowers
  Ave., Santa Clara, CA 95051. Order no. 210200.

  iAPX 286 Programmer's Reference Manual. Intel Corporation, Literature
  Department SV3-3, 3065 Bowers Ave., Santa Clara, CA 95051. Order no.
  210498.

  iAPX 386 Programmer's Reference Manual. Intel Corporation, Literature
  Department SV3-3, 3065 Bowers Ave., Santa Clara, CA 95051. Order no.
  230985.

PC, PC/AT, and PS/2 Architecture

  The IBM Personal Computer from the Inside Out (Revised Edition), by Murray
  Sargent and Richard L. Shoemaker. Addison-Wesley Publishing Company,
  Reading, MA, 1986. ISBN 0-201-06918-0.

  Programmer's Guide to PC & PS/2 Video Systems, by Richard Wilton.
  Microsoft Press, Redmond, WA, 1987. ISBN 1-55615-103-9.

  Personal Computer Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6322507.

  Personal Computer AT Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6280070.

  Options and Adapters Technical Reference. IBM Corporation, IBM Technical
  Directory, P. O. Box 2009, Racine, WI 53404. Part no. 6322509.

  Personal System/2 Model 30 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2201.

  Personal System/2 Model 50/60 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2224.

  Personal System/2 Model 80 Technical Reference. IBM Corporation, IBM
  Technical Directory, P. O. Box 2009, Racine, WI 53404. Part no. 68X2256.



────────────────────────────────────────────────────────────────────────────
Chapter 5  Keyboard and Mouse Input

  The fundamental means of user input under MS-DOS is the keyboard. This
  follows naturally from the MS-DOS command-line interface, whose lineage
  can be traced directly to minicomputer operating systems with Teletype
  consoles. During the first few years of MS-DOS's existence, when
  8088/8086-based machines were the norm, nearly every popular application
  program used key-driven menus and text-mode displays.

  However, as high-resolution graphics adapters (and 80286/80386-based
  machines with enough power to drive them) have become less expensive,
  programs that support windows and a graphical user interface have steadily
  grown more popular. Such programs typically rely on a pointing device such
  as a mouse, stylus, joystick, or light pen to let the user navigate in a
  "point-and-shoot" manner, reducing keyboard entry to a minimum. As a
  result, support for pointing devices has become an important consideration
  for all software developers.


Keyboard Input Methods

  Applications running under MS-DOS on IBM PC─compatible machines can use
  several methods to obtain keyboard input:

  ■  MS-DOS handle-oriented functions

  ■  MS-DOS traditional character functions

  ■  IBM ROM BIOS keyboard-driver functions

  These methods offer different degrees of flexibility, portability, and
  hardware independence.

  The handle, or stream-oriented, functions are philosophically derived from
  UNIX/XENIX and were first introduced in MS-DOS version 2.0. A program uses
  these functions by supplying a handle, or token, for the desired device,
  plus the address and length of a buffer.

  When a program begins executing, MS-DOS supplies it with predefined
  handles for certain commonly used character devices, including the
  keyboard:

  Handle             Device name                          Opened to
  ──────────────────────────────────────────────────────────────────────────
  0                  Standard input (stdin)               CON
  1                  Standard output (stdout)             CON
  2                  Standard error (stderr)              CON
  3                  Standard auxiliary (stdaux)          AUX
  4                  Standard printer (stdprn)            PRN
  ──────────────────────────────────────────────────────────────────────────

  These handles can be used for read and write operations without further
  preliminaries. A program can also obtain a handle for a character device
  by explicitly opening the device for input or output using its logical
  name (as though it were a file). The handle functions support I/O
  redirection, allowing a program to take its input from another device or
  file instead of the keyboard, for example. Redirection is discussed in
  detail in Chapter 15.

  The traditional character-input functions are a superset of the character
  I/O functions that were present in CP/M. Originally included in MS-DOS
  simply to facilitate the porting of existing applications from CP/M, they
  are still widely used. In MS-DOS versions 2.0 and later, most of the
  traditional functions also support I/O redirection (although not as well
  as the handle functions do).

  Use of the IBM ROM BIOS keyboard functions presupposes that the program is
  running on an IBM PC─compatible machine. The ROM BIOS keyboard driver
  operates at a much more primitive level than the MS-DOS functions and
  allows a program to circumvent I/O redirection or MS-DOS's special
  handling of certain control characters. Programs that use the ROM BIOS
  keyboard driver are inherently less portable than those that use the
  MS-DOS functions and may interfere with the proper operation of other
  programs; many of the popular terminate-and-stay-resident (TSR) utilities
  fall into this category.

Keyboard Input with Handles

  The principal MS-DOS function for keyboard input using handles is Int 21H
  Function 3FH (Read File or Device). The parameters for this function are
  a handle, the segment and offset of a buffer, and the length of the
  buffer. (For a more detailed explanation of this function, see Section
  II of this book, "MS-DOS Functions Reference.")

  As an example, let's use the predefined standard input handle (0) and Int
  21H Function 3FH to read a line from the keyboard:

  ──────────────────────────────────────────────────────────────────────────
  buffer  db   80 dup (?)     ; keyboard input buffer
          .
          .
          .
          mov  ah,3fh         ; function 3fh = read file or device
          mov  bx,0           ; handle for standard input
          mov  cx,80          ; maximum bytes to read
          mov  dx,seg buffer  ; DS:DX = buffer address
          mov  ds,dx
          mov  dx,offset buffer
          int  21h            ; transfer to MS-DOS
          jc   error          ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  When control returns from Int 21H Function 3FH, the carry flag is clear if
  the function was successful, and AX contains the number of characters
  read. If there was an error, the carry flag is set and AX contains an
  error code; however, this should never occur when reading the keyboard.

  The standard input is redirectable, so the code just shown is not a
  foolproof way of obtaining input from the keyboard. Depending upon whether
  a redirection parameter was included in the command line by the user,
  program input might be coming from the keyboard, a file, another character
  device, or even the bit bucket (NUL device). To bypass redirection and be
  absolutely certain where your input is coming from, you can ignore the
  predefined standard input handle and open the console as though it were a
  file, using the handle obtained from that open operation to perform your
  keyboard input, as in the following example:

  ──────────────────────────────────────────────────────────────────────────
  buffer  db     80 dup (?)   ; keyboard input buffer
  fname   db     'CON',0      ; keyboard device name
  handle  dw     0            ; keyboard device handle
          .
          .
          .
          mov    ah,3dh       ; function 3dh = open
          mov    al,0         ; mode = read
          mov    dx,seg fname ; DS:DX = device name
          mov    ds,dx
          mov    dx,offset fname
          int    21h          ; transfer to MS-DOS
          jc     error        ; jump if open failed
          mov    handle,ax    ; save handle for CON
          .
          .
          .
          mov    ah,3fh       ; function 3fh = read file or device
          mov    bx,handle    ; BX = handle for CON
          mov    cx,80        ; maximum bytes to read
          mov    dx,offset buffer ; DS:DX = buffer address
          int    21h          ; transfer to MS-DOS
          jc     error        ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  When a programmer uses Int 21H Function 3FH to read from the keyboard, the
  exact result depends on whether MS-DOS regards the handle to be in ASCII
  mode or binary mode (sometimes known as cooked mode and raw mode). ASCII
  mode is the default, although binary mode can be selected with Int 21H
  Function 44H (IOCTL) when necessary.

  In ASCII mode, MS-DOS initially places characters obtained from the
  keyboard in a 128-byte internal buffer, and the user can edit the input
  with the Backspace key and the special function keys. MS-DOS automatically
  echoes the characters to the standard output, expanding tab characters to
  spaces (although they are left as the ASCII code 09H in the buffer). The
  Ctrl-C, Ctrl-S, and Ctrl-P key combinations receive special handling, and
  the Enter key is translated to a carriage return─linefeed pair. When the
  user presses Enter or Ctrl-Z, MS-DOS copies the requested number of
  characters (or the actual number of characters entered, if less than the
  number requested) out of the internal buffer into the calling program's
  buffer.

  In binary mode, MS-DOS never echoes input characters. It passes the
  Ctrl-C, Ctrl-S, Ctrl-P, and Ctrl-Z key combinations and the Enter key
  through to the application unchanged, and Int 21H Function 3FH does not
  return control to the application until the exact number of characters
  requested has been received.

  Ctrl-C checking is discussed in more detail at the end of this chapter.
  For now, simply note that the application programmer can substitute a
  custom handler for the default MS-DOS Ctrl-C handler and thereby avoid
  having the application program lose control of the machine when the user
  enters a Ctrl-C or Ctrl-Break.

Keyboard Input with Traditional Calls

  The MS-DOS traditional keyboard functions offer a variety of character and
  line-oriented services with or without echo and Ctrl-C detection. These
  functions are summarized on the following page.

  Int 21H Function   Action                               Ctrl-C checking
  ──────────────────────────────────────────────────────────────────────────
  01H               Keyboard input with echo             Yes
  06H               Direct console I/O                   No
  07H               Keyboard input without echo          No
  08H               Keyboard input without echo          Yes
  0AH               Buffered keyboard input              Yes
  0BH               Input-status check                   Yes
  0CH               Input-buffer reset and input         Varies
  ──────────────────────────────────────────────────────────────────────────

  In MS-DOS versions 2.0 and later, redirection of the standard input
  affects all these functions. In other words, they act as though they were
  special cases of an Int 21H Function 3FH call using the predefined
  standard input handle (0).

  The character-input functions (01H, 06H, 07H, and 08H) all return a
  character in the AL register. For example, the following sequence waits
  until a key is pressed and then returns it in AL:

  ──────────────────────────────────────────────────────────────────────────
          mov     ah,1        ; function 01h = read keyboard
          int     21h         ; transfer to MS-DOS
  ──────────────────────────────────────────────────────────────────────────

  The character-input functions differ in whether the input is echoed to the
  screen and whether they are sensitive to Ctrl-C interrupts. Although
  MS-DOS provides no pure keyboard-status function that is immune to Ctrl-C,
  a program can read keyboard status (somewhat circuitously) without
  interference by using Int 21H Function 06H. Extended keys, such as the
  IBM PC keyboard's special function keys, require two calls to a
  character-input function.

  As an alternative to single-character input, a program can use
  buffered-line input (Int 21H Function 0AH) to obtain an entire line from
  the keyboard in one operation. MS-DOS builds up buffered lines in an
  internal buffer and does not pass them to the calling program until the
  user presses the Enter key. While the line is being entered, all the usual
  editing keys are active and are handled by the MS-DOS keyboard driver. You
  use Int 21H Function 0AH as follows:

  ──────────────────────────────────────────────────────────────────────────
  buff    db      81          ; maximum length of input
          db      0           ; actual length (from MS-DOS)
          db      81 dup (0)  ; receives keyboard input
          .
          .
          .
          mov     ah,0ah      ; function 0ah = read buffered line
          mov     dx,seg buff ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Int 21H Function 0AH differs from Int 21H Function 3FH in several
  important ways. First, the maximum length is passed in the first byte of
  the buffer, rather than in the CX register. Second, the actual length is
  returned in the second byte of the structure, rather than in the AX
  register. Finally, when the user has entered one less than the specified
  maximum number of characters, MS-DOS ignores all subsequent characters and
  sounds a warning beep until the Enter key is pressed.

  For detailed information about each of the traditional keyboard-input
  functions, see Section II of this book, "MS-DOS Functions Reference."

Keyboard Input with ROM BIOS Functions

  Programmers writing applications for IBM PC compatibles can bypass the
  MS-DOS keyboard functions and choose from two hardware-dependent
  techniques for keyboard input.

  The first method is to call the ROM BIOS keyboard driver using Int 16H.
  For example, the following sequence reads a single character from the
  keyboard input buffer and returns it in the AL register:

  ──────────────────────────────────────────────────────────────────────────
          mov    ah,0         ; function 0=read keyboard
          int    16h          ; transfer to ROM BIOS
  ──────────────────────────────────────────────────────────────────────────

  Int 16H Function 00H also returns the keyboard scan code in the AH
  register, allowing the program to detect key codes that are not ordinarily
  returned by MS-DOS. Other Int 16H services return the keyboard status
  (that is, whether a character is waiting) or the keyboard shift state
  (from the ROM BIOS data area 0000:0417H). For a more detailed explanation
  of ROM BIOS keyboard functions, see Section III of this book, "IBM ROM
  BIOS and Mouse Functions Reference."

  You should consider carefully before building ROM BIOS dependence into an
  application. Although this technique allows you to bypass any I/O
  redirection that may be in effect, ways exist to do this without
  introducing dependence on the ROM BIOS. And there are real disadvantages
  to calling the ROM BIOS keyboard driver:

  ■  It always bypasses I/O redirection, which sometimes may not be
     desirable.

  ■  It is dependent on IBM PC compatibility and does not work correctly,
     unchanged, on some older machines such as the Hewlett-Packard
     TouchScreen or the Wang Professional Computer.

  ■  It may introduce complicated interactions with TSR utilities.

  The other and more hardware-dependent method of keyboard input on an IBM
  PC is to write a new handler for ROM BIOS Int 09H and service the keyboard
  controller's interrupts directly. This involves translation of scan codes
  to ASCII characters and maintenance of the type-ahead buffer. In ordinary
  PC applications, there is no reason to take over keyboard I/O at this
  level; therefore, I will not discuss this method further here. If you are
  curious about the techniques that would be required, the best reference is
  the listing for the ROM BIOS Int 09H handler in the IBM PC or PC/AT
  technical reference manual.


Ctrl-C and Ctrl-Break Handlers

  In the discussion of keyboard input with the MS-DOS handle and traditional
  functions, I made some passing references to the fact that Ctrl-C entries
  can interfere with the expected behavior of those functions. Let's look at
  this subject in more detail now.

  During most character I/O operations, MS-DOS checks for a Ctrl-C (ASCII
  code 03H) waiting at the keyboard and executes an Int 23H if one is
  detected. If the system break flag is on, MS-DOS also checks for a Ctrl-C
  entry during certain other operations (such as file reads and writes).
  Ordinarily, the Int 23H vector points to a routine that simply terminates
  the currently active process and returns control to the parent process──
  usually the MS-DOS command interpreter.

  In other words, if your program is executing and you enter a Ctrl-C,
  accidentally or intentionally, MS-DOS simply aborts the program. Any files
  the program has opened using file control blocks will not be closed
  properly, any interrupt vectors it has altered may not be restored
  correctly, and if it is performing any direct I/O operations (for example,
  if it contains an interrupt driver for the serial port), all kinds of
  unexpected events may occur.

  Although you can use a number of partially effective methods to defeat
  Ctrl-C checking, such as performing keyboard input with Int 21H Functions
  06H and 07H, placing all character devices into binary mode, or turning
  off the system break flag with Int 21H Function 33H, none of these is
  completely foolproof. The simplest and most elegant way to defeat Ctrl-C
  checking is simply to substitute your own Int 23H handler, which can take
  some action appropriate to your program. When the program terminates,
  MS-DOS automatically restores the previous contents of the Int 23H vector
  from information saved in the program segment prefix. The following
  example shows how to install your own Ctrl-C handler (which in this case
  does nothing at all):

  ──────────────────────────────────────────────────────────────────────────
          push    ds          ; save data segment
                              ; set int 23h vector...
          mov     ax,2523h    ; function 25h = set interrupt
                              ; int 23h = vector for
                              ; Ctrl-C handler
          mov     dx,seg handler ; DS:DX = handler address
          mov     ds,dx
          mov     dx,offset handler
          int     21h         ; transfer to MS-DOS

          pop     ds          ; restore data segment
          .
          .
          .
  handler:                    ; a Ctrl-C handler
          iret                ; that does nothing
  ──────────────────────────────────────────────────────────────────────────

  The first part of the code (which alters the contents of the Int 23H
  vector) would be executed in the initialization part of the application.
  The handler receives control whenever MS-DOS detects a Ctrl-C at the
  keyboard. (Because this handler consists only of an interrupt return, the
  Ctrl-C will remain in the keyboard input stream and will be passed to the
  application when it requests a character from the keyboard, appearing on
  the screen as ^C.)

  When an Int 23H handler is called, MS-DOS is in a stable state. Thus, the
  handler can call any MS-DOS function. It can also reset the segment
  registers and the stack pointer and transfer control to some other point
  in the application without ever returning control to MS-DOS with an IRET.

  On IBM PC compatibles, an additional interrupt handler must be taken into
  consideration. Whenever the ROM BIOS keyboard driver detects the key
  combination Ctrl-Break, it calls a handler whose address is stored in the
  vector for Int 1BH. The default ROM BIOS Int 1BH handler does nothing.
  MS-DOS alters the Int 1BH vector to point to its own handler, which sets a
  flag and returns; the net effect is to remap the Ctrl-Break into a Ctrl-C
  that is forced ahead of any other characters waiting in the keyboard
  buffer.

  Taking over the Int 1BH vector in an application is somewhat tricky but
  extremely useful. Because the keyboard is interrupt driven, a press of
  Ctrl-Break lets the application regain control under almost any
  circumstance──often, even if the program has crashed or is in an endless
  loop.

  You cannot, in general, use the same handler for Int 1BH that you use for
  Int 23H. The Int 1BH handler is more limited in what it can do, because it
  has been called as a result of a hardware interrupt and MS-DOS may have
  been executing a critical section of code at the time the interrupt was
  serviced. Thus, all registers except CS:IP are in an unknown state; they
  may have to be saved and then modified before your interrupt handler can
  execute. Similarly, the depth of the stack in use when the Int 1BH handler
  is called is unknown, and if the handler is to perform stack-intensive
  operations, it may have to save the stack segment and the stack pointer
  and switch to a new stack that is known to have sufficient depth.

  In normal application programs, you should probably avoid retaining
  control in an Int 1BH handler, rather than performing an IRET. Because of
  subtle differences among non-IBM ROM BIOSes, it is difficult to predict
  the state of the keyboard controller and the 8259 Programmable Interrupt
  Controller (PIC) when the Int 1BH handler begins executing. Also, MS-DOS
  itself may not be in a stable state at the point of interrupt, a situation
  that can manifest itself in unexpected critical errors during subsequent
  I/O operations. Finally, MS-DOS versions 3.2 and later allocate a stack
  from an internal pool for use by the Int 09H handler. If the Int 1BH
  handler never returns, the Int 09H handler never returns either, and
  repeated entries of Ctrl-Break will eventually exhaust the stack pool,
  halting the system.

  Because Int 1BH is a ROM BIOS interrupt and not an MS-DOS interrupt,
  MS-DOS does not restore the previous contents of the Int 1BH vector when a
  program exits. If your program modifies this vector, it must save the
  original value and restore it before terminating. Otherwise, the vector
  will be left pointing to some random area in the next program that runs,
  and the next time the user presses Ctrl-Break a system crash is the best
  you can hope for.

Ctrl-C and Ctrl-Break Handlers and High-Level Languages

  Capturing the Ctrl-C and Ctrl-Break interrupts is straightforward when you
  are programming in assembly language. The process is only slightly more
  difficult with high-level languages, as long as you have enough
  information about the language's calling conventions that you can link in
  a small assembly-language routine as part of the program.

  The BREAK.ASM listing in Figure 5-1 contains source code for a Ctrl-Break
  handler that can be linked with small-model Microsoft C programs running
  on an IBM PC compatible. The short C program in Figure 5-2 demonstrates
  use of the handler. (This code should be readily portable to other C
  compilers.)

  ──────────────────────────────────────────────────────────────────────────
          page    55,132
          title   Ctrl-C & Ctrl-Break Handlers
          name    break

  ;
  ; Ctrl-C and Ctrl-Break handler for Microsoft C
  ; programs running on IBM PC compatibles
  ;
  ; by Ray Duncan
  ;
  ; Assemble with:  C>MASM /Mx BREAK;
  ;
  ; This module allows C programs to retain control
  ; when the user enters a Ctrl-Break or Ctrl-C.
  ; It uses Microsoft C parameter-passing conventions
  ; and assumes the C small memory model.
  ;
  ; The procedure _capture is called to install
  ; a new handler for the Ctrl-C and Ctrl-Break
  ; interrupts (1bh and 23h).  _capture is passed
  ; the address of a static variable, which will be
  ; set to true by the handler whenever a Ctrl-C
  ; or Ctrl-Break is detected.  The C syntax is:
  ;
  ;               static int flag;
  ;               capture(&flag);
  ;
  ; The procedure _release is called by the C program
  ; to restore the original Ctrl-Break and Ctrl-C
  ; handler. The C syntax is:
  ;               release();
  ;
  ; The procedure ctrlbrk is the actual interrupt
  ; handler.  It receives control when a software
  ; int 1bh is executed by the ROM BIOS or int 23h
  ; is executed by MS-DOS.  It simply sets the C
  ; program's variable to true (1) and returns.
  ;

  args    equ     4               ; stack offset of arguments,
                                  ; C small memory model

  cr      equ     0dh             ; ASCII carriage return
  lf      equ     0ah             ; ASCII linefeed

  _TEXT   segment word public 'CODE'

          assume cs:_TEXT


          public  _capture
  _capture proc   near            ; take over Ctrl-Break
                                  ; and Ctrl-C interrupt vectors

          push    bp              ; set up stack frame
          mov     bp,sp

          push    ds              ; save registers
          push    di
          push    si

                                  ; save address of
                                  ; calling program's "flag"
          mov     ax,word ptr [bp+args]
          mov     word ptr cs:flag,ax
          mov     word ptr cs:flag+2,ds

                                  ; save address of original
          mov     ax,3523h        ; int 23h handler
          int     21h
          mov     word ptr cs:int23,bx
          mov     word ptr cs:int23+2,es
          mov     ax,351bh        ; save address of original
          int     21h             ; int 1bh handler
          mov     word ptr cs:int1b,bx
          mov     word ptr cs:int1b+2,es
          push    cs              ; set DS:DX = address
          pop     ds              ; of new handler
          mov     dx,offset _TEXT:ctrlbrk

          mov     ax,02523h       ; set int 23h vector
          int     21h

          mov     ax,0251bh       ; set int 1bh vector
          int     21h

          pop     si              ; restore registers
          pop     di
          pop     ds

          pop     bp              ; discard stack frame
          ret                     ; and return to caller

  _capture endp


          public  _release
  _release proc   near            ; restore original Ctrl-C
                                  ; and Ctrl-Break handlers

          push    bp              ; save registers
          push    ds
          push    di
          push    si

          lds     dx,cs:int1b     ; get address of previous
                                  ; int 1bh handler

          mov     ax,251bh        ; set int 1bh vector
          int     21h

          lds     dx,cs:int23     ; get address of previous
                                  ; int 23h handler

          mov     ax,2523h        ; set int 23h vector
          int     21h

          pop     si              ; restore registers
          pop     di              ; and return to caller
          pop     ds
          pop     bp
          ret
  release endp

  ctrlbrk proc    far             ; Ctrl-C and Ctrl-Break
                                  ; interrupt handler

          push    bx              ; save registers
          push    ds

          lds     bx,cs:flag      ; get address of C program's
                                  ; "flag variable"

                                  ; and set the flag "true"
          mov     word ptr ds:[bx],1

          pop     ds              ; restore registers
          pop     bx

          iret                    ; return from handler

  ctrlbrk endp

  flag    dd      0               ; far pointer to caller's
                                  ; Ctrl-Break or Ctrl-C flag

  int23   dd      0               ; address of original
                                  ; Ctrl-C handler

  int1b   dd      0               ; address of original
                                  ; Ctrl-Break handler

  _TEXT   ends

          end
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-1.  BREAK.ASM: A Ctrl-C and Ctrl-Break interrupt handler that can
  be linked with Microsoft C programs.

  ──────────────────────────────────────────────────────────────────────────
  /*
      TRYBREAK.C

      Demo of BREAK.ASM Ctrl-Break and Ctrl-C
      interrupt handler, by Ray Duncan

      To create the executable file TRYBREAK.EXE, enter:

      MASM /Mx BREAK;
      CL TRYBREAK.C BREAK.OBJ
  */

  #include <stdio.h>

  main(int argc, char *argv[])
  {
      int hit = 0;                     /* flag for key press      */
      int c = 0;                       /* character from keyboard */
      static int flag = 0;             /* true if Ctrl-Break
                                          or Ctrl-C detected      */

      puts("\n*** TRYBREAK.C running ***\n");
      puts("Press Ctrl-C or Ctrl-Break to test handler,");
      puts("Press the Esc key to exit TRYBREAK.\n");

      capture(&flag);                  /* install new Ctrl-C and
                                          Ctrl-Break handler and
                                          pass address of flag    */

      puts("TRYBREAK has captured interrupt vectors.\n");

      while(1)
      {
          hit = kbhit();               /* check for key press     */
                                       /* (MS-DOS sees Ctrl-C
                                           when keyboard polled)  */

          if(flag != 0)                /* if flag is true, an     */
          {                            /* interrupt has occurred  */
              puts("\nControl-Break detected.\n");
              flag = 0;                /* reset interrupt flag    */
          }
          if(hit != 0)                 /* if any key waiting      */
          {
              c = getch();             /* read key, exit if Esc   */
              if( (c & 0x7f) == 0x1b) break;
              putch(c);                /* otherwise display it    */
          }
      }
      release();                       /* restore original Ctrl-C
                                          and Ctrl-Break handlers */

      puts("\n\nTRYBREAK has released interrupt vectors.");
  }
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-2.  TRYBREAK.C: A simple Microsoft C program that demonstrates
  use of the interrupt handler BREAK.ASM from Figure 5-1.

  In the example handler, the procedure named capture is called with the
  address of an integer variable within the C program. It saves the address
  of the variable, points the Int 1BH and Int 23H vectors to its own
  interrupt handler, and then returns.

  When MS-DOS detects a Ctrl-C or Ctrl-Break, the interrupt handler sets the
  integer variable within the C program to true (1) and returns. The C
  program can then poll this variable at its leisure. Of course, to detect
  more than one Ctrl-C, the program must reset the variable to zero again.

  The procedure named release simply restores the Int 1BH and Int 23H
  vectors to their original values, thereby disabling the interrupt handler.
  Although it is not strictly necessary for release to do anything about Int
  23H, this action does give the C program the option of restoring the
  default handler for Int 23H without terminating.


Pointing Devices

  Device drivers for pointing devices are supplied by the hardware
  manufacturer and are loaded with a DEVICE statement in the CONFIG.SYS
  file. Although the hardware characteristics of the available pointing
  devices differ greatly, nearly all of their drivers present the same
  software interface to application programs: the Int 33H protocol used by
  the Microsoft Mouse driver. Version 6 of the Microsoft Mouse driver (which
  was current as this was written) offers the following functions:

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  00H               Reset mouse and get status.
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  00H               Reset mouse and get status.
  01H               Show mouse pointer.
  02H               Hide mouse pointer.
  03H               Get button status and pointer position.
  04H               Set pointer position.
  05H               Get button-press information.
  06H               Get button-release information.
  07H               Set horizontal limits for pointer.
  08H               Set vertical limits for pointer.
  09H               Set graphics pointer type.
  0AH               Set text pointer type.
  0BH               Read mouse-motion counters.
  0CH               Install interrupt handler for mouse events.
  0DH               Turn on light pen emulation.
  0EH               Turn off light pen emulation.
  0FH               Set mickeys to pixel ratio.
  10H               Set pointer exclusion area.
  13H               Set double-speed threshold.
  14H               Swap mouse-event interrupt routines.
  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────
  14H               Swap mouse-event interrupt routines.
  15H               Get buffer size for mouse-driver state.
  16H               Save mouse-driver state.
  17H               Restore mouse-driver state.
  18H               Install alternate handler for mouse events.
  19H               Get address of alternate handler.
  1AH               Set mouse sensitivity.
  1BH               Get mouse sensitivity.
  1CH               Set mouse interrupt rate.
  1DH               Select display page for pointer.
  1EH               Get display page for pointer.
  1FH               Disable mouse driver.
  20H               Enable mouse driver.
  21H               Reset mouse driver.
  22H               Set language for mouse-driver messages.
  23H               Get language number.
  24H               Get driver version, mouse type, and IRQ number.
  ──────────────────────────────────────────────────────────────────────────

  Function           Meaning
  ──────────────────────────────────────────────────────────────────────────


  Although this list of mouse functions may appear intimidating, the average
  application will only need a few of them.

  A program first calls Int 33H Function 00H to initialize the mouse driver
  for the current display mode and to check its status. At this point, the
  mouse is "alive" and the application can obtain its state and position;
  however, the pointer does not become visible until the process calls Int
  33H Function 01H.

  The program can then call Int 33H Functions 03H, 05H, and 06H to
  monitor the mouse position and the status of the mouse buttons.
  Alternatively, the program can register an interrupt handler for mouse
  events, using Int 33H Function 0CH. This latter technique eliminates the
  need to poll the mouse driver; the driver will notify the program by
  calling the interrupt handler whenever the mouse is moved or a button is
  pressed or released.

  When the application is finished with the mouse, it can call Int 33H
  Function 02H to hide the mouse pointer. If the program has registered an
  interrupt handler for mouse events, it should disable further calls to the
  handler by resetting the mouse driver again with Int 33H Function 00H.

  For a complete description of the mouse-driver functions, see Section
  III of this book, "IBM ROM BIOS and Mouse Functions Reference." Figure
  5-3 shows a small demonstration program that polls the mouse continually,
  to display its position and status.

  ──────────────────────────────────────────────────────────────────────────
  /*
      Simple Demo of Int 33H Mouse Driver
      (C) 1988 Ray Duncan

      Compile with: CL MOUDEMO.C
  */

  #include <stdio.h>
  #include <dos.h>

  union REGS regs;

  void cls(void);                     /* function prototypes       */
  void gotoxy(int, int);

  main(int argc, char *argv[])
  {
      int x,y,buttons;                /* some scratch variables    */
                                      /* for the mouse state       */

      regs.x.ax = 0;                  /* reset mouse driver        */
      int86(0x33, &regs, &regs);      /* and check status          */

      if(regs.x.ax == 0)              /* exit if no mouse          */
      {   printf("\nMouse not available\n");
          exit(1);
      }

      cls();                          /* clear the screen          */
      gotoxy(45,0);                   /* and show help info        */
      puts("Press Both Mouse Buttons To Exit");

      regs.x.ax = 1;                  /* display mouse cursor      */
      int86(0x33, &regs, &regs);

      do {
          regs.x.ax = 3;              /* get mouse position        */
          int86(0x33, &regs, &regs);  /* and button status         */
          buttons = regs.x.bx & 3;
          x = regs.x.cx;
          y = regs.x.dx;
          gotoxy(0,0);                 /* display mouse position    */
          printf("X = %3d  Y = %3d", x, y);

      } while(buttons != 3);           /* exit if both buttons down */

      regs.x.ax = 2;                   /* hide mouse cursor         */
      int86(0x33, &regs, &regs);

      cls();                           /* display message and exit  */
      gotoxy(0,0);
      puts("Have a Mice Day!");
  }

  /*
      Clear the screen
  */
  void cls(void)
  {
      regs.x.ax = 0x0600;              /* ROM BIOS video driver     */
      regs.h.bh = 7;                   /* int 10h function 06h      */
      regs.x.cx = 0;                   /* initializes a window      */
      regs.h.dh = 24;
      regs.h.dl = 79;
      int86(0x10, &regs, &regs);
  }

  /*
      Position cursor to (x,y)
  */
  void gotoxy(int x, int y)
  {
      regs.h.dl = x;                   /* ROM BIOS video driver     */
      regs.h.dh = y;                   /* int 10h function 02h      */
      regs.h.bh = 0;                   /* positions the cursor      */
      regs.h.ah = 2;
      int86(0x10, &regs, &regs);
  }
  ──────────────────────────────────────────────────────────────────────────

  Figure 5-3.  MOUDEMO.C: A simple Microsoft C program that polls the mouse
  and continually displays the coordinates of the mouse pointer in the upper
  left corner of the screen. The program uses the ROM BIOS video driver,
  which is discussed in Chapter 6, to clear the screen and position the
  text cursor.



────────────────────────────────────────────────────────────────────────────
Chapter 6  Video Display

  The visual presentation of an application program is one of its most
  important elements. Users frequently base their conclusions about a
  program's performance and "polish" on the speed and attractiveness of its
  displays. Therefore, a feel for the computer system's display facilities
  and capabilities at all levels, from MS-DOS down to the bare hardware, is
  important to you as a programmer.


Video Display Adapters

  The video display adapters found in IBM PC─compatible computers have a
  hybrid interface to the central processor. The overall display
  characteristics, such as vertical and horizontal resolution, background
  color, and palette, are controlled by values written to I/O ports whose
  addresses are hardwired on the adapter, whereas the appearance of each
  individual character or graphics pixel on the display is controlled by a
  specific location within an area of memory called the regen buffer or
  refresh buffer. Both the CPU and the video controller access this memory;
  the software updates the display by simply writing character codes or bit
  patterns directly into the regen buffer. (This is called memory-mapped
  I/O.)

  The following adapters are in common use as this book is being written:

  ■  Monochrome/Printer Display Adapter (MDA). Introduced with the original
     IBM PC in 1981, this adapter supports 80-by-25 text display on a green
     (monochrome) screen and has no graphics capabilities at all.

  ■  Color/Graphics Adapter (CGA). Also introduced by IBM in 1981, this
     adapter supports 40-by-25 and 80-by-25 text modes and 320-by-200,
     4-color or 640-by-200, 2-color graphics (all-points-addressable, or
     APA) modes on composite or digital RGB monitors.

  ■  Enhanced Graphics Adapter (EGA). Introduced by IBM in 1985 and upwardly
     compatible from the CGA, this adapter adds support for 640-by-350,
     16-color graphics modes on digital RGB monitors. It also supports an
     MDA-compatible text mode.

  ■  Multi-Color Graphics Array (MCGA). Introduced by IBM in 1987 with the
     Personal System/2 (PS/2) models 25 and 30, this adapter is partially
     compatible with the CGA and EGA and supports 640-by-480, 2-color or
     320-by-200, 256-color graphics on analog RGB monitors.

  ■  Video Graphics Array (VGA). Introduced by IBM in 1987 with the PS/2
     models 50, 60, and 80, this adapter is upwardly compatible from the EGA
     and supports 640-by-480, 16-color or 320-by-200, 256-color graphics on
     analog RGB monitors. It also supports an MDA-compatible text mode.

  ■  Hercules Graphics Card, Graphics CardPlus, and InColor Cards. These are
     upwardly compatible from the MDA for text display but offer graphics
     capabilities that are incompatible with all of the IBM adapters.

  The locations of the regen buffers for the various IBM PC─compatible
  adapters are shown in Figure 6-1.

         ┌───────────────────────────────────────────────────────┐
         │                       ROM BIOS                        │
  FE000H ├───────────────────────────────────────────────────────┤
         │          System ROM, Stand-alone BASIC, etc.          │
  F4000H ├───────────────────────────────────────────────────────┤
         │             Reserved for BIOS extensions              │
         │             (hard-disk controller, etc.)              │
  C0000H ├───────────────────────────────────────────────────────┤
         │                       Reserved                        │
  BC000H ├───────────────────────────────────────────────────────┤
         │    16 KB regen buffer for CGA, EGA, MCGA, and VGA     │
         │       in text modes and 200-line graphics modes       │
  B8000H ├───────────────────────────────────────────────────────┤
         │                       Reserved                        │
  B1000H ├───────────────────────────────────────────────────────┤
         │         4 KB Monochrome Adapter regen buffer          │
  B0000H ├───────────────────────────────────────────────────────┤
         │       Regen buffer area for EGA, MCGA, and VGA        │
         │        in 350-line or 480-line graphics modes         │
  A0000H ├───────────────────────────────────────────────────────┤
         │             Transient part of COMMAND.COM             │
         ├───────────────────────────────────────────────────────┤
         │                Transient program area                 │
  varies ├───────────────────────────────────────────────────────┤
         │                MS-DOS and its buffers,                │
         │              tables, and device drivers               │
  00400H ├───────────────────────────────────────────────────────┤
         │                   Interrupt vectors                   │
  00000H └───────────────────────────────────────────────────────┘

  Figure 6-1.  Memory diagram of an IBM PC─compatible personal computer,
  showing the locations of the regen buffers for various adapters.


Support Considerations

  MS-DOS offers several functions to transfer text to the display. Version 1
  supported only Teletype-like output capabilities; version 2 added an
  optional ANSI console driver to allow the programmer to clear the screen,
  position the cursor, and select colors and attributes with standard escape
  sequences embedded in the output. Programs that use only the MS-DOS
  functions will operate properly on any computer system that runs MS-DOS,
  regardless of the level of IBM hardware compatibility.

  On IBM PC─compatible machines, the ROM BIOS contains a video driver that
  programs can invoke directly, bypassing MS-DOS. The ROM BIOS functions
  allow a program to write text or individual pixels to the screen or to
  select display modes, video pages, palette, and foreground and background
  colors. These functions are relatively efficient (compared with the MS-DOS
  functions, at least), although the graphics support is primitive.

  Unfortunately, the display functions of both MS-DOS and the ROM BIOS were
  designed around the model of a cursor-addressable terminal and therefore
  do not fully exploit the capabilities of the memory-mapped, high-bandwidth
  display adapters used on IBM PC─compatible machines. As a result, nearly
  every popular interactive application with full-screen displays or
  graphics capability ignores both MS-DOS and the ROM BIOS and writes
  directly to the video controller's registers and regen buffer.

  Programs that control the hardware directly are sometimes called
  "ill-behaved," because they are performing operations that are normally
  reserved for operating-system device drivers. These programs are a severe
  management problem in multitasking real-mode environments such as DesqView
  and Microsoft Windows, and they are the main reason why such environments
  are not used more widely. It could be argued, however, that the blame for
  such problematic behavior lies not with the application programs but with
  the failure of MS-DOS and the ROM BIOS──even six years after the first
  appearance of the IBM PC──to provide display functions of adequate range
  and power.


MS-DOS Display Functions

  Under MS-DOS versions 2.0 and later, the preferred method for sending text
  to the display is to use handle-based Int 21H Function 40H (Write File or
  Device). When an application program receives control, MS-DOS has already
  assigned it handles for the standard output (1) and standard error (2)
  devices, and these handles can be used immediately. For example, the
  sequence at the top of the following page writes the message hello to the
  display using the standard output handle.

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message to display
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,1        ; BX = standard output handle
          mov     cx,msg_len  ; CX = message length
          mov     dx,seg msg  ; DS:DX = address of message
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  If there is no error, the function returns the carry flag cleared and the
  number of characters actually transferred in register AX. Unless a Ctrl-Z
  is embedded in the text or the standard output is redirected to a disk
  file and the disk is full, this number should equal the number of
  characters requested.

  As in the case of keyboard input, the user's ability to specify
  command-line redirection parameters that are invisible to the application
  means that if you use the predefined standard output handle, you can't
  always be sure where your output is going. However, to ensure that your
  output actually goes to the display, you can use the predefined standard
  error handle, which is always opened to the CON (logical console) device
  and is not redirectable.

  As an alternative to the standard output and standard error handles, you
  can bypass any output redirection and open a separate channel to CON,
  using the handle obtained from that open operation for character output.
  For example, the following code opens the console display for output and
  then writes the string hello to it:

  ──────────────────────────────────────────────────────────────────────────
  fname   db      'CON',0      ; name of CON device
  handle  dw      0            ; handle for CON device
  msg     db      'hello'      ; message to display
  msg_len equ     $-msg        ; length of message
          .
          .
          .
          mov     ax,3d02h     ; AH = function 3dh = open
                               ; AL = mode = read/write
          mov     dx,seg fname ; DS:DX = device name
          mov     ds,dx
          mov     dx,offset fname
          int     21h          ; transfer to MS-DOS
          jc      error        ; jump if open failed
          mov     handle,ax    ; save handle for CON
          .
          .
          .
          mov     ah,40h       ; function 40h = write
          mov     cx,msg_len   ; CX = message length
          mov     dx,seg msg   ; DS:DX = address of message
          mov     ds,dx
          mov     dx,offset msg
          mov     bx,handle    ; BX = CON device handle
          int     21h          ; transfer to MS-DOS
          jc      error        ; jump if error detected
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  As with the keyboard input functions, MS-DOS also supports traditional
  display functions that are upwardly compatible from the corresponding CP/M
  output calls:

  ■  Int 21H Function 02H sends the character in the DL register to the
     standard output device. It is sensitive to Ctrl-C interrupts, and it
     handles carriage returns, linefeeds, bell codes, and backspaces
     appropriately.

  ■  Int 21H Function 06H transfers the character in the DL register to the
     standard output device, but it is not sensitive to Ctrl-C interrupts.
     You must take care when using this function, because it can also be
     used for input and for status requests.

  ■  Int 21H Function 09H sends a string to the standard output device. The
     string is terminated by the $ character.

  With MS-DOS version 2 or later, these three traditional functions are
  converted internally to handle-based writes to the standard output and
  thus are susceptible to output redirection.

  The sequence at the top of the following page sounds a warning beep by
  sending an ASCII bell code (07H) to the display driver using the
  traditional character-output call Int 21H Function 02H.

  ──────────────────────────────────────────────────────────────────────────
          .
          .
          .
          mov     dl,7        ; 07h = ASCII bell code
          mov     ah,2        ; function 02h = display character
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The following sequence uses the traditional string-output call Int 21H
  Function 09H to display a string:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello$'
          .
          .
          .
          mov     dx,seg msg  ; DS:DX = message address
          mov     ds,dx
          mov     dx,offset msg
          mov     ah,9        ; function 09h = write string
          int     21h         ; transfer to MS-DOS
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Note that MS-DOS detects the $ character as a terminator and does not
  display it on the screen.

Screen Control with MS-DOS Functions

  With version 2.0 or later, if MS-DOS loads the optional device driver
  ANSI.SYS in response to a DEVICE directive in the CONFIG.SYS file,
  programs can clear the screen, control the cursor position, and select
  foreground and background colors by embedding escape sequences in the text
  output. Escape sequences are so called because they begin with an escape
  character (1BH), which alerts the driver to intercept and interpret the
  subsequent characters in the sequence. When the ANSI driver is not loaded,
  MS-DOS simply passes the escape sequence to the display like any other
  text, usually resulting in a chaotic screen.

  The escape sequences that can be used with the ANSI driver for screen
  control are a subset of those defined in the ANSI 3.64─1979 Standard.
  These standard sequences are summarized in Figure 6-2. Note that case is
  significant for the last character in an escape sequence and that numbers
  must always be represented as ASCII digit strings, not as their binary
  values. (A separate set of escape sequences supported by ANSI.SYS, but not
  compatible with the ANSI standard, may be used for reprogramming and
  remapping the keyboard.)

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
  Esc[2J             Clear screen; place cursor in upper left corner (home
                     position).
  Esc[K              Clear from cursor to end of line.
  Esc[row;colH       Position cursor. (Row is the y coordinate in the range
                     1─25 and col is the x coordinate in the range 1─80 for
                     80-by-25 text display modes.) Escape sequences
                     terminated with the letter f instead of H have the same
                     effect.
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                     effect.
  Esc[nA             Move cursor up n rows.
  Esc[nB             Move cursor down n rows.
  Esc[nC             Move cursor right n columns.
  Esc[nD             Move cursor left n columns.
  Esc[s              Save current cursor position.
  Esc[u              Restore cursor to saved position.
  Esc[6n             Return current cursor position on the standard input
                     handle in the format Esc[row;colR.
  Esc[nm             Select character attributes:
                      0 = no special attributes
                      1 = high intensity
                      2 = low intensity
                      3 = italic
                      4 = underline
                      5 = blink
                      6 = rapid blink
                      7 = reverse video
                      8 = concealed text (no display)
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                      8 = concealed text (no display)
                     30 = foreground black
                     31 = foreground red
                     32 = foreground green
                     33 = foreground yellow
                     34 = foreground blue
                     35 = foreground magenta
                     36 = foreground cyan
                     37 = foreground white
                     40 = background black
                     41 = background red
                     42 = background green
                     43 = background yellow
                     44 = background blue
                     45 = background magenta
                     46 = background cyan
                     47 = background white
  Esc[=nh            Select display mode:
                      0 = 40-by-25, 16-color text (color burst off)
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                      0 = 40-by-25, 16-color text (color burst off)
                      1 = 40-by-25, 16-color text
                      2 = 80-by-25, 16-color text (color burst off)
                      3 = 80-by-25, 16-color text
                      4 = 320-by-200, 4-color graphics
                      5 = 320-by-200, 4-color graphics (color burst off)
                      6 = 620-by-200, 2-color graphics
                     14 = 640-by-200, 16-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     15 = 640-by-350, 2-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     16 = 640-by-350, 16-color graphics (EGA and VGA,
                     MS-DOS 4.0)
                     17 = 640-by-480, 2-color graphics (MCGA and VGA,
                     MS-DOS 4.0)
                     18 = 640-by-480, 16-color graphics (VGA, MS-DOS 4.0)
                     19 = 320-by-200, 256-color graphics (MCGA and VGA,
                     MS-DOS 4.0)
                     Escape sequences terminated with l instead of h have
  Escape sequence    Meaning
  ──────────────────────────────────────────────────────────────────────────
                     Escape sequences terminated with l instead of h have
                     the same effect.
  Esc[=7h            Enable line wrap.
  Esc[=7l            Disable line wrap.
  ──────────────────────────────────────────────────────────────────────────


  Figure 6-2.  The ANSI escape sequences supported by the MS-DOS ANSI.SYS
  driver. Programs running under MS-DOS 2.0 or later may use these
  functions, if ANSI.SYS is loaded, to control the appearance of the display
  in a hardware-independent manner. The symbol Esc indicates an ASCII escape
  code──a character with the value 1BH. Note that cursor positions in ANSI
  escape sequences are one-based, unlike the cursor coordinates used by the
  IBM ROM BIOS, which are zero-based. Numbers embedded in an escape sequence
  must always be represented as a string of ASCII digits, not as their
  binary values.

Binary Output Mode

  Under MS-DOS version 2 or later, you can substantially increase display
  speeds for well-behaved application programs without sacrificing hardware
  independence by selecting binary (raw) mode for the standard output. In
  binary mode, MS-DOS does not check between each character it transfers to
  the output device for a Ctrl-C waiting at the keyboard, nor does it filter
  the output string for certain characters such as Ctrl-Z.

  Bit 5 in the device information word associated with a device handle
  controls binary mode. Programs access the device information word by using
  Subfunctions 00H and 01H of the MS-DOS IOCTL function (I/O Control, Int
  21H Function 44H). For example, the sequence on the following page places
  the standard output handle into binary mode.

  ──────────────────────────────────────────────────────────────────────────
                              ; get device information...
          mov     bx,1        ; standard output handle
          mov     ax,4400h    ; function 44h subfunction 00h
          int     21h         ; transfer to MS-DOS

          mov     dh,0        ; set upper byte of DX = 0
          or      dl,20h      ; set binary mode bit in DL

                              ; write device information...
                              ; (BX still has handle)
          mov     ax,4401h    ; function 44h subfunction 01h
          int     21h         ; transfer to MS-DOS
  ──────────────────────────────────────────────────────────────────────────

  Note that if a program changes the mode of any of the standard handles, it
  should restore those handles to ASCII (cooked) mode before it exits.
  Otherwise, subsequent application programs may behave in unexpected ways.
  For more detailed information on the IOCTL function, see Section II of
  this book, "MS-DOS Functions Reference."


The ROM BIOS Display Functions

  You can somewhat improve the display performance of programs that are
  intended for use only on IBM PC─compatible machines by using the ROM BIOS
  video driver instead of the MS-DOS output functions. Accessed by means of
  Int 10H, the ROM BIOS driver supports the following functions for all of
  the currently available IBM display adapters:

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Function           Action
  ──────────────────────────────────────────────────────────────────────────
  Display mode control
  00H               Set display mode.
  0FH               Get display mode.

  Cursor control
  01H               Set cursor size.
  02H               Set cursor position.
  03H               Get cursor position and size.

  Writing to the display
  09H               Write character and attribute at cursor.
  0AH               Write character-only at cursor.
  0EH               Write character in teletype mode.

  Reading from the display
  08H               Read character and attribute at cursor.

  Function           Action
  ──────────────────────────────────────────────────────────────────────────

  Graphics support
  0CH               Write pixel.
  0DH               Read pixel.

  Scroll or clear display
  06H               Scroll up or initialize window.
  07H               Scroll down or initialize window.

  Miscellaneous
  04H               Read light pen.
  05H               Select display page.
  0BH               Select palette/set border color.
  ──────────────────────────────────────────────────────────────────────────


  Additional ROM BIOS functions are available on the EGA, MCGA, VGA, and
  PCjr to support the enhanced features of these adapters, such as
  programmable palettes and character sets (fonts). Some of the functions
  are valid only in certain display modes.

  Each display mode is characterized by the number of colors it can display,
  its vertical resolution, its horizontal resolution, and whether it
  supports text or graphics memory mapping. The ROM BIOS identifies it with
  a unique number. Section III of this book, "IBM ROM BIOS and Mouse
  Functions Reference," documents all of the ROM BIOS Int 10H functions and
  display modes.

  As you can see from the preceding list, the ROM BIOS offers several
  desirable capabilities that are not available from MS-DOS, including
  initialization or scrolling of selected screen windows, modification of
  the cursor shape, and reading back the character being displayed at an
  arbitrary screen location. These functions can be used to isolate your
  program from the hardware on any IBM PC─compatible adapter. However, the
  ROM BIOS functions do not suffice for the needs of a high-performance,
  interactive, full-screen program such as a word processor. They do not
  support the rapid display of character strings at an arbitrary screen
  position, and they do not implement graphics operations at the level
  normally required by applications (for example, bit-block transfers and
  rapid drawing of lines, circles, and filled polygons). And, of course,
  they are of no use whatsoever in non-IBM display modes such as the
  monochrome graphics mode of the Hercules Graphics Card.

  Let's look at a simple example of a call to the ROM BIOS video driver. The
  following sequence writes the string hello to the screen:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'
  msg_len equ     $-msg
          .
          .
          .
          mov     si,seg msg  ; DS:SI = message address
          mov     ds,si
          mov     si,offset msg
          mov     cx,msg_len  ; CX = message length
          cld
  next:   lodsb               ; get AL = next character
          push    si          ; save message pointer
          mov     ah,0eh      ; int 10h function 0eh = write
                              ; character in teletype mode
          mov     bh,0        ; assume video page 0
          mov     bl,color    ; (use in graphics modes only)
          int     10h         ; transfer to ROM BIOS
          pop     si          ; restore message pointer
          loop    next        ; loop until message done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  (Note that the SI and DI registers are not necessarily preserved across a
  call to a ROM BIOS video function.)


Memory-mapped Display Techniques

  Display performance is best when an application program takes over
  complete control of the video adapter and the refresh buffer. Because the
  display is memory-mapped, the speed at which characters can be put on the
  screen is limited only by the CPU's ability to copy bytes from one
  location in memory to another. The trade-off for this performance is that
  such programs are highly sensitive to hardware compatibility and do not
  always function properly on "clones" or even on new models of IBM video
  adapters.

Text Mode

  Direct programming of the IBM PC─compatible video adapters in their text
  display modes (sometimes also called alphanumeric display modes) is
  straightforward. The character set is the same for all, and the cursor
  home position──(x,y) = (0,0)──is defined to be the upper left corner of
  the screen (Figure 6-3). The MDA uses 4 KB of memory starting at segment
  B000H as a regen buffer, and the various adapters with both text and
  graphics capabilities (CGA, EGA, MCGA, and VGA) use 16 KB of memory
  starting at segment B800H. (See Figure 6-1.) In the latter case, the 16
  KB is divided into "pages" that can be independently updated and
  displayed.

   (0,0)┌─────────────────────────────────┐(79,0)
        │                                 │
        │                                 │
        │                                 │
        │                                 │
        │                                 │
        │                                 │
        │                                 │
  (0,24)└─────────────────────────────────┘(79,24)

  Figure 6-3.  Cursor addressing for 80-by-25 text display modes (IBM ROM
  BIOS modes 2, 3, and 7).

  Each character-display position is allotted 2 bytes in the regen buffer.
  The first byte (even address) contains the ASCII code of the character,
  which is translated by a special hardware character generator into a
  dot-matrix pattern for the screen. The second byte (odd address) is the
  attribute byte. Several bit fields in this byte control such features as
  blinking, intensity (highlighting), and reverse video, depending on the
  adapter type and display mode (Figures 6-4 and 6-5). Figure 6-6 shows a
  hex and ASCII dump of part of the video map for the MDA.

  Display                  Background              Foreground
  ──────────────────────────────────────────────────────────────────────────
  No display (black)       000                     000
  No display (white)☼      111                     111
  Underline                000                     001
  Normal video             000                     111
  Reverse video            111                     000
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-4.  Attribute byte for 80-by-25 monochrome text display mode on
  the MDA, Hercules cards, EGA, and VGA (IBM ROM BIOS mode 7).

  Value              Color
  ──────────────────────────────────────────────────────────────────────────
   0                 Black
   1                 Blue
   2                 Green
   3                 Cyan
   4                 Red
   5                 Magenta
   6                 Brown
   7                 White
   8                 Gray
   9                 Light blue
  10                 Light green
  11                 Light cyan
  12                 Light red
  13                 Light magenta
  14                 Yellow
  15                 Intense white
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-5.  Attribute byte for the 40-by-25 and 80-by-25 text display
  modes on the CGA, EGA, MCGA, and VGA (IBM ROM BIOS modes 0─3). The table
  of color values assumes default palette programming and that the B or I
  bit controls intensity.

  ──────────────────────────────────────────────────────────────────────────
  B000:0000 3e 07 73 07 65 07 6c 07 65 07 63 07 74 07 20 07
  B000:0010 74 07 65 07 6d 07 70 07 20 07 20 07 20 07 20 07
  B000:0020 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0030 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0040 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0050 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0060 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0070 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0080 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  B000:0090 20 07 20 07 20 07 20 07 20 07 20 07 20 07 20 07
  ──────────────────────────────────────────────────────────────────────────

  Figure 6-6.  Example dump of the first 160 bytes of the MDA's regen
  buffer. These bytes correspond to the first visible line on the screen.
  Note that ASCII character codes are stored in even bytes and their
  respective character attributes in odd bytes; all the characters in this
  example line have the attribute normal video.

  You can calculate the memory offset of any character on the display as the
  line number (y coordinate) times 80 characters per line times 2 bytes per
  character, plus the column number (x coordinate) times 2 bytes per
  character, plus (for the text/graphics adapters) the page number times the
  size of the page (4 KB per page in 80-by-25 modes; 2 KB per page in
  40-by-25 modes). In short, the formula for the offset of the
  character-attribute pair for a given screen position (x,y) in 80-by-25
  text modes is

    offset = ((y * 50H + x) * 2) + (page * 1000H)

  In 40-by-25 text modes, the formula is

    offset = ((y * 50H + x) * 2) + (page * 0800H)

  Of course, the segment register being used to address the video buffer
  must be set appropriately, depending on the type of display adapter.

  As a simple example, assume that the character to be displayed is in the
  AL register, the desired attribute byte for the character is in the AH
  register, the x coordinate (column) is in the BX register, and the y
  coordinate (row) is in the CX register. The following code stores the
  character and attribute byte into the MDA's video refresh buffer at the
  proper location:

  ──────────────────────────────────────────────────────────────────────────
          push    ax          ; save char and attribute
          mov     ax,160
          mul     cx          ; DX:AX = Y * 160
          shl     bx,1        ; multiply X by 2
          add     bx,ax       ; BX = (Y*160) + (X*2)
          mov     ax,0b000h   ; ES = segment of monochrome
          mov     es,ax       ; adapter refresh buffer
          pop     ax          ; restore char and attribute
          mov     es:[bx],ax  ; write them to video buffer
  ──────────────────────────────────────────────────────────────────────────

  More frequently, we wish to move entire strings into the refresh buffer,
  starting at a given coordinate. In the next example, assume that the DS:SI
  registers point to the source string, the ES:DI registers point to the
  starting position in the video buffer (calculated as shown in the previous
  example), the AH register contains the attribute byte to be assigned to
  every character in the string, and the CX register contains the length of
  the string. The following code moves the entire string into the refresh
  buffer:

  ──────────────────────────────────────────────────────────────────────────
  xfer:   lodsb               ; fetch next character
          stosw               ; store char + attribute
          loop    xfer        ; until all chars moved
  ──────────────────────────────────────────────────────────────────────────

  Of course, the video drivers written for actual application programs must
  take into account many additional factors, such as checking for special
  control codes (linefeeds, carriage returns, tabs), line wrap, and
  scrolling.

  Programs that write characters directly to the CGA regen buffer in text
  modes must deal with an additional complicating factor──they must examine
  the video controller's status port and access the refresh buffer only
  during the horizontal retrace or vertical retrace intervals. (A retrace
  interval is the period when the electron beam that illuminates the screen
  phosphors is being repositioned to the start of a new scan line.)
  Otherwise, the contention for memory between the CPU and the video
  controller is manifest as unsightly "snow" on the display. (If you are
  writing programs for any of the other IBM PC─compatible video adapters,
  such as the MDA, EGA, MCGA, or VGA, you can ignore the retrace intervals;
  snow is not a problem with these video controllers.)

  A program can detect the occurrence of a retrace interval by monitoring
  certain bits in the video controller's status register. For example,
  assume that the offset for the desired character position has been
  calculated as in the preceding example and placed in the BX register, the
  segment for the CGA's refresh buffer is in the ES register, and an ASCII
  character code to be displayed is in the CL register. The following code
  waits for the beginning of a new horizontal retrace interval and then
  writes the character into the buffer:

  ──────────────────────────────────────────────────────────────────────────
          mov     dx,03dah    ; DX = video controller's
                              ; status port address
          cli                 ; disable interrupts

                              ; if retrace is already
                              ; in progress, wait for
                              ; it to end...
  wait1:  in      al,dx       ; read status port
          and     al,1        ; check if retrace bit on
          jnz     wait1       ; yes, wait

                              ; wait for new retrace
                              ; interval to start...
  wait2:  in      al,dx       ; read status port
          and     al,1        ; retrace bit on yet?
          jz      wait2       ; jump if not yet on

          mov     es:[bx],cl  ; write character to
                              ; the regen buffer
          sti                 ; enable interrupts again
  ──────────────────────────────────────────────────────────────────────────

  The first wait loop "synchronizes" the code to the beginning of a
  horizontal retrace interval. If only the second wait loop were used (that
  is, if a character were written when a retrace interval was already in
  progress), the write would occasionally begin so close to the end of a
  horizontal retrace "window" that it would partially miss the retrace,
  resulting in scattered snow at the left edge of the display. Notice that
  the code also disables interrupts during accesses to the video buffer, so
  that service of a hardware interrupt won't disrupt the synchronization
  process.

  Because of the retrace-interval constraints just outlined, the rate at
  which you can update the CGA in text modes is severely limited when the
  updating is done one character at a time. You can obtain better results by
  calculating all the relevant addresses and setting up the appropriate
  registers, disabling the video controller by writing to register 3D8H,
  moving the entire string to the buffer with a REP MOVSW operation, and
  then reenabling the video controller. If the string is of reasonable
  length, the user won't even notice a flicker in the display. Of course,
  this procedure introduces additional hardware dependence into your code
  because it requires much greater knowledge of the 6845 controller.
  Luckily, snow is not a problem in CGA graphics modes.

Graphics Mode

  Graphics-mode memory-mapped programming for IBM PC─compatible adapters is
  considerably more complicated than text-mode programming. Each bit or
  group of bits in the regen buffer corresponds to an addressable point, or
  pixel, on the screen. The mapping of bits to pixels differs for each of
  the available graphics modes, with their differences in resolution and
  number of supported colors. The newer adapters (EGA, MCGA, and VGA) also
  use the concept of bit planes, where bits of a pixel are segregated into
  multiple banks of memory mapped at the same address; you must manipulate
  these bit planes by a combination of memory-mapped I/O and port
  addressing.

  IBM-video-systems graphics programming is a subject large enough for a
  book of its own, but we can use the 640-by-200, 2-color graphics display
  mode of the CGA (which is also supported by all subsequent IBM
  text/graphics adapters) to illustrate a few of the techniques involved.
  This mode is simple to deal with because each pixel is represented by a
  single bit. The pixels are assigned (x,y) coordinates in the range (0,0)
  through (639,199), where x is the horizontal displacement, y is the
  vertical displacement, and the home position (0,0) is the upper left
  corner of the display. (See Figure 6-7.)

    (0,0)┌─────────────────────────────────┐(639,0)
         │                                 │
         │                                 │
         │                                 │
         │                                 │
         │                                 │
         │                                 │
         │                                 │
  (0,199)└─────────────────────────────────┘(639,199)

  Figure 6-7.  Point addressing for 640-by-200, 2-color graphics modes on
  the CGA, EGA, MCGA, and VGA (IBM ROM BIOS mode 6).

  Each successive group of 80 bytes (640 bits) represents one horizontal
  scan line. Within each byte, the bits map one-for-one onto pixels, with
  the most significant bit corresponding to the leftmost displayed pixel of
  a set of eight pixels and the least significant bit corresponding to the
  rightmost displayed pixel of the set. The memory map is set up so that all
  the even y coordinates are scanned as a set and all the odd y coordinates
  are scanned as a set; this mapping is referred to as the memory interlace.

  To find the regen buffer offset for a particular (x,y) coordinate, you
  would use the following formula:

    offset = ((y AND 1) * 2000H) + (y/2 * 50H) + (x/8)

  The assembly-language implementation of this formula is as follows:

  ──────────────────────────────────────────────────────────────────────────
                              ; assume AX = Y, BX = X
          shr     bx,1        ; divide X by 8
          shr     bx,1
          shr     bx,1
          push    ax          ; save copy of Y
          shr     ax,1        ; find (Y/2) * 50h
          mov     cx,50h      ; with product in DX:AX
          mul     cx
          add     bx,ax       ; add product to X/8
          pop     ax          ; add (Y AND 1) * 2000h
          and     ax,1
          jz      label1
          add     bx,2000h
  label1:                     ; now BX = offset into
                              ; video buffer
  ──────────────────────────────────────────────────────────────────────────

  After calculating the correct byte address, you can use the following
  formula to calculate the bit position for a given pixel coordinate:

    bit = 7 - (x MOD 8)

  where bit 7 is the most significant bit and bit 0 is the least significant
  bit. It is easiest to build an 8-byte table, or array of bit masks, and
  use the operation X AND 7 to extract the appropriate entry from the table:

  (X AND 7)          Bit mask          (X AND 7)          Bit mask
  ──────────────────────────────────────────────────────────────────────────
  0                  80H               4                  08H
  1                  40H               5                  04H
  2                  20H               6                  02H
  3                  10H               7                  01H
  ──────────────────────────────────────────────────────────────────────────

  The assembly-language implementation of this second calculation is as
  follows:

  ──────────────────────────────────────────────────────────────────────────
  table   db      80h         ; X AND 7 = offset 0
          db      40h         ; X AND 7 = offset 1
          db      20h         ; X AND 7 = offset 2
          db      10h         ; X AND 7 = offset 3
          db      08h         ; X AND 7 = offset 4
          db      04h         ; X AND 7 = offset 5
          db      02h         ; X AND 7 = offset 6
          db      01h         ; X AND 7 = offset 7
          .
          .
          .
                              ; assume BX = X coordinate
          and     bx,7        ; isolate 0─7 offset
          mov     al,[bx+table]
                              ; now AL = mask from table
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The program can then use the mask, together with the byte offset
  previously calculated, to set or clear the appropriate bit in the video
  controller's regen buffer.



────────────────────────────────────────────────────────────────────────────
Chapter 7  Printer and Serial Port

  MS-DOS supports printers, plotters, modems, and other hard-copy output or
  communication devices with device drivers for parallel ports and serial
  ports. Parallel ports are so named because they transfer a byte──8 bits──
  in parallel to the destination device over eight separate physical paths
  (plus additional status and handshaking signals). The serial port, on the
  other hand, communicates with the CPU with bytes but sends data to or
  receives data from its destination device serially──a bit at a time──over
  a single physical connection.

  Parallel ports are typically used for high-speed output devices, such as
  line printers, over relatively short distances (less than 50 feet). They
  are rarely used for devices that require two-way communication with the
  computer. Serial ports are used for lower-speed devices, such as modems
  and terminals, that require two-way communication (although some printers
  also have serial interfaces). A serial port can drive its device reliably
  over much greater distances (up to 1000 feet) over as few as three wires──
  transmit, receive, and ground.

  The most commonly used type of serial interface follows a standard called
  RS-232. This standard specifies a 25-wire interface with certain
  electrical characteristics, the use of various handshaking signals, and a
  standard DB-25 connector. Other serial-interface standards exist──for
  example, the RS-422, which is capable of considerably higher speeds than
  the RS-232── but these are rarely used in personal computers (except for
  the Apple Macintosh) at this time.

  MS-DOS has built-in device drivers for three parallel adapters, and for
  two serial adapters on the PC or PC/AT and three serial adapters on the
  PS/2. The logical names for these devices are LPT1, LPT2, LPT3, COM1,
  COM2, and COM3. The standard printer (PRN) and standard auxiliary (AUX)
  devices are normally aliased to LPT1 and COM1, but you can redirect PRN to
  one of the serial ports with the MS-DOS MODE command.

  As with keyboard and video display I/O, you can manage printer and
  serial-port I/O at several levels that offer different degrees of
  flexibility and hardware independence:

  ■  MS-DOS handle-oriented functions

  ■  MS-DOS traditional character functions

  ■  IBM ROM BIOS driver functions

  In the case of the serial port, direct control of the hardware by
  application programs is also common. I will discuss each of these I/O
  methods briefly, with examples, in the following pages.


Printer Output

  The preferred method of printer output is to use the handle write function
  (Int 21H Function 40H) with the predefined standard printer handle (4).
  For example, you could write the string hello to the printer as follows:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,4        ; BX = standard printer handle
          mov     cx,msg_len  ; CX = length of string
          mov     dx,seg msg  ; DS:DX = string address
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  If there is no error, the function returns the carry flag cleared and the
  number of characters actually transferred to the list device in register
  AX. Under normal circumstances, this number should always be the same as
  the length requested and the carry flag indicating an error should never
  be set. However, the output will terminate early if your data contains an
  end-of-file mark (Ctrl-Z).

  You can write independently to several list devices (for example, LPT1,
  LPT2) by issuing a specific open request (Int 21H Function 3DH) for each
  device and using the handles returned to access the printers individually
  with Int 21H Function 40H. You have already seen this general approach in
  Chapters 5 and 6.

  An alternative method of printer output is to use the traditional Int 21H
  Function 05H, which transfers the character in the DL register to the
  printer. (This function is sensitive to Ctrl-C interrupts.) For example,
  the assembly-language code sequence at the top of the following page would
  write the the string hello to the line printer.

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = string length

  next:   mov     dl,[bx]     ; get next character
          mov     ah,5        ; function 05h = printer output
          int     21h         ; transfer to MS-DOS
          inc     bx          ; bump string pointer
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Programs that run on IBM PC─compatible machines can obtain improved
  printer throughput by bypassing MS-DOS and calling the ROM BIOS printer
  driver directly by means of Int 17H. Section III of this book, "IBM ROM
  BIOS and Mouse Functions Reference," documents the Int 17H functions in
  detail. Use of the ROM BIOS functions also allows your program to test
  whether the printer is off line or out of paper, a capability that MS-DOS
  does not offer.

  For example, the following sequence of instructions calls the ROM BIOS
  printer driver to send the string hello to the line printer:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for printer
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = string length
          mov     dx,0        ; DX = printer number

  next:   mov     al,[bx]     ; AL = character to print
          mov     ah,0        ; function 00h = printer output
          int     17h         ; transfer to ROM BIOS
          inc     bx          ; bump string pointer
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Note that the printer numbers used by the ROM BIOS are zero-based, whereas
  the printer numbers in MS-DOS logical-device names are one-based. For
  example, ROM BIOS printer 0 corresponds to LPT1.

  Finally, the most hardware-dependent technique of printer output is to
  access the printer controller directly. Considering the functionality
  already provided in MS-DOS and the IBM ROM BIOS, as well as the speeds of
  the devices involved, I cannot see any justification for using direct
  hardware control in this case. The disadvantage of introducing such
  extreme hardware dependence for such a low-speed device would far outweigh
  any small performance gains that might be obtained.


The Serial Port

  MS-DOS support for serial ports (often referred to as the auxiliary device
  in MS-DOS manuals) is weak compared with its keyboard, video-display, and
  printer support. This is one area where the application programmer is
  justified in making programs hardware dependent to extract adequate
  performance.

  Programs that restrict themselves to MS-DOS functions to ensure
  portability can use the handle read and write functions (Int 21H Functions
  3FH and 40H), with the predefined standard auxiliary handle (3) to
  access the serial port. For example, the following code writes the string
  hello to the serial port that is currently defined as the AUX device:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for serial port
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     ah,40h      ; function 40h = write file or device
          mov     bx,3        ; BX = standard aux handle
          mov     cx,msg_len  ; CX = string length
          mov     dx,seg msg  ; DS:DX = string address
          mov     ds,dx
          mov     dx,offset msg
          int     21h         ; transfer to MS-DOS
          jc      error       ; jump if error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  The standard auxiliary handle gives access to only the first serial port
  (COM1). If you want to read or write COM2 and COM3 using the handle calls,
  you must issue an open request (Int 21H Function 3DH) for the desired
  serial port and use the handle returned by that function with Int 21H
  Functions 3FH and 40H.

  Some versions of MS-DOS have a bug in character-device handling that
  manifests itself as follows: If you issue a read request with Int 21H
  Function 3FH for the exact number of characters that are waiting in the
  driver's buffer, the length returned in the AX register is the number of
  characters transferred minus one. You can circumvent this problem by
  always requesting more characters than you expect to receive or by placing
  the device handle into binary mode using Int 21H Function 44H.

  MS-DOS also supports two traditional functions for serial-port I/O. Int
  21H Function 03H inputs a character from COM1 and returns it in the AL
  register; Int 21H Function 04H transmits the character in the DL register
  to COM1. Like the other traditional calls, these two are direct
  descendants of the CP/M auxiliary-device functions.

  For example, the following code sends the string hello to COM1 using the
  traditional Int 21H Function 04H:

  ──────────────────────────────────────────────────────────────────────────
  msg     db      'hello'     ; message for serial port
  msg_len equ     $-msg       ; length of message
          .
          .
          .
          mov     bx,seg msg  ; DS:BX = string address
          mov     ds,bx
          mov     bx,offset msg
          mov     cx,msg_len  ; CX = length of string
    mov     dl,[bx]     ; get next character
          mov     ah,4        ; function 04h = aux output
          int     21h         ; transfer to MS-DOS
          inc     bx          ; bump pointer to string
          loop    next        ; loop until string done
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  MS-DOS translates the traditional auxiliary-device functions into calls on
  the same device driver used by the handle calls. Therefore, it is
  generally preferable to use the handle functions in the first place,
  because they allow very long strings to be read or written in one
  operation, they give access to serial ports other than COM1, and they are
  symmetrical with the handle video-display, keyboard, printer, and file I/O
  methods described elsewhere in this book.

  Although the handle or traditional serial-port functions allow you to
  write programs that are portable to any machine running MS-DOS, they have
  a number of disadvantages:

  ■  The built-in MS-DOS serial-port driver is slow and is not interrupt
     driven.

  ■  MS-DOS serial-port I/O is not buffered.

  ■  Determining the status of the auxiliary device requires a separate call
     to the IOCTL function (Int 21H Function 44H)──if you request input and
     no characters are ready, your program will simply hang.

  ■  MS-DOS offers no standardized function to configure the serial port
     from within a program.

  For programs that are going to run on the IBM PC or compatibles, a more
  flexible technique for serial-port I/O is to call the IBM ROM BIOS
  serial-port driver by means of Int 14H. You can use this driver to
  initialize the serial port to a desired configuration and baud rate,
  examine the status of the controller, and read or write characters.
  Section III of this book, "IBM ROM BIOS and Mouse Functions Reference,"
  documents the functions available from the ROM BIOS serial-port driver.

  For example, the following sequence sends the character X to the first
  serial port (COM1):

  ──────────────────────────────────────────────────────────────────────────
          .
          .
          .
          mov     ah,1        ; function 01h = send character
          mov     al,'X'      ; AL = character to transmit
          mov     dx,0        ; DX = serial-port number
          int     14h         ; transfer to ROM BIOS
          and     ah,80h      ; did transmit fail?
          jnz     error       ; jump if transmit error
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  As with the ROM BIOS printer driver, the serial-port numbers used by the
  ROM BIOS are zero-based, whereas the serial-port numbers in MS-DOS
  logical-device names are one-based. In this example, serial port 0
  corresponds to COM1.

  Unfortunately, like the MS-DOS auxiliary-device driver, the ROM BIOS
  serial-port driver is not interrupt driven. Although it will support
  higher transfer speeds than the MS-DOS functions, at rates greater than
  2400 baud it may still lose characters. Consequently, most programmers
  writing high-performance applications that use a serial port (such as
  telecommunications programs) take complete control of the serial-port
  controller and provide their own interrupt driver. The built-in functions
  provided by MS-DOS, and by the ROM BIOS in the case of the IBM PC, are
  simply not adequate.

  Writing such programs requires a good understanding of the hardware. In
  the case of the IBM PC, the chips to study are the INS8250 Asynchronous
  Communications Controller and the Intel 8259A Programmable Interrupt
  Controller. The IBM technical reference documentation for these chips is a
  bit disorganized, but most of the necessary information is there if you
  look for it.


The TALK Program

  The simple terminal-emulator program TALK.ASM (Figure 7-1) is an example
  of a useful program that performs screen, keyboard, and serial-port I/O.
  This program recapitulates all of the topics discussed in Chapters 5
  through 7. TALK uses the IBM PC's ROM BIOS video driver to put characters
  on the screen, to clear the display, and to position the cursor; it uses
  the MS-DOS character-input calls to read the keyboard; and it contains its
  own interrupt driver for the serial-port controller.

  ──────────────────────────────────────────────────────────────────────────
          name      talk
          page      55,132
          .lfcond             ; List false conditionals too
          title     TALK--Simple terminal emulator

  ;
  ; TALK.ASM--Simple IBM PC terminal emulator
  ;
  ; Copyright (c) 1988 Ray Duncan
  ;
  ; To assemble and link this program into TALK.EXE:
  ;
  ;       C>MASM TALK;
  ;       C>LINK TALK;
  ;

  stdin   equ     0               ; standard input handle
  stdout  equ     1               ; standard output handle
  stderr  equ     2               ; standard error handle

  cr      equ     0dh             ; ASCII carriage return
  lf      equ     0ah             ; ASCII linefeed
  bsp     equ     08h             ; ASCII backspace
  escape  equ     1bh             ; ASCII escape code

  dattr   equ     07h             ; display attribute to use
                                  ; while in emulation mode

  bufsiz  equ     4096            ; size of serial-port buffer

  echo    equ     0               ; 0 = full-duplex, -1 = half-duplex
     equ     -1
  false   equ     0

  com1    equ     true            ; use COM1 if nonzero
  com2    equ     not com1        ; use COM2 if nonzero

  pic_mask  equ   21h             ; 8259 interrupt mask port
  pic_eoi   equ   20h             ; 8259 EOI port

          if      com1
  com_data equ    03f8h           ; port assignments for COM1
  com_ier  equ    03f9h
  com_mcr  equ    03fch
  com_sts  equ    03fdh
  com_int  equ    0ch             ; COM1 interrupt number
  int_mask equ    10h             ; IRQ4 mask for 8259
          endif

          if      com2
  com_data equ    02f8h           ; port assignments for COM2
  com_ier  equ    02f9h
  com_mcr  equ    02fch
  com_sts  equ    02fdh
  com_int  equ    0bh             ; COM2 interrupt number
  int_mask equ    08h             ; IRQ3 mask for 8259
          endif

  _TEXT   segment word public 'CODE'

          assume  cs:_TEXT,ds:_DATA,es:_DATA,ss:STACK

  talk    proc    far             ; entry point from MS-DOS

          mov     ax,_DATA        ; make data segment addressable
          mov     ds,ax
          mov     es,ax
                                  ; initialize display for
                                  ; terminal emulator mode...

          mov     ah,15           ; get display width and
          int     10h             ; current display mode
          dec     ah              ; save display width for use
          mov     columns,ah      ; by the screen-clear routine

          cmp     al,7            ; enforce text display mode
          je      talk2           ; mode 7 ok, proceed
         cmp     al,3
          jbe     talk2           ; modes 0-3 ok, proceed

          mov     dx,offset msg1
          mov     cx,msg1_len
          jmp     talk6           ; print error message and exit

  talk2:  mov     bh,dattr        ; clear screen and home cursor
          call    cls

          call    asc_enb         ; capture serial-port interrupt
                                  ; vector and enable interrupts

          mov     dx,offset msg2  ; display message
          mov     cx,msg2_len     ; 'terminal emulator running'
          mov     bx,stdout       ; BX = standard output handle
          mov     ah,40h          ; function 40h = write file or device
          int     21h             ; transfer to MS-DOS

  talk3:  call    pc_stat         ; keyboard character waiting?
          jz      talk4           ; nothing waiting, jump

          call    pc_in           ; read keyboard character

          cmp     al,0            ; is it a function key?
          jne     talk32          ; not function key, jump

          call    pc_in           ; function key, discard 2nd
                                  ; character of sequence
          jmp     talk5           ; then terminate program

  talk32:                         ; keyboard character received
          if      echo
          push    ax              ; if half-duplex, echo
          call    pc_out          ; character to PC display
          pop     ax
          endif

          call    com_out         ; write char to serial port

  talk4:  call    com_stat        ; serial-port character waiting?
          jz      talk3           ; nothing waiting, jump

          call    com_in          ; read serial-port character

          cmp     al,20h          ; is it control code?
          jae     talk45          ; jump if not
          call    ctrl_code       ; control code, process it

          jmp     talk3           ; check keyboard again

  talk45:                         ; noncontrol char received,
          call    pc_out          ; write it to PC display

          jmp     talk4           ; see if any more waiting

  talk5:                          ; function key detected,
                                  ; prepare to terminate...

          mov     bh,07h          ; clear screen and home cursor
          call    cls

          mov     dx,offset msg3  ; display farewell message
          mov     cx,msg3_len

  talk6:  push    dx              ; save message address
          push    cx              ; and message length

          call    asc_dsb         ; disable serial-port interrupts
                                  ; and release interrupt vector

          pop     cx              ; restore message length
          pop     dx              ; and address

          mov     bx,stdout       ; handle for standard output
          mov     ah,40h          ; function 40h = write device
          int     21h             ; transfer to MS-DOS

          mov     ax,4c00h        ; terminate program with
          int     21h             ; return code = 0

  talk    endp

  com_stat proc   near            ; check asynch status; returns
                                  ; Z = false if character ready
                                  ; Z = true if nothing waiting
          push    ax
          mov     ax,asc_in       ; compare ring buffer pointers
          cmp     ax,asc_out
          pop     ax
          ret                     ; return to caller
  stat endp

  com_in  proc    near            ; get character from serial-
                                  ; port buffer; returns
                                  ; new character in AL

          push    bx              ; save register BX

  com_in1:                        ; if no char waiting, wait
          mov     bx,asc_out      ; until one is received
          cmp     bx,asc_in
          je      com_in1         ; jump, nothing waiting

          mov     al,[bx+asc_buf] ; character is ready,
                                  ; extract it from buffer

          inc     bx              ; update buffer pointer
          cmp     bx,bufsiz
          jne     com_in2
          xor     bx,bx           ; reset pointer if wrapped
  com_in2:
          mov     asc_out,bx      ; store updated pointer
          pop     bx              ; restore register BX
          ret                     ; and return to caller

  com_in  endp

  com_out proc    near            ; write character in AL
                                  ; to serial port

          push    dx              ; save register DX
          push    ax              ; save character to send
          mov     dx,com_sts      ; DX = status port address

  com_out1:                       ; check if transmit buffer
          in      al,dx           ; is empty (TBE bit = set)
          and     al,20h
          jz      com_out1        ; no, must wait

          pop     ax              ; get character to send
          mov     dx,com_data     ; DX = data port address
          out     dx,al           ; transmit the character
          pop     dx              ; restore register DX
          ret                     ; and return to caller

  com_out endp
  pc_stat proc    near            ; read keyboard status; returns
                                  ; Z = false if character ready
                                  ; Z = true if nothing waiting
                                  ; register DX destroyed

          mov     al,in_flag      ; if character already
          or      al,al           ; waiting, return status
          jnz     pc_stat1

          mov     ah,6            ; otherwise call MS-DOS to
          mov     dl,0ffh         ; determine keyboard status
          int     21h

          jz      pc_stat1        ; jump if no key ready

          mov     in_char,al      ; got key, save it for
          mov     in_flag,0ffh    ; "pc_in" routine

  pc_stat1:                       ; return to caller with
          ret                     ; Z flag set appropriately

  pc_stat endp

  pc_in   proc    near            ; read keyboard character,
                                  ; return it in AL
                                  ; DX may be destroyed

          mov     al,in_flag      ; key already waiting?
          or      al,al
          jnz     pc_in1          ; yes, return it to caller

          call    pc_stat         ; try to read a character
          jmp     pc_in

  pc_in1: mov     in_flag,0       ; clear char-waiting flag
          mov     al,in_char      ; and return AL = character
          ret

  pc_in   endp

  pc_out  proc    near            ; write character in AL
                                  ; to the PC's display

          mov     ah,0eh          ; ROM BIOS function 0eh =
                                  ; "teletype output"
          push    bx              ; save register BX
          xor     bx,bx           ; assume page 0
          int     10h             ; transfer to ROM BIOS
          pop     bx              ; restore register BX
          ret                     ; and return to caller

  pc_out  endp


  cls     proc    near            ; clear display using
                                  ; char attribute in BH
                                  ; registers AX, CX,
                                  ; and DX destroyed

          mov     dl,columns      ; set DL,DH = X,Y of
          mov     dh,24           ; lower right corner
          mov     cx,0            ; set CL,CH = X,Y of
                                  ; upper left corner
          mov     ax,600h         ; ROM BIOS function 06h =
                                  ; "scroll or initialize
                                  ; window"
          int     10h             ; transfer to ROM BIOS
          call    home            ; set cursor at (0,0)
          ret                     ; and return to caller

  cls     endp

  clreol  proc    near            ; clear from cursor to end
                                  ; of line using attribute
                                  ; in BH, registers AX, CX,
                                  ; and DX destroyed

          call    getxy           ; get current cursor position
          mov     cx,dx           ; current position = "upper
                                  ; left corner" of window;
          mov     dl,columns      ; "lower right corner" X is
                                  ; max columns, Y is same
                                  ; as upper left corner
          mov     ax,600h         ; ROM BIOS function 06h =
                                  ; "scroll or initialize
                                  ; window"
          int     10h             ; transfer to ROM BIOS
          ret                     ; return to caller

  clreol  endp
  home    proc    near            ; put cursor at home position

          mov     dx,0            ; set (X,Y) = (0,0)
          call    gotoxy          ; position the cursor
          ret                     ; return to caller

  home    endp

  gotoxy  proc    near            ; position the cursor
                                  ; call with DL = X, DH = Y

          push    bx              ; save registers
          push    ax

          mov     bh,0            ; assume page 0
          mov     ah,2            ; ROM BIOS function 02h =
                                  ; set cursor position
          int     10h             ; transfer to ROM BIOS

          pop     ax              ; restore registers
          pop     bx
          ret                     ; and return to caller

  gotoxy  endp


  getxy   proc    near            ; get cursor position,
                                  ; returns DL = X, DH = Y

          push    ax              ; save registers
          push    bx
          push    cx

          mov     ah,3            ; ROM BIOS function 03h =
                                  ; get cursor position
          mov     bh,0            ; assume page 0
          int     10h             ; transfer to ROM BIOS

          pop     cx              ; restore registers
          pop     bx
          pop     ax
          ret                     ; and return to caller

  getxy   endp
  ctrl_code proc  near            ; process control code
                                  ; call with AL = char

          cmp     al,cr           ; if carriage return
          je      ctrl8           ; just send it

          cmp     al,lf           ; if linefeed
          je      ctrl8           ; just send it

          cmp     al,bsp          ; if backspace
          je      ctrl8           ; just send it

          cmp     al,26           ; is it cls control code?
          jne     ctrl7           ; no, jump

          mov     bh,dattr        ; cls control code, clear
          call    cls             ; screen and home cursor

          jmp     ctrl9

  ctrl7:
          cmp     al,escape       ; is it Escape character?
          jne     ctrl9           ; no, throw it away

          call    esc_seq         ; yes, emulate CRT terminal
          jmp     ctrl9

  ctrl8:  call    pc_out          ; send CR, LF, or backspace
                                  ; to the display

  ctrl9:  ret                     ; return to caller

  ctrl_code endp


  esc_seq proc    near            ; decode Televideo 950 escape
                                  ; sequence for screen control

          call    com_in          ; get next character
          cmp     al,84           ; is it clear to end of line?
          jne     esc_seq1        ; no, jump

          mov     bh,dattr        ; yes, clear to end of line
          call    clreol
          jmp     esc_seq2        ; then exit
  esc_seq1:
          cmp     al,61           ; is it cursor positioning?
          jne     esc_seq2        ; no jump

          call    com_in          ; yes, get Y parameter
          sub     al,33           ; and remove offset
          mov     dh,al

          call    com_in          ; get X parameter
          sub     al,33           ; and remove offset
          mov     dl,al
          call    gotoxy          ; position the cursor

  esc_seq2:                       ; return to caller
          ret

  esc_seq endp


  asc_enb proc    near            ; capture serial-port interrupt
                                  ; vector and enable interrupt

                                  ; save address of previous
                                  ; interrupt handler...
          mov     ax,3500h+com_int ; function 35h = get vector
          int     21h             ; transfer to MS-DOS
          mov     word ptr oldvec+2,es
          mov     word ptr oldvec,bx

                                  ; now install our handler...
          push    ds              ; save our data segment
          mov     ax,cs           ; set DS:DX = address
          mov     ds,ax           ; of our interrupt handler
          mov     dx,offset asc_int
          mov     ax,2500h+com_int ; function 25h = set vector
          int     21h             ; transfer to MS-DOS
          pop     ds              ; restore data segment

          mov     dx,com_mcr      ; set modem-control register
          mov     al,0bh          ; DTR and OUT2 bits
          out     dx,al

          mov     dx,com_ier      ; set interrupt-enable
          mov     al,1            ; register on serial-
          out     dx,al           ; port controller
          in      al,pic_mask     ; read current 8259 mask
          and     al,not int_mask ; set mask for COM port
          out     pic_mask,al     ; write new 8259 mask

          ret                     ; back to caller

  asc_enb endp


  asc_dsb proc    near            ; disable interrupt and
                                  ; release interrupt vector

          in      al,pic_mask     ; read current 8259 mask
          or      al,int_mask     ; reset mask for COM port
          out     pic_mask,al     ; write new 8259 mask

          push    ds              ; save our data segment
          lds     dx,oldvec       ; load address of
                                  ; previous interrupt handler
          mov     ax,2500h+com_int ; function 25h = set vector
          int     21h             ; transfer to MS-DOS
          pop     ds              ; restore data segment

          ret                     ; back to caller

  asc_dsb endp


  asc_int proc    far             ; interrupt service routine
                                  ; for serial port

          sti                     ; turn interrupts back on

          push    ax              ; save registers
          push    bx
          push    dx
          push    ds

          mov     ax,_DATA        ; make our data segment
          mov     ds,ax           ; addressable

          cli                     ; clear interrupts for
                                  ; pointer manipulation

          mov     dx,com_data     ; DX = data port address
          in      al,dx           ; read this character
          mov     bx,asc_in       ; get buffer pointer
          mov     [asc_buf+bx],al ; store this character
          inc     bx              ; bump pointer
          cmp     bx,bufsiz       ; time for wrap?
          jne     asc_int1        ; no, jump
          xor     bx,bx           ; yes, reset pointer

  asc_int1:                       ; store updated pointer
          mov     asc_in,bx

          sti                     ; turn interrupts back on

          mov     al,20h          ; send EOI to 8259
          out     pic_eoi,al

          pop     ds              ; restore all registers
          pop     dx
          pop     bx
          pop     ax

          iret                    ; return from interrupt

  asc_int endp

  _TEXT   ends


  _DATA   segment word public 'DATA'

  in_char db      0               ; PC keyboard input char
  in_flag db      0               ; <>0 if char waiting

  columns db      0               ; highest numbered column in
                                  ; current display mode (39 or 79)

  msg1    db      cr,lf
          db      'Display must be text mode.'
          db      cr,lf
  msg1_len equ $-msg1

  msg2    db      'Terminal emulator running...'
          db      cr,lf
  msg2_len equ $-msg2

  msg3    db      'Exit from terminal emulator.'
          db      cr,lf
  msg3_len equ $-msg3
  oldvec  dd      0               ; original contents of serial-
                                  ; port interrupt vector

  asc_in  dw      0               ; input pointer to ring buffer
  asc_out dw      0               ; output pointer to ring buffer

  asc_buf db      bufsiz dup (?)  ; communications buffer

  _DATA   ends


  STACK   segment para stack 'STACK'

          db      128 dup (?)

  STACK   ends

          end     talk            ;  defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 7-1.  TALK.ASM: A simple terminal-emulator program for IBM
  PC─compatible computers. This program demonstrates use of the MS-DOS and
  ROM BIOS video and keyboard functions and direct control of the
  serial-communications adapter.

  The TALK program illustrates the methods that an application should use to
  take over and service interrupts from the serial port without running
  afoul of MS-DOS conventions.

  The program begins with some equates and conditional assembly statements
  that configure the program for half- or full-duplex and for the desired
  serial port (COM1 or COM2). At entry from MS-DOS, the main routine of the
  program──the procedure named talk──checks the status of the serial port,
  initializes the display, and calls the asc_enb routine to take over the
  serial-port interrupt vector and enable interrupts. The talk procedure
  then enters a loop that reads the keyboard and sends the characters out
  the serial port and then reads the serial port and puts the characters on
  the display──in other words, it causes the PC to emulate a simple CRT
  terminal.

  The TALK program intercepts and handles control codes (carriage return,
  linefeed, and so forth) appropriately. It detects escape sequences and
  handles them as a subset of the Televideo 950 terminal capabilities. (You
  can easily modify the program to emulate any other cursor-addressable
  terminal.) When one of the PC's special function keys is pressed, the
  program disables serial-port interrupts, releases the serial-port
  interrupt vector, and exits back to MS-DOS.

  There are several TALK program procedures that are worth your attention
  because they can easily be incorporated into other programs. These are
  listed in the table on the following page.

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────
  asc_enb            Takes over the serial-port interrupt vector and enables
                     interrupts by writing to the modem-control register of
                     the INS8250 and the interrupt-mask register of the
                     8259A.

  asc_dsb            Restores the original state of the serial-port
                     interrupt vector and disables interrupts by writing to
                     the interrupt-mask register of the 8259A.

  asc_int            Services serial-port interrupts, placing received
                     characters into a ring buffer.

  com_stat           Tests whether characters from the serial port are
                     waiting in the ring buffer.

  com_in             Removes characters from the interrupt handler's ring
                     buffer and increments the buffer pointers
                     appropriately.
  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────
                     appropriately.

  com_out            Sends one character to the serial port.

  cls                Calls the ROM BIOS video driver to clear the screen.

  clreol             Calls the ROM BIOS video driver to clear from the
                     current cursor position to the end of the line.

  home               Places the cursor in the upper left corner of the
                     screen.

  gotoxy             Positions the cursor at the desired position on the
                     display.

  getxy              Obtains the current cursor position.

  pc_out             Sends one character to the PC's display.

  Procedure          Action
  ──────────────────────────────────────────────────────────────────────────

  pc_stat            Gets status for the PC's keyboard.

  pc_in              Returns a character from the PC's keyboard.
  ──────────────────────────────────────────────────────────────────────────





────────────────────────────────────────────────────────────────────────────
Chapter 8  File Management

  The dual heritage of MS-DOS──CP/M and UNIX/XENIX──is perhaps most clearly
  demonstrated in its file-management services. In general, MS-DOS provides
  at least two distinct operating-system calls for each major file or record
  operation. This chapter breaks this overlapping battery of functions into
  two groups and explains the usage, advantages, and disadvantages of each.

  I will refer to the set of file and record functions that are compatible
  with CP/M as FCB functions. These functions rely on a data structure
  called a file control block (hence, FCB) to maintain certain bookkeeping
  information about open files. This structure resides in the application
  program's memory space. The FCB functions allow the programmer to create,
  open, close, and delete files and to read or write records of any size at
  any record position within such files. These functions do not support the
  hierarchical (treelike) file structure that was first introduced in MS-DOS
  version 2.0, so they can be used only to access files in the current
  subdirectory for a given disk drive.

  I will refer to the set of file and record functions that provide
  compatibility with UNIX/XENIX as the handle functions. These functions
  allow the programmer to open or create files by passing MS-DOS a
  null-terminated string that describes the file's location in the
  hierarchical file structure (the drive and path), the file's name, and its
  extension. If the open or create operation is successful, MS-DOS returns a
  16-bit token, or handle, that is saved by the application program and used
  to specify the file in subsequent operations.

  When you use the handle functions, the operating system maintains the data
  structures that contain bookkeeping information about the file inside its
  own memory space, and these structures are not accessible to the
  application program. The handle functions fully support the hierarchical
  file structure, allowing the programmer to create, open, close, and delete
  files in any subdirectory on any disk drive and to read or write records
  of any size at any byte offset within such files.

  Although we are discussing the FCB functions first in this chapter for
  historical reasons, new MS-DOS applications should always be written using
  the more powerful handle functions. Use of the FCB functions in new
  programs should be avoided, unless compatibility with MS-DOS version 1.0
  is needed.


Using the FCB Functions

  Understanding the structure of the file control block is the key to
  success with the FCB family of file and record functions. An FCB is a
  37-byte data structure allocated within the application program's memory
  space; it is divided into many fields (Figure 8-1). Typically, the
  program initializes an FCB with a drive code, a filename, and an extension
  (conveniently accomplished with the parse-filename service, Int 21H
  Function 29H) and then passes the address of the FCB to MS-DOS to open or
  create the file. If the file is successfully opened or created, MS-DOS
  fills in certain fields of the FCB with information from the file's entry
  in the disk directory. This information includes the file's exact size in
  bytes and the date and time the file was created or last updated. MS-DOS
  also places certain other information within a reserved area of the FCB;
  however, this area is used by the operating system for its own purposes
  and varies among different versions of MS-DOS. Application programs should
  never modify the reserved area.

  For compatibility with CP/M, MS-DOS automatically sets the record-size
  field of the FCB to 128 bytes. If the program does not want to use this
  default record size, it must place the desired size (in bytes) into the
  record-size field after the open or create operation. Subsequently, when
  the program needs to read or write records from the file, it must pass the
  address of the FCB to MS-DOS; MS-DOS, in turn, keeps the FCB updated with
  information about the current position of the file pointer and the size of
  the file. Data is always read to or written from the current disk transfer
  area (DTA), whose address is set with Int 21H Function 1AH. If the
  application program wants to perform random record access, it must set the
  record number into the FCB before issuing each function call; when
  sequential record access is being used, MS-DOS maintains the FCB and no
  special intervention is needed from the application.

  Byte offset
  00H ┌───────────────────────────────────────────────────────┐
      │                 Drive identification                  │ Note 1
  01H ├───────────────────────────────────────────────────────┤
      │                Filename (8 characters)                │ Note 2
  09H ├───────────────────────────────────────────────────────┤
      │               Extension (3 characters)                │ Note 2
  0CH ├───────────────────────────────────────────────────────┤
      │                 Current block number                  │ Note 9
  0EH ├───────────────────────────────────────────────────────┤
      │                      Record size                      │ Note 10
  10H ├───────────────────────────────────────────────────────┤
      │                  File size (4 bytes)                  │ Notes 3, 6
  14H ├───────────────────────────────────────────────────────┤
      │                 Date created/updated                  │ Note 7
  16H ├───────────────────────────────────────────────────────┤
      │                 Time created/updated                  │ Note 8
  18H ├───────────────────────────────────────────────────────┤
      │                       Reserved                        │
  20H ├───────────────────────────────────────────────────────┤
      │                 Current-record number                 │ Note 9
  21H ├───────────────────────────────────────────────────────┤
      │           Relative-record number (4 bytes)            │ Note 5
      └───────────────────────────────────────────────────────┘

  Figure 8-1.  Normal file control block. Total length is 37 bytes (25H
  bytes). See notes on pages 133─34.

  In general, MS-DOS functions that use FCBs accept the full address of the
  FCB in the DS:DX register and pass back a return code in the AL register
  (Figure 8-2). For file-management calls (open, close, create, and
  delete), this return code is zero if the function was successful and 0FFH
  (255) if the function failed. For the FCB-type record read and write
  functions, the success code returned in the AL register is again zero, but
  there are several failure codes. Under MS-DOS version 3.0 or later, more
  detailed error reporting can be obtained by calling Int 21H Function 59H
  (Get Extended Error Information) after a failed FCB function call.

  When a program is loaded under MS-DOS, the operating system sets up two
  FCBs in the program segment prefix, at offsets 005CH and 006CH. These are
  often referred to as the default FCBs, and they are included to provide
  upward compatibility from CP/M. MS-DOS parses the first two parameters in
  the command line that invokes the program (excluding any redirection
  directives) into the default FCBs, under the assumption that they may be
  file specifications. The application must determine whether they really
  are filenames or not. In addition, because the default FCBs overlap and
  are not in a particularly convenient location (especially for .EXE
  programs), they usually must be copied elsewhere in order to be used
  safely. (See Chapter 3.)

  ──────────────────────────────────────────────────────────────────────────
                                               ; filename was previously
                                               ; parsed into "my_fcb"
                  mov   dx,seg my_fcb          ; DS:DX = address of
                  mov   ds,dx                  ; file control block
                  mov   dx,offset my_fcb
                  mov   ah,0fh                 ; function 0fh = open
                  int   21h
                  or    al,al                  ; was open successful?
                  jnz   error                  ; no, jump to error routine
                  .
                  .
                  .
  my_fcb          db    37 dup (0)             ; file control block
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-2.  A typical FCB file operation. This sequence of code attempts
  to open the file whose name was previously parsed into the FCB named
  my_fcb.

  Note that the structures of FCBs under CP/M and MS-DOS are not identical.
  However, the differences lie chiefly in the reserved areas of the FCBs
  (which should not be manipulated by application programs in any case), so
  well-behaved CP/M applications should be relatively easy to port into
  MS-DOS. It seems, however, that few such applications exist. Many of the
  tricks that were played by clever CP/M programmers to increase performance
  or circumvent the limitations of that operating system can cause severe
  problems under MS-DOS, particularly in networking environments. At any
  rate, much better performance can be achieved by thoroughly rewriting the
  CP/M applications to take advantage of the superior capabilities of
  MS-DOS.

  You can use a special FCB variant called an extended file control block to
  create or access files with special attributes (such as hidden or
  read-only files), volume labels, and subdirectories. An extended FCB has a
  7-byte header followed by the 37-byte structure of a normal FCB (Figure
  8-3). The first byte contains 0FFH, which could never be a legal drive
  code and thus indicates to MS-DOS that an extended FCB is being used. The
  next 5 bytes are reserved and are unused in current versions of MS-DOS.
  The seventh byte contains the attribute of the special file type that is
  being accessed. (Attribute bytes are discussed in more detail in Chapter
  9.) Any MS-DOS function that uses a normal FCB can also use an extended
  FCB.

  The FCB file- and record-management functions may be gathered into the
  following broad classifications:

  Byte
  offset
  00H ┌───────────────────────────────────────────────────────┐
      │                         0FFH                          │ Note 11
  01H ├───────────────────────────────────────────────────────┤
      │           Reserved (5 bytes, must be zero)            │
  06H ├───────────────────────────────────────────────────────┤
      │                    Attribute byte                     │ Note 12
  07H ├───────────────────────────────────────────────────────┤
      │                 Drive identification                  │ Note 1
  08H ├───────────────────────────────────────────────────────┤
      │                Filename (8 characters)                │ Note 2
  10H ├───────────────────────────────────────────────────────┤
      │               Extension (3 characters)                │ Note 2
  13H ├───────────────────────────────────────────────────────┤
      │                 Current-block number                  │ Note 9
  15H ├───────────────────────────────────────────────────────┤
      │                      Record size                      │ Note 10
  17H ├───────────────────────────────────────────────────────┤
      │                  File size (4 bytes)                  │ Notes 3, 6
  1BH ├───────────────────────────────────────────────────────┤
      │                 Date created/updated                  │ Note 7
  1DH ├───────────────────────────────────────────────────────┤
      │                 Time created/updated                  │ Note 8
  1FH ├───────────────────────────────────────────────────────┤
      │                       Reserved                        │
  27H ├───────────────────────────────────────────────────────┤
      │                 Current-record number                 │ Note 9
  28H ├───────────────────────────────────────────────────────┤
      │           Relative-record number (4 bytes)            │ Note 5
      └───────────────────────────────────────────────────────┘

  Figure 8-3.  Extended file control block. Total length is 44 bytes (2CH
  bytes). See notes on pages 133─34.

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  Common FCB file operations
  0FH                     Open file.
  10H                     Close file.
  16H                     Create file.

  Common FCB record operations
  14H                     Perform sequential read.
  15H                     Perform sequential write.
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  15H                     Perform sequential write.
  21H                     Perform random read.
  22H                     Perform random write.
  27H                     Perform random block read.
  28H                     Perform random block write.

  Other vital FCB operations
  1AH                     Set disk transfer address.
  29H                     Parse filename.

  Less commonly used FCB file operations
  13H                     Delete file.
  17H                     Rename file.

  Less commonly used FCB record operations
  23H                     Obtain file size.
  24H                     Set relative-record number.
  ──────────────────────────────────────────────────────────────────────────

  Function                 Action
  ──────────────────────────────────────────────────────────────────────────


  Several of these functions have special properties. For example, Int 21H
  Functions 27H (Random Block Read) and 28H (Random Block Write) allow
  reading and writing of multiple records of any size and also update the
  random-record field automatically (unlike Int 21H Functions 21H and
  22H). Int 21H Function 28H can truncate a file to any desired size, and
  Int 21H Function 17H used with an extended FCB can alter a volume label
  or rename a subdirectory.

  Section 2 of this book, "MS-DOS Functions Reference," gives detailed
  specifications for each of the FCB file and record functions, along with
  assembly-language examples. It is also instructive to compare the
  preceding groups with the corresponding groups of handle-type functions
  listed on pages 140─41.

  ──────────────────────────────────────────────────────────────────────────
  Notes for Figures 8-1 and 8-3
    1.  The drive identification is a binary number: 00=default drive,
        01=drive A:, 02=drive B:, and so on. If the application program
        supplies the drive code as zero (default drive), MS-DOS fills in the
        code for the actual current disk drive after a successful open or
        create call.

    2.  File and extension names must be left justified and padded with
        blanks.

    3.  The file size, date, time, and reserved fields should not be
        modified by applications.

    4.  All word fields are stored with the least significant byte at the
        lower address.

    5.  The relative-record field is treated as 4 bytes if the record size
        is less than 64 bytes; otherwise, only the first 3 bytes of this
        field are used.

    6.  The file-size field is in the same format as in the directory, with
        the less significant word at the lower address.

    7.  The date field is mapped as in the directory. Viewed as a 16-bit
        word (as it would appear in a register), the field is broken down as
        follows:

      F  E  D  C  B  A  9   8     7     6     5    4   3   2   1   0
    ┌─────────────────────┬─────────────────────┬─────────────────────┐
    │        Year         │        Month        │         Day         │
    └─────────────────────┴─────────────────────┴─────────────────────┘

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           Day (1─31)
    05H─08H           Month (1─12)
    09H─0FH           Year, relative to 1980
    ────────────────────────────────────────────────────────────────────────

    8.  The time field is mapped as in the directory. Viewed as a 16-bit
        word (as it would appear in a register), the field is broken down as
        follows:

      F   E   D   C   B   A   9   8   7   6   5   4   3   2   1   0
    ┌───────────────────┬───────────────────────┬─────────────────────┐
    │     Hours         │        Minutes        │ 2-second increments │
    └───────────────────┴───────────────────────┴─────────────────────┘

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           2-second increments (0─29)
    05H─0AH           Minutes (0─59)
    0BH─0FH           Hours (0─23)
    ────────────────────────────────────────────────────────────────────────

    9.  The current-block and current-record numbers are used together on
        sequential reads and writes. This simulates the behavior of CP/M.

    10. The Int 21H open (0FH) and create (16H) functions set the
        record-size field to 128 bytes, to provide compatibility with CP/M.
        If you use another record size, you must fill it in after the open
        or create operation.

    11. An 0FFH (255) in the first byte of the structure signifies that it
        is an extended file control block. You can use extended FCBs with
        any of the functions that accept an ordinary FCB. (See also note
        12.)

    12. The attribute byte in an extended FCB allows access to files with
        the special characteristics hidden, system, or read-only. You can
        also use extended FCBs to read volume labels and the contents of
        special subdirectory files.

  ──────────────────────────────────────────────────────────────────────────

FCB File-Access Skeleton

  The following is a typical program sequence to access a file using the
  FCB, or traditional, functions (Figure 8-4):

  1.  Zero out the prospective FCB.

  2.  Obtain the filename from the user, from the default FCBs, or from the
      command tail in the PSP.

  3.  If the filename was not obtained from one of the default FCBs, parse
      the filename into the new FCB using Int 21H Function 29H.

  4.  Open the file (Int 21H Function 0FH) or, if writing new data only,
      create the file or truncate any existing file of the same name to zero
      length (Int 21H Function 16H).

  5.  Set the record-size field in the FCB, unless you are using the default
      record size. Recall that it is important to do this after a successful
      open or create operation. (See Figure 8-5.)

  6.  Set the relative-record field in the FCB if you are performing random
      record I/O.

  7.  Set the disk transfer area address using Int 21H Function 1AH, unless
      the buffer address has not been changed since the last call to this
      function. If the application never performs a set DTA, the DTA address
      defaults to offset 0080H in the PSP.

  8.  Request the needed read- or write-record operation (Int 21H Function
      14H─Sequential Read, 15H─Sequential Write, 21H─Random Read,
      22H─Random Write, 27H─Random Block Read, 28H─Random Block Write).

  9.  If the program is not finished processing the file, go to step 6;
      otherwise, close the file (Int 21H Function 10H). If the file was
      used for reading only, you can skip the close operation under early
      versions of MS-DOS. However, this shortcut can cause problems under
      MS-DOS versions 3.0 and later, especially when the files are being
      accessed across a network.

  ──────────────────────────────────────────────────────────────────────────
  recsize      equ   1024                   ; file record size
               .
               .
               .
               mov   ah,29h                 ; parse input filename
               mov   al,1                   ; skip leading blanks
               mov   si,offset fname1       ; address of filename
               mov   di,offset fcb1         ; address of FCB
               int   21h
               or    al,al                  ; jump if name
               jnz   name_err               ; was bad
               .
               .
               .
               mov   ah,29h                 ; parse output filename
               mov   al,1                   ; skip leading blanks
               mov   si,offset fname2       ; address of filename
               mov   di,offset fcb2         ; address of FCB
               int   21h
               or    al,al                  ; jump if name
               jnz   name_err               ; was bad
               .
               .
               .
               mov   ah,0fh                 ; open input file
               mov   dx,offset fcb1
               int   21h
               or    al,al                  ; open successful?
               jnz   no_file                ; no, jump
               .
               .
               .
               mov   ah,16h                 ; create and open
               mov   dx,offset fcb2         ; output file
               int   21h
               or    al,al                  ; create successful?
               jnz   disk_full              ; no, jump
               .
               .
               .                            ; set record sizes
               mov   word ptr fcb1+0eh,recsize
               mov   word ptr fcb2+0eh,recsize
               .
               .
               .
               mov   ah,1ah                 ; set disk transfer
               mov   dx,offset buffer       ; address for reads
               int   21h                    ; and writes
               .
  next:        .                            ; process next record
               .
               mov   ah,14h                 ; sequential read from
               mov   dx,offset fcb1         ; input file
               int   21h
               cmp   al,01                  ; check for end of file
               je    file_end               ; jump if end of file
               cmp   al,03
               je    file_end               ; jump if end of file
               or    al,al                  ; other read fault?
               jnz   bad_read               ; jump if bad read
               .
               .
               .
               mov   ah,15h                 ; sequential write to
               mov   dx,offset fcb2         ; output file
               int   21h
               or    al,al                  ; write successful?
               jnz   bad_write              ; jump if write failed
               .
               .
               .
               jmp   next                   ; process next record
               .
  file_end:    .                            ; reached end of input
               .
               mov   ah,10h                 ; close input file
               mov   dx,offset fcb1
               int   21h
               .
               .
               .
               mov   ah,10h                 ; close output file
               mov   dx,offset fcb2
               int   21h
               .
               .
               .
               mov   ax,4c00h               ; exit with return
               int   21h                    ; code of zero
               .
               .
               .
  fname1       db    'OLDFILE.DAT',0        ; name of input file
  fname2       db    'NEWFILE.DAT',0        ; name of output file
  fcb1         db    37 dup (0)             ; FCB for input file
  fcb2         db    37 dup (0)             ; FCB for output file
  buffer       db    recsize dup (?)        ; buffer for file I/O
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-4.  Skeleton of an assembly-language program that performs file
  and record I/O using the FCB family of functions.

  Byte Offset  FCB before open       FCB contents       FCB after open
           ┌────────────────────┬────────────────────┬────────────────────┐
       00H │         00         │       Drive        │         03         │
           ├────────────────────┼────────────────────┼────────────────────┤
       01H │         4D         │                    │         4D         │
       02H │         59         │                    │         59         │
       03H │         46         │                    │         46         │
       04H │         49         │      Filename      │         49         │
       05H │         4C         │                    │         4C         │
       06H │         45         │                    │         45         │
       07H │         20         │                    │         20         │
       08H │         20         │                    │         20         │
           ├────────────────────┼────────────────────┼────────────────────┤
       09H │         44         │                    │         44         │
       0AH │         41         │     Extension      │         41         │
       0BH │         54         │                    │         54         │
           ├────────────────────┼────────────────────┼────────────────────┤
       0CH │         00         │                    │         00         │
       0DH │         00         │   Current block    │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       0EH │         00         │                    │         80         │
       0FH │         00         │    Record size     │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       10H │         00         │                    │         80         │
       11H │         00         │                    │         3D         │
       12H │         00         │     File size      │         00         │
       13H │         00         │                    │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       14H │         00         │                    │         43         │
       15H │         00         │     File date      │         0B         │
           ├────────────────────┼────────────────────┼────────────────────┤
       16H │         00         │                    │         A1         │
       17H │         00         │     File time      │         52         │
           ├────────────────────┼────────────────────┼────────────────────┤
       18H │         00         │                    │         03         │
       19H │         00         │                    │         02         │
       1AH │         00         │                    │         42         │
       1BH │         00         │                    │         73         │
       1CH │         00         │      Reserved      │         00         │
       1DH │         00         │                    │         01         │
       1EH │         00         │                    │         35         │
       1FH │         00         │                    │         0F         │
           ├────────────────────┼────────────────────┼────────────────────┤
       20H │         00         │   Current record   │         00         │
           ├────────────────────┼────────────────────┼────────────────────┤
       21H │         00         │                    │         00         │
       22H │         00         │  Relative-record   │         00         │
       23H │         00         │       number       │         00         │
       24H │         00         │                    │         00         │
           └────────────────────┴────────────────────┴────────────────────┘

  Figure 8-5.  A typical file control block before and after a successful
  open call (Int 21H Function 0FH).

Points to Remember

  Here is a summary of the pros and cons of using the FCB-related file and
  record functions in your programs.

  Advantages:

  ■  Under MS-DOS versions 1 and 2, the number of files that can be open
     concurrently when using FCBs is unlimited. (This is not true under
     MS-DOS versions 3.0 and later, especially if networking software is
     running.)

  ■  File-access methods using FCBs are familiar to programmers with a CP/M
     background, and well-behaved CP/M applications require little change in
     logical flow to run under MS-DOS.

  ■  MS-DOS supplies the size, time, and date for a file to its FCB after
     the file is opened. The calling program can inspect this information.

  Disadvantages:

  ■  FCBs take up room in the application program's memory space.

  ■  FCBs offer no support for the hierarchical file structure (no access to
     files outside the current directory).

  ■  FCBs provide no support for file locking/sharing or record locking in
     networking environments.

  ■  In addition to the read or write call itself, file reads or writes
     using FCBs require manipulation of the FCB to set record size and
     record number, plus a previous call to a separate MS-DOS function to
     set the DTA address.

  ■  Random record I/O using FCBs for a file containing variable-length
     records is very clumsy and inconvenient.

  ■  You must use extended FCBs, which are incompatible with CP/M anyway, to
     access or create files with special attributes such as hidden,
     read-only, or system.

  ■  The FCB file functions have poor error reporting. This situation has
     been improved somewhat in MS-DOS version 3 because a program can call
     the added Int 21H Function 59H (Get Extended Error Information) after
     a failed FCB function to obtain additional information.

  ■  Microsoft discourages use of FCBs. FCBs will make your program more
     difficult to port to MS OS/2 later because MS OS/2 does not support
     FCBs in protected mode at all.


Using the Handle Functions

  The handle file- and record-management functions access files in a fashion
  similar to that used under the UNIX/XENIX operating system. Files are
  designated by an ASCIIZ string (an ASCII character string terminated by a
  null, or zero, byte) that can contain a drive designator, path, filename,
  and extension. For example, the file specification

  C:\SYSTEM\COMMAND.COM

  would appear in memory as the following sequence of bytes:

  43 3A 5C 53 59 53 54 45 4D 5C 43 4F 4D 4D 41 4E 44 2E 43 4F 4D 00

  When a program wishes to open or create a file, it passes the address of
  the ASCIIZ string specifying the file to MS-DOS in the DS:DX registers
  (Figure 8-6). If the operation is successful, MS-DOS returns a 16-bit
  handle to the program in the AX register. The program must save this
  handle for further reference.

  ──────────────────────────────────────────────────────────────────────────
               mov   ah,3dh                  ; function 3dh = open
               mov   al,2                    ; mode 2 = read/write
               mov   dx,seg filename         ; address of ASCIIZ
               mov   ds,dx                   ; file specification
               mov   dx,offset filename
               int   21h                     ; request open from DOS
               jc    error                   ; jump if open failed
               mov   handle,ax               ; save file handle
               .
               .
               .
  filename     db    'C:\MYDIR\MYFILE.DAT',0 ; filename
  handle       dw    0                       ; file handle
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-6.  A typical handle file operation. This sequence of code
  attempts to open the file designated in the ASCIIZ string whose address is
  passed to MS-DOS in the DS:DX registers.

  When the program requests subsequent operations on the file, it usually
  places the handle in the BX register before the call to MS-DOS. All the
  handle functions return with the CPU's carry flag cleared if the operation
  was successful, or set if the operation failed; in the latter case, the AX
  register contains a code describing the failure.

  MS-DOS restricts the number of handles that can be active at any one
  time──that is, the number of files and devices that can be open
  concurrently when using the handle family of functions──in two different
  ways:

  ■  The maximum number of concurrently open files in the system, for all
     active processes combined, is specified by the entry

     FILES=nn

     in the CONFIG.SYS file. This entry determines the number of entries
     to be allocated in the system file table; under MS-DOS version 3, the
     default value is 8 and the maximum is 255. After MS-DOS is booted and
     running, you cannot expand this table to increase the total number of
     files that can be open. You must use an editor to modify the CONFIG.SYS
     file and then restart the system.

  ■  The maximum number of concurrently open files for a single process is
     20, assuming that sufficient entries are also available in the system
     file table. When a program is loaded, MS-DOS preassigns 5 of its
     potential 20 handles to the standard devices. Each time the process
     issues an open or create call, MS-DOS assigns a handle from the
     process's private allocation of 20, until all the handles are used up
     or the system file table is full. In MS-DOS versions 3.3 and later, you
     can expand the per-process limit of 20 handles with a call to Int 21H
     Function 67H (Set Handle Count).

  The handle file- and record-management calls may be gathered into the
  following broad classifications for study:

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Function                 Action
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  Common handle file operations
  3CH                     Create file (requires ASCIIZ string).
  3DH                     Open file (requires ASCIIZ string).
  3EH                     Close file.

  Common handle record operations
  42H                     Set file pointer (also used to find file size).
  3FH                     Read file.
  40H                     Write file.

  Less commonly used handle operations
  41H                     Delete file.
  43H                     Get or modify file attributes.
  44H                     IOCTL (I/O Control).
  45H                     Duplicate handle.
  46H                     Redirect handle.
  56H                     Rename file.
  57H                     Get or set file date and time.
  5AH                     Create temporary file (versions 3.0 and later).
  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  5AH                     Create temporary file (versions 3.0 and later).
  5BH                     Create file (fails if file already exists;
                           versions 3.0 and later).
  5CH                     Lock or unlock file region (versions 3.0 and
                           later).
  67H                     Set handle count (versions 3.3 and later).
  68H                     Commit file (versions 3.3 and later).
  6CH                     Extended open file (version 4).
  ──────────────────────────────────────────────────────────────────────────


  Compare the groups of handle-type functions in the preceding table with
  the groups of FCB functions outlined earlier, noting the degree of
  functional overlap. Section 2 of this book, "MS-DOS Functions Reference,"
  gives detailed specifications for each of the handle functions, along with
  assembly-language examples.

Handle File-Access Skeleton

  The following is a typical program sequence to access a file using the
  handle family of functions (Figure 8-7):

  1.  Get the filename from the user by means of the buffered input service
      (Int 21H Function 0AH) or from the command tail supplied by MS-DOS in
      the PSP.

  2.  Put a zero at the end of the file specification in order to create an
      ASCIIZ string.

  3.  Open the file using Int 21H Function 3DH and mode 2 (read/write
      access), or create the file using Int 21H Function 3CH. (Be sure to
      set the CX register to zero, so that you don't accidentally make a
      file with special attributes.) Save the handle that is returned.

  4.  Set the file pointer using Int 21H Function 42H. You may set the
      file-pointer position relative to one of three different locations:
      the start of the file, the current pointer position, or the end of the
      file. If you are performing sequential record I/O, you can usually
      skip this step because MS-DOS will maintain the file pointer for you
      automatically.

  5.  Read from the file (Int 21H Function 3FH) or write to the file (Int
      21H Function 40H). Both of these functions require that the BX
      register contain the file's handle, the CX register contain the length
      of the record, and the DS:DX registers point to the data being
      transferred. Both return the actual number of bytes transferred in the
      AX register.

      In a read operation, if the number of bytes read is less than the
      number requested, the end of the file has been reached. In a write
      operation, if the number of bytes written is less than the number
      requested, the disk containing the file is full. Neither of these
      conditions is returned as an error code; that is, the carry flag is
      not set.

  6.  If the program is not finished processing the file, go to step 4;
      otherwise, close the file (Int 21H Function 3EH). Any normal exit
      from the program will also close all active handles.

  ──────────────────────────────────────────────────────────────────────────
  recsize      equ     1024                 ; file record size
               .
               .
               .
               mov   ah,3dh                 ; open input file
               mov   al,0                   ; mode = read only
               mov   dx,offset fname1       ; name of input file
               int   21h
               jc    no_file                ; jump if no file
               mov   handle1,ax             ; save token for file
               .
               .
               .
               mov   ah,3ch                 ; create output file
               mov   cx,0                   ; attribute = normal
               mov   dx,offset fname2       ; name of output file
               int   21h
               jc    disk_full              ; jump if create fails
               mov   handle2,ax             ; save token for file
               .
  next:        .                            ; process next record
               .
               mov   ah,3fh                 ; sequential read from
               mov   bx,handle1             ; input file
               mov   cx,recsize
               mov   dx,offset buffer
               int   21h
               jc    bad_read               ; jump if read error
               or    ax,ax                  ; check bytes transferred
               jz    file_end               ; jump if end of file
               .
               .
               .
               mov   ah,40h                 ; sequential write to
               mov   bx,handle2             ; output file
               mov   cx,recsize
               mov   dx,offset buffer
               int   21h
               jc    bad_write              ; jump if write error
               cmp   ax,recsize             ; whole record written?
               jne   disk_full              ; jump if disk is full
               .
               .
               .
               jmp   next                   ; process next record
               .
  file_end:    .                            ; reached end of input
               .
               mov   ah,3eh                 ; close input file
               mov   bx,handle1
               int   21h
               .
               .
               .
               mov   ah,3eh                 ; close output file
               mov   bx,handle2
               int   21h
               .
               .
               .
               mov   ax,4c00h               ; exit with return
               int   21h                    ; code of zero
               .
               .
               .
  fname1       db    'OLDFILE.DAT',0        ; name of input file
  fname2       db    'NEWFILE.DAT',0        ; name of output file
  handle1      dw    0                      ; token for input file
  handle2      dw    0                      ; token for output file
  buffer       db    recsize dup (?)        ; buffer for file I/O
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-7.  Skeleton of an assembly-language program that performs
  sequential processing on an input file and writes the results to an output
  file using the handle file and record functions. This code assumes that
  the DS and ES registers have already been set to point to the segment
  containing the buffers and filenames.

Points to Remember

  Here is a summary of the pros and cons of using the handle file and record
  operations in your program. Compare this list with the one given earlier
  in the chapter for the FCB family of functions.

  Advantages:

  ■  The handle calls provide direct support for I/O redirection and pipes
     with the standard input and output devices in a manner functionally
     similar to that used by UNIX/XENIX.

  ■  The handle functions provide direct support for directories (the
     hierarchical file structure) and special file attributes.

  ■  The handle calls support file sharing/locking and record locking in
     networking environments.

  ■  Using the handle functions, the programmer can open channels to
     character devices and treat them as files.

  ■  The handle calls make the use of random record access extremely easy.
     The current file pointer can be moved to any byte offset relative to
     the start of the file, the end of the file, or the current pointer
     position. Records of any length, up to an entire segment (65,535
     bytes), can be read to any memory address in one operation.

  ■  The handle functions have relatively good error reporting in MS-DOS
     version 2, and error reporting has been enhanced even further in MS-DOS
     versions 3.0 and later.

  ■  Microsoft strongly encourages use of the handle family of functions in
     order to provide upward compatibility with MS OS/2.

  Disadvantages:

  ■  There is a limit per program of 20 concurrently open files and devices
     using handles in MS-DOS versions 2.0 through 3.2.

  ■  Minor gaps still exist in the implementation of the handle functions.
     For example, you must still use extended FCBs to change volume labels
     and to access the contents of the special files that implement
     directories.


MS-DOS Error Codes

  When one of the handle file functions fails with the carry flag set, or
  when a program calls Int 21H Function 59H (Get Extended Error
  Information) following a failed FCB function or other system service, one
  of the following error codes may be returned:

╓┌─┌────────────────────────┌────────────────────────────────────────────────╖
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  MS-DOS version 2 error codes
  01H                      Function number invalid
  02H                      File not found
  03H                      Path not found
  04H                      Too many open files
  05H                      Access denied
  06H                      Handle invalid
  07H                      Memory control blocks destroyed
  08H                      Insufficient memory
  09H                      Memory block address invalid
  0AH (10)                 Environment invalid
  0BH (11)                 Format invalid
  0CH (12)                 Access code invalid
  0DH (13)                 Data invalid
  0EH (14)                 Unknown unit
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  0EH (14)                 Unknown unit
  0FH (15)                 Disk drive invalid
  10H (16)                 Attempted to remove current directory
  11H (17)                 Not same device
  12H (18)                 No more files

  Mappings to critical-error codes
  13H (19)                 Write-protected disk
  14H (20)                 Unknown unit
  15H (21)                 Drive not ready
  16H (22)                 Unknown command
  17H (23)                 Data error (CRC)
  18H (24)                 Bad request-structure length
  19H (25)                 Seek error
  1AH (26)                 Unknown media type
  1BH (27)                 Sector not found
  1CH (28)                 Printer out of paper
  1DH (29)                 Write fault
  1EH (30)                 Read fault
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  1EH (30)                 Read fault
  1FH (31)                 General failure

  MS-DOS version 3 and later extended error codes
  20H (32)                 Sharing violation
  21H (33)                 File-lock violation
  22H (34)                 Disk change invalid
  23H (35)                 FCB unavailable
  24H (36)                 Sharing buffer exceeded
  25H─31H (37─49)          Reserved
  32H (50)                 Unsupported network request
  33H (51)                 Remote machine not listening
  34H (52)                 Duplicate name on network
  35H (53)                 Network name not found
  36H (54)                 Network busy
  37H (55)                 Device no longer exists on network
  38H (56)                 NetBIOS command limit exceeded
  39H (57)                 Error in network adapter hardware
  3AH (58)                 Incorrect response from network
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  3AH (58)                 Incorrect response from network
  3BH (59)                 Unexpected network error
  3CH (60)                 Remote adapter incompatible
  3DH (61)                 Print queue full
  3EH (62)                 Not enough room for print file
  3FH (63)                 Print file was deleted
  40H (64)                 Network name deleted
  41H (65)                 Network access denied
  42H (66)                 Incorrect network device type
  43H (67)                 Network name not found
  44H (68)                 Network name limit exceeded
  45H (69)                 NetBIOS session limit exceeded
  46H (70)                 Temporary pause
  47H (71)                 Network request not accepted
  48H (72)                 Print or disk redirection paused
  49H─4FH (73─79)          Reserved
  50H (80)                 File already exists
  51H (81)                 Reserved
  52H (82)                 Cannot make directory
  Value                    Meaning
  ──────────────────────────────────────────────────────────────────────────
  52H (82)                 Cannot make directory
  53H (83)                 Fail on Int 24H (critical error)
  54H (84)                 Too many redirections
  55H (85)                 Duplicate redirection
  56H (86)                 Invalid password
  57H (87)                 Invalid parameter
  58H (88)                 Net write fault
  ──────────────────────────────────────────────────────────────────────────


  Under MS-DOS versions 3.0 and later, you can also use Int 21H Function
  59H to obtain other information about the error, such as the error locus
  and the recommended recovery action.

Critical-Error Handlers

  In Chapter 5, we discussed how an application program can take over the
  Ctrl-C handler vector (Int 23H) and replace the MS-DOS default handler, to
  avoid losing control of the computer when the user enters a Ctrl-C or
  Ctrl-Break at the keyboard. Similarly, MS-DOS provides a
  critical-error-handler vector (Int 24H) that defines the routine to be
  called when unrecoverable hardware faults occur. The default MS-DOS
  critical-error handler is the routine that displays a message describing
  the error type and the cue

  Abort, Retry, Ignore?

  This message appears after such actions as the following:

  ■  Attempting to open a file on a disk drive that doesn't contain a floppy
     disk or whose door isn't closed

  ■  Trying to read a disk sector that contains a CRC error

  ■  Trying to print when the printer is off line

  The unpleasant thing about MS-DOS's default critical-error handler is, of
  course, that if the user enters an A for Abort, the application that is
  currently executing is terminated abruptly and never has a chance to clean
  up and make a graceful exit. Intermediate files may be left on the disk,
  files that have been extended using FCBs are not properly closed so that
  the directory is updated, interrupt vectors may be left pointing into the
  transient program area, and so forth.

  To write a truly bombproof MS-DOS application, you must take over the
  critical-error-handler vector and point it to your own routine, so that
  your program intercepts all catastrophic hardware errors and handles them
  appropriately. You can use MS-DOS Int 21H Function 25H to alter the Int
  24H vector in a well-behaved manner. When your application exits, MS-DOS
  will automatically restore the previous contents of the Int 24H vector
  from information saved in the program segment prefix.

  MS-DOS calls the critical-error handler for two general classes of
  errors── disk-related and non-disk-related──and passes different
  information to the handler in the registers for each of these classes.

  For disk-related errors, MS-DOS sets the registers as shown on the
  following page. (Bits 3─5 of the AH register are relevant only in MS-DOS
  versions 3.1 and later.)

╓┌─┌──────────────────┌─────────────────┌────────────────────────────────────╖
  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
  AH                 7                 0, to signify disk error
                     6                 Reserved
                     5                 0 = ignore response not allowed
                                       1 = ignore response allowed
                     4                 0 = retry response not allowed
                                       1 = retry response allowed
                     3                 0 = fail response not allowed
                                       1 = fail response allowed
                     1─2               Area where disk error occurred
                                       00 = MS-DOS area
                                       01 = file allocation table
                                       10 = root directory
                                       11 = files area
                     0                 0 = read operation
                                       1 = write operation
  AL                 0─7               Drive code (0 = A, 1 = B, and so
                                       forth)
  DI                 0─7               Driver error code
                     8─15              Not used
  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
                     8─15              Not used
  BP:SI                                Segment:offset of device-driver
                                       header
  ──────────────────────────────────────────────────────────────────────────


  For non-disk-related errors, the interrupt was generated either as the
  result of a character-device error or because a corrupted memory image of
  the file allocation table was detected. In this case, MS-DOS sets the
  registers as follows:

  Register           Bit(s)            Significance
  ──────────────────────────────────────────────────────────────────────────
  AH                 7                 1, to signify a non-disk error
  DI                 0─7               Driver error code
                     8─15              Not used
  BP:SI                                Segment:offset of device-driver
                                       header
  ──────────────────────────────────────────────────────────────────────────

  To determine whether the critical error was caused by a character device,
  use the address in the BP:SI registers to examine the device attribute
  word at offset 0004H in the presumed device-driver header. If bit 15 is
  set, then the error was indeed caused by a character device, and the
  program can inspect the name field of the driver's header to determine the
  device.

  At entry to a critical-error handler, MS-DOS has already disabled
  interrupts and set up the stack as shown in Figure 8-8. A critical-error
  handler cannot use any MS-DOS services except Int 21H Functions 01H
  through 0CH (Traditional Character I/O), Int 21H Function 30H (Get MS-DOS
  Version), and Int 21H Function 59H (Get Extended Error Information).
  These functions use a special stack so that the context of the original
  function (which generated the critical error) will not be lost.

  ┌───────┐─┐
  │ Flags │ │
  ├───────┤ │  Flags and CS:IP pushed
  │  CS   │ ├─ on stack by original
  ├───────┤ │  Int 21H call
  │  IP   │ │
  ├───────┤═╡◄─SS:SP on entry to
  │  ES   │ │  Int 21H handler
  ├───────┤ │
  │  DS   │ │
  ├───────┤ │
  │  BP   │ │
  ├───────┤ │
  │  DI   │ │
  ├───────┤ ├─ Registers at point of
  │  SI   │ │  original Int 21H call
  ├───────┤ │
  │  DX   │ │
  ├───────┤ │
  │  CX   │ │
  ├───────┤ │
  │  BX   │ │
  ├───────┤ │
  │  AX   │ │
  ├───────┤═╡
  │ Flags │ │
  ├───────┤ │
  │  CS   │ ├─ Return address for
  ├───────┤ │  Int 24H handler
  │  IP   │ │
  └──────┘─┘
        └───── SS:SP on entry to
               Int 24H handler

  Figure 8-8.  The stack at entry to a critical-error handler.

  The critical-error handler should return to MS-DOS by executing an IRET,
  passing one of the following action codes in the AL register:

  Code               Meaning
  ──────────────────────────────────────────────────────────────────────────
  0                  Ignore the error (MS-DOS acts as though the original
                     function call had succeeded).
  1                  Retry the operation.
  2                  Terminate the process that encountered the error.
  3                  Fail the function (an error code is returned to the
                     requesting process). Versions 3.1 and later only.
  ──────────────────────────────────────────────────────────────────────────

  The critical-error handler should preserve all other registers and must
  not modify the device-driver header pointed to by BP:SI. A skeleton
  example of a critical-error handler is shown in Figure 8-9.

  ──────────────────────────────────────────────────────────────────────────
                                  ; prompt message used by
                                  ; critical-error handler
  prompt  db      cr,lf,'Critical Error Occurred: '
          db      'Abort, Retry, Ignore, Fail? $'

  keys    db      'aArRiIfF'      ; possible user response keys
  keys_len equ $-keys             ; (both cases of each allowed)

  codes   db      2,2,1,1,0,0,3,3 ; codes returned to MS-DOS kernel
                                  ; for corresponding response keys

  ;
  ; This code is executed during program's initialization
  ; to install the new critical-error handler.
  ;
          .
          .
          .
          push    ds              ; save our data segment

          mov     dx,seg int24    ; DS:DX = handler address
          mov     ds,dx
          mov     dx,offset int24
          mov     ax,2524h        ; function 25h = set vector
          int     21h             ; transfer to MS-DOS

          pop     ds              ; restore data segment
          .
          .
          .
  ;
  ; This is the replacement critical-error handler. It
  ; prompts the user for Abort, Retry, Ignore, or Fail, and
  ; returns the appropriate code to the MS-DOS kernel.
  ;

  int24   proc    far             ; entered from MS-DOS kernel

          push    bx              ; save registers
          push    cx
          push    dx
          push    si
          push    di
          push    bp
          push    ds
          push    es
  int24a: mov     ax,seg prompt   ; display prompt for user
          mov     ds,ax           ; using function 9 (print string
          mov     es,ax           ; terminated by $ character)
          mov     dx,offset prompt
          mov     ah,9
          int     21h

          mov     ah,1            ; get user's response
          int     21h             ; function 1 = read one character

          mov     di,offset keys  ; look up code for response key
          mov     cx,keys_len
          cld
          repne scasb
          jnz     int24a          ; prompt again if bad response

                                  ; set AL = action code for MS-DOS
                                  ; according to key that was entered:
                                  ; 0 = ignore, 1 = retry, 2 = abort,
                                  ; 3 = fail
          mov     al,[di+keys_len-1]

          pop     es              ; restore registers
          pop     ds
          pop     bp
          pop     di
          pop     si
          pop     dx
          pop     cx
          pop     bx
          iret                    ; exit critical-error handler

  int24   endp
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-9.  A skeleton example of a replacement critical-error handler.


Example Programs: DUMP.ASM and DUMP.C

  The programs DUMP.ASM (Figure 8-10) and DUMP.C (Figure 8-11) are
  parallel examples of the use of the handle file and record functions. The
  assembly-language version, in particular, illustrates features of a
  well-behaved MS-DOS utility:

  ■  The program checks the version of MS-DOS to ensure that all the
     functions it is going to use are really available.

  ■  The program parses the drive, path, and filename from the command tail
     in the program segment prefix.

  ■  The program uses buffered I/O for speed.

  ■  The program sends error messages to the standard error device.

  ■  The program sends normal program output to the standard output device,
     so that the dump output appears by default on the system console but
     can be redirected to other character devices (such as the line printer)
     or to a file.

  The same features are incorporated into the C version of the program, but
  some of them are taken care of behind the scenes by the C runtime library.

  ──────────────────────────────────────────────────────────────────────────
          name    dump
          page    55,132
          title   DUMP--display file contents

  ;
  ;  DUMP--Display contents of file in hex and ASCII
  ;
  ;  Build:   C>MASM DUMP;
  ;           C>LINK DUMP;
  ;
  ;  Usage:   C>DUMP unit:\path\filename.exe [ >device ]
  ;
  ;  Copyright (C) 1988 Ray Duncan
  ;

  cr      equ     0dh             ; ASCII carriage return
  lf      equ     0ah             ; ASCII line feed
  tab     equ     09h             ; ASCII tab code
  blank   equ     20h             ; ASCII space code

  cmd     equ     80h             ; buffer for command tail

  blksize equ     16              ; input file record size

  stdin   equ     0               ; standard input handle
  stdout  equ     1               ; standard output handle
  stderr  equ     2               ; standard error handle
  _TEXT   segment word public 'CODE'

          assume  cs:_TEXT,ds:_DATA,es:_DATA,ss:STACK

  dump    proc    far             ; entry point from MS-DOS

          push    ds              ; save DS:0000 for final
          xor     ax,ax           ; return to MS-DOS, in case
          push    ax              ; function 4ch can't be used

          mov     ax,_DATA        ; make our data segment
          mov     ds,ax           ; addressable via DS register

                                  ; check MS-DOS version
          mov     ax,3000h        ; function 30h = get version
          int     21h             ; transfer to MS-DOS
          cmp     al,2            ; major version 2 or later?
          jae     dump1           ; yes, proceed

                                  ; if MS-DOS 1.x, display
                                  ; error message and exit
          mov     dx,offset msg3  ; DS:DX = message address
          mov     ah,9            ; function 9 = print string
          int     21h             ; transfer to MS-DOS
          ret                     ; then exit the old way

  dump1:                          ; check if filename present
          mov     bx,offset cmd   ; ES:BX = command tail
          call    argc            ; count command arguments
          cmp     ax,2            ; are there 2 arguments?
          je      dump2           ; yes, proceed

                                  ; missing filename, display
                                  ; error message and exit
          mov     dx,offset msg2  ; DS:DX = message address
          mov     cx,msg2_len     ; CX = message length
          jmp     dump9           ; go display it

  dump2:                          ; get address of filename
          mov     ax,1            ; AX = argument number
                                  ; ES:BX still = command tail
          call    argv            ; returns ES:BX = address,
                                  ; and AX = length

          mov     di,offset fname ; copy filename to buffer
          mov     cx,ax           ; CX = length
  dump3:  mov     al,es:[bx]      ; copy one byte
          mov     [di],al
          inc     bx              ; bump string pointers
          inc     di
          loop    dump3           ; loop until string done
          mov     byte ptr [di],0 ; add terminal null byte

          mov     ax,ds           ; make our data segment
          mov     es,ax           ; addressable by ES too
                                  ; now open the file
          mov     ax,3d00h        ; function 3dh = open file
                                  ; mode 0 = read only
          mov     dx,offset fname ; DS:DX = filename
          int     21h             ; transfer to MS-DOS
          jnc     dump4           ; jump, open successful

                                  ; open failed, display
                                  ; error message and exit
          mov     dx,offset msg1  ; DS:DX = message address
          mov     cx,msg1_len     ; CX = message length
          jmp     dump9           ; go display it

  dump4:  mov     fhandle,ax      ; save file handle

  dump5:                          ; read block of file data
          mov     bx,fhandle      ; BX = file handle
          mov     cx,blksize      ; CX = record length
          mov     dx,offset fbuff ; DS:DX = buffer
          mov     ah,3fh          ; function 3fh = read
          int     21h             ; transfer to MS-DOS

          mov     flen,ax         ; save actual length
          cmp     ax,0            ; end of file reached?
          jne     dump6           ; no, proceed

          cmp     word ptr fptr,0 ; was this the first read?
          jne     dump8           ; no, exit normally

                                  ; display empty file
                                  ; message and exit
          mov     dx,offset msg4  ; DS:DX = message address
          mov     cx,msg4_len     ; CX = length
          jmp     dump9           ; go display it
  dump6:                          ; display heading at
                                  ; each 128-byte boundary
          test    fptr,07fh       ; time for a heading?
          jnz     dump7           ; no, proceed

                                  ; display a heading
          mov     dx,offset hdg   ; DS:DX = heading address
          mov     cx,hdg_len      ; CX = heading length
          mov     bx,stdout       ; BX = standard output
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS

  dump7:  call    conv            ; convert binary record
                                  ; to formatted ASCII

                                  ; display formatted output
          mov     dx,offset fout  ; DX:DX = output address
          mov     cx,fout_len     ; CX = output length
          mov     bx,stdout       ; BX = standard output
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS
          jmp     dump5           ; go get another record

  dump8:                          ; close input file
          mov     bx,fhandle      ; BX = file handle
          mov     ah,3eh          ; function 3eh = close
          int     21h             ; transfer to MS-DOS

          mov     ax,4c00h        ; function 4ch = terminate,
                                  ; return code = 0
          int     21h             ; transfer to MS-DOS

  dump9:                          ; display message on
                                  ; standard error device
                                  ; DS:DX = message address
                                  ; CX = message length
          mov     bx,stderr       ; standard error handle
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS

          mov     ax,4c01h        ; function 4ch = terminate,
                                  ; return code = 1
          int     21h             ; transfer to MS-DOS

  dump    endp
  conv    proc    near            ; convert block of data
                                  ; from input file

          mov     di,offset fout  ; clear output format
          mov     cx,fout_len-2   ; area to blanks
          mov     al,blank
          rep stosb

          mov     di,offset fout  ; convert file offset
          mov     ax,fptr         ; to ASCII for output
          call    w2a

          mov     bx,0            ; init buffer pointer

  conv1:  mov     al,[fbuff+bx]   ; fetch byte from buffer
          mov     di,offset foutb ; point to output area

                                  ; format ASCII part...
                                  ; store '.' as default
          mov     byte ptr [di+bx],'.'

          cmp     al,blank        ; in range 20h-7eh?
          jb      conv2           ; jump, not alphanumeric

          cmp     al,7eh          ; in range 20h-7eh?
          ja      conv2           ; jump, not alphanumeric

          mov     [di+bx],al      ; store ASCII character

  conv2:                          ; format hex part...
          mov     di,offset fouta ; point to output area
          add     di,bx           ; base addr + (offset*3)
          add     di,bx
          add     di,bx
          call    b2a             ; convert byte to hex

          inc     bx              ; advance through record
          cmp     bx,flen         ; entire record converted?
          jne     conv1           ; no, get another byte

                                  ; update file pointer
          add     word ptr fptr,blksize

          ret

  conv    endp
  w2a     proc    near            ; convert word to hex ASCII
                                  ; call with AX = value
                                  ;           DI = addr for string
                                  ; returns AX, DI, CX destroyed

          push    ax              ; save copy of value
          mov     al,ah
          call    b2a             ; convert upper byte

          pop     ax              ; get back copy
          call    b2a             ; convert lower byte
          ret

  w2a     endp

  b2a     proc    near            ; convert byte to hex ASCII
                                  ; call with AL = binary value
                                  ;           DI = addr for string
                                  ; returns   AX, DI, CX modified

          sub     ah,ah           ; clear upper byte
          mov     cl,16
          div     cl              ; divide byte by 16
          call    ascii           ; quotient becomes the first
          stosb                   ; ASCII character
          mov     al,ah
          call    ascii           ; remainder becomes the
          stosb                   ; second ASCII character
          ret

  b2a     endp

  ascii   proc    near            ; convert value 0-0fh in AL
                                  ; into "hex ASCII" character

          add     al,'0'          ; offset to range 0-9
          cmp     al,'9'          ; is it > 9?
          jle     ascii2          ; no, jump
          add     al,'A'-'9'-1    ; offset to range A-F,

  ascii2: ret                     ; return AL = ASCII char

  ascii   endp

  argc    proc    near            ; count command-line arguments
                                  ; call with ES:BX = command line
                                  ; returns   AX = argument count
          push    bx              ; save original BX and CX
          push    cx              ; for later
          mov     ax,1            ; force count >= 1

  argc1:  mov     cx,-1           ; set flag = outside argument

  argc2:  inc     bx              ; point to next character
          cmp     byte ptr es:[bx],cr
          je      argc3           ; exit if carriage return
          cmp     byte ptr es:[bx],blank
          je      argc1           ; outside argument if ASCII blank
          cmp     byte ptr es:[bx],tab
          je      argc1           ; outside argument if ASCII tab

                                  ; otherwise not blank or tab,
          jcxz    argc2           ; jump if already inside argument

          inc     ax              ; else found argument, count it
          not     cx              ; set flag = inside argument
          jmp     argc2           ; and look at next character

  argc3:  pop     cx              ; restore original BX and CX
          pop     bx
          ret                     ; return AX = argument count

  argc    endp

  argv    proc    near            ; get address & length of
                                  ; command line argument
                                  ; call with ES:BX = command line
                                  ;           AX    = argument #
                                  ; returns   ES:BX = address
                                  ;           AX    = length

          push    cx              ; save original CX and DI
          push    di

          or      ax,ax           ; is it argument 0?
          jz      argv8           ; yes, jump to get program name

          xor     ah,ah           ; initialize argument counter

  argv1:  mov     cx,-1           ; set flag = outside argument
  argv2:  inc     bx              ; point to next character
          cmp     byte ptr es:[bx],cr
          je      argv7           ; exit if carriage return
          cmp     byte ptr es:[bx],blank
          je      argv1           ; outside argument if ASCII blank
          cmp     byte ptr es:[bx],tab
          je      argv1           ; outside argument if ASCII tab

                                  ; if not blank or tab...
          jcxz    argv2           ; jump if already inside argument

          inc     ah              ; else count arguments found
          cmp     ah,al           ; is this the one we're looking for?
          je      argv4           ; yes, go find its length
          not     cx              ; no, set flag = inside argument
          jmp     argv2           ; and look at next character

  argv4:                          ; found desired argument, now
                                  ; determine its length...
          mov     ax,bx           ; save param starting address

  argv5:  inc     bx              ; point to next character
          cmp     byte ptr es:[bx],cr
          je      argv6           ; found end if carriage return
          cmp     byte ptr es:[bx],blank
          je      argv6           ; found end if ASCII blank
          cmp     byte ptr es:[bx],tab
          jne     argv5           ; found end if ASCII tab

  argv6:  xchg    bx,ax           ; set ES:BX = argument address
          sub     ax,bx           ; and AX = argument length
          jmp     argvx           ; return to caller

  argv7:  xor     ax,ax           ; set AX = 0, argument not found
          jmp     argvx           ; return to caller

  argv8:                          ; special handling for argv = 0
          mov     ax,3000h        ; check if DOS 3.0 or later
          int     21h             ; (force AL = 0 in case DOS 1)
          cmp     al,3
          jb      argv7           ; DOS 1 or 2, return null param
          mov     es,es:[2ch]     ; get environment segment from PSP
          xor     di,di           ; find the program name by
          xor     al,al           ; first skipping over all the
          mov     cx,-1           ; environment variables...
          cld
  argv9:  repne scasb             ; scan for double null (can't use
          scasb                   ; SCASW since might be odd addr)
          jne     argv9           ; loop if it was a single null
          add     di,2            ; skip count word in environment
          mov     bx,di           ; save program name address
          mov     cx,-1           ; now find its length...
          repne scasb             ; scan for another null byte
          not     cx              ; convert CX to length
          dec     cx
          mov     ax,cx           ; return length in AX

  argvx:                          ; common exit point
          pop     di              ; restore original CX and DI
          pop     cx
          ret                     ; return to caller

  argv    endp

  _TEXT    ends

  _DATA   segment word public 'DATA'

  fname   db      64 dup (0)      ; buffer for input filespec

  fhandle dw      0               ; token from PCDOS for input file

  flen    dw      0               ; actual length read

  fptr    dw      0               ; relative address in file

  fbuff   db      blksize dup (?) ; data from input file

  fout    db      'nnnn'          ; formatted output area
          db      blank,blank
  fouta   db      16 dup ('nn',blank)
          db      blank
  foutb   db      16 dup (blank),cr,lf
  fout_len equ    $-fout

  hdg     db      cr,lf           ; heading for each 128 bytes
          db      7 dup (blank)   ; of formatted output
          db      '0  1  2  3  4  5  6  7  '
          db      '8  9  A  B  C  D  E  F',cr,lf
  hdg_len equ     $-hdg
  msg1    db      cr,lf
          db      'dump: file not found'
          db      cr,lf
  msg1_len equ    $-msg1

  msg2    db      cr,lf
          db      'dump: missing file name'
          db      cr,lf
  msg2_len equ    $-msg2

  msg3    db      cr,lf
          db      'dump: wrong MS-DOS version'
          db      cr,lf,'$'

  msg4    db      cr,lf
          db      'dump: empty file'
          db      cr,lf
  msg4_len equ    $-msg4

  _DATA   ends

  STACK   segment para stack 'STACK'

          db      64 dup (?)

  STACK   ends

          end     dump
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-10.  The assembly-language version: DUMP.ASM.

  ──────────────────────────────────────────────────────────────────────────
  /*
      DUMP.C      Displays the binary contents of a file in
                  hex and ASCII on the standard output device.

      Compile:    C>CL DUMP.C

      Usage:      C>DUMP unit:path\filename.ext

      Copyright (C) 1988 Ray Duncan
  */

  #include <stdio.h>
  #include <io.h>
  #include <fcntl.h>
  #define REC_SIZE 16               /* input file record size    */

  main(int argc, char *argv[])
  {
      int fd;                       /* input file handle         */
        int status = 0;             /* status from file read     */
      long fileptr = 0L;            /* current file byte offset  */
      char filebuf[REC_SIZE];       /* data from file            */

      if(argc != 2)                 /* abort if missing filename */
      {   fprintf(stderr,"\ndump: wrong number of parameters\n");
          exit(1);
      }

                                    /* open file in binary mode,
                                       abort if open fails       */
      if((fd = open(argv[1],O_RDONLY | O_BINARY) ) == -1)
      {   fprintf(stderr, "\ndump: can't find file %s \n", argv[1]);
          exit(1);
      }

                                    /* read and dump records
                                       until end of file         */
      while((status = read(fd,filebuf,REC_SIZE) ) != 0)
      {   dump_rec(filebuf, fileptr, status);
          fileptr += REC_SIZE;
      }

      close(fd);                    /* close input file          */
      exit(0);                      /* return success code       */
  }

  /*
      Display record (16 bytes) in hex and ASCII on standard output
  */

  dump_rec(char *filebuf, long fileptr, int length)
  {
      int i;                        /* index to current record   */

      if(fileptr % 128 == 0)        /* display heading if needed */
          printf("\n\n       0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F")

      printf("\n%04lX ",fileptr);   /* display file offset       */

                                    /* display hex equivalent of
                                       each byte from file       */
      for(i = 0; i < length; i++)
          printf(" %02X", (unsigned char) filebuf[i]);

      if(length != 16)              /* spaces if partial record  */
          for (i=0; i<(16-length); i++) printf("   ");

                                    /* display ASCII equivalent of
                                       each byte from file       */
      printf("  ");
      for(i = 0; i < length; i++)
      {   if(filebuf[i] < 32 || filebuf[i] > 126) putchar('.');
          else putchar(filebuf[i]);
      }
  }
  ──────────────────────────────────────────────────────────────────────────

  Figure 8-11.  The C version: DUMP.C.

  The assembly-language version of the DUMP program contains a number of
  subroutines that you may find useful in your own programming efforts.
  These include the following:

  Subroutine  Action
  ──────────────────────────────────────────────────────────────────────────
  argc        Returns the number of command-line arguments.
  argv        Returns the address and length of a particular command-line
              argument.
  w2a         Converts a binary word (16 bits) into hex ASCII for output.
  b2a         Converts a binary byte (8 bits) into hex ASCII for output.
  ascii       Converts 4 bits into a single hex ASCII character.
  ──────────────────────────────────────────────────────────────────────────

  It is interesting to compare these two equivalent programs. The C program
  contains only 77 lines, whereas the assembly-language program has 436
  lines. Clearly, the C source code is less complex and easier to maintain.
  On the other hand, if size and efficiency are important, the DUMP.EXE file
  generated by the C compiler is 8563 bytes, whereas the assembly-language
  DUMP.EXE file is only 1294 bytes and runs twice as fast as the C program.



────────────────────────────────────────────────────────────────────────────
Chapter 9  Volumes and Directories

  Each file in an MS-DOS system is uniquely identified by its name and its
  location. The location, in turn, has two components: the logical drive
  that contains the file and the directory on that drive where the filename
  can be found.

  Logical drives are specified by a single letter followed by a colon (for
  example, A:). The number of logical drives in a system is not necessarily
  the same as the number of physical drives; for example, it is common for
  large fixed-disk drives to be divided into two or more logical drives. The
  key aspect of a logical drive is that it contains a self-sufficient file
  system; that is, it contains one or more directories, zero or more
  complete files, and all the information needed to locate the files and
  directories and to determine which disk space is free and which is already
  in use.

  Directories are simply lists or catalogs. Each entry in a directory
  consists of the name, size, starting location, attributes, and last
  modification date and time of a file or another directory that the disk
  contains. The detailed information about the location of every block of
  data assigned to a file or directory is in a separate control area on the
  disk called the file allocation table (FAT). (See Chapter 10 for a
  detailed discussion of the internal format of directories and the FAT.)

  Every disk potentially has two distinct kinds of directories: the root
  directory and all other directories. The root directory is always present
  and has a maximum number of entries, determined when the disk is
  formatted; this number cannot be changed. The subdirectories of the root
  directory, which may or may not be present on a given disk, can be nested
  to any level and can grow to any size (Figure 9-1). This is the
  hierarchical, or tree, directory structure referred to in earlier
  chapters. Every directory has a name, except for the root directory, which
  is designated by a single backslash (\) character.

  MS-DOS keeps track of a "current drive" for the system and uses this drive
  when a file specification does not include an explicit drive code.
  Similarly, MS-DOS maintains a "current directory" for each logical drive.
  You can select any particular directory on a drive by naming in order──
  either from the root directory or relative to the current directory──the
  directories that lead to its location in the tree structure. Such a list
  of directories, separated by backslash delimiters, is called a path. When
  a complete path from the root directory is prefixed by a logical drive
  code and followed by a filename and extension, the resulting string is a
  fully qualified filename and unambiguously specifies a file.

                           ┌────────────┐
                           │   Drive    │
                           │ identifier │
                           └─────┬──────┘
                                 │
                         ┌───────┴────────┐
                         │ Root directory │
                         │ (volume label) │
                         └─┬──┬──┬───┬──┬─┘
       ┌───────────────────┘  │  │   │  └───────────────────┐
       │          ┌───────────┘  │   └───────────┐          │
  ┌────┴───┐ ┌────┴──────┐   ┌───┴────┐   ┌──────┴────┐ ┌───┴────┐
  │ File A │ │ Directory │   │ File B │   │ Directory │ │ File C │
  └────────┘ └─┬───────┬─┘   └────────┘   └─┬─────────┘ └─┬──────┘
               │       │                    │             │
               │       │                    │             │
         ┌─────┘       │                    │             │
         │             │                    │             │
    ┌────┴──────┐   ┌──┴─────┐        ┌─────┴──┐      ┌───┴────┐
    │ Directory │   │ File D │        │ File E │      │ File F │
    └───────────┘   └────────┘        └────────┘      └────────┘

  Figure 9-1.  An MS-DOS file-system structure.


Drive and Directory Control

  You can examine, select, create, and delete disk directories interactively
  with the DIR, CHDIR (CD), MKDIR (MD), and RMDIR (RD) commands. You can
  select a new current drive by entering the letter of the desired drive,
  followed by a colon. MS-DOS provides the following Int 21H functions to
  give application programs similar control over drives and directories:

  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  0EH                     Select current drive.
  19H                     Get current drive.
  39H                     Create directory.
  3AH                     Remove directory.
  3BH                     Select current directory.
  47H                     Get current directory.
  ──────────────────────────────────────────────────────────────────────────

  The two functions that deal with disk drives accept or return a binary
  drive code──0 represents drive A, 1 represents drive B, and so on. This
  differs from most other MS-DOS functions, which use 0 to indicate the
  current drive, 1 for drive A, and so on.

  The first three directory functions in the preceding list require an
  ASCIIZ string that describes the path to the desired directory. As with
  the handle-based file open and create functions, the address of the ASCIIZ
  string is passed in the DS:DX registers. On return, the carry flag is
  clear if the function succeeds or set if the function failed, with an
  error code in the AX register. The directory functions can fail for a
  variety of reasons, but the most common cause of an error is that some
  element of the indicated path does not exist.

  The last function in the preceding list, Int 21H Function 47H, allows you
  to obtain an ASCIIZ path for the current directory on the specified or
  default drive. MS-DOS supplies the path string without the drive
  identifier or a leading backslash. Int 21H Function 47H is most commonly
  used with Int 21H Function 19H to build fully qualified filenames. Such
  filenames are desirable because they remain valid if the user changes the
  current drive or directory.

  Section 2 of this book, "MS-DOS Functions Reference," gives detailed
  information on the drive and directory control functions.

Searching Directories

  When you request an open operation on a file, you are implicitly
  performing a search of a directory. MS-DOS examines each entry of the
  directory to find a match for the filename you have given as an argument;
  if the file is found, MS-DOS copies certain information from the directory
  into a data structure that it can use to control subsequent read or write
  operations to the file. Thus, if you wish to test for the existence of a
  specific file, you need only perform an open operation and observe whether
  it is successful. (If it is, you should, of course, perform a subsequent
  close operation to avoid needless expenditure of handles.)

  Sometimes you may need to perform more elaborate searches of a disk
  directory. Perhaps you wish to find all the files with a certain
  extension, a file with a particular attribute, or the names of the
  subdirectories of a certain directory. Although the locations of a disk's
  directories and the specifics of the entries that are found in them are of
  necessity hardware dependent (for example, interpretation of the field
  describing the starting location of a file depends upon the physical disk
  format), MS-DOS does provide functions that will allow examination of a
  disk directory in a hardware-independent fashion.

  In order to search a disk directory successfully, you must understand two
  types of MS-DOS search services. The first type is the "search for first"
  function, which accepts a file specification──possibly including wildcard
  characters──and looks for the first matching file in the directory of
  interest. If it finds a match, the function fills a buffer owned by the
  requesting program with information about the file; if it does not find a
  match, it returns an error flag.

  A program can call the second type of search service, called "search for
  next," only after a successful "search for first." If the file
  specification that was originally passed to "search for first" included
  wildcard characters and at least one matching file was present, the
  program can call "search for next" as many times as necessary to find all
  additional matching files. Like "search for first," "search for next"
  returns information about the matched files in a buffer designated by the
  requesting program. When it can find no more matching files, "search for
  next" returns an error flag.

  As with nearly every other operation, MS-DOS provides two parallel sets of
  directory-searching services:

  Action             FCB function      Handle function
  ──────────────────────────────────────────────────────────────────────────
  Search for first   11H               4EH
  Search for next    12H               4FH
  ──────────────────────────────────────────────────────────────────────────

  The FCB directory functions allow searches to match a filename and
  extension, both possibly containing wildcard characters, within the
  current directory for the specified or current drive. The handle directory
  functions, on the other hand, allow a program to perform searches within
  any directory on any drive, regardless of the current directory.

  Searches that use normal FCBs find only normal files. Searches that use
  extended FCBs, or the handle-type functions, can be qualified with file
  attributes. The attribute bits relevant to searches are as follows:

  Bit                      Significance
  ──────────────────────────────────────────────────────────────────────────
  0                        Read-only file
  1                        Hidden file
  2                        System file
  3                        Volume label
  4                        Directory
  5                        Archive needed (set when file modified)
  ──────────────────────────────────────────────────────────────────────────

  The remaining bits of a search function's attribute parameter should be
  zero. When any of the preceding attribute bits are set, the search
  function returns all normal files plus any files with the specified
  attributes, except in the case of the volume-label attribute bit, which
  receives special treatment as described later in this chapter. Note that
  by setting bit 4 you can include directories in a search, exactly as
  though they were files.

  Both the FCB and handle directory-searching functions require that the
  disk transfer area address be set (with Int 21H Function 1AH), before the
  call to "search for first," to point to a working buffer for use by
  MS-DOS. The DTA address should not be changed between calls to "search for
  first" and "search for next." When it finds a matching file, MS-DOS places
  the information about the file in the buffer and then inspects the buffer
  on the next "search for next" call, to determine where to resume the
  search. The format of the data returned in the buffer is different for the
  FCB and handle functions, so read the detailed descriptions in Section 2
  of this book, "MS-DOS Functions Reference," before attempting to interpret
  the buffer contents.

  Figures 9-2 and 9-3 provide equivalent examples of searches for all
  files in a given directory that have the .ASM extension, one example using
  the FCB directory functions (Int 21H Functions 11H and 12H) and the
  other using the handle functions (Int 21H Functions 4EH and 4FH). (Both
  programs use the handle write function with the standard output handle to
  display the matched filenames, to avoid introducing tangential differences
  in the listings.)

  ──────────────────────────────────────────────────────────────────────────
  start:                          ; set DTA address for buffer
                                  ; used by search functions
          mov     dx,seg buff     ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          mov     ah,1ah          ; function 1ah = search for first
          int     21h             ; transfer to MS-DOS
                                  ; search for first match...
          mov     dx,offset fcb   ; DS:DX = FCB address
          mov     ah,11h          ; function 11h = search for first
          int     21h             ; transfer to MS-DOS
          or      al,al           ; any matches at all?
          jnz     exit            ; no, quit

  disp:                           ; go to a new line...
          mov     dx,offset crlf  ; DS:DX = CR-LF string
          mov     cx,2            ; CX = string length
          mov     bx,1            ; BX = standard output handle
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS

                                  ; display matching file
          mov     dx,offset buff+1 ; DS:DX = filename
          mov     cx,11           ; CX = length
          mov     bx,1            ; BX = standard output handle
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS

                                  ; search for next match...
          mov     dx,offset fcb   ; DS:DX = FCB address
          mov     ah,12h          ; function 12h = search for next
          int     21h             ; transfer to MS-DOS
          or      al,al           ; any more matches?
          jz      disp            ; yes, go show filename

  exit:                           ; final exit point
          mov     ax,4c00h        ; function 4ch = terminate,
                                  ; return code = 0
          int     21h             ; transfer to MS-DOS

          .
          .
          .

  crlf    db      0dh,0ah         ; ASCII carriage return-
                                  ; linefeed string

  fcb     db      0               ; drive = current
          db      8 dup ('?')     ; filename = wildcard
          db      'ASM'           ; extension = ASM
          db      25 dup (0)      ; remainder of FCB = zero

  buff    db      64 dup (0)      ; receives search results
  ──────────────────────────────────────────────────────────────────────────

  Figure 9-2.  Example of an FCB-type directory search using Int 21H
  Functions 11H and 12H. This routine displays the names of all files in
  the current directory that have the .ASM extension.

  ──────────────────────────────────────────────────────────────────────────
  start:                          ; set DTA address for buffer
                                  ; used by search functions
          mov     dx,seg buff     ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          mov     ah,1ah          ; function 1ah = search for first
          int     21h             ; transfer to MS-DOS

                                  ; search for first match...
          mov     dx,offset fname ; DS:DX = wildcard filename
          mov     cx,0            ; CX = normal file attribute
          mov     ah,4eh          ; function 4eh = search for first
          int     21h             ; transfer to MS-DOS
          jc      exit            ; quit if no matches at all

  disp:                           ; go to a new line...
          mov     dx,offset crlf  ; DS:DX = CR-LF string
          mov     cx,2            ; CX = string length
          mov     bx,1            ; BX = standard output handle
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS
                                  ; find length of filename...
          mov     cx,0            ; CX will be char count
                                  ; DS:SI = start of name
          mov     si,offset buff+30

  disp1:  lodsb                   ; get next character
          or      al,al           ; is it null character?
          jz      disp2           ; yes, found end of string
          inc     cx              ; else count characters
          jmp     disp1           ; and get another

  disp2:                          ; display matching file...
                                  ; CX already contains length
                                  ; DS:DX = filename
          mov     dx,offset buff+30
          mov     bx,1            ; BX = standard output handle
          mov     ah,40h          ; function 40h = write
          int     21h             ; transfer to MS-DOS
                                  ; find next matching file...
          mov     ah,4fh          ; function 4fh = search for next
          int     21h             ; transfer to MS-DOS
          jnc     disp            ; jump if another match found

  exit:                           ; final exit point
          mov     ax,4c00h        ; function 4ch = terminate,
                                  ; return code = 0
          int     21h             ; transfer to MS-DOS

          .
          .
          .

  crlf    db      0dh,0ah         ; ASCII carriage return-
                                  ; linefeed string

  fname   db      '*.ASM',0       ; ASCIIZ filename to
                                  ; be matched

  buff    db      64 dup (0)      ; receives search results
  ──────────────────────────────────────────────────────────────────────────

  Figure 9-3.  Example of a handle-type directory search using Int 21H
  Functions 4EH and 4FH. This routine also displays the names of all files
  in the current directory that have a .ASM extension.

Moving Files

  The rename file function that was added in MS-DOS version 2.0, Int 21H
  Function 56H, has the little-advertised capability to move a file from
  one directory to another. The function has two ASCIIZ parameters: the
  "old" and "new" names for the file. If the old and new paths differ,
  MS-DOS moves the file; if the filename or extension components differ,
  MS-DOS renames the file. MS-DOS can carry out both of these actions in the
  same function call.

  Of course, the old and new directories must be on the same drive, because
  the file's actual data is not moved at all; only the information that
  describes the file is removed from one directory and placed in another
  directory. Function 56H fails if the two ASCIIZ strings include different
  logical-drive codes, if the file is read-only, or if a file with the same
  name and location as the "new" filename already exists.

  The FCB-based rename file service, Int 21H Function 17H, works only on
  the current directory and cannot be used to move files.


Volume Labels

  Support for volume labels was first added to MS-DOS in version 2.0. A
  volume label is an optional name of from 1 to 11 characters that the user
  assigns to a disk during a FORMAT operation. You can display a volume
  label with the DIR, TREE, CHKDSK, or VOL command. Beginning with MS-DOS
  version 3.0, you can use the LABEL command to add, display, or alter the
  label after formatting. In MS-DOS version 4, the FORMAT program also
  assigns a semi-random 32-bit binary ID to each disk it formats; you can
  display this value, but you cannot change it.

  The distinction between volumes and drives is important. A volume label is
  associated with a specific storage medium. A drive identifier (such as A)
  is associated with a physical device that a storage medium can be mounted
  on. In the case of fixed-disk drives, the medium associated with a drive
  identifier does not change (hence the name). In the case of floppy disks
  or other removable media, the disk accessed with a given drive identifier
  might have any volume label or none at all.

  Hence, volume labels do not take the place of the logical-drive identifier
  and cannot be used as part of a pathname to identify a file. In fact, in
  MS-DOS version 2, the system does not use volume labels internally at all.
  In MS-DOS versions 3.0 and later, a disk driver can use volume labels to
  detect whether the user has replaced a disk while a file is open; this use
  is optional, however, and is not implemented in all systems.

  MS-DOS volume labels are implemented as a special type of entry in a
  disk's root directory. The entry contains a time-and-date stamp and has an
  attribute value of 8 (i.e., bit 3 set). Except for the attribute, a volume
  label is identical to the directory entry for a file that was created but
  never had any data written into it, and you can manipulate volume labels
  with Int 21H functions much as you manipulate files. However, a volume
  label receives special handling at several levels:

  ■  When you create a volume label after a disk is formatted, MS-DOS always
     places it in the root directory, regardless of the current directory.

  ■  A disk can contain only one volume label; attempts to create additional
     volume labels (even with different names) will fail.

  ■  MS-DOS always carries out searches for volume labels in the root
     directory, regardless of the current directory, and does not also
     return all normal files.

  In MS-DOS version 2, support for volume labels is not completely
  integrated into the handle file functions, and you must use extended FCBs
  instead to manipulate volume labels. For example, the code in Figure 9-4
  searches for the volume label in the root directory of the current drive.
  You can also change volume labels with extended FCBs and the rename file
  function (Int 21H Function 17H), but you should not attempt to remove an
  existing volume label with Int 21H Function 13H under MS-DOS version 2,
  because this operation can damage the disk's FAT in an unpredictable
  manner.

  In MS-DOS versions 3.0 and later, you can create a volume label in the
  expected manner, using Int 21H Function 3CH and an attribute of 8, and
  you can use the handle-type "search for first" function (4EH) to obtain
  an existing volume label for a logical drive (Figure 9-5). However, you
  still must use extended FCBs to change a volume label.

  ──────────────────────────────────────────────────────────────────────────
  buff    db      64 dup (?)   ; receives search results

  xfcb    db      0ffh         ; flag signifying extended FCB
          db      5 dup (0)    ; reserved
          db      8            ; volume attribute byte
          db      0            ; drive code (0 = current)
          db      11 dup ('?') ; wildcard filename and extension
          db      25 dup (0)   ; remainder of FCB (not used)
          .
          .
          .
                               ; set DTA address for buffer
                               ; used by search functions
          mov     dx,seg buff  ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          mov     ah,1ah       ; function 1ah = set DTA
          int     21h          ; transfer to MS-DOS

                               ; now search for label...
                               ; DS:DX = extended FCB
          mov     dx,offset xfcb
          mov     ah,11h       ; function 11h = search for first
          int     21h          ; transfer to MS-DOS
          cmp     al,0ffh      ; search successful?
          je      no_label     ; jump if no volume label
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Figure 9-4.  A volume-label search under MS-DOS version 2, using an
  extended file control block. If the search is successful, the volume label
  is returned in buff, formatted in the filename and extension fields of an
  extended FCB.

  ──────────────────────────────────────────────────────────────────────────
  buff    db      64 dup (?)   ; receives search results

  wildcd  db      '*.*',0      ; wildcard ASCIIZ filename
          .
          .
          .
                               ; set DTA address for buffer
                               ; used by search functions
          mov     dx,seg buff  ; DS:DX = buffer address
          mov     ds,dx
          mov     dx,offset buff
          mov     ah,1ah       ; function 1ah = set DTA
          int     21h          ; transfer to MS-DOS

                               ; now search for label...
                               ; DS:DX = ASCIIZ string
          mov     dx,offset wildcd
          mov     cx,8         ; CX = volume attribute
          mov     ah,4eh       ; function 4eh = search for first
          int     21h          ; transfer to MS-DOS
          jc      no_label     ; jump if no volume label
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Figure 9-5.  A volume-label search under MS-DOS version 3, using the
  handle-type file functions. If the search is successful (carry flag
  returned clear), the volume name is placed at location buff+1EH in the
  form of an ASCIIZ string.



────────────────────────────────────────────────────────────────────────────
Chapter 10  Disk Internals

  MS-DOS disks are organized according to a rather rigid scheme that is
  easily understood and therefore easily manipulated. Although you will
  probably never need to access the special control areas of a disk
  directly, an understanding of their internal structure leads to a better
  understanding of the behavior and performance of MS-DOS as a whole.

  From the application programmer's viewpoint, MS-DOS presents disk devices
  as logical volumes that are associated with a drive code (A, B, C, and so
  on) and that have a volume name (optional), a root directory, and from
  zero to many additional directories and files. MS-DOS shields the
  programmer from the physical characteristics of the medium by providing a
  battery of disk services through Int 21H. Using these services, the
  programmer can create, open, read, write, close, and delete files in a
  uniform way, regardless of the disk drive's size, speed, number of
  read/write heads, number of tracks, and so forth.

  Requests from an application program for file operations actually go
  through two levels of translation before resulting in the physical
  transfer of data between the disk device and random-access memory:

  1.  Beneath the surface, MS-DOS views each logical volume, whether it is
      an entire physical unit such as a floppy disk or only a part of a
      fixed disk, as a continuous sequence of logical sectors, starting at
      sector 0. (A logical disk volume can also be implemented on other
      types of storage. For example, RAM disks map a disk structure onto an
      area of random-access memory.) MS-DOS translates an application
      program's Int 21H file-management requests into requests for transfers
      of logical sectors, using the information found in the volume's
      directories and allocation tables. (For those rare situations where it
      is appropriate, programs can also access logical sectors directly with
      Int 25H and Int 26H.)

  2.  MS-DOS then passes the requests for logical sectors to the disk
      device's driver, which maps them onto actual physical addresses (head,
      track, and sector). Disk drivers are extremely hardware dependent and
      are always written in assembly language for maximum speed. In most
      versions of MS-DOS, a driver for IBM-compatible floppy- and fixed-disk
      drives is built into the MS-DOS BIOS module (IO.SYS) and is always
      loaded during system initialization; you can install additional
      drivers for non-IBM-compatible disk devices by including the
      appropriate DEVICE directives in the CONFIG.SYS file.

  Each MS-DOS logical volume is divided into several fixed-size control
  areas and a files area (Figure 10-1). The size of each control area
  depends on several factors──the size of the volume and the version of
  FORMAT used to initialize the volume, for example──but all of the
  information needed to interpret the structure of a particular logical
  volume can be found on the volume itself in the boot sector.

  ┌───────────────────────────────────────────────────────┐
  │                      Boot sector                      │
  │                     Reserved area                     │
  ├───────────────────────────────────────────────────────┤
  │               File allocation table #1                │
  ├───────────────────────────────────────────────────────┤
  │           Possible additional copies of FAT           │
  ├───────────────────────────────────────────────────────┤
  │                    Root directory                     │
  ├───────────────────────────────────────────────────────┤
  │                                                       │
  │                      Files area                       │
  │                                                       │
  └───────────────────────────────────────────────────────┘

  Figure 10-1.  Map of a typical MS-DOS logical volume. The boot sector
  (logical sector 0) contains the OEM identification, BIOS parameter block
  (BPB), and disk bootstrap. The remaining sectors are divided among an
  optional reserved area, one or more copies of the file allocation table,
  the root directory, and the files area.


The Boot Sector

  Logical sector 0, known as the boot sector, contains all of the critical
  information regarding the disk medium's characteristics (Figure 10-2).
  The first byte in the sector is always an 80x86 jump instruction──either a
  normal intrasegment JMP (opcode 0E9H) followed by a 16-bit displacement or
  a "short" JMP (opcode 0EBH) followed by an 8-bit displacement and then by
  an NOP (opcode 90H). If neither of these two JMP opcodes is present, the
  disk has not been formatted or was not formatted for use with MS-DOS. (Of
  course, the presence of the JMP opcode does not in itself ensure that the
  disk has an MS-DOS format.)

  Following the initial JMP instruction is an 8-byte field that is reserved
  by Microsoft for OEM identification. The disk-formatting program, which is
  specialized for each brand of computer, disk controller, and medium, fills
  in this area with the name of the computer manufacturer and the
  manufacturer's internal MS-DOS version number.

  00H ┌───────────────────────────────────────────────┐
      │             E9 XX XX or EB XX 90              │
  03H ├───────────────────────────────────────────────┤
      │             OEM name and version              │
      │                   (8 bytes)                   │
  OBH ├───────────────────────────────────────────────┤─┐
      │          Bytes per sector (2 bytes)           │ │
  ODH ├───────────────────────────────────────────────┤ │
      │     Sectors per allocation unit (1 byte)      │ │
  0EH ├───────────────────────────────────────────────┤ │
      │   Reserved sectors, starting at 0 (2 bytes)   │ │
  10H ├───────────────────────────────────────────────┤ │
      │            Number of FATs (1 byte)            │ B
  11H ├───────────────────────────────────────────────┤ P
      │  Number of root-directory entries (2 bytes)   │ B
  13H ├───────────────────────────────────────────────┤ │
      │   Total sectors in logical volume (2 bytes)   │ │
  15H ├───────────────────────────────────────────────┤ │ MS-DOS
      │             Media descriptor byte             │ │ version 2.0
  16H ├───────────────────────────────────────────────┤ │
      │      Number of sectors per FAT (2 bytes)      │ │
  18H ├───────────────────────────────────────────────┤═╡
      │          Sectors per track (2 bytes)          │ │
  1AH ├───────────────────────────────────────────────┤ │
      │           Number of heads (2 bytes)           │ │ MS-DOS
  1CH ├───────────────────────────────────────────────┤ │ version 3.0
      │      Number of hidden sectors (4 bytes)       │═╡
  20H ├───────────────────────────────────────────────┤ │ MS-DOS
      │        Total sectors in logical volume        │ │ version 4.0
      │      (MS-DOS 4.0 and volume size >32 MB)      │ │
  24H ├───────────────────────────────────────────────┤═╡
      │             Physical drive number             │ │
  25H ├───────────────────────────────────────────────┤ │
      │                   Reserved                    │ │
  26H ├───────────────────────────────────────────────┤ │
      │     Extended boot signature record (29H)      │ │ Additional
  27H ├───────────────────────────────────────────────┤ │ MS-DOS 4.0
      │            32-bit binary volume ID            │ │ information
  2BH ├───────────────────────────────────────────────┤ │
      │            Volume label (11 bytes)            │ │
  36H ├───────────────────────────────────────────────┤ │
      │              Reserved (8 bytes)               │ │
  3EH ├───────────────────────────────────────────────┤─┘
      │                   Bootstrap                   │
      └───────────────────────────────────────────────┘

  Figure 10-2.  Map of the boot sector of an MS-DOS disk. Note the JMP at
  offset 0, the OEM identification field, the MS-DOS version 2 compatible
  BIOS parameter block (bytes 0BH─17H), the three additional WORD fields for
  MS-DOS version 3, the double-word number-of-sectors field and 32-bit
  binary volume ID for MS-DOS version 4.0, and the bootstrap code.

  The third major component of the boot sector is the BIOS parameter block
  (BPB) in bytes 0BH through 17H. (Additional fields are present in MS-DOS
  versions 3.0 and later.) This data structure describes the physical disk
  characteristics and allows the device driver to calculate the proper
  physical disk address for a given logical-sector number; it also contains
  information that is used by MS-DOS and various system utilities to
  calculate the address and size of each of the disk control areas (file
  allocation tables and root directory).

  The final element of the boot sector is the disk bootstrap routine. The
  disk bootstrap is usually read into memory by the ROM bootstrap, which is
  executed automatically when the computer is turned on. The ROM bootstrap
  is usually just smart enough to home the head of the disk drive (move it
  to track 0), read the first physical sector into RAM at a predetermined
  location, and jump to it. The disk bootstrap is more sophisticated. It
  calculates the physical disk address of the beginning of the files area,
  reads the files containing the operating system into memory, and transfers
  control to the BIOS module at location 0070:0000H. (See Chapter 2.)

  Figures 10-3 and 10-4 show a partial hex dump and disassembly of a
  PC-DOS 3.3 floppy-disk boot sector.

  ──────────────────────────────────────────────────────────────────────────
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  EB 34 90 49 42 4D 20 20 33 2E 33 00 02 02 01 00  .4.IBM  3.3.....
  0010  02 70 00 D0 02 FD 02 00 09 00 02 00 00 00 00 00  .p..............
  0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 12  ................
  0030  00 00 00 00 01 00 FA 33 C0 8E D0 BC 00 7C 16 07  .......3.....|..
        .
        .
        .
  01C0  0D 0A 44 69 73 6B 20 42 6F 6F 74 20 66 61 69 6C  ..Disk Boot fail
  01D0  75 72 65 0D 0A 00 49 42 4D 42 49 4F 20 20 43 4F  ure...IBMBIO  CO
  01E0  4D 49 42 4D 44 4F 53 20 20 43 4F 4D 00 00 00 00  MIBMDOS  COM....
  01F0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA  ..............U.
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-3.  Partial hex dump of the boot sector (track 0, head 0, sector
  1) of a PC-DOS version 3.3 floppy disk. This sector contains the OEM
  identification, a copy of the BIOS parameter block describing the medium,
  and the bootstrap routine that reads the BIOS into memory and transfers
  control to it. See also Figures 10-2 and 10-4.

  ──────────────────────────────────────────────────────────────────────────
          jmp     $+54            ; jump to bootstrap
          nop

          db      'IBM  3.3'      ; OEM identification

                                  ; BIOS parameter block
          dw      512             ; bytes per sector
          db      2               ; sectors per cluster
          dw      1               ; reserved sectors
          db      2               ; number of FATs
          dw      112             ; root directory entries
          dw      720             ; total sectors
          db      0fdh            ; media descriptor byte
          dw      2               ; sectors per FAT

          dw      9               ; sectors per track
          dw      2               ; number of heads
          dd      0               ; hidden sectors
          .
          .
          .
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-4.  Partial disassembly of the boot sector shown in Figure
  10-3.


The Reserved Area

  The boot sector is actually part of a reserved area that can span from one
  to several sectors. The reserved-sectors word in the BPB, at offset 0EH in
  the boot sector, describes the size of this area. Remember that the number
  in the BPB field includes the boot sector itself, so if the value is 1 (as
  it is on IBM PC floppy disks), the length of the reserved area is actually
  0 sectors.


The File Allocation Table

  When a file is created or extended, MS-DOS assigns it groups of disk
  sectors from the files area in powers of 2. These are known as allocation
  units or clusters. The number of sectors per cluster for a given medium is
  defined in the BPB and can be found at offset 0DH in the disk's boot
  sector. Below are some example cluster sizes:

  Disk type                     Power of 2    Sectors/cluster
  ──────────────────────────────────────────────────────────────────────────
  5.25" 180 KB floppy disk      0             1
  5.25" 360 KB floppy disk      1             2
  PC/AT fixed disk              2             4
  PC/XT fixed disk              3             8
  ──────────────────────────────────────────────────────────────────────────

  The file allocation table (FAT) is divided into fields that correspond
  directly to the assignable clusters on the disk. These fields are 12 bits
  in MS-DOS versions 1 and 2 and may be either 12 bits or 16 bits in
  versions 3.0 and later, depending on the size of the medium (12 bits if
  the disk contains fewer than 4087 clusters, 16 bits otherwise).

  The first two fields in the FAT are always reserved. On IBM-compatible
  media, the first 8 bits of the first reserved FAT entry contain a copy of
  the media descriptor byte, which is also found in the BPB in the boot
  sector. The second, third, and (if applicable) fourth bytes, which
  constitute the remainder of the first two reserved FAT fields, always
  contain 0FFH. The currently defined IBM-format media descriptor bytes are
  as follows:

                                                             MS-DOS version
                                                             where first
  Descriptor     Medium                                      supported
  ──────────────────────────────────────────────────────────────────────────
  0F0H           3.5" floppy disk, 2-sided, 18-sector        3.3
  0F8H           Fixed disk                                  2.0
  0F9H           5.25" floppy disk, 2-sided, 15-sector       3.0
                 3.5" floppy disk, 2-sided, 9-sector         3.2
  0FCH           5.25" floppy disk, 1-sided, 9-sector        2.0
  0FDH           5.25" floppy disk, 2-sided, 9-sector        2.0
                 8" floppy disk, 1-sided, single-density
  0FEH           5.25" floppy disk, 1-sided, 8-sector        1.0
                 8" floppy disk, 1-sided, single-density
                 8" floppy disk, 2-sided, double-density
  0FFH           5.25" floppy disk, 2-sided, 8-sector        1.1
  ──────────────────────────────────────────────────────────────────────────

  The remainder of the FAT entries describe the use of their corresponding
  disk clusters. The contents of the FAT fields are interpreted as follows:

  Value              Meaning
  ──────────────────────────────────────────────────────────────────────────
  (0)000H            Cluster available
  (F)FF0─(F)FF6H     Reserved cluster
  (F)FF7H            Bad cluster, if not part of chain
  (F)FF8─(F)FFFH     Last cluster of file
  (X)XXX             Next cluster in file
  ──────────────────────────────────────────────────────────────────────────

  Each file's entry in a directory contains the number of the first cluster
  assigned to that file, which is used as an entry point into the FAT. From
  the entry point on, each FAT slot contains the cluster number of the next
  cluster in the file, until a last-cluster mark is encountered.

  At the computer manufacturer's option, MS-DOS can maintain two or more
  identical copies of the FAT on each volume. MS-DOS updates all copies
  simultaneously whenever files are extended or the directory is modified.
  If access to a sector in a FAT fails due to a read error, MS-DOS tries the
  other copies until a successful disk read is obtained or all copies are
  exhausted. Thus, if one copy of the FAT becomes unreadable due to wear or
  a software accident, the other copies may still make it possible to
  salvage the files on the disk. As part of its procedure for checking the
  integrity of a disk, the CHKDSK program compares the multiple copies
  (usually two) of the FAT to make sure they are all readable and
  consistent.


The Root Directory

  Following the file allocation tables is an area known in MS-DOS versions
  2.0 and later as the root directory. (Under MS-DOS version 1, it was the
  only directory on the disk.) The root directory contains 32-byte entries
  that describe files, other directories, and the optional volume label
  (Figure 10-5). An entry beginning with the byte value E5H is available
  for reuse; it represents a file or directory that has been erased. An
  entry beginning with a null (zero) byte is the logical end-of-directory;
  that entry and all subsequent entries have never been used.

  00H ┌──────────────────────────────┐
      │           Filename           │ Note 1
  08H ├──────────────────────────────┤
      │          Extension           │
  0BH ├──────────────────────────────┤
      │        File attribute        │ Note 2
  0CH ├──────────────────────────────┤
      │           Reserved           │
  16H ├──────────────────────────────┤
      │ Time created or last updated │ Note 3
  18H ├──────────────────────────────┤
      │ Date created or last updated │ Note 4
  1AH ├──────────────────────────────┤
      │       Starting cluster       │
  1CH ├──────────────────────────────┤
      │      File size, 4 bytes      │ Note 5
  20H └──────────────────────────────┘

  Figure 10-5.  Format of a single entry in a disk directory. Total length
  is 32 bytes (20H bytes).

  ──────────────────────────────────────────────────────────────────────────
  Notes for Figure 10-5
    1.  The first byte of the filename field of a directory entry may
        contain the following special information:

    Value             Meaning
    ────────────────────────────────────────────────────────────────────────
    00H               Directory entry has never been used; end of occupied
                      portion of directory.
    05H               First character of filename is actually E5H.
    2EH               Entry is an alias for the current or parent directory.
                      If the next byte is also 2EH, the cluster field
                      contains the cluster number of the parent directory
                      (zero if the parent directory is the root directory).
    E5H               File has been erased.
    ────────────────────────────────────────────────────────────────────────

    2.  The attribute byte of the directory entry is mapped as follows:

    Bit               Meaning
    ────────────────────────────────────────────────────────────────────────
    0                 Read-only; attempts to open file for write or to
                      delete file will fail.
    1                 Hidden file; excluded from normal searches.
    2                 System file; excluded from normal searches.
    3                 Volume label; can exist only in root directory.
    4                 Directory; excluded from normal searches.
    5                 Archive bit; set whenever file is modified.
    6                 Reserved.
    7                 Reserved.
    ────────────────────────────────────────────────────────────────────────

    3.  The time field is encoded as follows:

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           Binary number of 2-second increments (0─29,
                      corresponding to 0─58 seconds)
    05H─0AH           Binary number of minutes (0─59)
    0BH─0FH           Binary number of hours (0─23)
    ────────────────────────────────────────────────────────────────────────

    4.  The date field is encoded as follows:

    Bits              Contents
    ────────────────────────────────────────────────────────────────────────
    00H─04H           Day of month (1─31)
    05H─08H           Month (1─12)
    09H─0FH           Year (relative to 1980)
    ────────────────────────────────────────────────────────────────────────

    5.  The file-size field is interpreted as a 4-byte integer, with the
        low-order 2 bytes of the number stored first.

  ──────────────────────────────────────────────────────────────────────────

  The root directory has a number of special properties. Its size and
  position are fixed and are determined by the FORMAT program when a disk is
  initialized. This information can be obtained from the boot sector's BPB.
  If the disk is bootable, the first two entries in the root directory
  always describe the files containing the MS-DOS BIOS and the MS-DOS
  kernel. The disk bootstrap routine uses these entries to bring the
  operating system into memory and start it up.

  Figure 10-6 shows a partial hex dump of the first sector of the root
  directory on a bootable PC-DOS 3.3 floppy disk.

  ──────────────────────────────────────────────────────────────────────────
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  49 42 4D 42 49 4F 20 20 43 4F 4D 27 00 00 00 00  IBMBIO  COM'....
  0010  00 00 00 00 00 00 00 60 72 0E 02 00 54 56 00 00  .......'r...TV..
  0020  49 42 4D 44 4F 53 20 20 43 4F 4D 27 00 00 00 00  IBMDOS  COM'....
  0030  00 00 00 00 00 00 00 60 71 0E 18 00 CF 75 00 00  .......'q....u..
  0040  43 4F 4D 4D 41 4E 44 20 43 4F 4D 20 00 00 00 00  COMMAND COM ....
  0050  00 00 00 00 00 00 00 60 71 0E 36 00 DB 62 00 00  .......'q.6..b..
  0060  42 4F 4F 54 44 49 53 4B 20 20 20 28 00 00 00 00  BOOTDISK   (....
  0070  00 00 00 00 00 00 A1 00 21 00 00 00 00 00 00 00  ........!.......
  0080  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0090  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        .
        .
        .
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-6.  Partial hex dump of the first sector of the root directory
  for a PC-DOS 3.3 disk containing the three system files and a volume
  label.


The Files Area

  The remainder of the volume after the root directory is known as the files
  area. MS-DOS views the sectors in this area as a pool of clusters, each
  containing one or more logical sectors, depending on the disk format. Each
  cluster has a corresponding entry in the FAT that describes its current
  use: available, reserved, assigned to a file, or unusable (because of
  defects in the medium). Because the first two fields of the FAT are
  reserved, the first cluster in the files area is assigned the number 2.

  When a file is extended under versions 1 and 2, MS-DOS searches the FAT
  from the beginning until it finds a free cluster (designated by a zero FAT
  field); it then changes that FAT field to a last-cluster mark and updates
  the previous last cluster of the file's chain to point to the new last
  cluster. Under versions 3.0 and later, however, MS-DOS searches the FAT
  from the most recently allocated cluster; this reduces file fragmentation
  and improves overall access times.

  Directories other than the root directory are simply a special type of
  file. Their storage is allocated from the files area, and their contents
  are 32-byte entries──in the same format as those used in the root
  directory──that describe files or other directories. Directory entries
  that describe other directories contain an attribute byte with bit 4 set,
  zero in the file-length field, and the date and time that the directory
  was created (Figure 10-7). The first cluster field points, of course, to
  the first cluster in the files area that belongs to the directory. (The
  directory's other clusters can be found only by tracing through the FAT.)

  All directories except the root directory contain two special directory
  entries with the names . and ... MS-DOS puts these entries in place when
  it creates a directory, and they cannot be deleted. The . entry is an
  alias for the current directory; its cluster field points to the cluster
  in which it is found. The .. entry is an alias for the directory's parent
  (the directory immediately above it in the tree structure); its cluster
  field points to the first cluster of the parent directory. If the parent
  is the root directory, the cluster field of the .. entry contains zero
  (Figure 10-8).

  ──────────────────────────────────────────────────────────────────────────
        .
        .
        .
  0080  4D 59 44 49 52 20 20 20 20 20 20 10 00 00 00 00  MYDIR      .....
  0090  00 00 00 00 00 00 87 9A 9B 0A 2A 00 00 00 00 00  ..........*.....
        .
        .
        .
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-7.  Extract from the root directory of an MS-DOS disk, showing
  the entry for a subdirectory named MYDIR. Bit 4 in the attribute byte is
  set, the cluster field points to the first cluster of the subdirectory
  file, the date and time stamps are valid, but the file length is zero.

  ──────────────────────────────────────────────────────────────────────────
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  2E 20 20 20 20 20 20 20 20 20 20 10 00 00 00 00  .         .....
  0010  00 00 00 00 00 00 87 9A 9B 0A 2A 00 00 00 00 00  ..........*.....
  0020  2E 2E 20 20 20 20 20 20 20 20 20 10 00 00 00 00  ..        .....
  0030  00 00 00 00 00 00 87 9A 9B 0A 00 00 00 00 00 00  ................
  0040  4D 59 46 49 4C 45 20 20 44 41 54 20 00 00 00 00  MYFILE  DAT ....
  0050  00 00 00 00 00 00 98 9A 9B 0A 2B 00 15 00 00 00  ..........+.....
  0060  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  0070  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        .
        .
        .
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-8.  Hex dump of the first block of the directory MYDIR. Note the
  . and .. entries. This directory contains exactly one file, MYFILE.DAT.


Interpreting the File Allocation Table

  Now that we understand how the disk is structured, let's see how we can
  use this knowledge to find a FAT position from a cluster number.

  If the FAT has 12-bit entries, use the following procedure:

  1.  Use the directory entry to find the starting cluster of the file in
      question.

  2.  Multiply the cluster number by 1.5.

  3.  Use the integral part of the product as the offset into the FAT and
      move the word at that offset into a register. Remember that a FAT
      position can span a physical disk-sector boundary.

  4.  If the product is a whole number, AND the register with 0FFFH.

  5.  Otherwise, "logical shift" the register right 4 bits.

  6.  If the result is a value from 0FF8H through 0FFFH, the file has no
      more clusters. Otherwise, the result is the number of the next cluster
      in the file.

  On disks with at least 4087 clusters formatted under MS-DOS version 3.0 or
  later, the FAT entries use 16 bits, and the extraction of a cluster number
  from the table is much simpler:

  1.  Use the directory entry to find the starting cluster of the file in
      question.

  2.  Multiply the cluster number by 2.

  3.  Use the product as the offset into the FAT and move the word at that
      offset into a register.

  4.  If the result is a value from 0FFF8H through 0FFFFH, the file has no
      more clusters. Otherwise, the result is the number of the next cluster
      in the file.

  To convert cluster numbers to logical sectors, subtract 2, multiply the
  result by the number of sectors per cluster, and add the logical-sector
  number of the beginning of the data area (this can be calculated from the
  information in the BPB).

  As an example, let's work out the disk location of the file IBMBIO.COM,
  which is the first entry in the directory shown in Figure 10-6. First, we
  need some information from the BPB, which is in the boot sector of the
  medium. (See Figures 10-3 and 10-4.) The BPB tells us that there are

  ■  512 bytes per sector

  ■  2 sectors per cluster

  ■  2 sectors per FAT

  ■  2 FATs

  ■  112 entries in the root directory

  From the BPB information, we can calculate the starting logical-sector
  number of each of the disk's control areas and the files area by
  constructing a table, as follows:

                                                   Length       Sector
  Area                                             (sectors)    numbers
  ──────────────────────────────────────────────────────────────────────────
  Boot sector                                      1            00H
  2 FATs * 2 sectors/FAT                           4            01H─04H
  112 directory entries                            7            05H─0BH
    *32 bytes/entry
    /512 bytes/sector
  Total sectors occupied by bootstrap, FATs, and   12
  root directory
  ──────────────────────────────────────────────────────────────────────────

  Therefore, the first sector of the files area is 12 (0CH).

  The word at offset 01AH in the directory entry for IBMBIO.COM gives us the
  starting cluster number for that file: cluster 2. To find the
  logical-sector number of the first block in the file, we can follow the
  procedure given earlier:

  1.  Cluster number - 2 = 2 - 2 = 0.

  2.  Multiply by sectors per cluster = 0 * 2 = 0.

  3.  Add logical-sector number of start of the files area = 0 + 0CH = 0CH.

  So the calculated sector number of the beginning of the file IBMBIO.COM is
  0CH, which is exactly what we expect knowing that the FORMAT program
  always places the system files in contiguous sectors at the beginning of
  the data area.

  Now let's trace IBMBIO.COM's chain through the file allocation table
  (Figures 10-9 and 10-10). This will be a little tedious, but a detailed
  understanding of the process is crucial. In an actual program, we would
  first read the boot sector using Int 25H, then calculate the address of
  the FAT from the contents of the BPB, and finally read the FAT into
  memory, again using Int 25H.

  From IBMBIO.COM's directory entry, we already know that the first cluster
  in the file is cluster 2. To examine that cluster's entry in the FAT, we
  multiply the cluster number by 1.5, which gives 0003H as the FAT offset,
  and fetch the word at that offset (which contains 4003H). Because the
  product of the cluster and 1.5 is a whole number, we AND the word from the
  FAT with 0FFFH, yielding the number 3, which is the number of the second
  cluster assigned to the file.

  ──────────────────────────────────────────────────────────────────────────
         0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F
  0000  FD FF FF 03 40 00 05 60 00 07 80 00 09 A0 00 0B  ....@..'........
  0010  C0 00 0D E0 00 0F 00 01 11 20 01 13 40 01 15 60  ......... ..@..'
  0020  01 17 F0 FF 19 A0 01 1B C0 01 1D E0 01 1F 00 02  ................
  0030  21 20 02 23 40 02 25 60 02 27 80 02 29 A0 02 2B  ! .#@.%'.'..)..+
        .
        .
        .
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-9.  Hex dump of the first block of the file allocation table
  (track 0, head 0, sector 2) for the PC-DOS 3.3 disk whose root directory
  is shown in Figure 10-6. Notice that the first byte of the FAT contains
  the media descriptor byte for a 5.25-inch, 2-sided, 9-sector floppy disk.

  ──────────────────────────────────────────────────────────────────────────
  getfat    proc      near      ; extracts the FAT field
                                ; for a given cluster
                                ; call    AX = cluster #
                                ;      DS:BX = addr of FAT
                                ; returns AX = FAT field
                                ; other registers unchanged

            push      bx        ; save affected registers
            push      cx
            mov       cx,ax
            shl       ax,1      ; cluster * 2
            add       ax,cx     ; cluster * 3
            test      ax,1
            pushf               ; save remainder in Z flag
            shr       ax,1      ; cluster * 1.5
            add       bx,ax
            mov       ax,[bx]
            popf                ; was cluster * 1.5 whole number?
            jnz       getfat1   ; no, jump
            and       ax,0fffh  ; yes, isolate bottom 12 bits
            jmp       getfat2
  getfat1:  mov       cx,4      ; shift word right 4 bits
            shr       ax,cx
  getfat2:  pop       cx        ; restore registers and exit
            pop       bx
            ret
  getfat    endp
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-10.  Assembly-language procedure to access the file allocation
  table (assumes 12-bit FAT fields). Given a cluster number, the procedure
  returns the contents of that cluster's FAT entry in the AX register. This
  simple example ignores the fact that FAT entries can span sector
  boundaries.

  To examine cluster 3's entry in the FAT, we multiply 3 by 1.5, which gives
  4.5, and fetch the word at offset 0004H (which contains 0040H). Because
  the product of 3 and 1.5 is not a whole number, we shift the word right
  4 bits, yielding the number 4, which is the number of the third cluster
  assigned to IBMBIO.COM.

  In this manner, we can follow the chain through the FAT until we come to a
  cluster (number 23, in this case) whose FAT entry contains the value
  0FFFH, which is an end-of-file marker in FATs with 12-bit entries.

  We have now established that the file IBMBIO.COM contains clusters 2
  through 23 (02H─17H), from which we can calculate that logical sectors 0CH
  through 38H are assigned to the file. Of course, the last cluster may be
  only partially filled with actual data; the portion of the last cluster
  used is the remainder of the file's size in bytes (found in the directory
  entry) divided by the bytes per cluster.


Fixed-Disk Partitions

  Fixed disks have another layer of organization beyond the logical volume
  structure already discussed: partitions. The FDISK utility divides a fixed
  disk into one or more partitions consisting of an integral number of
  cylinders. Each partition can contain an independent file system and, for
  that matter, its own copy of an operating system.

  The first physical sector on a fixed disk (track 0, head 0, sector 1)
  contains the master boot record, which is laid out as follows:

  Bytes              Contents
  ──────────────────────────────────────────────────────────────────────────
  000─1BDH           Reserved
  1BE─1CDH           Partition #1 descriptor
  1CE─1DDH           Partition #2 descriptor
  1DE─1EDH           Partition #3 descriptor
  1EE─1FDH           Partition #4 descriptor
  1FE─1FFH           Signature word (AA55H)
  ──────────────────────────────────────────────────────────────────────────

  The partition descriptors in the master boot record define the size,
  location, and type of each partition, as follows:

  Byte(s)            Contents
  ──────────────────────────────────────────────────────────────────────────
  00H                Active flag (0 = not bootable, 80H = bootable)
  01H                Starting head
  02H─03H            Starting cylinder/sector
  04H                Partition type
  00H                not used
  01H                FAT file system, 12-bit FAT entries
  04H                FAT file system, 16-bit FAT entries
  05H                extended partition
  06H                "huge partition" (MS-DOS versions 4.0 and later)
  05H                Ending head
  06H─07H            Ending cylinder/sector
  08H─0BH            Starting sector for partition, relative to beginning of
                     disk
  0CH─0FH            Partition length in sectorsThe active flag, which
                     indicates that the partition is bootable, can be set on
                     only one partition at a time.
  ──────────────────────────────────────────────────────────────────────────

  MS-DOS treats partition types 1, 4, and 6 as normal logical volumes and
  assigns them their own drive identifiers during the system boot process.
  Partition type 5 can contain multiple logical volumes and has a special
  extended boot record that describes each volume. The FORMAT utility
  initializes MS-DOS fixed-disk partitions, creating the file system within
  the partition (boot record, file allocation table, root directory, and
  files area) and optionally placing a bootable copy of the operating system
  in the file system.

  Figure 10-11 contains a partial hex dump of a master block from a fixed
  disk formatted under PC-DOS version 3.3. This dump illustrates the
  partition descriptors for a normal partition with a 16-bit FAT and an
  extended partition.

  ──────────────────────────────────────────────────────────────────────────
  0000   .
         .
         .
  0180  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0190  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  01A0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  01B0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01
  01C0  01 00 04 04 D1 02 11 00 00 00 EE FF 00 00 00 00
  01D0  C1 04 05 04 D1 FD 54 00 01 00 02 53 00 00 00 00
  01E0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  01F0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA
  ──────────────────────────────────────────────────────────────────────────

  Figure 10-11.  A partial hex dump of a master block from a fixed disk
  formatted under PC-DOS version 3.3. This disk contains two partitions. The
  first partition has a 16-bit FAT and is marked "active" to indicate that
  it contains a bootable copy of PC-DOS. The second partition is an
  "extended" partition. The third and fourth partition entries are not used
  in this example.



────────────────────────────────────────────────────────────────────────────
Chapter 11  Memory Management

  Current versions of MS-DOS can manage as much as 1 megabyte of contiguous
  random-access memory. On IBM PCs and compatibles, the memory occupied by
  MS-DOS and other programs starts at address 0000H and may reach as high as
  address 09FFFFH; this 640 KB area of RAM is sometimes referred to as
  conventional memory. Memory above this address is reserved for ROM
  hardware drivers, video refresh buffers, and the like. Computers that are
  not IBM compatible may use other memory layouts.

  The RAM area under the control of MS-DOS is divided into two major
  sections:

  ■  The operating-system area

  ■  The transient-program area

  The operating-system area starts at address 0000H──that is, it occupies
  the lowest portion of RAM. It holds the interrupt vector table, the
  operating system proper and its tables and buffers, any additional
  installable drivers specified in the CONFIG.SYS file, and the resident
  part of the COMMAND.COM command interpreter. The amount of memory occupied
  by the operating-system area varies with the version of MS-DOS used, the
  number of disk buffers, the size of installed device drivers, and so
  forth.

  The transient-program area (TPA), sometimes called the memory arena, is
  the remainder of memory above the operating-system area. The memory arena
  is dynamically allocated in blocks called arena entries. Each arena entry
  has a special control structure called an arena header, and all of the
  arena headers are chained together. Three MS-DOS Int 21H functions allow
  programs to allocate, resize, and release blocks of memory from the TPA:

  Function                 Action
  ──────────────────────────────────────────────────────────────────────────
  48H                     Allocate memory block.
  49H                     Release memory block.
  4AH                     Resize memory block.
  ──────────────────────────────────────────────────────────────────────────

  MS-DOS itself uses these functions when loading a program from disk at the
  request of COMMAND.COM or another program. The EXEC function, which is the
  MS-DOS program loader, calls Int 21H Function 48H to allocate a memory
  block for the loaded program's environment and another for the program
  itself and its program segment prefix. It then reads the program from the
  disk into the assigned memory area. When the program terminates, MS-DOS
  calls Int 21H Function 49H to release all memory owned by the program.

  Transient programs can also employ the MS-DOS memory-management functions
  to dynamically manage the memory available in the TPA. Proper use of these
  functions is one of the most important criteria of whether a program is
  well behaved under MS-DOS. Well-behaved programs are most likely to be
  portable to future versions of the operating system and least likely to
  cause interference with other processes under multitasking user interfaces
  such as Microsoft Windows.


Using the Memory-Allocation Functions

  The memory-allocation functions have two common uses:

  ■  To shrink a program's initial memory allocation so that there is enough
     room to load and execute another program under its control.

  ■  To dynamically allocate additional memory required by the program and
     to release the same memory when it is no longer needed.

Shrinking the Initial Memory Allocation

  Although many MS-DOS application programs simply assume they own all
  memory, this assumption is a relic of MS-DOS version 1 (and CP/M), which
  could support only one active process at a time. Well-behaved MS-DOS
  programs take pains to modify only memory that they actually own and to
  release any memory that they don't need.

  Unfortunately, under current versions of MS-DOS, the amount of memory that
  a program will own is not easily predicted in advance. It turns out that
  the amount of memory allocated to a program when it is first loaded
  depends upon two factors:

  ■  The type of file the program is loaded from

  ■  The amount of memory available in the TPA

  MS-DOS always allocates all of the largest available memory block in the
  TPA to programs loaded from .COM (memory-image) files. Because .COM
  programs contain no file header that can pass segment and memory-use
  information to MS-DOS, MS-DOS simply assumes the worst case and gives such
  a program everything. MS-DOS will load the program as long as there is an
  available memory block as large as the size of the file plus 256 bytes for
  the PSP and 2 bytes for the stack. The .COM program, when it receives
  control, must determine whether enough memory is available to carry out
  its functions.

  MS-DOS uses more complicated rules to allocate memory to programs loaded
  from .EXE files. First, of course, a memory block large enough to hold the
  declared code, data, and stack segments must be available in the TPA. In
  addition, the linker sets two fields in a .EXE file's header to inform
  MS-DOS about the program's memory requirements. The first field,
  MIN_ALLOC, defines the minimum number of paragraphs required by the
  program, in addition to those for the code, data, and stack segments. The
  second, MAX_ALLOC, defines the maximum number of paragraphs of additional
  memory the program would use if they were available.

  When loading a .EXE file, MS-DOS first attempts to allocate the number of
  paragraphs in MAX_ALLOC plus the number of paragraphs required by the
  program itself. If that much memory is not available, MS-DOS assigns all
  of the largest available block to the program, provided that this is at
  least the amount specified by MIN_ALLOC plus the size of the program
  image. If that condition is not satisfied, the program cannot be executed.

  After a .COM or .EXE program is loaded and running, it can use Int 21H
  Function 4AH (Resize Memory Block) to release all the memory it does not
  immediately need. This is conveniently done right after the program
  receives control from MS-DOS, by calling the resize function with the
  segment of the program's PSP in the ES register and the number of
  paragraphs that the program requires to run in the BX register (Figure
  11-1).

  ──────────────────────────────────────────────────────────────────────────
          .
          .
          .
          org     100h

  main    proc    near            ; entry point from MS-DOS
                                  ; DS, ES = PSP address

          mov     sp,offset stk   ; COM program must move
                                  ; stack to safe area

                                  ; release extra memory...
          mov     ah,4ah          ; function 4Ah =
                                  ; resize memory block
                                  ; BX = paragraphs to keep
          mov     bx,(offset stk - offset main + 10FH) / 16
          int     21h             ; transfer to MS-DOS
          jc      error           ; jump if resize failed
          .
          .
          .
  main    endp

          .
          .
          .

          dw      64 dup (?)      ; new stack area
  stk     equ     $               ; new base of stack

          end     main            ; defines entry point
  ──────────────────────────────────────────────────────────────────────────

  Figure 11-1.  An example of a .COM program releasing excess memory after
  it receives control from MS-DOS. Int 21H Function 4AH is called with ES
  pointing to the program's PSP and BX containing the number of paragraphs
  that the program needs to execute. In this case, the new size for the
  program's memory block is calculated as the program image size plus the
  size of the PSP (256 bytes), rounded up to the next paragraph. .EXE
  programs use similar code.

Dynamic Allocation of Additional Memory

  When a well-behaved program needs additional memory space──for an I/O
  buffer or an array of intermediate results, for example──it can call Int
  21H Function 48H (Allocate Memory Block) with the desired number of
  paragraphs. If a sufficiently large block of unallocated memory is
  available, MS-DOS returns the segment address of the base of the assigned
  area and clears the carry flag (0), indicating that the function was
  successful.

  If no unallocated block of sufficient size is available, MS-DOS sets the
  carry flag (1), returns an error code in the AX register, and returns the
  size (in paragraphs) of the largest block available in the BX register
  (Figure 11-2). In this case, no memory has yet been allocated. The
  program can use the value returned in the BX register to determine whether
  it can continue in a "degraded" fashion, with less memory. If it can, it
  must call Int 21H Function 48H again to allocate the smaller memory
  block.

  When the MS-DOS memory manager is searching the chain of arena headers to
  satisfy a memory-allocation request, it can use one of the following
  strategies:

  ■  First fit: Use the arena entry at the lowest address that is large
     enough to satisfy the request.

  ■  Best fit: Use the smallest arena entry that will satisfy the request,
     regardless of its location.

  ■  Last fit: Use the arena entry at the highest address that is large
     enough to satisfy the request.

  ──────────────────────────────────────────────────────────────────────────
                .
                .
                .
                mov   ah,48h                 ; function 48h = allocate mem bl
                mov   bx,0800h               ; 800h paragraphs = 32 KB
                int   21h                    ; transfer to MS-DOS
                jc    error                  ; jump if allocation failed
                mov   buff_seg,ax            ; save segment of allocated bloc
                .
                .
                .
                mov   es,buff_seg            ; ES:DI = address of block
                xor   di,di
                mov   cx,08000h              ; store 32,768 bytes
                mov   al,0ffh                ; fill buffer with -1s
                cld
                rep   stosb                  ; now perform fast fill
                .
                .
                .
                mov   cx,08000h              ; length to write, bytes
                mov   bx,handle              ; handle for prev opened file
                push  ds                     ; save our data segment
                mov   ds,buff_seg            ; let DS:DX = buffer address
                mov   dx,0
                mov   ah,40h                 ; function 40h = write
                int   21h                    ; transfer to MS-DOS
                pop   ds                     ; restore our data segment
                jc    error                  ; jump if write failed
                .
                .
                .
                mov   es,buff_seg            ; ES = seg of prev allocated blo
                mov   ah,49h                 ; function 49h = release mem blo
                int   21h                    ; transfer to MS-DOS
                jc    error                  ; jump if release failed
                .
  error:        .
                .
  handle        dw    0                      ; file handle
  buff_seg      dw    0                      ; segment of allocated block
                .
                .
                .
  ──────────────────────────────────────────────────────────────────────────

  Figure 11-2.  Example of dynamic memory allocation. The program requests a
  32 KB memory block from MS-DOS, fills it with -1s, writes it to disk, and
  then releases it.

  If the arena entry selected is larger than the size requested, MS-DOS
  divides it into two parts: one block of the size requested, which is
  assigned to the program that called Int 21H Function 48H, and an unowned
  block containing the remaining memory.

  The default MS-DOS allocation strategy is first fit. However, under MS-DOS
  versions 3.0 and later, an application program can change the strategy
  with Int 21H Function 58H.

  When a program is through with an allocated memory block, it should use
  Int 21H Function 49H to release the block. If it does not, MS-DOS will
  automatically release all memory allocations for the program when it
  terminates.


Arena Headers

  Microsoft has not officially documented the internal structure of arena
  headers for the outside world at present. This is probably to deter
  programmers from trying to manipulate their memory allocations directly
  instead of through the MS-DOS functions provided for that purpose.

  Arena headers have identical structures in MS-DOS versions 2 and 3. They
  are 16 bytes (one paragraph) and are located immediately before the memory
  area that they control (Figure 11-3). An arena header contains the
  following information:

  ■  A byte signifying whether the header is a member or the last entry in
     the entire chain of such headers

  ■  A word indicating whether the area it controls is available or whether
     it already belongs to a program (if the latter, the word points to the
     program's PSP)

  ■  A word indicating the size (in paragraphs) of the controlled memory
     area (arena entry)

  MS-DOS inspects the chain of arena headers whenever the program requests a
  memory-block allocation, modification, or release function, or when a
  program is EXEC'd or terminated. If any of the blocks appear to be
  corrupted or if the chain is broken, MS-DOS displays the dreaded message

  Memory allocation error

  and halts the system.

  In the example illustrated in Figure 11-3, COMMAND.COM originally loaded
  PROGRAM1.COM into the TPA and, because it was a .COM file, COMMAND.COM
  allocated it all of the TPA, controlled by arena header #1. PROGRAM1.COM
  then used Int 21H Function 4AH (Resize Memory Block) to shrink its memory
  allocation to the amount it actually needed to run and loaded and executed
  PROGRAM2.EXE with the EXEC function (Int 21H Function 4BH). The EXEC
  function obtained a suitable amount of memory, controlled by arena header
  #2, and loaded PROGRAM2.EXE into it. PROGRAM2.EXE, in turn, needed some
  additional memory to store some intermediate results, so it called Int 21H
  Function 48H (Allocate Memory Block) to obtain the area controlled by
  arena header #3. The highest arena header (#4) controls all of the
  remaining TPA that has not been allocated to any program.

  ┌─────────────────────────────────────────────────┐◄ Top of RAM
  │       Unowned RAM controlled by header #4       │  controlled by MS-DOS
  ├─────────────────────────────────────────────────┤
  │                 Arena header #4                 │
  ├─────────────────────────────────────────────────┤
  │ Memory area controlled by header #3; additional │
  │  storage dynamically allocated by PROGRAM2.EXE  │
  ├─────────────────────────────────────────────────┤
  │                 Arena header #3                 │
  ├─────────────────────────────────────────────────┤
  │      Memory area controlled by header #2,       │
  │             containing PROGRAM2.EXE             │
  ├─────────────────────────────────────────────────┤
  │                 Arena header #2                 │
  ├─────────────────────────────────────────────────┤
  │      Memory area controlled by header #1,       │
  │             containing PROGRAM1.COM             │
  ├─────────────────────────────────────────────────┤
  │                 Arena header #1                 │
  └─────────────────────────────────────────────────┘◄ Bottom of transient-
                                                       program area

  Figure 11-3.  An example diagram of MS-DOS arena headers and the
  transient-program area. The environment blocks and their associated
  headers have been omitted from this figure to increase its clarity.


Lotus/Intel/Microsoft Expanded Memory

  When the IBM Personal Computer and MS-DOS were first released, the 640 KB
  limit that IBM placed on the amount of RAM that could be directly managed
  by MS-DOS seemed almost unimaginably huge. But as MS-DOS has grown in both
  size and capabilities and the popular applications have become more
  powerful, that 640 KB has begun to seem a bit crowded. Although personal
  computers based on the 80286 and 80386 have the potential to manage up to
  16 megabytes of RAM under operating systems such as MS OS/2 and XENIX,
  this is little comfort to the millions of users of 8086/8088-based
  computers and MS-DOS.

  At the spring COMDEX in 1985, Lotus Development Corporation and Intel
  Corporation jointly announced the Expanded Memory Specification 3.0 (EMS),
  which was designed to head off rapid obsolescence of the older PCs because
  of limited memory. Shortly afterward, Microsoft announced that it would
  support the EMS and would enhance Microsoft Windows to use the memory made
  available by EMS hardware and software. EMS versions 3.2 and 4.0, released
  in fall 1985 and summer 1987, expanded support for multitasking operating
  systems.

  The LIM EMS (as it is usually known) has been an enormous success. EMS
  memory boards are available from scores of manufacturers, and "EMS-aware"
  software──especially spreadsheets, disk caches, and terminate-and-stay-
  resident utilities──has become the rule rather than the exception.

What Is Expanded Memory?

  The Lotus/Intel/Microsoft Expanded Memory Specification is a functional
  definition of a bank-switched memory-expansion subsystem. It consists of
  hardware expansion modules and a resident driver program specific to those
  modules. In EMS versions 3.0 and 3.2, the expanded memory is made
  available to application software as 16 KB pages mapped into a contiguous
  64 KB area called the page frame, somewhere above the main memory area
  used by MS-DOS/PC-DOS (0─640 KB). The exact location of the page frame is
  user configurable, so it need not conflict with other hardware options. In
  EMS version 4.0, the pages may be mapped anywhere in memory and can have
  sizes other than 16 KB.

  The EMS provides a uniform means for applications to access as much as 8
  megabytes of memory (32 megabytes in EMS 4.0). The supporting software,
  which is called the Expanded Memory Manager (EMM), provides a
  hardware-independent interface between application software and the
  expanded memory board(s). The EMM is supplied in the form of an
  installable device driver that you link into the MS-DOS/PC-DOS system by
  adding a line to the CONFIG.SYS file on the system boot disk.

  Internally, the Expanded Memory Manager consists of two major portions,
  which may be referred to as the driver and the manager. The driver portion
  mimics some of the actions of a genuine installable device driver, in that
  it includes initialization and output status functions and a valid device
  header. The second, and major, portion of the EMM is the true interface
  between application software and the expanded-memory hardware. Several
  classes of services are provided:

  ■  Verification of functionality of hardware and software modules

  ■  Allocation of expanded-memory pages

  ■  Mapping of logical pages into the physical page frame

  ■  Deallocation of expanded-memory pages

  ■  Support for multitasking operating systems

  Application programs communicate with the EMM directly, by means of
  software Int 67H. MS-DOS versions 3.3 and earlier take no part in (and in
  fact are completely oblivious to) any expanded-memory manipulations that
  may occur. MS-DOS version 4.0 and Microsoft Windows, on the other hand,
  are "EMS-aware" and can use the EMS memory when it is available.

  Expanded memory should not be confused with extended memory. Extended
  memory is the term used by IBM to refer to the memory at physical
  addresses above 1 megabyte that can be accessed by an 80286 or 80386 CPU
  in protected mode. Current versions of MS-DOS run the 80286 and 80386 in
  real mode (8086-emulation mode), and extended memory is therefore not
  directly accessible.

Checking for Expanded Memory

  An application program can use either of two methods to test for the
  existence of the Expanded Memory Manager:

  ■  Issue an open request (Int 21H Function 3DH) using the guaranteed
     device name of the EMM driver: EMMXXXX0. If the open function succeeds,
     either the driver is present or a file with the same name
     coincidentally exists on the default disk drive. To rule out the
     latter, the application can use IOCTL (Int 21H Function 44H)
     subfunctions 00H and 07H to ensure that EMM is present. In either case,
     the application should then use Int 21H Function 3EH to close the
     handle that was obtained from the open function, so that the handle can
     be reused for another file or device.

  ■  Use the address that is found in the Int 67H vector to inspect the
     device header of the presumed EMM. Interrupt handlers and device
     drivers must use this method. If the EMM is present, the name field at
     offset 0AH of the device header contains the string EMMXXXX0. This
     approach is nearly foolproof and avoids the relatively high overhead of
     an MS-DOS open function. However, it is somewhat less well behaved
     because it involves inspection of memory that does not belong to the
     application.

  These two methods of testing for the existence of the Expanded Memory
  Manager are illustrated in Figures 11-4 and 11-5.

  ──────────────────────────────────────────────────────────────────────────
            .
            .
            .
                                 ; attempt to "open" EMM...
            mov  dx,seg emm_name ; DS:DX = address of name
            mov  ds,dx           ; of Expanded Memory Manager
            mov  dx,offset emm_name
            mov  ax,3d00h        ; function 3dh, mode = 00h
                                 ; = open, read only
            int  21h             ; transfer to MS-DOS
            jc   error           ; jump if open failed

                                 ; open succeeded, be sure
                                 ; it was not a file...
            mov  bx,ax           ; BX = handle from open
            mov  ax,4400h        ; function 44h subfunction 00h
                                 ; = IOCTL get device information
            int  21h             ; transfer to MS-DOS
            jc   error           ; jump if IOCTL call failed
            and  dx,80h          ; bit 7 = 1 if character device
            jz   error           ; jump if it was a file

                                 ; EMM is present, be sure
                                 ; it is available...
                                 ; (BX still contains handle)
            mov  ax,4407h        ; function 44h subfunction 07h
                                 ; = IOCTL get output status
            int  21h             ; transfer to MS-DOS
            jc   error           ; jump if IOCTL call failed
            or   al,al           ; test device status
            jz   error           ; if AL = 0 EMM is not available
                                 ; now close handle ...
                                 ; (BX still contains handle)
            mov  ah,3eh          ; function 3eh = close
            int  21h             ; transfer to MS-DOS
            jc   error           ; jump if close failed
            .
            .
            .
  emm_name  db   'EMMXXXX0',0    ; guaranteed device name for
                                 ; Expanded Memory Manager
  ──────────────────────────────────────────────────────────────────────────

  Figure 11-4.  Testing for the Expanded Memory Manager by means of the
  MS-DOS open and IOCTL functions.

  ──────────────────────────────────────────────────────────────────────────
  emm_int   equ  67h            ; Expanded Memory Manager
                                ; software interrupt
            .
            .
            .
                                ; first fetch contents of
                                ; EMM interrupt vector...
            mov  al,emm_int     ; AL = EMM int number
            mov  ah,35h         ; function 35h = get vector
            int  21h            ; transfer to MS-DOS
                                ; now ES:BX = handler address

                                ; assume ES:0000 points
                                ; to base of the EMM...
            mov  di,10          ; ES:DI = address of name
                                ; field in device header
                                ; DS:SI = EMM driver name
            mov  si,seg emm_name
            mov  ds,si
            mov  si,offset emm_name
            mov  cx,8           ; length of name field
            cld
            repz cmpsb          ; compare names...
            jnz  error          ; jump if driver absent
            .
            .
            .


  emm_name  db   'EMMXXXX0'     ; guaranteed device name for
                                ; Expanded Memory Manager
  ──────────────────────────────────────────────────────────────────────────

  Figure 11-5.  Testing for the Expanded Memory Manager by inspection of the
  name field in the driver's device header.


Using Expanded Memory

  After establishing that the memory-manager software is present, the
  application program communicates with it directly by means of the "user
  interrupt" 67H, bypassing MS-DOS/PC-DOS. The calling sequence for the EMM
  is as follows:

  ──────────