-
Notifications
You must be signed in to change notification settings - Fork 3
Decode the analog waveform extracted from old computer mag tapes
License
LenShustek/readtape
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a program for recovering data from old magnetic computer tapes by digitizing the analog waveforms and then using software to decode the bits and reconstruct the original data. The objective is to correctly read tapes that have degraded beyond the point where conventional tape drives will work, or for which drives are no longer available. We have had good success using this for tapes in the collection of the Computer History Museum. This lives at https://github.com/LenShustek/readtape. For a slide show about the system, see VCF_Aug2020_01.pdf. For a low-quality video of me giving a talk about it, see https://www.youtube.com/watch?v=7YoolSAHR5w&t=4200s. For lots of detailed information about how to use it, and how it works, see A_documentation.txt. For a detailed change log, see the beginning of src\readtape.c. We record data using a 16-channel Saleae digital/analog logic analyzer (https://www.saleae.com) connected to any of several computer-grade magnetic tape decks, including the Qualstar 1052 with either a 9-track or 7-track head installed, the Qualstar 3418S, and the Kennedy 9600. The input is typically taken from the output of a differential amplifier that produces a +-2V or larger analog signal for reasonable tapes. We ignore the drive's subsequent circuitry (thresholding, zero-crossing identification, envelope detection, etc.) that tries to recreate the data bits on the tape. I had first designed and prototyped a custom 9-channel A-to-D converter using the amazing Cirrus/Wolfson WM8235 9-channel analog front end, http://www.mouser.com/ds/2/76/WM8235_v4.2-532446.pdf. But I was unable to find the right setting for their 300 configuration registers that made it work as a normal A-to-D converter, and their tech support would not respond to inquiries. The remnants of that suspended project are in the front_end directory. The data exported by the logic analyzer is a comma-separated-value (CSV) file whose lines contains a timestamp and the voltages for all the read heads. We set the sampling rate to generate about 20 samples per cycle. For 800 BPI NRZI tapes read at 50 IPS, the Saleae 781 KHz rate works well. But the CSV files can be huge -- many tens of gigabytes for a few minutes of recording -- so for archival purposes we've defined a binary compressed "TBIN" format, and I wrote the utility program "csvtbin" that can convert between CSV and TBIN. The "readtape" decoding program can read either format. The compression using TBIN is about 10:1, and it speeds up decoding by about 2x. The output of the decoding can include: - a log file - multiple binary files of the reconstructed data separated at filemarks, or - one SIMH .tap file that encodes data and filemarks (see http://simh.trailing-edge.com/docs/simh_magtape.pdf) - a text file in various formats of readable numeric and character interpretation - a CSV file with data showing peak dispersion after track deskewing - a CSV file, in DEBUG mode, that recreates one or all tracks of data with information about the state of the decoding, like peaks detected We so far support 7-track NRZI format, 9-track NRZI, PE, GCR formats, and, most recently, the bizarre 6-track tapes that were written on the vacuum- tube Whirlwind I computer. The museum has over a hundred of those, and we have had remarkable success (about 95%) in recovering data and programs that have been unread and unexamined for fifty years. *** The files in this repository (Github ought to allow the file list of a repository to say what the files *are*, not what the last minor edit was! https://github.community/t5/How-to-use-Git-and-GitHub/Naive-question-about-describing-files/td-p/7532) ---DOCUMENTATION A_documentation.txt A narrative about usage and internal operation A_experiences.txt Some (old) anecdotes about what we have done with this AtoD_attachment.jpg A photo showing how the analyzer connects to the drive example_01.pdf An example of a really bad block we can decode flux_transition_dispersion.jpg A graph showing the effect of head skew VCF_Aug2020_01.pdf The slide show about the project src\readtape.c The main source file, which also has the complete change log and notes about the internal program structure ---READTAPE source code src\readtape.c main program: options, file handling, and block processing src\decoder.h compile-time options, and common declarations src\csvtbin.h the format of the .tbin compressed binary data file src\decoder.c common routines for analog sample analysis and decoding src\decode_pe.c PE (phase encoded) decoding routines src\decode_nrzi.c NRZI (non-return-to-zero-inverted) decoding routines src\decode_gcr.c GCR (group coded recording) decoding routines src\decode_ww.c Whirlwind I 6-track decoding routines src\parmsets.c parameter set processing, and their defaults src\textfile.c interpreted text dump of the data src\ibmlabels.c IBM 9-track standard label (SL) interpretation src\trace.c create debugging output and spreadsheet graphs src\tapread.c a .tap file reader in support of the -tapread option ---UTILITY PROGRAMS src\csvtbin.c a program for converting between CSV and TBIN files src\dumptap.c a deprecated program for dumping SIMH .tap files (but this functionality, expanded, is now an option in readtape) ---BINARIES bin\readtape.exe readtape Windows 64-bit (x64) executable bin\csvtbin.exe csvtbin Windows 64-bit (x64) executable ---TEST DATA examples\README.txt a directory with test magnetic tape data and decodes and batch files to run them automatically *** Thanks to: - Paul Pierce for the original inspiration of his similar work 10+ years ago. - Grant Saviers for detailed consulting on tape nitty-gritties. - Al Kossow for the tape drive, for making lots of good suggestions (Saleae, .tap format, compressed files, etc.), and for carefully reading many, many tapes from the collection of the Computer History Museum. Len Shustek 6 Feb 2018 17 May 2018, 27 May 2018, 8 Oct 2018 4 Aug 2019, 29 Dec 2019, 28 Feb 2022, 21 Jun 2022, 20 Jul 2022
About
Decode the analog waveform extracted from old computer mag tapes
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published