Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tzx: Add suport for ZX Spectrum TZX and TAP files #975

Merged
merged 15 commits into from
Aug 8, 2024
Merged

Conversation

mrcook
Copy link
Contributor

@mrcook mrcook commented Aug 1, 2024

For those not familiar with the ZX Spectrum, this was a popular 1980's home micro in Europe and various other countries, and sold in excess of 5 million units. Most software was distributed on cassette tapes and these were encoded as TZX or TAP formats for use in emulators.

This PR adds decoders for these formats based on the specification, and also my previous efforts in writing a TZX/TAP decoder.

The test file includes a simple BASIC program and was exported from the FUSE emulator -- the TZX was manually edited to add some "archive info" (metadata).

Perhaps in a future update I could add functionality to decode the BASIC programs.

Copy link
Owner

@wader wader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks! it's a bit late in sweden so i've only done quick skim through but it looks very good 👍 will have a closer look tomorrow.

Is there some large archive of tap and txz files one could stress test the decoders on?

@@ -0,0 +1,21 @@
$ fq -d tap dv basic_prog1.tap
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to add some instruction how the tap and tzx test files were created? if very simple just a comment in the .fqtest-file otherwise maybe a README.md in testdata?


### References

- https://worldofspectrum.net/zx-modules/fileformats/tapformat.html
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to add a author section to tap.md and tzx.md? ex https://github.com/wader/fq/blob/master/format/fit/fit.md?plain=1#L14-L15

// Sampling rate
d.FieldU24("sample_rate")
// Compression type
d.FieldU8("compression_type", scalar.UintMapSymStr{0x01: "RLE", 0x02: "Z-RLE"})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've usually tried to keep symbol mapped strings to be jq-ish, that is snake_case, but if the format spec etc is very clear about the naming then keep it think.

0x10| 00 0a 14 00 20 f5 22 66| .... ."f| data: raw bits 0x18-0x3e (38)
0x20|71 20 69 73 20 74 68 65 20 62 65 73 74 21 22 0d|q is the best!".|
0x30|00 14 0a 00 ec 31 30 0e 00 00 0a 00 00 0d |.....10....... |
0x30| b6| | .|| checksum: 0xb6 0x3e-0x3f (1)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a $ fq -d tap -o read_one_block=true dv basic_prog1.tap test here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, on running this all blocks are still being decoded. I'll take another look at this tonight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick debug I notice that the ReadOneBlock flag is not being set here:

var ti format.TAP_In
d.ArgAs(&ti)

Is my implementation missing something?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap see my comment for tap.go

@wader
Copy link
Owner

wader commented Aug 2, 2024

Failures all.fqtest i think is just to run tests with -update.

The failure in the zip tests i think is about that the tap decoder is in the probe group and succeeds on empty input (for !d.End() { ... } does no iteration), so maybe remove from probe group, if it's shaky to probe, or maybe make sure that at least one block was decoded? but it seems there is no real signature for a block so might not be possible? 🤔

@mrcook
Copy link
Contributor Author

mrcook commented Aug 2, 2024

Is there some large archive of tap and txz files one could stress test the decoders on?

Hi Matthias, the go to place these days is https://spectrumcomputing.co.uk. I've tested over 50 tape images myself: mostly TZX but a dozen or so TAPs too.

Most tapes seems to be just the Standard Speed Data and Turbo Speed Data blocks, and I've struggled to find many TZX with the more interest blocks.

There are some collections available on the Internet Archive, so I plan to do some bulk testing at some point.

@mrcook
Copy link
Contributor Author

mrcook commented Aug 2, 2024

The failure in the zip tests i think is about that the tap decoder is in the probe group and succeeds on empty input (for !d.End() { ... } does no iteration), so maybe remove from probe group, if it's shaky to probe, or maybe make sure that at least one block was decoded? but it seems there is no real signature for a block so might not be possible? 🤔

Yep, and in fact the specs state: "The TAP file may be empty. Then it has a size of 0 bytes".

I hadn't looked into what that probe group does, I just started with a copy of the format/nes decoder 🙃 I'll try removing it and see if that works.

@wader
Copy link
Owner

wader commented Aug 2, 2024

There are some collections available on the Internet Archive, so I plan to do some bulk testing at some point.

Great! if you end up writing some script or has some snippet to do it you can include that in testdata/README.md also. I've usually tried to document test data well so that me, you or someone in the future has some resonable chance at updating them :)

@wader
Copy link
Owner

wader commented Aug 2, 2024

Yep, and in fact the specs state: "The TAP file may be empty. Then it has a size of 0 bytes".

I hadn't looked into what that probe group does, I just started with a copy of the format/nes decoder 🙃 I'll try removing it and see if that works.

Aha i see. Being part of the "probe group" just means that fq will, if no format or group is specific with -d, try to decode using all format in that group and the first format to succeed will be selected. So that is why it's important to not succeed on empty input and also try to fail fast if part of the probe group. Also some archive formats like zip uses the probe group to do nested decoding.

About nes decoder: Oh then i should probably have a look at it so it does not have the same issue.

format.TAP,
&decode.Format{
Description: "TAP tape format for ZX Spectrum computers",
DecodeFn: tapDecoder,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this and i think the read_one_block option should work:

			DefaultInArg: format.TAP_In{
				ReadOneBlock: false,
			},

It's how the fq internals know the type and also default values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did initially think to add this, but then figured, it's false by default so no need to add it :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha :) i've thought about making Format generic somehow then the in/out types could be passed as use zero value... but maybe not worth it? maybe will refactor this in the future

Inside the emulated ZX Spectrum a BASIC program was created:

```
10 PRINT "fq is the best!"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

// A single TAP Data Block
peekBytes := d.PeekBytes(2) // get the TAP data block length
length := uint16(peekBytes[1])<<8 | uint16(peekBytes[0]) // bytes are stored in LittleEndian
length += 2 // include the two bytes for this value
Copy link
Owner

@wader wader Aug 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the ReadOneBlock option or could the length*8 limit and the end check be enough? or just to skip the blocks array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! In retrospect it even makes sense for these embedded tap blocks would be an array. As the TAP spec states: "a TAP file is simply one datablock or a group of 2 or more datablocks".

},
}

blockType := d.FieldU8("block_type", blockTypeMapper)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe can be just "type" if it's inside a block struct?

case 0:
// fragment with no data
case 1:
d.FieldRawLen("data", 8)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible future change: Are these data fragments (and data below) something that can be joined into something decodable? if so fq can append things together and decode a "sub"-buffer of some format or just provide a large raw field

}

// Simply all bytes XORed (including flag byte).
d.FieldU8("checksum", scalar.UintHex)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fq has some support for checksums if you like to add it

fq/format/png/png.go

Lines 232 to 233 in cdba38d

d.Copy(chunkCRC, bitio.NewIOReader(d.BitBufRange(crcStartPos, d.Pos()-crcStartPos)))
d.FieldU32("crc", d.UintValidateBytes(chunkCRC.Sum(nil)), scalar.UintHex)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately obvious to me how I'm going implement this, so I won't get this done tonight.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok but no worries, is up to you. Could be done later if it's something you want

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I'd have a quick look before I go to bed and put something together.

@wader
Copy link
Owner

wader commented Aug 4, 2024

Just some minor comments that you might want to do, otherwise i think it look good as it is now. And can always do more changes later on, i like to iterate and not let PRs drag on too long. Let me know when you feel your happy with the code and testing.

BTW what's your use case for this? just curious how the formats work or using to do some aggregations or querying for ROMs features etc?

@mrcook
Copy link
Contributor Author

mrcook commented Aug 5, 2024

Just some minor comments that you might want to do, otherwise i think it look good as it is now. And can always do more changes later on, i like to iterate and not let PRs drag on too long. Let me know when you feel your happy with the code and testing.

I agree that iterating is better than having long running PRs. Your new comments can certainly be considered improvements rather than requirements.

Whatever you think is best. :)

BTW what's your use case for this? just curious how the formats work or using to do some aggregations or querying for ROMs features etc?

Exactly that; extracting metadata, features, etc. The archive info (title, publisher, etc.) is the obvious one, but also the more interesting blocks like Hardware Type. Possibly this could be used for verifying/updating the https://github.com/zxdb/ZXDB

I already have the core of a BASIC decoder written, so in the future I'd also like to add that.

@wader
Copy link
Owner

wader commented Aug 5, 2024

Just some minor comments that you might want to do, otherwise i think it look good as it is now. And can always do more changes later on, i like to iterate and not let PRs drag on too long. Let me know when you feel your happy with the code and testing.

I agree that iterating is better than having long running PRs. Your new comments can certainly be considered improvements rather than requirements.

Whatever you think is best. :)

The smaller things maybe in this PR, the larger about fragments etc maybe later if you feel motivated.

BTW what's your use case for this? just curious how the formats work or using to do some aggregations or querying for ROMs features etc?

Exactly that; extracting metadata, features, etc. The archive info (title, publisher, etc.) is the obvious one, but also the more interesting blocks like Hardware Type. Possibly this could be used for verifying/updating the https://github.com/zxdb/ZXDB

Sounds similar to my use case for media files, diff, aggregate and query for very things. But i'll say a majority of the decoders in fq exist mostly out of curiosity :)

I already have the core of a BASIC decoder written, so in the future I'd also like to add that.

👍 I've done some experiments with decoding ISA:s to be something like a "structured disassembler", see #215, maybe can give some inspiration how it might work.

@mrcook
Copy link
Contributor Author

mrcook commented Aug 5, 2024

The smaller things maybe in this PR, the larger about fragments etc maybe later if you feel motivated.

I should be able to take a look at that tomorrow evening.

@wader
Copy link
Owner

wader commented Aug 6, 2024

test -update should fix CI error

@wader
Copy link
Owner

wader commented Aug 7, 2024

Looks good. Ready to merge or you wanted to do some more testing?

@mrcook
Copy link
Contributor Author

mrcook commented Aug 8, 2024

I'd be happy with what we have here now, so yes, feel free to merge it.

@wader wader merged commit 7fca197 into wader:master Aug 8, 2024
5 checks passed
@mrcook
Copy link
Contributor Author

mrcook commented Aug 8, 2024

Thanks Matthias! I also wanted to thank you for all the help in making my contribution better.

Last but no least, thanks for creating this great tool!!

@wader
Copy link
Owner

wader commented Aug 8, 2024

Glad to hear that 😊 and looking forward to future contributions! and if you have questions about how to use fq in general i'm happy to help, always interesting and useful to know how ppl are using it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants