Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unpack() function deals 1-bit data differently with ewanbarr/sigpyproc #26

Closed
xxgt opened this issue Oct 28, 2022 · 3 comments · Fixed by #35
Closed

unpack() function deals 1-bit data differently with ewanbarr/sigpyproc #26

xxgt opened this issue Oct 28, 2022 · 3 comments · Fixed by #35
Labels
bug Something isn't working

Comments

@xxgt
Copy link

xxgt commented Oct 28, 2022

Hi,
I found that this repository unpacks 1-bit data in big endian order, however ewanbarr/sigpyproc reacts differently in this case:
https://github.com/ewanbarr/sigpyproc/blob/54a804200723d30601026be5bfa37ec90c8266c1/c_src/libSigPyProc.c#L23-L27
I wonder if this is meant to be?

@ewanbarr
Copy link
Contributor

ewanbarr commented Apr 2, 2024

It is unclear if this is a bug or a convention. The implementation in the original sigpyproc was a direct port of the implementation in sigproc (see https://github.com/SixByNine/sigproc/blob/master/src/readchunk.c).

There the 1-bit unpacker explicitly unpacks in little-endian order.

    /* unpack the sample(s) if necessary */
    switch (nbits) {
    case 1:
      fread(&c,1,1,input);
      for (i=0;i<8;i++) {
	f[i]=c&1;
	c>>=1;
      }
      ns=8;
      break;

The sigproc documentation states:

Multi-byte precision data are written in different orders depending on your machine’s operating system. The original WAPP data, for example, was written on a PC (little endian format). The filterbank program knows about this and automatically does any byte swapping required while reading. When it comes to writing the data out, however, the program will always write data in the native order of the processing machine. To swap the bytes around before writing for use on other machines, use the-swapout option.

The endianness of the data depends on the recording instrument and it should probably be more explicitly set when reading the files.

@pravirkr
Copy link
Collaborator

pravirkr commented Apr 3, 2024

This is a convention so that we have the same bitorder for low-bit data. Looks like this need not be true.

So, the sigproc codebase follow different bitorder convention for 1-bit data? There is no header keyword defined to infer the bitorder from the filterbank file. The -swapout option only controls the writing convention.

I think using endianness as a parameter flag (bitorder) for these read/write functions and having both kind of unpacking implementation can be a good strategy. The flag would then be irrelevant if nbits > 8.

FilReader.read_block(), FilReader.read_plan(), Header.prep_outfile()

Ideally, it should have been a header flag.

Need to reverse/make changes in telegraphic/numbits#6 and telegraphic/numbits#7

@pravirkr pravirkr added the bug Something isn't working label Apr 3, 2024
@ewanbarr
Copy link
Contributor

ewanbarr commented Apr 3, 2024

bitorder is more correct than endianness here so it makes sense to use that terminology.

As a note, the previous discussion about unpack1_8 vs np.unpackbits is no longer valid after PR #34. The previous performance discrepancy was due to the intermediate buffer allocation.

In [72]: %timeit kernels.unpack1_8(in_ar, out_ar)
563 µs ± 2.26 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [73]: %timeit np.unpackbits(in_ar)
646 µs ± 9.54 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants