Skip to content

Commit

Permalink
Updated global README
Browse files Browse the repository at this point in the history
  • Loading branch information
MatrixEditor committed Dec 25, 2023
1 parent 60976ce commit aec195d
Show file tree
Hide file tree
Showing 3 changed files with 84 additions and 96 deletions.
79 changes: 4 additions & 75 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@
Caterpillar is a Python 3.12+ library to pack and unpack structurized binary data. It
enhances the capabilities of [Python Struct](https://docs.python.org/3/library/struct.html)
by enabling direct class declaration. More information about the different configuration
options will be added in the future. A brief introduction can be found [here >](docs/INTRO.md).
options will be added in the future. A brief introduction can be found [here (outdated)](docs/INTRO.md) - documentation is [here >](https://matrixeditor.github.io/caterpillar/).

*Caterpillar* is able to:

* Pack and unpack data just from processing Python class definitions (including support for bitfields with a c++ like syntax!),
* apply a wide range of data types (with endianess and architecture configuration),
* dynamically adapt structs based on their inheritance layout,
* inserts proper types into the class definition to support documentation and
* it helps you to create cleaner and more concise code.
* it helps you to create cleaner and more compact code.

## Installation

Expand All @@ -29,80 +29,9 @@ Simply use pip to install the package:
pip install git+https://github.com/MatrixEditor/caterpillar.git
```

## Starting Point

Let's start off with a simple example (The full code is available in the `examples/` directory):
Write a parser for the [NIBArchive](https://github.com/matsmattsson/nibsqueeze/blob/master/NibArchive.md)
file format only using python classes. *It has never been easier!*

1. Step: Create the header struct
```python
from caterpillar.fields import *
from caterpillar.shortcuts import struct, LittleEndian

@struct(order=LittleEndian)
class NIBHeader:
# Here we define a constant value, which will raise an exception
# upon a different parsed value.
magic: b"NIBArchive"

# Primitive types can be used just like this
unknown_1: int32
unknown_2: int32
# ...
value_count: int32
offset_values: int32
# --- other fields omitted ---
```

2. Step: We want to parse all values, so let's to define its corresponding struct:
```python
@struct(order=LittleEndian)
class NIBValue:
key: VarInt
# NOTE the use of a default value; otherwise None would be set.
type: Enum(ValueType, uint8, ValueType.UNKNOWN)
# The field below describes a simple switch-case structure.
value: Field(this.type) >> {
ValueType.INT8: int8,
ValueType.INT16: int16,
# --- other options ---
ValueType.OBJECT_REF: int32,
# The following line shows how to manually return the parsed value (in
# this case it would be the result of this.type). NOTE that the value
# is only stored temporarily in the current context (not this-context).
#
# If this option is not specified and none of the above matched the input,
# an exception will be thrown.
DEFAULT_OPTION: Computed(ctx._value),
}
```

3. Step: Define the final file structure
```python
@struct(order=LittleEndian)
class NIBArchive:
# Other structs can be referenced as simple as this
header: NIBHeader

# this field is marked with '@': The parser will jump temporarily
# to the position specified after the operator. Use | F_KEEP_POSITION to
# continue parsing from the resulting position
# --- other fields snipped out ---
values: NIBValue[this.header.value_count] @ this.header.offset_values
```

4. Step: pack and unpack files
```python
from caterpillar.shortcuts import pack_file, unpack_file

# parse files
obj: NIBArchive = unpack_file(NIBArchive, "/path/to/file.nib")
# pack files: Note the use of 'use_tempfile' here. It will first
# write everything into a temporary file and ceopy it later on.
pack_file(obj, "/path/to/destination.nib", use_tempfile=True)
```
## Stating Point

Please visit the [Documentation](https://matrixeditor.github.io/caterpillar/), it contains a complete tutorial on how to use this library.

## Other Approaches

Expand Down
88 changes: 73 additions & 15 deletions docs/INTRO.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,18 +174,76 @@ class NIBArchive:
> [!CAUTION]
> If packing a very large object with offset-defined fields, consider using a temporary file to avoid excessive memory usage during processing. (set `use_tempfile` to True in `unpack`)
### Flags `|` and `^`

Flags

### Sequence `[]`

*TODO*

## Context

*TODO*

### Object-Context `this`

### Processing-Context `ctx`
## Starting Point (old)

Let's start off with a simple example (The full code is available in the `examples/` directory):
Write a parser for the [NIBArchive](https://github.com/matsmattsson/nibsqueeze/blob/master/NibArchive.md)
file format only using python classes. *It has never been easier!*

1. Step: Create the header struct
```python
from caterpillar.fields import *
from caterpillar.shortcuts import struct, LittleEndian

@struct(order=LittleEndian)
class NIBHeader:
# Here we define a constant value, which will raise an exception
# upon a different parsed value.
magic: b"NIBArchive"

# Primitive types can be used just like this
unknown_1: int32
unknown_2: int32
# ...
value_count: int32
offset_values: int32
# --- other fields omitted ---
```

2. Step: We want to parse all values, so let's to define its corresponding struct:
```python
@struct(order=LittleEndian)
class NIBValue:
key: VarInt
# NOTE the use of a default value; otherwise None would be set.
type: Enum(ValueType, uint8, ValueType.UNKNOWN)
# The field below describes a simple switch-case structure.
value: Field(this.type) >> {
ValueType.INT8: int8,
ValueType.INT16: int16,
# --- other options ---
ValueType.OBJECT_REF: int32,
# The following line shows how to manually return the parsed value (in
# this case it would be the result of this.type). NOTE that the value
# is only stored temporarily in the current context (not this-context).
#
# If this option is not specified and none of the above matched the input,
# an exception will be thrown.
DEFAULT_OPTION: Computed(ctx._value),
}
```

3. Step: Define the final file structure
```python
@struct(order=LittleEndian)
class NIBArchive:
# Other structs can be referenced as simple as this
header: NIBHeader

# this field is marked with '@': The parser will jump temporarily
# to the position specified after the operator. Use | F_KEEP_POSITION to
# continue parsing from the resulting position
# --- other fields snipped out ---
values: NIBValue[this.header.value_count] @ this.header.offset_values
```

4. Step: pack and unpack files
```python
from caterpillar.shortcuts import pack_file, unpack_file

# parse files
obj: NIBArchive = unpack_file(NIBArchive, "/path/to/file.nib")
# pack files: Note the use of 'use_tempfile' here. It will first
# write everything into a temporary file and ceopy it later on.
pack_file(obj, "/path/to/destination.nib", use_tempfile=True)
```
13 changes: 7 additions & 6 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ to write complex structures in a compact and readable manner.
class Format:
magic: b"Foo" # constant values
name: CString(...) # C-String without a fixed length
num_entries: uint32 # simple field definition
value: le + uint16 # little endian encoding
num_entries: be + uint32 # simple field definition + big endian encoding
entries: CString[this.num_entries] # arrays just like that
.. admonition:: Hold up, wait a minute!
Expand All @@ -39,16 +40,16 @@ to write complex structures in a compact and readable manner.
Working with defined classes is as straightforward as working with normal classes. *All
constant values are created automatically!*

>>> obj = Format(name="Hello, World!", num_entries=1, entries=["Bar"])
>>> obj = Format(name="Hello, World!", value=1, num_entries=1, entries=["Bar"])
>>> print(obj)
Format(magic=b'Foo', name='Hello, World!', num_entries=1, entries=['Bar'])
Format(magic=b'Foo', name='Hello, World!', value=1, num_entries=1, entries=['Bar'])

Packing and unpacking have never been easier:

>>> pack(obj)
b'FooHello, World!\x00\x01\x00\x00\x00Bar\x00'
>>> unpack(Format, b'FooHello, World!\x00\x01\x00\x00\x00Bar\x00')
Format(magic=b'Foo', name='Hello, World!', num_entries=1, entries=['Bar'])
b'FooHello, World!\x00\x01\x00\x00\x00\x00\x00\x00\x01Bar\x00'
>>> unpack(Format, _)
Format(magic=b'Foo', name='Hello, World!', value=1, num_entries=1, entries=['Bar'])

.. admonition:: What about documentation?

Expand Down

0 comments on commit aec195d

Please sign in to comment.