Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
dragoon committed Nov 23, 2023
1 parent c3a6629 commit c98a6b1
Showing 1 changed file with 18 additions and 15 deletions.
33 changes: 18 additions & 15 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,16 @@ scratch for crypto assets.

Here is a high-level overview of what we are going to cover:

1. Data collection platform
2. Signal generation
1. Real-time data collection system
2. Historical data processing
3. Backtesting & reporting
4. Unit and integration testing
5. Live trading

## Data collection platform
## Data collection system

The first system we need to build is a data collection platform.
The first system we are going to build is a real-time data collection system.
[TODO]
For backtesting, we need to collect at least best bid/ask prices,
and to implement a trading strategy, we need to collect other features from the order book, so let's talk about it briefly.

Expand Down Expand Up @@ -144,7 +145,7 @@ if __name__ == '__main__':
asyncio.run(main("BTC"))

```
Full initial version is available with tag [v0.1.0-beta](https://github.com/FarawayTech/faraway-finance/tree/v0.1.0-beta)
Full initial version is available with tag [v0.1.0-beta](https://github.com/dragoon/acts-collector/tree/v0.1.0-beta)
This very is super short, fits into a single python file, but lacks one important requirement that we defined: this code is completely untestable.

To make it testable, we need to split the implementation in several modules for business logic, data access, etc.,
Expand All @@ -153,7 +154,7 @@ This is a typical domain-driven design (DDD) that I have used successfully in ma

### Testable implementation

The new implementation is available with tag [XXX]().
The new implementation is available with tag [v0.2.0](https://github.com/dragoon/acts-collector/tree/v0.2.0).
The structure of the project is now as follows:

- **/datacollector/**
Expand Down Expand Up @@ -236,15 +237,15 @@ async def collect_data(self):
self.logger.info(f"Starting order book collection for {self.asset_symbol}-USDT")

async for data in self.book_manager.get_data():
await self._process_depth_cache(data)
retry_count = 0
try:
await self._process_depth_cache(data)
retry_count = 0
except pymongo.errors.ConnectionFailure as e:
self.logger.error(f"Mongo error: {e}. Sleeping...")
await asyncio.sleep(self.retry_delay)
# in production the data will always continue
break

except asyncio.TimeoutError as e:
self.logger.error(f"Network error: {e}. Reconnecting...")
await asyncio.sleep(self.retry_delay)

except Exception as e:
self.logger.exception(f"An unexpected error occurred: {e}")
retry_count += 1
Expand All @@ -258,13 +259,16 @@ async def collect_data(self):
await asyncio.sleep(wait)
```

In case we get ``asyncio.TimeoutError``, we simply sleep with a constant delay, and then try to re-connect.
In case of other exceptions, we sleep with exponential backoff delay, and exit completely if the number of retries exceeded pre-configured value.
I only catch ``pymongo.errors.ConnectionFailure`` in the main data loop, since we don't want to stop the collection system in case our database temporary becomes unavailable.

:warning: NB: while network errors are somewhat expected, other exceptions are not,
and the generic exception handle will swallow everything, even errors in your implementation.
The log monitoring system should be configured to notify the dev team in case of such errors.

:warning: I also noticed that ``python-binance`` library swallows most of the errors and tries to reconnect to the socket internally.
This strategy however, is not ideal, since it will not raise an error in case of prolonged network issues (more than 15-20 seconds).
You can try it yourself but starting the program and disabling network access.

Finally, ``_process_depth_cache`` function checks the elapsed time and sends data entry for storing in minute intervals.


Expand Down Expand Up @@ -315,4 +319,3 @@ while total asks difference remains significant even after many hours.




1 comment on commit c98a6b1

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
start_data_collector.py25250%1–33
datacollector
   domain.py280100% 
datacollector/repositories
   data_repository.py60100% 
datacollector/services
   collector_service.py671184%17, 20–34
   data_process_service.py400100% 
   datetime_service.py9189%13
TOTAL1753779% 

Tests Skipped Failures Errors Time
6 0 💤 0 ❌ 0 🔥 1.317s ⏱️

Please sign in to comment.