Skip to content

Commit

Permalink
Remove my_posts from chrono_trending
Browse files Browse the repository at this point in the history
  • Loading branch information
richardr1126 committed Dec 12, 2024
1 parent 05dcc1c commit 672d3b6
Show file tree
Hide file tree
Showing 8 changed files with 526 additions and 87 deletions.
171 changes: 171 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
.DS_Store
.idea
*.iml
.env
*.db
*.db-wal
*.db-shm
output.txt

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
bin/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
pyvenv.cfg

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,15 @@ FROM python:3.12.8-slim-bookworm
# create a volume for the sqlite database, so that it persists between container restarts
# need persistent storage attached to server
VOLUME /var/data/
WORKDIR /usr/src/app
WORKDIR /usr/src/app/

# Copy package files and install dependencies
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code
COPY . .

EXPOSE 8000

# Runs when the container is started
Expand Down
60 changes: 29 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,39 @@ The feed generator uses the following filters to curate content:

The generator offers custom filtering using SQLite and regular expressions to identify Cosmere-related content. It integrates trending posts by calculating interaction scores and maintains the database by cleaning outdated entries with `apscheduler`. Deployment is streamlined with `gunicorn` and managed using `honcho`.

## Getting Started
## Making your own Feed

### Prerequisites
1. **Update files:**
- Update `publish_feed.py` with your feed details. **(REQUIRED)**
- Modify filters in `firehose/data_filter.py`. **(OPTIONAL)**
- Change database names/routes in `firehose/database.py` and `web/database_ro.py`. **(REQUIRED (unless using Docker))**
> **Note:** Currently `/var/data/` is used for database storage in a Docker volume. Change this to a different path if needed.
2. **Publish Your Feed:** Follow the [Publishing Your Feed](#publishing-your-feed) instructions below.

## Easiest installation (Docker)

Configure the environment variables by copying and editing the example file:

```shell
cp example.env .env
```

Open `.env` in your preferred text editor and fill in the necessary variables.
> **Note:** To obtain `CHRONO_TRENDING_URI`, publish the feed first using `publish_feed.py`.
Build and run Docker image:
```shell
docker build -t myfeed .
docker run --rm -it --env-file .env -p 8000:8000 -v feeddata:/var/data/ myfeed
```


### Manual Installation

Ensure you have **Python 3.7+** and **Conda** installed. [Download Miniconda](https://docs.conda.io/en/latest/miniconda.html) if you haven't already.

### Installation
### Prerequisites

Clone the repository and navigate to its directory:

Expand All @@ -64,28 +90,6 @@ Install the required dependencies:
pip install -r requirements.txt
```

Configure the environment variables by copying and editing the example file:

```shell
cp example.env .env
```

Open `.env` in your preferred text editor and fill in the necessary variables.
> **Note:** To obtain `CHRONO_TRENDING_URI`, publish the feed first using `publish_feed.py`.
## Making your own Feed

To create your own feed, install dependencies, configure environment variables, and customize the settings:

1. **Update files:**
- Update `publish_feed.py` with your details. **(REQUIRED)**
- Modify filters in `firehose/data_filter.py`. **(OPTIONAL)**
- Change database names/routes in `firehose/database.py` and `web/database_ro.py`. **(REQUIRED)**
- Change `DID_TO_PRIORITIZE` in `algos/chrono_trending.py` with a bsky DID which will show it's posts at the top of the feed **(REQUIRED)**
> **Note:** Because current DB folder for production `/var/data` might not be accessible in your environment.
2. **Publish Your Feed:** Follow the [Publishing Your Feed](#publishing-your-feed) instructions below.

## Publishing Your Feed

Edit the `publish_feed.py` script with your specific information such as `HANDLE`, `PASSWORD`, `HOSTNAME`, `RECORD_NAME`, `DISPLAY_NAME`, `DESCRIPTION`, and `AVATAR_PATH`. Run the script to publish your feed:
Expand All @@ -100,12 +104,6 @@ To update your feed's display data, modify the relevant variables and rerun the

The server operates two main processes: the web server and the firehose data stream. Use `honcho` to manage these processes as defined in the `Procfile`:

Build and run Docker image:
```shell
docker build -t myfeed .
docker run --rm -it -p 8000:8000 -v feeddata:/var/data/ myfeed
```

Manually run the server:
```shell
honcho start
Expand Down
10 changes: 5 additions & 5 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
atproto==0.0.53
peewee~=3.16.2
Flask~=2.3.2
python-dotenv~=1.0.0
gunicorn~=20.1.0
atproto
peewee
Flask
python-dotenv
gunicorn
honcho
apscheduler
9 changes: 9 additions & 0 deletions utils/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
from utils.logger import logger

SERVICE_DID = os.environ.get('SERVICE_DID', None)
HOSTNAME = os.environ.get('HOSTNAME', None)
Expand All @@ -16,3 +17,11 @@
if CHRONOLOGICAL_TRENDING_URI is None:
raise RuntimeError('Publish your feed first (run publish_feed.py) to obtain Feed URI. '
'Set this URI to "CHRONOLOGICAL_TRENDING_URI" environment variable.')

# logger.info(f'HANDLE: {HANDLE}')
# logger.info(f'PASSWORD: {PASSWORD}')
if HANDLE is None:
raise RuntimeError('You should set "HANDLE" environment variable first.')

if PASSWORD is None:
raise RuntimeError('You should set "PASSWORD" environment variable first.')
Loading

0 comments on commit 672d3b6

Please sign in to comment.