Skip to content

Commit

Permalink
Cleanup (#19)
Browse files Browse the repository at this point in the history
* cleanup
  • Loading branch information
eloyfelix authored Jan 16, 2025
1 parent a40e9cf commit f45056f
Show file tree
Hide file tree
Showing 7 changed files with 264 additions and 289 deletions.
59 changes: 25 additions & 34 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,53 +11,44 @@ on:
release:
types:
- published

jobs:
build:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11']
name: Python ${{ matrix.python-version }}
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13']
name: Test Python ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v3

- uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install cbl-migrator
shell: bash -l {0}
run: pip install -e . pytest twine build

- name: Install dependencies
run: pip install -e . pytest
- name: Run tests
shell: bash -l {0}
run: pytest

build-and-publish:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.12' # Use a stable version for building
- name: Install build dependencies
run: pip install build twine
- name: Build wheels
run: python3 -m build

run: python -m build
- name: Upload artifacts
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: dist
path: dist/*

upload_all:
name: Upload if release
runs-on: ubuntu-latest
needs: build
if: github.event_name == 'release' && github.event.action == 'published'

steps:
- uses: actions/setup-python@v4
with:
python-version: "3.x"

- uses: actions/download-artifact@v3
with:
name: artifact
path: dist

- uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.pypi_password }}
- name: Publish to PyPI
if: github.event_name == 'release' && github.event.action == 'published'
uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.pypi_password }}
47 changes: 18 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,53 +3,42 @@
[![Supported Python versions](https://img.shields.io/pypi/pyversions/cbl_migrator.svg)](https://pypi.python.org/pypi/cbl_migrator/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)


# CBL Migrator

Small SQLAlchemy based library that migrates Oracle DBs to MySQL, PostgreSQL and SQLite. Used in ChEMBL dumps generation process.

to use it, as a Python library:
A lightweight SQLAlchemy-based tool that migrates Oracle databases to MySQL, PostgreSQL, or SQLite. It is used in ChEMBL dump generation.

## Usage in Python
```python
from cbl_migrator import DbMigrator

origin = 'oracle://{user}:{pass}@{host}:{port}/?service_name={service_name}&encoding=utf8'
#dest = 'mysql://{user}:{pass}@{host}:{port}/{dbname}?use_unicode=1&charset=utf8'
dest = 'postgresql://{user}:{pass}@{host}:{port}/{dbname}?client_encoding=utf8'

migrator = DbMigrator(origin, dest, ['excluded_table1', 'excluded_table2'], n_workers=4)
migrator.migrate()
```

directly from the command line:
## Command Line Usage
```bash
cbl-migrator "oracle://{user}:{pass}@{host}:{port}/?service_name={service_name}&encoding=utf8" "postgresql://{user}:{pass}@{host}:{port}/{dbname}?client_encoding=utf8" --n_workers 8
cbl-migrator "oracle://{user}:{pass}@{host}:{port}/?service_name={service_name}&encoding=utf8" \
"postgresql://{user}:{pass}@{host}:{port}/{dbname}?client_encoding=utf8" \
--n_workers 8
```

## What it does (in order of events)

- Copies tables from origin to dest using the closest data type for each field. No constraints except PKs are initially copied across.
- Table contents are migrated from origin to dest tables. In parallel.
- If the data migration is succesful it will first generate the constraints and then the indexes. Any index in a field with a previously created UK will be skipped (UKs are implemented as unique indexes).
- It logs every time it was not possible to migrate an object, e.g., ```(psycopg2.OperationalError) index row size 2856 exceeds maximum 2712 for index.```
## How It Works
- Copies tables from the source, preserving only PKs initially.
- Migrates table data in parallel.
- If successful, applies constraints and then indexes; skips indexes already covered by unique keys.
- Logs objects that fail to migrate.

## What it does not do

- It won't migrate any table without a PK. May hang with a table without PK and containing an UK field referenced as FK in another table.
- It does not try to migrate server default values.
- It does not set autoincremental fields.
- It does not try to migrate triggers nor procedures.
## What It Does Not Do
- Avoids tables without PKs (may hang if a unique field is referenced by an FK).
- Ignores server default values, autoincrement fields, triggers, and procedures.

## SQLite

SQLite can not:

- concurrently write
- alter table ADD CONSTRAINT

So only one core is used when migrating to it. All constraints are generated at the time of generating the destination tables and it sequentially inserts rows in tables in correct FKs order.

- No concurrent writes or ALTER TABLE ADD CONSTRAINT.
- Uses one core and creates constraints at table creation time.
- Inserts rows sequentially in correct FK order.

## MySQL

CLOBs are migrated to LONGTEXT.
- Converts CLOBs to LONGTEXT.
Loading

0 comments on commit f45056f

Please sign in to comment.