-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactored to get things working in GitHub
- Loading branch information
Will Langdale
committed
Sep 30, 2024
1 parent
d62c8c8
commit 511b896
Showing
15 changed files
with
194 additions
and
192 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Pull Request Description | ||
|
||
## Changes Made | ||
- [List the main changes you've made] | ||
|
||
## Reason for Changes | ||
[Explain why you've made these changes] | ||
|
||
## Testing Done | ||
[Describe the testing you've done to validate your changes] | ||
|
||
## Screenshots (if applicable) | ||
[Add screenshots here if your changes include visual elements] | ||
|
||
## Checklist: | ||
- [ ] My code follows the style guidelines of this project | ||
- [ ] I have performed a self-review of my own code | ||
- [ ] I have commented my code, particularly in hard-to-understand areas | ||
- [ ] I have made corresponding changes to the documentation | ||
- [ ] My changes generate no new warnings | ||
- [ ] I have added tests that prove my fix is effective or that my feature works | ||
- [ ] New and existing unit tests pass locally with my changes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
name: Unit tests | ||
|
||
jobs: | ||
uv-example: | ||
name: python | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Install uv | ||
uses: astral-sh/setup-uv@v2 | ||
|
||
- name: Set up Python | ||
run: uv python install | ||
|
||
- name: Set up PostgreSQL | ||
run: | | ||
docker compose up db -d --wait | ||
- name: Run pytest | ||
run: | | ||
uv python -m pytest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,46 +1,17 @@ | ||
# 🔗 Company matching framework | ||
# 🔥 Matchbox (neé Company Matching Framework) | ||
|
||
A match orchestration framework to allow the comparison, validation, and orchestration of the best match methods for the company matching job. | ||
Record matching is a chore. We aim to: | ||
|
||
We envisage this forming one of three repos in the Company Matching Framework: | ||
* Make it an iterative, collaborative, measurable problem | ||
* Allow organisations to know they have matching records without having to share the data | ||
* Allow matching pipelines to run iteratively | ||
|
||
* `company-matching-framework`, this repo. A Python library for creating data linkage and deduplication pipelines over a shared relational database | ||
* `company-matching-framework-dash`, or https://matching.data.trade.gov.uk/. A dashboard for verifying links and deduplications, and comparing the performance metrics of different approaches. Uses `company-matching-framework` | ||
* `company-matching-framework-pipeline`. The live pipeline of matching and deduping methods, running in production. Uses `company-matching-framework` | ||
## Development | ||
|
||
## Coverage | ||
This project is managed by [uv](https://docs.astral.sh/uv/), linted and formated with [ruff](https://docs.astral.sh/ruff/), and tested with [pytest](https://docs.pytest.org/en/stable/). | ||
|
||
* [Companies House](https://data.trade.gov.uk/datasets/a777d199-53a4-4d0a-bbbb-1559a86f8c4c#companies-house-company-data) | ||
* [Data Hub companies](https://data.trade.gov.uk/datasets/32918f3e-a727-42e6-8359-9efc61c93aa4#data-hub-companies-master) | ||
* [Export Wins](https://data.trade.gov.uk/datasets/0738396f-d1fd-46f1-a53f-5d8641d032af#export-wins-master-datasets) | ||
* [HMRC UK exporters](https://data.trade.gov.uk/datasets/76fb2db3-ab32-4af8-ae87-d41d36b31265#uk-exporters) | ||
Task running is done with [make](https://www.gnu.org/software/make/). To see all available commands: | ||
|
||
## Quickstart | ||
|
||
Clone the repo, then run: | ||
|
||
```bash | ||
. setup.sh | ||
``` | ||
|
||
Create a `.env` with your development schema to write tables into. Copy the sample with `cp .env.sample .env` then fill it in. | ||
|
||
* `SCHEMA` is where any tables the service creates will be written by default | ||
|
||
To set up the database in your specificed schema run: | ||
|
||
```bash | ||
make cmf | ||
```console | ||
make | ||
``` | ||
|
||
## Usage | ||
|
||
See [the aspirational README](references/README_aspitational.md) for how we envisage the finished version of this Python library will be used. | ||
|
||
## Release metrics | ||
|
||
🛠 Coming soon! | ||
|
||
-------- | ||
|
||
<p><small>Project based on the <a target="_blank" href="https://drivendata.github.io/cookiecutter-data-science/">cookiecutter data science project template</a>.</small></p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.