Skip to content

Commit

Permalink
Merge pull request #1 from H-IAAC/dev
Browse files Browse the repository at this point in the history
Adding pre-commits hooks and applied pylinters, code formaters and update python version.
  • Loading branch information
sildolfogomes authored May 17, 2024
2 parents c7a61d1 + c23607d commit 14cb8ad
Show file tree
Hide file tree
Showing 33 changed files with 1,510 additions and 5,702 deletions.
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,4 @@
*.jar binary
*.so binary
*.war binary
*.jks binary
*.jks binary
25 changes: 25 additions & 0 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: CI Workflow

on: [push, pull_request]

jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v2

- name: Set up Python 3.12.3
uses: actions/setup-python@v2
with:
python-version: 3.12.3

- name: Install poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
- name: Install dependencies with poetry
run: poetry install

- name: Run pytest
run: poetry run pytest
59 changes: 25 additions & 34 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,42 +1,33 @@
repos:
- repo: https://github.com/psf/black
rev: 23.3.0
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: black
- repo: local
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files

- repo: https://github.com/psf/black
rev: 24.4.2
hooks:
- id: pylint
name: pylint
entry: pylint
language: system
types: [python]
args:
[
"-rn", # Only display messages
"-sn", # Don't display the score
"--max-args=7", # Allow up to 10 arguments
"--max-locals=16", # Allow up to 16 local variables
"--ignore=tests", # Ignore the tests directory
]
- id: black

- repo: https://github.com/pycqa/isort
rev: 5.12.0
rev: 5.13.2
hooks:
- id: isort
- repo: https://github.com/econchick/interrogate
rev: 1.5.0
hooks:
- id: interrogate
- repo: https://github.com/pre-commit/mirrors-mypy
rev: 'v1.4.1' # Use the sha / tag you want to point at
hooks:
- id: mypy
name: isort (python)

- repo: local
- repo: https://github.com/PyCQA/pydocstyle
rev: 6.3.0
hooks:
- id: pytest
name: pytest
stages: [commit]
language: system
entry: pytest
types: [python]
args: ["-v"]
- id: pydocstyle

# - repo: local
# hooks:
# - id: pip-audit
# name: pip-audit
# entry: poetry run pip-audit
# language: system
# pass_filename: false
# types: [python]
14 changes: 7 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
.PHONY: notebook docs
.EXPORT_ALL_VARIABLES:

setup:
setup:
initialize_git install

install:
install:
@echo "Installing..."
poetry install
poetry run pre-commit install
Expand All @@ -14,7 +14,7 @@ activate:
poetry shell

initialize_git:
git init
git init

pull_data:
poetry run dvc pull
Expand All @@ -23,11 +23,11 @@ test:
pytest

docs_view:
@echo View API documentation...
@echo View API documentation...
pdoc src --http localhost:8080

docs_save:
@echo Save documentation to docs...
@echo Save documentation to docs...
pdoc src -o docs

## Delete all compiled Python files
Expand Down Expand Up @@ -58,9 +58,9 @@ verificar_e_quebrar:
@$(MAKE) break_text ARQUIVO_ENTRADA=$(ARQUIVO_ENTRADA)

translate_marian:
@echo "Traduzindo texto..."
@echo "Traduzindo texto..."
poetry run python3 src/utilities/translate_marian.py

translate_t5:
@echo "Traduzindo texto..."
@echo "Traduzindo texto..."
poetry run python3 src/utilities/translate_t5.py
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@ Tools for translate dataset usin
* Especialista Android: Daniel Miranda ([email protected])

# How to Use
Install Poetry
Install Poetry
make activate
4 changes: 2 additions & 2 deletions config/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ defaults:
- model: model1
- _self_

raw:
raw:
path: data/raw/sample.csv

processed:
Expand All @@ -14,4 +14,4 @@ processed:
final:
dir: data/final
name: final.csv
path: ${final.dir}/${final.name}
path: ${final.dir}/${final.name}
2 changes: 1 addition & 1 deletion config/model/model1.yaml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
name: model1
name: model1
2 changes: 1 addition & 1 deletion config/model/model2.yaml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
name: model2
name: model2
4 changes: 2 additions & 2 deletions config/process/process1.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
use_columns:
use_columns:
- col1
- col2
- col2
4 changes: 2 additions & 2 deletions config/process/process2.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use_columns:
use_columns:
- col1
- col2
- col3
- col3
1 change: 0 additions & 1 deletion docs/_config.yml

This file was deleted.

1 change: 0 additions & 1 deletion docs/data_dictionaries/README.md

This file was deleted.

18 changes: 16 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
# baseline
# Welcome to MkDocs

Directory Baseline
For full documentation visit [mkdocs.org](https://www.mkdocs.org).

## Commands

* `mkdocs new [dir-name]` - Create a new project.
* `mkdocs serve` - Start the live-reloading docs server.
* `mkdocs build` - Build the documentation site.
* `mkdocs -h` - Print help message and exit.

## Project layout

mkdocs.yml # The configuration file.
docs/
index.md # The documentation homepage.
... # Other markdown pages, images and other files.
3 changes: 0 additions & 3 deletions docs/references/README.md

This file was deleted.

35 changes: 25 additions & 10 deletions merge.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,26 @@
"""Merge module."""

import csv
import logging
import os

logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logging.basicConfig(
level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)


def merge_csv_files(directory_path, output_file):
"""
Lê vários arquivos CSV separados em um diretório e cria um novo arquivo com a fusão de todos eles.
O script une os arquivos CSV que possuem um padrão de nome "nome_parte_X_Y.csv",
onde X é o mesmo para diferentes partes e Y varia, em uma única linha no novo arquivo CSV.
Combina diferentes arquivos .csv em 1 só.
O script une os arquivos CSV que
possuem um padrão de nome "nome_parte_X_Y.csv", onde X é o mesmo
para diferentes partes e Y varia, em uma única linha no novo arquivo CSV.
Args:
directory_path (str): O caminho do diretório contendo os arquivos CSV separados.
directory_path (str): O caminho do diretório contendo os arquivos CSV
separados.
output_file (str): O caminho do arquivo de saída que será gerado.
"""
merged_data = []
current_row = []
Expand All @@ -28,11 +35,15 @@ def merge_csv_files(directory_path, output_file):
header = rows[0]
if len(rows) == 1:
current_row.extend(rows[0])
elif len(rows) >= 2 and filename.endswith("_1.csv") and len(rows[1]) >= 1:
elif (
len(rows) >= 2 and filename.endswith("_1.csv") and len(rows[1]) >= 1
):
if current_row:
merged_data.append(current_row)
current_row = rows[1]
elif len(rows) >= 2 and filename.endswith("_2.csv") and len(rows[1]) >= 1:
elif (
len(rows) >= 2 and filename.endswith("_2.csv") and len(rows[1]) >= 1
):
current_row.extend(rows[1])

if current_row:
Expand All @@ -46,10 +57,14 @@ def merge_csv_files(directory_path, output_file):

logging.info("Arquivo CSV merged gerado: %s", output_file)


if __name__ == "__main__":
# Configuração dos parâmetros
directory_path = input("Digite o caminho do diretório contendo os arquivos CSV separados: ")
directory_path = input(
"Digite o caminho do diretório contendo os arquivos CSV separados: "
)
output_file = input("Digite o caminho do arquivo CSV de saída que será gerado: ")

# Execução do merge
merge_csv_files(directory_path, output_file)
merge_csv_files(directory_path, output_file)
print("Hello!")
20 changes: 20 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
site_name: Dataset Translation for PT-BR using Open Source Models
theme:
name: material
palette:
primary: red
language: pt

repo_name: "H-IAAC/translate-dataset/"
repo_url: "https://github.com/H-IAAC/translate-dataset"

markdown_extensions:
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid

plugins:
- search
- mkdocstrings:
default_handler: python
Loading

0 comments on commit 14cb8ad

Please sign in to comment.