Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Merge Analyses runs for each experiment when running isabl process-finished #46

Open
nickp60 opened this issue Apr 21, 2023 · 1 comment

Comments

@nickp60
Copy link
Collaborator

nickp60 commented Apr 21, 2023

  • isabl_cli version: │isabl, version 0.1.0
  • Python version: Python 3.11.0
  • Operating System: linux

Description

When running isabl process-finished to complete a group of analyses, the project-level merge logic is run after each is completed. This is very time consuming, and would probably be best run after the command is finished to avoid wasting resources.

What I Did

isabl process-finished -fi projects 20
Isabl runs the merge analysis after processing each analysis.

New feature

Perhaps a fix could be to disable trigger_analyses_merge when executing patch_instance from the process-finished command. Instead, the process-finished cmd could keep a list of analysis keys, individuals, and projects affected by the query and run those after updaing the status

# not actual code
analyses_processsed = {pks=[], indvs=[], projects=[]}
for analysis in analyses:
    patch_instance(status="SUCCEEDED", run_triggers=False)
    analyses_processsed["pks"].append(analysis.pk)
    analyses_processsed["indvs"].append(analysis.individual)
    analyses_processsed["projects"].extend(analysis.projects)  
for proj in set(analyses_processed["projects"]):
    run_project_merge_analyses(proj)
for indv in set(analyses_processed["indvss"]):
    run_individual_merge_analyses(indv)
   
@juanesarango
Copy link
Contributor

juanesarango commented Oct 18, 2024

@nickp60 I was long due to comment on this issue.

Project-level merge runs after each analysis is completed (FAILED or SUCCEEDED) And is only submitted when no other analyses on the project is pending to finish (SUBMITTED or STARTED), if there are pending running analyses the merge is skipped.

isabl_cli/isabl_cli/data.py

Lines 135 to 167 in f13b995

def trigger_analyses_merge(analysis):
"""Submit project level analyses merge if neccessary."""
if analysis["status"] not in {"SUCCEEDED", "FAILED"}:
return
try:
application = import_from_string(analysis["application"]["application_class"])()
except ImportError:
return
def _echo_action(instance, application, pending):
click.secho(
("Skipping " if pending else "Submitting ")
+ ("individual " if "species" in instance else "project ")
+ f"merge for {instance} and application {application}",
fg="green" if pending else "yellow",
)
if application.has_project_auto_merge:
projects = {j["pk"]: j for i in analysis["targets"] for j in i["projects"]}
for i in projects.values():
pending = api.get_instances_count(
endpoint="analyses",
status__in="STARTED,SUBMITTED",
application=analysis["application"]["pk"],
projects=i["pk"],
)
_echo_action(i, analysis.application, pending)
if not pending:
application.submit_merge_analysis(i)

I believe what you're seeing is that it runs when FINISHED are available. We should add FINISHED to pending status:
status__in="STARTED,SUBMITTED,FINISHED",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants