Skip to content

Latest commit

 

History

History
361 lines (357 loc) · 9.51 KB

evaluation_results.md

File metadata and controls

361 lines (357 loc) · 9.51 KB

1. Evaluation of OPI-Llama-3.1-8B-Instruct model on 9 tasks.

Each testing result is derived from the Llama-3.1-8B-Instruct model that has been fine-tuned using OPI_full_1.61M_train.json and subsequently evaluated on the respective testing set for each specific task.

Task Type Task Name Testing file Accuracy Precision Recall F1 Rouge-L
Sequence Understanding EC Number Prediction (split100) CLEAN_EC_number_new_test - 0.3724 0.3374 0.3468 -
CLEAN_EC_number_price_test - 0.0738 0.0738 0.0738 -
Fold Type Prediction fold_type_test_Fold_Holdout 0.1045 - - - -
fold_type_test_Superfamily_Holdout 0.1507 - - - -
fold_type_test_Family_Holdout 0.6145 - - - -
Subcellular Localization Prediction subcell_loc_test 0.4214 - - - -
Annotation Prediction Function Keywords Prediction CASPSimilarSeq_keywords_test - 0.4202 0.5057 0.4385 -
IDFilterSeq_keywords_test - 0.6762 0.6905 0.6650 -
UniProtSeq_keywords_test - 0.7606 0.7489 0.7374 -
Gene Ontology(GO) Terms Prediction CASPSimilarSeq_go_terms_test - 0.1113 0.0936 0.099 -
IDFilterSeq_go_terms_test - 0.6686 0.6287 0.6304 -
UniProtSeq_go_terms_test - 0.7150 0.6897 0.6849 -
Function Description Prediction CASPSimilarSeq_function_test - - - - 0.7524
IDFilterSeq_function_test - - - - 0.4786
UniProtSeq_function_test - - - - 0.5144
Knowledge Mining Tissue Location Prediction from Gene Symbol gene_symbol_to_tissue_test - 0.4002 0.9356 0.5466 -
Cancer Prediction from Gene Symbol gene_symbol_to_cancer_test - 0.2890 0.2701 0.2664 -
Cancer Prediction from Gene Name gene_name_to_cancer_test - 0.2786 0.2707 0.2659 -

2. Evaluation of OPI-Galactica-6.7B model on 9 tasks

Each testing result is derived from the Galactica-6.7B model that has been fine-tuned using OPI_full_1.61M_train.json and subsequently evaluated on the respective testing set for each specific task.

Task Type Task Name Testing file Accuracy Precision Recall F1 Rouge-L
Sequence Understanding EC Number Prediction (split100) CLEAN_EC_number_new_test - 0.2700 0.2663 0.2596 -
CLEAN_EC_number_price_test - 0.0268 0.0268 0.0268 -
Fold Type Prediction fold_type_test_Fold_Holdout 0.0808 - - - -
fold_type_test_Superfamily_Holdout 0.1348 - - - -
fold_type_test_Family_Holdout 0.4854 - - - -
Subcellular Localization Prediction subcell_loc_test 0.7771 - - - -
Annotation Prediction Function Keywords Prediction CASPSimilarSeq_keywords_test - 0.8120 0.7360 0.7643 -
Function Keywords Prediction IDFilterSeq_keywords_test - 0.8377 0.8019 0.8070 -
Function Keywords Prediction UniProtSeq_keywords_test - 0.8596 0.8196 0.8276 -
Gene Ontology (GO) Terms Prediction CASPSimilarSeq_go_terms_test - 0.7613 0.7492 0.7476 -
Gene Ontology (GO) Terms Prediction IDFilterSeq_go_terms_test - 0.7404 0.7274 0.7207 -
Gene Ontology (GO) Terms Prediction UniProtSeq_go_terms_test - 0.7638 0.7373 0.7358 -
Function Description Prediction CASPSimilarSeq_function_test - - - - 0.7430
Function Description Prediction IDFilterSeq_function_test - - - - 0.7014
Function Description Prediction UniProtSeq_function_test - - - - 0.7133
Knowledge Mining Tissue Location Prediction from Gene Symbol gene_symbol_to_tissue_test - 0.3917 0.9077 0.5303 -
Cancer Prediction from Gene Symbol gene_symbol_to_cancer_test - 0.3555 0.3189 0.3229 -
Cancer Prediction from Gene Name gene_name_to_cancer_test - 0.2728 0.2554 0.2533 -

3. Performance comparison between OPI-Llama-3.1-8B-Instruct and OPI-Galactica-6.7B across 9 tasks.

It highlights task-specific strengths of each model, with Llama-3.1 excelling in EC number prediction and fold type prediction tasks whose prediction targets are numeric type, such as 3.4.11.4 and 10. Galactica leads in all three AP tasks, as well as cancer prediction from gene symbols whose prediction targets are character type.