[perf] Use local schemas if available #307

sneakers-the-rat · 2024-03-13T08:49:11Z

Edit to add relation to prior issues:

(Partial)
Fix: linkml/linkml#866
(Full)
Fix: linkml/linkml#1012

And i also see this was handled in a few different ways in different prior PRs/issues:

This, to me, points to a greater need to simplify and unify the loading behavior, since it seems like we have a patchwork of fixes here that didn't quite reach the root of the problem because the loading behavior is quite complex.

One can validate that network requests are still being made by, well, monitoring network traffic, as well as adding a debug flag just before the hbread printing what it's about to read.

Finally took the time to see what network requests were still happening during normal usage, because i kept hanging both on test runs and also when just trying to use the tool.

Turns out that hbread doesn't use requests (which would be cached during testing) and just directly calls urllib. It also turns out that most of the time we are just requesting types.yaml over and over again, and so we can safely use the local version of the meta schema instead - our local version should always be the one we prefer, since it's tagged to the particular version of linkml_runtime that we're using, as opposed to the URI version which could be any version (ie. would be the most recent version even if we wanted to use an older version of the spec).

edit: this was removed to satisfy a test that needed the fileinfo:

The only thing that looks dicey in here is me also trying to just cache all identical results from reads for a given cache period, and so I am avoiding passing FileInfo to hbread so that we can use lru_cache which requires hashable args.

Perf of request.py(urlopen):
Before: 288.5s (cumulative) 0.8291s per call
This PR: 18.94s (cumulative) 0.789s per call (we make fewer calls is the point)
Difference: -269s (-93%)

Edit: i have no idea why this test is failing - I tried to fix the source schema file and remove the newline at the end of the file, but otherwise i have no idea why it decided to stop printing the filename between 3 hours ago and now. i'll come back in the morning

cache hashable calls to yaml loader if possible

… is apparently necessary for test issue 1040 to pass

codecov · 2024-03-16T04:36:39Z

Codecov Report

Attention: Patch coverage is 84.21053% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 62.92%. Comparing base (27b9158) to head (33ca663).
Report is 3 commits behind head on main.

Files	Patch %	Lines
linkml_runtime/loaders/loader_root.py	72.72%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #307      +/-   ##
==========================================
+ Coverage   62.88%   62.92%   +0.03%     
==========================================
  Files          62       62              
  Lines        8528     8545      +17     
  Branches     2436     2437       +1     
==========================================
+ Hits         5363     5377      +14     
- Misses       2554     2557       +3     
  Partials      611      611

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cmungall · 2024-03-18T15:36:50Z

Thanks! Yes, we should be phasing out the older hbread code over time

Use local schemas if available

1edef76

cache hashable calls to yaml loader if possible

sneakers-the-rat mentioned this pull request Mar 13, 2024

[perf] Use yamllib if available #306

Merged

Remove cached reads and restore repeated metadata instantiation which…

33ca663

… is apparently necessary for test issue 1040 to pass

cmungall approved these changes Mar 18, 2024

View reviewed changes

cmungall merged commit 3b0f817 into linkml:main Mar 21, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[perf] Use local schemas if available #307

[perf] Use local schemas if available #307

sneakers-the-rat commented Mar 13, 2024 •

edited

Loading

codecov bot commented Mar 16, 2024 •

edited

Loading

cmungall commented Mar 18, 2024

[perf] Use local schemas if available #307

[perf] Use local schemas if available #307

Conversation

sneakers-the-rat commented Mar 13, 2024 • edited Loading

codecov bot commented Mar 16, 2024 • edited Loading

Codecov Report

cmungall commented Mar 18, 2024

sneakers-the-rat commented Mar 13, 2024 •

edited

Loading

codecov bot commented Mar 16, 2024 •

edited

Loading