-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GenAI Preprocessor logic #2314
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Test Results for model-monitoring-ci258 tests 244 ✅ 2h 16m 31s ⏱️ Results for commit c3c63ac. ♻️ This comment has been updated with latest results. |
This reverts commit 836396b.
This reverts commit 14a4d63.
ycheng35xo
approved these changes
Feb 20, 2024
anushree1808
pushed a commit
to anushree1808/azureml-assets
that referenced
this pull request
Apr 23, 2024
* add sample raw logs to e2e test resources * working on genai preprocessor * adding logic for genAI preprocessor * adding UT tests * select cols in df to v1 schema * UT * fix UT * fix data * refactor shared mdc code in genai preprocessor * syntax fixes * let all extra columns through * syntax * syntax * doc syntax * adding e2e test * adding dataref example * remove unused import * add file datalake pacakge to genai preprocessor * no need to bump version. Hasn't released v1 yet * add dataref testcase but comment out for now * fix UT * merge trace aggregator and genai preprocessor PRs into one (Azure#2323) * temp testing files * stubbing out trace logic * update type hinting * add sample raw logs to e2e test resources * remove attributes for temp testing * temp files * working on span tree json_string repr * fix syntaxes * finished logic for trace aggregator * remove temp files * add type hints * syntax fix and working on UTs * remove unused property * make span_row internal private * fix syntax * syntax fixes * add SpanTreeNode tests * add tests for spantree and spantreenode * syntax * Revert "add sample raw logs to e2e test resources" This reverts commit 5a6ca1b64ba29586aaa9f62f7f568cac268687d7. * syntax fixes * syntax fix * fix UT * Update test_trace_aggregator.py * Update test_trace_aggregator.py * change to timestamptype * schema in alphabetical order * columns will be in datetime. Handle correct json serialization * expect datetime object in spantree tests not string * moved span utils out of subfolder * fix syntax * add debug and temp file * assign df to return * remove debug info * add from json str test * use property in function instead of private var * syntax changes * syntax * add more validation in span tree util * move datetime serialization logic to appropriate function * syntax * fix test case for to_dict() * add trace aggregator in preprocessor * user_id and session_id may not be available for v0 * comment out user id and session id in schema * unused import for now * syntax * fix UT with less schema * slim down debugging show() * refactor trace aggregator to use more pyspark * fix import to be same as others * fix UT and import stmts * modify UT * modify UT * Update trace_aggregator.py * fix UT * fix syntax * syntax * refactor tree construction to include all fields from span logs * syntax * syntax fix * format output of trace aggregator to fit agg trace schema * remove unused variable * unused imports * unused definition * set auto for parallel pytests * Revert "set auto for parallel pytests" This reverts commit a08b2b0e3b8488bcd23b8d49f3fb805757e0298b. * revert main merge * Update model-monitoring-ci.yml * revert auto change * pass schema to map function, not call it on workers * addressing some comments * fix UT * add new flag for calculate trace logs in preprocessor * return None column if can't find in raw logs data * add UT for None columns if not in raw logs * Revert "add new flag for calculate trace logs in preprocessor" This reverts commit fe60dc9d3ef21f257d45b44d64dc6554136700f6. * syntax * syntax * Remove problematic test for v1 * add trace aggregator UT * setup pyspark path for each testcase * remove broken UT for v1 * Revert "remove broken UT for v1" This reverts commit 26019ccd9271b19338a7eb650b6f3bec1f6e6f66. * Reapply "remove broken UT for v1" This reverts commit e6bfb8f48e0d2d33d1d6a3955cc9d07896a48d66. * Revert "setup pyspark path for each testcase" This reverts commit e9076184321b9154ab5a7ef328f17910f744d383. * Revert "add trace aggregator UT" This reverts commit 9d3c8534747da8ef79f887dd159d9203e68ce756.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
sample e2e component test: https://ml.azure.com/experiments/id/15d0d2a8-d3fb-41ef-8609-72af4968f196/runs/modest_carpet_b58g2pwb8t?wsid=/subscriptions/5fca341e-4ec3-45bd-b006-a7648d9376c5/resourceGroups/modelmonitoring-e2e-rg/providers/Microsoft.MachineLearningServices/workspaces/momo-e2e-test-ws-eastus&tid=72f988bf-86f1-41af-91ab-2d7cd011db47#