Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GenAI Preprocessor logic #2314

Merged
merged 72 commits into from
Feb 20, 2024
Merged

Add GenAI Preprocessor logic #2314

merged 72 commits into from
Feb 20, 2024

Conversation

@alanpo1 alanpo1 requested a review from a team as a code owner February 13, 2024 03:03
@alanpo1 alanpo1 changed the title Add Preprocessor logic Add GenAI Preprocessor logic Feb 13, 2024
Copy link

github-actions bot commented Feb 13, 2024

Test Results for model-monitoring-ci

258 tests   244 ✅  2h 16m 31s ⏱️
  1 suites   14 💤
  1 files      0 ❌

Results for commit c3c63ac.

♻️ This comment has been updated with latest results.

@alanpo1 alanpo1 merged commit 9746f3a into main Feb 20, 2024
25 checks passed
@alanpo1 alanpo1 deleted the alanpoblette/SpanLogPreprocessor branch February 20, 2024 22:22
anushree1808 pushed a commit to anushree1808/azureml-assets that referenced this pull request Apr 23, 2024
* add sample raw logs to e2e test resources

* working on genai preprocessor

* adding logic for genAI preprocessor

* adding UT tests

* select cols in df  to v1 schema

* UT

* fix UT

* fix data

* refactor shared mdc code in genai preprocessor

* syntax fixes

* let all extra columns through

* syntax

* syntax

* doc syntax

* adding e2e test

* adding dataref example

* remove unused import

* add file datalake pacakge to genai preprocessor

* no need to bump version. Hasn't released v1 yet

* add dataref testcase but comment out for now

* fix UT

* merge trace aggregator and genai preprocessor PRs into one (Azure#2323)

* temp testing files

* stubbing out trace logic

* update type hinting

* add sample raw logs to e2e test resources

* remove attributes for temp testing

* temp files

* working on span tree json_string repr

* fix syntaxes

* finished logic for trace aggregator

* remove temp files

* add type hints

* syntax fix and working on UTs

* remove unused property

* make span_row internal private

* fix syntax

* syntax fixes

* add SpanTreeNode tests

* add tests for spantree and spantreenode

* syntax

* Revert "add sample raw logs to e2e test resources"

This reverts commit 5a6ca1b64ba29586aaa9f62f7f568cac268687d7.

* syntax fixes

* syntax fix

* fix UT

* Update test_trace_aggregator.py

* Update test_trace_aggregator.py

* change to timestamptype

* schema in alphabetical order

* columns will be in datetime. Handle correct json serialization

* expect datetime object in spantree tests not string

* moved span utils out of subfolder

* fix syntax

* add debug and temp file

* assign df to return

* remove debug info

* add from json str test

* use property in function instead of private var

* syntax changes

* syntax

* add more validation in span tree util

* move datetime serialization logic to appropriate function

* syntax

* fix test case for to_dict()

* add trace aggregator in preprocessor

* user_id and session_id may not be available for v0

* comment out user id and session id in schema

* unused import for now

* syntax

* fix UT with less schema

* slim down debugging show()

* refactor trace aggregator to use more pyspark

* fix import to be same as others

* fix UT and import stmts

* modify UT

* modify UT

* Update trace_aggregator.py

* fix UT

* fix syntax

* syntax

* refactor tree construction to include all fields from span logs

* syntax

* syntax fix

* format output of trace aggregator to fit  agg trace schema

* remove unused variable

* unused imports

* unused definition

* set auto for parallel pytests

* Revert "set auto for parallel pytests"

This reverts commit a08b2b0e3b8488bcd23b8d49f3fb805757e0298b.

* revert main merge

* Update model-monitoring-ci.yml

* revert auto change

* pass schema to map function, not call it on workers

* addressing some comments

* fix UT

* add new flag for calculate trace logs in preprocessor

* return None column if can't find in raw logs data

* add UT for None columns if not in raw logs

* Revert "add new flag for calculate trace logs in preprocessor"

This reverts commit fe60dc9d3ef21f257d45b44d64dc6554136700f6.

* syntax

* syntax

* Remove problematic test for v1

* add trace aggregator UT

* setup pyspark path for each testcase

* remove broken UT for v1

* Revert "remove broken UT for v1"

This reverts commit 26019ccd9271b19338a7eb650b6f3bec1f6e6f66.

* Reapply "remove broken UT for v1"

This reverts commit e6bfb8f48e0d2d33d1d6a3955cc9d07896a48d66.

* Revert "setup pyspark path for each testcase"

This reverts commit e9076184321b9154ab5a7ef328f17910f744d383.

* Revert "add trace aggregator UT"

This reverts commit 9d3c8534747da8ef79f887dd159d9203e68ce756.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants