Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Notebook testing changes #1606

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

[WIP] Notebook testing changes #1606

wants to merge 17 commits into from

Conversation

sfc-gh-chu
Copy link
Contributor

@sfc-gh-chu sfc-gh-chu commented Oct 29, 2024

Description

Please include a summary of the changes and the related issue that can be
included in the release announcement. Please also include relevant motivation
and context.

Other details good to know for developers

Please include any other details of this change useful for TruLens developers.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to
    not work as expected)
  • New Tests
  • This change includes re-generated golden test results
  • This change requires a documentation update

Important

Enhance notebook testing with pytest, refactor preprocessors, improve environment handling, and update model references to gpt-4o-mini.

  • Testing Enhancements:
    • Replace unittest with pytest in test_notebooks.py.
    • Add test_notebook and test_notebook_backwards_compat functions.
  • Preprocessor Refactoring:
    • Refactor VariableSettingPreprocessor to include init_code and code_to_run_before_each_cell as optional parameters.
    • Add DBMigrationPreprocessor for database migration logic.
  • Environment Handling:
    • Use dotenv_values to load environment variables from .env file in test_notebooks.py.
  • Model Updates:
    • Update model references from gpt-3.5-turbo to gpt-4o-mini in App_TruBot.py, trubot.py, and provider.py.

This description was created by Ellipsis for 65600f9. It will automatically update as commits are pushed.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Oct 29, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to f5da453 in 18 seconds

More details
  • Looked at 208 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. tests/docs_notebooks/test_notebooks.py:2
  • Draft comment:
    The import statement for listdir is redundant since glob is used for listing files. Consider removing it to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The import statement for listdir is redundant since glob is used for listing files. Removing it will clean up the code.
2. tests/docs_notebooks/test_notebooks.py:5
  • Draft comment:
    The main() function from unittest is not used in the script. Consider removing this import to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The main() function from unittest is not used in the script. It should be removed to avoid confusion and clean up the code.
3. tests/docs_notebooks/test_notebooks.py:20
  • Draft comment:
    Consider using default values directly in the function signature for init_code and code_to_run_before_each_cell to simplify the logic. For example, use init_code: Optional[Iterable[str]] = [] and code_to_run_before_each_cell: Optional[Iterable[str]] = [].
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The VariableSettingPreprocessor class has a potential issue with the init_code and code_to_run_before_each_cell parameters. If these are empty lists, they will be converted to empty strings, which is fine, but the logic could be simplified by using default values directly in the function signature.
4. tests/docs_notebooks/test_notebooks.py:82
  • Draft comment:
    The NOTEBOOKS_TO_TEST variable uses glob to find files, making the listdir import unnecessary. Ensure listdir is removed to clean up the code.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The NOTEBOOKS_TO_TEST variable uses glob to find files, which is more efficient and flexible than listdir. The listdir import is unnecessary and should be removed.

Workflow ID: wflow_jXwZ85yySYMMLpuc


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped PR review on 1a61834 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped PR review on 7861740 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped PR review on a3c3a52 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 2c093b4 in 43 seconds

More details
  • Looked at 182 lines of code in 12 files
  • Skipped 9 files when reviewing.
  • Skipped posting 7 drafted comments based on config settings.
1. docs/component_guides/instrumentation/langchain.md:51
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
2. docs/component_guides/instrumentation/nemo.md:41
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo-instruct to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
3. examples/experimental/end2end_apps/trubot/App_TruBot.py:27
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
4. examples/experimental/end2end_apps/trubot/trubot.py:121
  • Draft comment:
    Ensure that the change from text-embedding-ada-002 to text-embedding-3-small is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
5. examples/expositional/use_cases/app_with_human_feedback.py:34
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
6. src/apps/langchain/trulens/apps/langchain/tru_chain.py:154
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.
7. src/core/trulens/core/feedback/provider.py:55
  • Draft comment:
    Ensure that the change from gpt-3.5-turbo to gpt-4o-mini is consistent across all documentation and code examples. This change is reflected here, but verify it is applied everywhere in the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR involves changing the model name from 'gpt-3.5-turbo' to 'gpt-4o-mini' across multiple files. This change should be consistent throughout the codebase.

Workflow ID: wflow_cpzptL9lps50rwtw


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@sfc-gh-pmardziel
Copy link
Contributor

Heys, I don't know if possible here but can you check whether the notebooks tested can be executed in a manner resembling colab or snowflake which notably executes each cell in a different async task?

sfc-gh-dkurokawa and others added 2 commits October 30, 2024 16:56
…ke.connector.cursor.SnowflakeCursor::execute` instead as its thread-safe.
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on c6c0a19 in 16 seconds

More details
  • Looked at 29 lines of code in 2 files
  • Skipped 4 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. Makefile:74
  • Draft comment:
    Ensure that the newly added dependencies llama-index-retrievers-bm25 and llama-index-tools-yelp are necessary for the notebook environment. If they are not required, consider removing them to keep the environment lean.
  • Reason this comment was not posted:
    Confidence changes required: 33%
    The change in the Makefile adds new dependencies for the notebook environment. This is a straightforward addition and does not seem to introduce any issues. However, it's important to ensure that these dependencies are necessary and correctly specified.

Workflow ID: wflow_Lg5455iAcdqxAMX5


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 65600f9 in 22 seconds

More details
  • Looked at 45 lines of code in 1 files
  • Skipped 5 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. Makefile:225
  • Draft comment:
    Consider re-adding the -n auto option to the pytest command in the test-notebook target to enable parallel test execution, which can improve test performance.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The test-notebook target in the Makefile was modified to remove the -n auto option from the pytest command. This change might affect the parallel execution of tests, potentially leading to longer test execution times. It's important to ensure that this change is intentional and that the impact on test execution time is acceptable.

Workflow ID: wflow_ZeR9aEk87af5pIV9


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants