Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kedro package as a CI check #236

Merged
merged 6 commits into from
Sep 9, 2024
Merged

Conversation

ElenaKhaustova
Copy link
Contributor

@ElenaKhaustova ElenaKhaustova commented Sep 4, 2024

Motivation and Context

Solves kedro-org/kedro#3209

How has this been tested?

  1. make install-test-requirements
  2. behave - to run all tests or behave -n <Scenario> to run specific test

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Assigned myself to the PR
  • Added tests to cover my changes

@ElenaKhaustova ElenaKhaustova marked this pull request as draft September 4, 2024 14:19
@ElenaKhaustova ElenaKhaustova marked this pull request as ready for review September 4, 2024 16:06
@ankatiyar
Copy link
Contributor

Do you mind also adding this in the github actions workflow - https://github.com/kedro-org/kedro-starters/blob/main/.github/workflows/all-checks.yml so it runs in the CI

@ElenaKhaustova ElenaKhaustova marked this pull request as draft September 4, 2024 16:16
@ElenaKhaustova
Copy link
Contributor Author

Do you mind also adding this in the github actions workflow - https://github.com/kedro-org/kedro-starters/blob/main/.github/workflows/all-checks.yml so it runs in the CI

ah, yes, sure

@ElenaKhaustova ElenaKhaustova marked this pull request as ready for review September 4, 2024 23:09
@astrojuanlu
Copy link
Member

Wondering if there's an easier way to setup Hadoop on Windows, maybe through conda (or better, micromamba)? According to conda-forge/pyspark-feedstock#34 (comment), -c conda-forge pyspark does not depend on Hadoop, and when I did tests with @ravi-kumar-pilla , an environment.yml like this was enough:

name: kedro-pyspark
channels:
- conda-forge
dependencies:
- findspark
- openjdk
- pyspark
- python=3.11

The fact that you managed to get Hadoop.exe installed through PowerShell is truly impressive, but I doubt Windows users would go that far, and therefore it would maybe be more useful if the way we do things in our CI is representative of how users get to install Kedro.

(Not to mention that I highly doubt anybody is trying to use Kedro + PySpark on Windows natively, but that's another story)

@astrojuanlu
Copy link
Member

(I don't intend to block the PR over this though, it was more of an informative comment - "if it works, don't touch it")

@ElenaKhaustova
Copy link
Contributor Author

(I don't intend to block the PR over this though, it was more of an informative comment - "if it works, don't touch it")

Since we use that approach to run other e2e tests as well, maybe we should address this in a separate PR?

Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
I think starters CI needs an update anyway (also need to update actions versions) we can do it separately.

@ElenaKhaustova
Copy link
Contributor Author

ElenaKhaustova commented Sep 6, 2024

@astrojuanlu, @ankatiyar - separate issue opened for updating the whole CI for starters: #237

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the Windows + Hadoop comment because I was passing by but please always take my comments on CI and infra work as suggestions and with a full bucket of salt 😊 Leaving the spot for others to do the final approval 👍🏼

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

features/package.feature Outdated Show resolved Hide resolved
Co-authored-by: Merel Theisen <[email protected]>
@ElenaKhaustova ElenaKhaustova enabled auto-merge (squash) September 9, 2024 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants