Skip to content

Commit

Permalink
merged
Browse files Browse the repository at this point in the history
Signed-off-by: rashidakanchwala <[email protected]>
  • Loading branch information
rashidakanchwala committed Nov 14, 2024
2 parents c2bd671 + aa6c8cb commit f469ccc
Show file tree
Hide file tree
Showing 71 changed files with 2,338 additions and 2,050 deletions.
Binary file added .github/img/backend-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/img/frontend-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions .github/workflows/label-community-issues.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Label Community Issues

on:
issues:
types:
- opened

jobs:
label:
runs-on: ubuntu-latest
steps:
- name: Check if issue author is a member of Kedro org
uses: actions/github-script@v6
id: membership
with:
github-token: ${{ secrets.GH_TAGGING_TOKEN }}
result-encoding: string
script: |
try {
const result = await github.rest.orgs.getMembershipForUser({
org: "kedro-org",
username: '${{ github.actor }}'
})
console.log(result?.data?.state)
if (result?.data?.state == "active"){
console.log("%s: detected as an active member of Kedro org", '${{ github.actor }}')
return "member";
} else {
console.log("%s: not detected as active member of Kedro org", '${{ github.actor }}')
return "notMember";
}
} catch (error) {
console.log("%s: Error occured and marked user as notMember", '${{ github.actor }}')
console.log("Error", error.stack);
console.log("Error", error.name);
console.log("Error", error.message);
return "notMember";
}
- name: Label issue if author is from community
if: ${{ steps.membership.outputs.result == 'notMember' }}
uses: actions-ecosystem/action-add-labels@v1
with:
github_token: ${{ secrets.GH_TAGGING_TOKEN }}
labels: 'Community'
20 changes: 20 additions & 0 deletions .github/workflows/no-response.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: No Response

on:
issue_comment:
types: [created]
schedule:
# Run every day at 9am (UTC time)
- cron: '0 9 * * *'

jobs:
noResponse:
runs-on: ubuntu-latest
steps:
- uses: lee-dohm/[email protected]
with:
token: ${{ secrets.GITHUB_TOKEN }}
responseRequiredLabel: "support: needs more info"
daysUntilClose: 28
closeComment: >-
This issue has been closed due to lack of information. Feel free to re-open this issue if you're facing a similar problem. Please provide as much information as possible so we can help resolve your issue.
9 changes: 8 additions & 1 deletion ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ For further information, see also:

- [Kedro-Viz contributing documentation](CONTRIBUTING.md), which covers how to start development on the project
- [Kedro-Viz style guide](STYLE_GUIDE.md), which walks through our standards and recommended best practices for our codebase
- [Kedro-Viz Architecture Diagram](https://miro.com/app/board/uXjVKhNg1RE=/?moveToWidget=3458764606468376036&cot=10), to see a high level overview of both back-end and front-end and how they are connected.

## High-level Overview

Expand Down Expand Up @@ -62,7 +63,7 @@ The `localStorage` state is updated automatically on every Redux store update, v

## Data ingestion

![Kedro-Viz data flow diagram](/.github/img/app-architecture-data-flow.png)
![Kedro-Viz data flow diagram](/.github/img/frontend-architecture.png)

Kedro-Viz currently utilizes two different methods of data ingestion: the Redux setup for the pipeline and flowchart-view related components, and GraphQL via Apollo Client for the experiment tracking components.

Expand Down Expand Up @@ -147,3 +148,9 @@ Kedro-Viz includes a graph layout engine, for details see the [layout engine doc
Our layout engine runs inside a web worker, which asynchronously performs these expensive calculations in a separate CPU thread, in order to avoid this blocking other operations on the main thread (e.g. CSS transitions and other state updates).

The app uses [redux-watch](https://github.com/ExodusMovement/redux-watch) with a graph input selector to watch the store for state changes relevant to the graph layout. If the layout needs to change, this listener dispatches an asynchronous action which sends a message to the web worker to instruct it to calculate the new layout. Once the layout worker completes its calculations, it returns a new action to update the store's `state.graph` property with the new layout. Updates to the graph input state during worker calculations will interrupt the worker and cause it to start over from scratch.

## Backend Architecture

![Kedro-Viz backend architecture](/.github/img/backend-architecture.png)

The backend of Kedro-Viz serves as the data provider and API layer that interacts with Kedro projects and manages data access for visualisations in the frontend. It offers both REST and GraphQL APIs to support data retrieval for the frontend, allowing access to pipeline structures, node-specific details, and experiment tracking data. Key components include the `DataAccessManager`, which interfaces with data `Repositories` to fetch and structure data. The CLI enables users launch with Kedro-Viz from the command line, while deploy and build options enables seamless sharing of pipeline visualisations on any static website hosting platform.
3 changes: 3 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ Please follow the established format:
- Display full dataset type with library prefix in metadata panel (#2136)
- Enable SQLite WAL mode for Azure ML to fix database locking issues (#2131)
- Replace `flake8`, `isort`, `pylint` and `black` by `ruff` (#2149)
- Refactor `DatasetStatsHook` to avoid showing error when dataset doesn't have file size info (#2174)
- Fix 404 error when accessing the experiment tracking page on the demo site (#2179)
- Add check for port availability before starting Kedro Viz to prevent unintended browser redirects when the port is already in use (#2176)


# Release 10.0.0
Expand Down
2 changes: 1 addition & 1 deletion cypress/tests/ui/flowchart/flowchart.cy.js
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ describe('Flowchart DAG', () => {
const nodeToToggleText = 'Parameters';

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${nodeToToggleText}]`).as(
cy.get(`.toggle-control__checkbox[name=${nodeToToggleText}]`).as(
'nodeToToggle'
);

Expand Down
34 changes: 17 additions & 17 deletions cypress/tests/ui/flowchart/menu.cy.js
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ describe('Flowchart Menu', () => {
});

// Pipeline Label in the Menu
cy.get('.pipeline-nodelist__row__label')
cy.get('.row-text__label')
.first()
.invoke('text')
.should((pipelineLabel) => {
Expand All @@ -57,7 +57,7 @@ describe('Flowchart Menu', () => {
cy.get('.search-input__field').type(searchInput, { force: true });

// Pipeline Label in the Menu
cy.get('.pipeline-nodelist__row__label')
cy.get('.row-text__label')
.first()
.invoke('text')
.should((pipelineLabel) => {
Expand All @@ -72,7 +72,7 @@ describe('Flowchart Menu', () => {

// Action
cy.get(
`.MuiTreeItem-label > .pipeline-nodelist__row > [data-test=nodelist-data-${nodeToClickText}]`
`.MuiTreeItem-label > .node-list-tree-item-row > [data-test=node-list-tree-item--row--${nodeToClickText}]`
)
.should('exist')
.as('nodeToClick');
Expand All @@ -91,7 +91,7 @@ describe('Flowchart Menu', () => {

// Action
cy.get(
`.MuiTreeItem-label > .pipeline-nodelist__row > [data-test=nodelist-data-${nodeToHighlightText}]`
`.MuiTreeItem-label > .node-list-tree-item-row > [data-test=node-list-tree-item--row--${nodeToHighlightText}]`
)
.should('exist')
.as('nodeToHighlight');
Expand All @@ -108,7 +108,7 @@ describe('Flowchart Menu', () => {
const nodeToToggleText = 'Companies';

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${nodeToToggleText}]`, {
cy.get(`.toggle-control__checkbox[name=${nodeToToggleText}]`, {
timeout: 5000,
}).as('nodeToToggle');

Expand All @@ -121,7 +121,7 @@ describe('Flowchart Menu', () => {

// Assert after action
cy.__checkForText__(
`[data-test=nodelist-data-${nodeToToggleText}] > .pipeline-nodelist__row__label--faded`,
`[data-test=node-list-tree-item--row--${nodeToToggleText}] > .row-text__label--faded`,
nodeToToggleText
);
cy.get('.pipeline-node__text').should('not.contain', nodeToToggleText);
Expand All @@ -137,7 +137,7 @@ describe('Flowchart Menu', () => {

// Action
cy.get(
`[for=${nodeToFocusText}-focus] > .pipeline-nodelist__row__icon`
`[for=feature_engineering-focus]`
).click();

// Assert after action
Expand All @@ -161,34 +161,34 @@ describe('Flowchart Menu', () => {
const visibleRowLabel = 'Companies';

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${nodeToToggleText}]`).as(
cy.get(`.toggle-control__checkbox[name=${nodeToToggleText}]`).as(
'nodeToToggle'
);

// Assert before action
cy.get('@nodeToToggle').should('be.checked');
cy.get(
`[data-test=nodelist-data-${visibleRowLabel}] > .pipeline-nodelist__row__label`
`[data-test=node-list-tree-item--row--${visibleRowLabel}] > .row-text__label`
)
.should('not.have.class', 'pipeline-nodelist__row__label--faded')
.should('not.have.class', 'pipeline-nodelist__row__label--disabled');
.should('not.have.class', 'row-text__label--faded')
.should('not.have.class', 'row-text__label--disabled');

// Action
cy.get('@nodeToToggle').uncheck({ force: true });

// Assert after action
cy.get(
`[data-test=nodelist-data-${visibleRowLabel}] > .pipeline-nodelist__row__label`
`[data-test=node-list-tree-item--row--${visibleRowLabel}] > .row-text__label`
)
.should('have.class', 'pipeline-nodelist__row__label--faded')
.should('have.class', 'pipeline-nodelist__row__label--disabled');
.should('have.class', 'row-text__label--faded')
.should('have.class', 'row-text__label--disabled');
});

it('verifies that after checking node type URL should be updated with correct query params', () => {
const nodeToToggleText = 'Parameters';

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${nodeToToggleText}]`).as(
cy.get(`.toggle-control__checkbox[name=${nodeToToggleText}]`).as(
'nodeToToggle'
);

Expand All @@ -207,7 +207,7 @@ describe('Flowchart Menu', () => {
cy.visit(`/?tags=${visibleRowLabel}`);

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${visibleRowLabel}]`).as(
cy.get(`.toggle-control__checkbox[name=${visibleRowLabel}]`).as(
'nodeToToggle'
);

Expand All @@ -220,7 +220,7 @@ describe('Flowchart Menu', () => {
cy.visit('/?types=datasets');

// Alias
cy.get(`.pipeline-nodelist__row__checkbox[name=${visibleRowLabel}]`).as(
cy.get(`.toggle-control__checkbox[name=${visibleRowLabel}]`).as(
'nodeToToggle'
);

Expand Down
8 changes: 4 additions & 4 deletions cypress/tests/ui/toolbar/global-toolbar.cy.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,14 +81,14 @@ describe('Global Toolbar', () => {
cy.get('@isPrettyNameCheckbox').should('be.checked');

// Menu
cy.get(`[data-test="nodelist-modularPipeline-${prettifyName(modularPipelineText)}"]`).click();
cy.get(`[data-test="nodelist-${nodeNameType}-${prettyNodeNameText}"]`).should('exist');
cy.get(`[data-test="node-list-tree-item--row--${prettifyName(modularPipelineText)}"]`).click();
cy.get(`[data-test="node-list-tree-item--row--${prettyNodeNameText}"]`).should('exist');

// Flowchart
cy.get('.pipeline-node__text').should('contain', prettyNodeNameText);

// Metadata
cy.get(`[data-test="nodelist-${nodeNameType}-${prettyNodeNameText}"]`).click({ force: true });
cy.get(`[data-test="node-list-tree-item--row--${prettyNodeNameText}"]`).click({ force: true });
cy.get('.pipeline-metadata__title').should(
'have.text',
prettyNodeNameText
Expand All @@ -106,7 +106,7 @@ describe('Global Toolbar', () => {
// Assert after action
cy.__waitForPageLoad__(() => {
// Menu
cy.get(`[data-test="nodelist-${nodeNameType}-${originalNodeNameText}"]`).should('exist');
cy.get(`[data-test="node-list-tree-item--row--${originalNodeNameText}"]`).should('exist');

// Flowchart
cy.get('.pipeline-node__text').should('contain', originalNodeNameText);
Expand Down
37 changes: 37 additions & 0 deletions docs/source/kedro-viz_visualisation.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,43 @@ The visualisation now includes the layers:

![](./images/pipeline_visualisation_with_layers.png)

Duplicated definitions like:

```yaml
metadata:
kedro-viz:
layer: raw
```

can be avoided by leveraging YAML native syntax for anchors and aliases.

Use an anchor (`&`) first, to create a reusable piece of configuration:

```yaml
_raw_layer: &raw_layer
metadata:
kedro-viz:
layer: 01_raw
```

And then use aliases (`*`) to reference it:

```yaml
companies:
type: pandas.CSVDataset
filepath: data/01_raw/companies.csv
<<: *raw_layer
reviews:
type: pandas.CSVDataset
filepath: data/01_raw/reviews.csv
<<: *raw_layer
# Same for other datasets of the raw layer...
```

See [this example from the Kedro docs](https://docs.kedro.org/en/stable/data/data_catalog_yaml_examples.html#load-multiple-datasets-with-similar-configuration-using-yaml-anchors) for more details.

## Share a pipeline visualisation

You can save a pipeline structure within a Kedro-Viz visualisation directly from the terminal as follows:
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
},
"proxy": "http://localhost:4142/",
"scripts": {
"build": "cross-env GENERATE_SOURCEMAP=false react-scripts build",
"build": "cross-env GENERATE_SOURCEMAP=false react-scripts build && cp ./build/index.html ./build/404.html",
"postbuild": "rm -rf build/api",
"start": "REACT_APP_DATA_SOURCE=$DATA NODE_OPTIONS=\"--dns-result-order=ipv4first\" npm-run-all -p start:app start:lib",
"start:dev": "rm -rf node_modules/.cache && npm start",
Expand Down
28 changes: 18 additions & 10 deletions package/kedro_viz/integrations/kedro/hooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from pathlib import Path, PurePosixPath
from typing import Any, Union

import fsspec
from kedro.framework.hooks import hook_impl
from kedro.io import DataCatalog
from kedro.io.core import get_filepath_str
Expand Down Expand Up @@ -141,19 +142,26 @@ def get_file_size(self, dataset: Any) -> Union[int, None]:
Args:
dataset: A dataset instance for which we need the file size
Returns: file size for the dataset if file_path is valid, if not returns None
Returns:
File size for the dataset if available, otherwise None.
"""

if not (hasattr(dataset, "_filepath") and dataset._filepath):
return None

try:
file_path = get_filepath_str(
PurePosixPath(dataset._filepath), dataset._protocol
)
return dataset._fs.size(file_path)
if hasattr(dataset, "filepath") and dataset.filepath:
filepath = dataset.filepath
# Fallback to private '_filepath' for known datasets
elif hasattr(dataset, "_filepath") and dataset._filepath:
filepath = dataset._filepath
else:
return None

fs, path_in_fs = fsspec.core.url_to_fs(filepath)
if fs.exists(path_in_fs):
file_size = fs.size(path_in_fs)
return file_size
else:
return None

except Exception as exc:
except Exception as exc: # pragma: no cover
logger.warning(
"Unable to get file size for the dataset %s: %s", dataset, exc
)
Expand Down
5 changes: 4 additions & 1 deletion package/kedro_viz/launchers/cli/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ def run(
from kedro_viz.launchers.utils import (
_PYPROJECT,
_check_viz_up,
_find_available_port,
_find_kedro_project,
_start_browser,
_wait_for,
Expand Down Expand Up @@ -145,6 +146,9 @@ def run(
"https://github.com/kedro-org/kedro-viz/releases.",
"yellow",
)

port = _find_available_port(host, port)

try:
if port in _VIZ_PROCESSES and _VIZ_PROCESSES[port].is_alive():
_VIZ_PROCESSES[port].terminate()
Expand Down Expand Up @@ -186,7 +190,6 @@ def run(
)

display_cli_message("Starting Kedro Viz ...", "green")

viz_process.start()

_VIZ_PROCESSES[port] = viz_process
Expand Down
Loading

0 comments on commit f469ccc

Please sign in to comment.