Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AGGrid implementation #260

Closed
wants to merge 8 commits into from
Closed

AGGrid implementation #260

wants to merge 8 commits into from

Conversation

maxschulz-COL
Copy link
Contributor

@maxschulz-COL maxschulz-COL commented Jan 10, 2024

Description

Implementation of the dash_ag_grid as alternative table function to be inserted in vm.Table

This PR is two things:

  • an implementation of the required functionality, see Base Version
  • a discussion of potential abstractions we might want to implement now, or at least be aware of

TLDR discussion

The discussion evolves around the question: where do we want implementation details of callable objects inserted into Graph and Table to lie: with the callable, or as an if distinction in the source code once the callable reaches that point. This also extends to implementation details of models such as Parameter and Filter, but is a slightly different question there.

The short answer is: of course the former (ie with the callable), but the consequences are at times a little tricky.

Base Version (96b6259):

Initial implementation of AGGrid with the following key decisions:

  • distinction between required Dash components (e.g. State(...)) rests with source code, not the supplied function

This is for example code like

if model == `graph`:
    inputs.append(
          {
              "clickData": State(component_id=triggered_model.id, component_property="clickData"),
          }
      )
elif model == `aggrid`:
    inputs.append(...)
elif model == `dashtable`:
    ....
  • filter application specifics (i.e. how to implement the click consequences for the table) also lie in source code

This is for example code that explains how to carry out actions such as filtering by clicking into a table

def apply_abc_filter_interaction():
        ....
        column = ctd_active_cell["value"]["column_id"]
        derived_viewport_data_row = ctd_active_cell["value"]["row"]
        clicked_data = ctd_derived_viewport_data["value"][derived_viewport_data_row][column]
        data_frame = data_frame[data_frame[column].isin([clicked_data])]
return data_frame

We need to determine different triggers:

    @validator("actions")
    def set_actions(cls, v, values):
        table_type = _get_table_type(values["figure"])
        if table_type == "DataTable":
            return _set_actions(v, values, "active_cell")
        elif table_type == "AgGrid":
            return _set_actions(v, values, "cellClicked")
        else:
            raise ValueError(f"Table type {table_type} not supported.")

Iteration on base (7d370c2)

  • improve on private attribute table_type to not require custom __init__
  • achieved by shift calculation of _table_type to pre_build

Abstraction approach (2c0e0a7)

  • abstract implementation details into function attribute as far as possible
  • also still a working example, but it proves a point

Which ultimate route do we want to go?

  • Base approach

This means hard coding the distinction between returned components in the source code. It is the easiest solution to implement. However, any new return for callables inserted into Graph and Table (and in the future maybe vm.React?) would require us to modify the source code. How often that would occur and whether this is a problem remains to be debated. If we decide that we want to get the feature out ASAP, we should choose this approach.

  • Abstraction approach

This means making heavy use of function attributes and would follow a light agreement we had previously on this topic. However, making such extensive use of function attributes could also be seen as generally questionable, and it opens up the question why we are not using an object oriented approach where attributes, inheritance and other things come naturally (see next approach).

On the flip side, this looks to some extent much improved to the base version. If you follow the example above, the implementation details of the dash_data_table now lie in the same file as the dash_data_table. Creating the dash_ag_grid would have been a breeze e.g.

  • Revisit our models vs callables as model attributes approach

Currently we are following this taxonomy, but we may question whether it is the correct one. Some questions that arise:

  • if we rely so heavily on attributes, and if we would love to create implementation recipes, are classes and ABCs not the way forward
  • are we still happy with two models that take callables (Table and Graph), but one of them (Table) takes 2 fundamentally different callable types
  • how will this look once we integrate more react components (our vague idea here is a vm.React model)
  • how do other components such as Card, but also Dropdown etc fit into the chosen approach

Other open tasks

  • Add tests
  • Add documentation
  • Add styling according to design

Screenshot

Notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

    • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
    • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
    • I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
    • I have not referenced individuals, products or companies in any commits, directly or indirectly.
    • I have not added data or restricted code in any commits, directly or indirectly.

@maxschulz-COL maxschulz-COL force-pushed the feature/enable_AG_grid branch from 1101cb8 to a74a28f Compare January 11, 2024 15:10
from vizro.models.types import capture


@capture("action")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@capture("action")
@capture("table")

@huong-li-nguyen
Copy link
Contributor

I'll take a proper look tomorrow if that's fine, but I already wanted to leave a comment saying that I love the PR description and the different iterations you've done. Amazing summary - it looks like lots of great stuff and interesting questions to ponder! 🚀 ❤️

Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments here while I remember, but let's discuss properly tomorrow or Thursday 🙂 I need to ponder some more also but didn't want to leave radio silence. on this.

Thank you very much for working through this and the very clear PR structure and description! 🙏

The amount of code that's different between the AGGrid and DataTable cases is bigger than I had hoped it would be 😬 I thought it might just be a case of switching a single string value for the input property and didn't realise that the whole filter interaction stuff would need to be different also.

This definitely makes the function attribute approach invalid. Think e.g. of what a user who write a custom table based on our inbuilt table function would do - we don't want them to have to put a whole function.action_info = ... thing on it. If we go for this approach (which I'm tending against now) then it would be used simply as a flag I think:

dash_ag_grid.table_type = "dash_ag_grid"

... and then the lookup for dash_ag_grid ➡️ filter_interaction_input would be encoded somewhere else.

What I had assumed previously was that all the information needed for the code to handle the different table cases could be encoded in such a short string, but that's obviously wrong because we have the different filter functions etc. also.

So there's really two different things here:

  1. how to distinguish different types of table function - currently you do this by evaluating the function in the Table model, which I originally didn't like but definitely see the advantage of doing and makes total sense if we need to evaluate the function for the id anyway. So this is in all likelihood the best solution, especially if we can untangle the code a bit
  2. where to encode the information that switches between different behaviours depending on _table_type. This is trickier and I think should not be done through function attributes since the discrepancies are just too great there

As for what the right solution is for 2, it's not immediately obvious... I do still think we should keep to one Table model though. The thing that we might want to change is to make the dash_table a callable class rather than a function, which would naturally give a home to these extra properties etc. This definitely has some disadvantages associated with it though. On second thoughts, I am more willing to be persuaded about having a new Grid model. Let's discuss more...

Comment on lines +24 to +30
kwargs = figure._arguments.copy()

# This workaround is needed because the underlying table object requires a data_frame
kwargs["data_frame"] = DataFrame()

# The underlying table object is pre-built, so we can fetch its ID.
underlying_table_object = figure._function(**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kwargs = figure._arguments.copy()
# This workaround is needed because the underlying table object requires a data_frame
kwargs["data_frame"] = DataFrame()
# The underlying table object is pre-built, so we can fetch its ID.
underlying_table_object = figure._function(**kwargs)
underlying_table_object = figure(data_frame=pd.DataFrame())

This should work by itself I think?

Probably we should do some try/except here to give a clear error message in the case that the function call fails for some reason.

I'd also like to understand why this evaluation of figure to extract id is needed (not changed by you here).

Copy link
Contributor Author

@maxschulz-COL maxschulz-COL Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posting the answer @petar-qb gave to me:

Graphs and Tables in the Dash (so in the Vizro too) are handled differently.

dcc.Graph has: id, figure attributes where figure attribute is any plotly chart.
If this graph has to be changed as Output of the callback we target it as - dash.Output(dcc_graph_id, "figure")
If this graph has to be an Input (let's say clickData property) of the callback we defined it as - dash.Input(dcc_graph_id, "clickData")
It means that we access to graph properties by accessing the outer wrapper dcc.Graph component.

The problem is that there's no outer dcc component wrapper for tables 😕. So, there is nothing like dcc.Table Dash inbuilt component that has id and figure attributes. Table ID is written directly inside its "figure" callable (e.g. inside the dash_table.DataTable()). This is different than graphs because plotly graph doesn't contain the ID (you cannot put the ID inside px.box(...)), but its outer component dcc.Graph does.

Yes, Vizro has created some kind of wrapper vm.Table that has id and figure where the figure is callable that can return dash_table.DataTable or AgGrid. Still, we didn't solve this problem because we can't (or at least, we didn't decide like that) to propagate the vm.Table.id into underlying table component.

Now, let's give an example on how callback inputs and outputs are created in the case of Tables.
If Vizro Table has to be changed as Output of the callback we target it with - dash.Output(vm_table_id, "children") - We can re-render dash_table.DashTable only if we change "children" of the outer Div component.
If Vizro Table has to be the Input (let's say active_cell property) of the callback we defined it as - dash.Input(underlying_table_id, "active_cell") - So we need to fetch this ID in the case that filter_interaction is defined on the Table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petar-qb this half makes sense to me. The half that doesn't make sense is:

  1. why can't we also use self._callable_object_id in callback outputs as well? Is the same true for AG Grid or just Dash datatable?
  2. why not set self._callable_object_id = self.id - you say we decided not to? I guess the answer is that if my above question is not possible you'll get duplicate ids for the table and the containing div
  3. do we keep on setting _callable_object_id somewhere outside Table.build?

Ideally what I'd like to do is this:

class Table(VizroBaseModel):
    def __call__(self, **kwargs):
        kwargs.setdefault("data_frame", data_manager._get_component_data(self.id))
        return self.figure(id=self.id, **kwargs)

    # no need for pre_build at all

so that the id is always injected from the automatically. But I'm guessing this will not be possible.

No need resolve this conversation as part of this PR because it's outside the scope of the PR, but it would be great to have a chat about it - let me put something in the calendar 🙂

Comment on lines +64 to +65
_, table_type = _get_table_type(values["figure"])
if table_type == "DataTable":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this by something other than string comparison? e.g. change _get_table_type to return the class and then use isinstance.

@@ -37,14 +49,26 @@ class Table(VizroBaseModel):
actions: List[Action] = []

_callable_object_id: str = PrivateAttr()
_table_type: str = (
PrivateAttr()
) # Ideally we would be able to use the populated content of this field in the `set_actions` validator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely the current code feels a bit tangled here, but I appreciate it's hard to get these things with private properties and validators working exactly as you'd like.

Copy link
Contributor Author

@maxschulz-COL maxschulz-COL Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I spent a while getting this to work, but also didn't hunt for a better solution once it was working (except removing the super().__init__) as there were other bigger questions. We should definitely revisit this once we have agreed on an implementation approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Indeed there's no point spending a long time perfecting this if we don't need it at all in the end.

@petar-qb
Copy link
Contributor

petar-qb commented Jan 17, 2024

First of all, Max, you’ve done excellent investigation 🚀, and thanks for that.

Let me recap my thoughts, and please correct me if I misunderstand something 🙏.

Base approach is probably good enough to support only AgGrid. The most important question here (the reason why the PR description looks like “spike issue”) is how to make vm.Table flexible enough to enable custom user underlying table implementation to work.

By setting only different vm.Table private properties (string properties e.g. _input_property(”active_cell”)) we do not achieve desired flexibility. The most basic reason is that sometimes input property has to be a dict or list of dash.States. (e.g. filter interaction for dash_table.DashTable).

One approach could be object oriented approach where we serve different implementations of _get_action_inputs (and _get_action_outputs) based on is the underlying table dash_table or ag_grid. This method could be overwritten by user in case that the underlying table is custom_user_table. Okay, but we also need to expose _filter_interaction_function so it can be overwritten too (filter_interaction functions differ based on the underlying table). The problem with object oriented solution is that we don't want to force our users to inherit vm.Table class if they need only to change the figure property, and we want to enable handling custom figures utilising the @capture("table") only instead. If you can remember any other difficulties about object oriented approach, please let me know. Currently the biggest issue with the object oriented approach is that we don’t want to inject dash dependencies in any model method that is not the build method.

Very Quick Question: Why? 😄

Okay.. Another approach is serving model_action_configuration through the function attributes. This approach has similar opportunities. There is a default configuration for predefined models and users are able to overwrite it and propagate their own custom_action_input_properties, custom_filter_interaction_function and so on. Ups and downs of using function attributes are:

Advantages:

  • No dash_dependencies in model methods,
  • No need to inherit vm.Table if the figure object is the only thing that should we overwritten.

Disadvantages:

  • unintuitive approach for the average user,
  • more disadvantages on the link.

So, If we want to enable only AgGrid - it should be fine. If we want to enable modifying how the model component would behave in the actions world then we should implement the solution on the Vizro library level (for every component: Dropdown, Button, Graph, Table). If we solve this problem generally it would mean that the custom react components, mathplotlib charts, and any custom tables will be automatically enabled to deal with actions.

Person A: “Is this implementation out of the scope of this PR?”
Person B: “Yes it is, but it’s worth at least thinking about it before we merge AgGrid on the main.”

Solving component action's information agnostically on the app level is a broad topic and probably we need to exchange and align our thoughts on some PS session.

Comment on lines +162 to +163
# This would also have to be abstracted outside the function, but is a little more complicated
# essentially we have to go: ctd_filter_interaction[<wildcard>]["id"] -> get parent Vizro Table -> get function -> get attributes
Copy link
Contributor

@petar-qb petar-qb Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like implementing a solution for this would require a lot of development effort. However, this is something we should enable in case of increasing the flexibility.

Enabling filter interaction function to be customisable looks like it should be considered similarly as predefined action (e.g. taken into account in the filtering or parameterisation process).
Does this also mean we need to enable some kind of custom_filter/custom_parameter (e.g. users want to implement custom filer component with custom filer action e.g. range_data_picker_filtering)?

I keep coming across this question these days and it seems like something we should look into very soon. As it looks like a big feature, I suggest considering it separately from this PR. Maybe we should discuss it before this PR merges (just in case to avoid breaking changes in the future).

@AnnMarieW
Copy link
Contributor

Just FYI Dash AG Grid V31 was just released. It would be good to update to the latest version here too.

https://community.plotly.com/t/dash-ag-grid-31-0-0-released-more-function-support-new-quartz-theme-cell-data-types-built-in-cell-editors-and-more/82036

@maxschulz-COL
Copy link
Contributor Author

We decided to go with an approach that has multiple models per callable (ie one for Table, one for Grid). See here for implementation: #289

@maxschulz-COL
Copy link
Contributor Author

Just FYI Dash AG Grid V31 was just released. It would be good to update to the latest version here too.

https://community.plotly.com/t/dash-ag-grid-31-0-0-released-more-function-support-new-quartz-theme-cell-data-types-built-in-cell-editors-and-more/82036

In the PR raised above I have taken over what I think are the most powerful out-of-the-box features of Dash AG Grid V31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants