-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unified additional properties #16
base: master
Are you sure you want to change the base?
Conversation
alpha, | ||
'opacity', | ||
(int, float), | ||
alt.Opacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe alpha can only be a scalar value for pandas plotting. In that case, it would make more sense to set it as a mark property (chart.mark_point(opacity=alpha)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for color
, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed alpha is (for some reason) either None or float. But color and size could be an array or dataframe column. But probably we need some uniform interface for them anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that color and size must also be scalar properties. I think data.plot.scatter
only accepts column names for x
, y
, and c
arguments, and other plot types only accept column names for x
and y
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also some arguments that allow passing of arrays of values... we'll have to add those as new columns to the input data in order to use them in Altair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does decorator make sense, then? and/or storing processed data as an instance attribute - so that we can access it outside of the plotting function.
Of corse I can also replace every plot function ending with :
chart = ....
return _additional_attributes(chart, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the best design right now, TBH. I don't think t's too important to store processed data as an instance attribute.
Maybe the best approach would be to have utility routines that take a set of kwargs and extract properties based on where they will be needed in an Altair chart. e.g. _extract_mark_properties(kwargs)
, _extract_additional_encodings(kwargs)
, _extract_additional_data_columns(kwargs)
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk, let me try that approach on some basics (title, alpha, figsize)
most (all?) pandas plot functions support
Many also support shape and size. Part of those arguments can take a specific value (e.g.
color='red'
), a dataframe column (e.g.color='column2'
), or a new array of the same length.We probably want to support those for altair as well, preferably in a unified fashion. Below is my first attempt - it does support title, color and alpha for all the given charts, and can switch between specific values, columns, and corresponding Altair objects. Arrays are not supported yet.
One question I need some advice on is to where should this happen?
right now, it works in the
plot
method. I think that it is safer to have this working separately for a DataFrame / Series plotters. I can decorate each method, or run the function from within each method if that's ok. That way we can use a similar strategy for scatterplot c/s values.Will be thankful for the feedback