You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using withDataFrame, Vegas collects all the data and has a threshold for sampling instead.
But when doing aggregations in your plot, this means it will fetch all the data to the driver – potentially sampling it – and push all of it to vega-lite, where the aggregation will happen in JavaScript in the browser. This is probably never what you want.
It would be totally possible to map AggOps to Spark aggregations, and push the aggregation itself down to Spark. This will reduce the cardinality of the data dramatically, and would probably eliminate the need to sample in most cases.
The text was updated successfully, but these errors were encountered:
When using
withDataFrame
, Vegas collects all the data and has a threshold for sampling instead.But when doing aggregations in your plot, this means it will fetch all the data to the driver – potentially sampling it – and push all of it to vega-lite, where the aggregation will happen in JavaScript in the browser. This is probably never what you want.
It would be totally possible to map
AggOps
to Spark aggregations, and push the aggregation itself down to Spark. This will reduce the cardinality of the data dramatically, and would probably eliminate the need to sample in most cases.The text was updated successfully, but these errors were encountered: