Jo Stichbury
01/03/2023, 2:55 PMdef compare_passenger_capacity_go(preprocessed_shuttles: pd.DataFrame):
data_frame = preprocessed_shuttles.groupby(["shuttle_type"]).mean().reset_index()
fig = go.Figure(
[
go.Bar(
x=data_frame["shuttle_type"],
y=data_frame["passenger_capacity"],
)
]
)
return fig
However, the code for Plotly express isn't working in a kedro run
.
def compare_passenger_capacity_exp(preprocessed_shuttles: pd.DataFrame):
fig = px.bar(
data_frame=preprocessed_shuttles.groupby(["shuttle_type"]).mean().reset_index(),
x="shuttle_type",
y="passenger_capacity",
)
return fig
The error returned is
PlotlyDataSet(filepath=/Users/jo_stichbury/Documents/GitHub/stichbury/kedro-projects/kedro-tutorial/data/08_reporting/shuttle_passenger_capacity_plot_exp.json, load_args={},
plotly_args={'fig': {'orientation': h, 'x': shuttle_type, 'y': passenger_capacity}, 'layout': {'title': Shuttle Passenger capacity, 'xaxis_title': Shuttles, 'yaxis_title': Average
passenger capacity}, 'type': bar}, protocol=file, save_args={}, version=Version(load=None, save='2023-01-03T14.43.36.537Z')).
Value of 'x' is not the name of a column in 'data_frame'. Expected one of [0] but received: shuttle_type
Before the holiday, I did a fair amount of trial and error to re-write the function according to various stack overflow searches, but I couldn't find a way to fix it.
🚨 Please could I get some help from anyone who knows this code (maybe @Rashida Kanchwala?) or anyone who is familiar with Plotly to get the compare_passenger_capacity_exp
method working? 🚨
My example is here so I hope it's just a matter of taking it and revising the method in the nodes.py
file for the reporting pipeline. I should point out that it doesn't currently work on 0.18.4 (see this issue) so it's necessary to test against 0.18.3 (using the 'old' dataset notation) for now. Everything in my example is working apart from this node.Rashida Kanchwala
01/03/2023, 4:00 PM# This function uses plotly.express
def compare_passenger_capacity_exp(preprocessed_shuttles: pd.DataFrame):
return preprocessed_shuttles.groupby(["shuttle_type"]).mean().reset_index()
if you use the JSONDataset then you need to do px.bar() in the node function but then you don't provide any sort plotly_args. In your catalog.yml you simply do this
shuttle_passenger_capacity_plot_exp:
type: plotly.JSONDataSet
filepath: data/08_reporting/shuttle_passenger_capacity_plot_exp.json
versioned: true
l understand your confusion, it seems you have done both and that's why your kedro run failsJo Stichbury
01/03/2023, 4:21 PM