https://kedro.org/ logo
#questions
Title
# questions
f

Francis Duval

01/26/2024, 2:31 PM
Hello! I have a template pipeline here:
Copy code
compute_embeddings_template = pipeline(
    pipe=[
        node(
            func=compute_20_dim_embeddings,
            inputs=['mapping_agg_coarse_ibc_code', 'st_model'],
            outputs='embs_20_dim',
            name='compute_20_dim_embeddings'
        ),
        node(
            func=apply_umap,
            inputs=['embs_20_dim', 'umap_object'],
            outputs='embs_2_dim',
            name='apply_umap'
        ),
        node(
            func=plot_embeddings,
            inputs='embs_2_dim',
            outputs='plot_embeddings',
            name='plot_embeddings'
        ),
    ]
)
Then, I apply this template pipeline 2 times for 2 different
st_model
(I create 1 namespace for each):
Copy code
nli_finetuned_pipeline = pipeline(
    pipe=compute_embeddings_template,
    inputs={'mapping_agg_coarse_ibc_code': 'mapping_agg_coarse_ibc_code', 'st_model': 'st_oussama', 'umap_object': 'umap_object'},
    namespace='nli_finetuned',
)

freq_and_nli_finetuned_pipeline = pipeline(
    pipe=compute_embeddings_template,
    inputs={'mapping_agg_coarse_ibc_code': 'mapping_agg_coarse_ibc_code', 'st_model': 'st_oussama_finetuned_freq', 'umap_object': 'umap_object'},
    namespace='freq_and_nli_finetuned',
)
Then, I consolidate everything in a namespace `compute_2D_embeddings`:
Copy code
compute_2D_embeddings_pipeline = pipeline(
    pipe=nli_finetuned_pipeline + freq_and_nli_finetuned_pipeline,
    inputs={'mapping_agg_coarse_ibc_code', 'st_oussama', 'st_oussama_finetuned_freq', 'umap_object'},
    namespace='compute_2D_embeddings'
)
The problem is only with the visualization. Ideally, I would like to have 4 outputs to the supernode `compute_2D_embeddings`: • compute_2D_embeddings.nli_finetuned.embs_2_dim • compute_2D_embeddings.freq_and_nli_finetuned.embs_2_dim • compute_2D_embeddings.nli_finetuned.plot_embeddings • compute_2D_embeddings.freq_and_nli_finetuned.plot_embeddings Right now I only have: • compute_2D_embeddings.nli_finetuned.plot_embeddings • compute_2D_embeddings.freq_and_nli_finetuned.plot_embeddings Here is a screenshot of the visualization:

https://github.com/francisduval/to_delete/blob/main/Capture2.PNG

To sum it up, the datasets
embs_2_dim
are not displayed as outputs of the supernode
compute_2D_embeddings
since they are used as inputs in the
plot_embeddings
node, but I still would want them displayed as outputs. Thanks!
K 1
n

Nok Lam Chan

01/26/2024, 2:56 PM
Thanks, the current behaviour seems correct in terms of the topology. Why do you want it to displayed as output otherwise? Is it just because you want to see clearly what's the "output" of your pipeline. Would some kind of highlight/annotations helps? Cc @Nero Okwa
f

Francis Duval

01/26/2024, 3:03 PM
Hi Nok! Yes, it is just because these 2 objects are important and I would like to see them as outputs of the supernode. As a workaround, I made a node that will output both object:
Copy code
node(
    func=output_embedding_results,
    inputs=['embs_20_dim', 'umap_object'],
    outputs=['embs_2_dim', 'plot_embeddings'],
    name='output_embedding_results'
)
Copy code
def output_embedding_results(embs_20_dim, umap_object):
    embs_2_dim = apply_umap(embs_20_dim, umap_object)
    fig = plot_embeddings(embs_2_dim)

    return embs_2_dim, fig
n

Nero Okwa

01/26/2024, 3:49 PM
Thanks @Francis Duval I'm not sure what's causing this, @Rashida Kanchwala can you have a look.