Hello I have a template pipeline here ```compute embeddings Kedro #questions

Hello! I have a template pipeline here: ```comput...

Francis Duval

01/26/2024, 2:31 PM

Hello! I have a template pipeline here:

Copy code

compute_embeddings_template = pipeline(
    pipe=[
        node(
            func=compute_20_dim_embeddings,
            inputs=['mapping_agg_coarse_ibc_code', 'st_model'],
            outputs='embs_20_dim',
            name='compute_20_dim_embeddings'
        ),
        node(
            func=apply_umap,
            inputs=['embs_20_dim', 'umap_object'],
            outputs='embs_2_dim',
            name='apply_umap'
        ),
        node(
            func=plot_embeddings,
            inputs='embs_2_dim',
            outputs='plot_embeddings',
            name='plot_embeddings'
        ),
    ]
)

Then, I apply this template pipeline 2 times for 2 different

st_model

(I create 1 namespace for each):

Copy code

nli_finetuned_pipeline = pipeline(
    pipe=compute_embeddings_template,
    inputs={'mapping_agg_coarse_ibc_code': 'mapping_agg_coarse_ibc_code', 'st_model': 'st_oussama', 'umap_object': 'umap_object'},
    namespace='nli_finetuned',
)

freq_and_nli_finetuned_pipeline = pipeline(
    pipe=compute_embeddings_template,
    inputs={'mapping_agg_coarse_ibc_code': 'mapping_agg_coarse_ibc_code', 'st_model': 'st_oussama_finetuned_freq', 'umap_object': 'umap_object'},
    namespace='freq_and_nli_finetuned',
)

Then, I consolidate everything in a namespace `compute_2D_embeddings`:

Copy code

compute_2D_embeddings_pipeline = pipeline(
    pipe=nli_finetuned_pipeline + freq_and_nli_finetuned_pipeline,
    inputs={'mapping_agg_coarse_ibc_code', 'st_oussama', 'st_oussama_finetuned_freq', 'umap_object'},
    namespace='compute_2D_embeddings'
)

The problem is only with the visualization. Ideally, I would like to have 4 outputs to the supernode `compute_2D_embeddings`: • compute_2D_embeddings.nli_finetuned.embs_2_dim • compute_2D_embeddings.freq_and_nli_finetuned.embs_2_dim • compute_2D_embeddings.nli_finetuned.plot_embeddings • compute_2D_embeddings.freq_and_nli_finetuned.plot_embeddings Right now I only have: • compute_2D_embeddings.nli_finetuned.plot_embeddings • compute_2D_embeddings.freq_and_nli_finetuned.plot_embeddings Here is a screenshot of the visualization:

https://github.com/francisduval/to_delete/blob/main/Capture2.PNG▾

To sum it up, the datasets

embs_2_dim

are not displayed as outputs of the supernode

compute_2D_embeddings

since they are used as inputs in the

plot_embeddings

node, but I still would want them displayed as outputs. Thanks!

K 1

Nok Lam Chan

01/26/2024, 2:56 PM

Thanks, the current behaviour seems correct in terms of the topology. Why do you want it to displayed as output otherwise? Is it just because you want to see clearly what's the "output" of your pipeline. Would some kind of highlight/annotations helps? Cc @Nero Okwa

Francis Duval

01/26/2024, 3:03 PM

Hi Nok! Yes, it is just because these 2 objects are important and I would like to see them as outputs of the supernode. As a workaround, I made a node that will output both object:

Copy code

node(
    func=output_embedding_results,
    inputs=['embs_20_dim', 'umap_object'],
    outputs=['embs_2_dim', 'plot_embeddings'],
    name='output_embedding_results'
)

Copy code

def output_embedding_results(embs_20_dim, umap_object):
    embs_2_dim = apply_umap(embs_20_dim, umap_object)
    fig = plot_embeddings(embs_2_dim)

    return embs_2_dim, fig

Nero Okwa

01/26/2024, 3:49 PM

Thanks @Francis Duval I'm not sure what's causing this, @Rashida Kanchwala can you have a look.

Open in Slack

Previous Next