Melvin Kok
04/04/2023, 10:58 AMafter_catalog_created hook is triggered before after_context_created. However this is fixed when kedro-telemetry is uninstalled (I have raised an issue here)
2. kedro-telemetry is still sending information about the data catalog, the default pipeline etc to heapanalytics.com even if consent is set to false. Under KedroTelemetryProjectHooks , it is calling _send_heap_event without checking for consent.datajoely
04/04/2023, 11:02 AMdatajoely
04/04/2023, 11:02 AMNok Lam Chan
04/04/2023, 11:03 AMMelvin Kok
04/04/2023, 11:11 AMWARNING Failed to send data to Heap. Exception of type 'ConnectTimeout' was raised even though we set consent to false. Started a debugger and eventually led me to KedroTelemetryProjectHooks calling _send_heap_eventMelvin Kok
04/04/2023, 11:15 AMNok Lam Chan
04/04/2023, 11:15 AMWhen I removedSome log will helps to confirm this - it’s pretty unlikely.,kedro-telemetrywas triggered first. When I reinstallafter_context_created,kedro-telemetrywas triggered first.after_catalog_created
Nok Lam Chan
04/04/2023, 11:15 AMMelvin Kok
04/04/2023, 11:16 AMNok Lam Chan
04/04/2023, 11:19 AMkedro run or Python API?Melvin Kok
04/04/2023, 11:19 AMkedro run, with --pipeline pipeline_name if that mattersMelvin Kok
04/04/2023, 11:22 AM2023-04-04 18:18:06,023 - kedro.framework.session.session - INFO - Kedro project <project_name>
2023-04-04 18:18:06,031 - kedro.config.common - INFO - Config from path '<project_folder>\conf\local' will override the following existing top-level config keys: base_path, workspace
2023-04-04 18:18:06,228 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\google\rpc\__init__.py:20: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.rpc')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See <https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages>
pkg_resources.declare_namespace(__name__)
2023-04-04 18:18:06,257 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\pkg_resources\__init__.py:2349: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See <https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages>
declare_namespace(parent)
2023-04-04 18:18:08,689 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\google\auth\_default.py:78: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. See the following page for troubleshooting: <https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds>.
warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
2023-04-04 18:18:21,007 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\seaborn\rcmod.py:82: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
if LooseVersion(mpl.__version__) >= "3.0":
2023-04-04 18:18:21,021 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\setuptools\_distutils\version.py:345: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
other = LooseVersion(other)
2023-04-04 18:19:44,457 - kedro_telemetry.plugin - WARNING - Failed to send data to Heap. Exception of type 'ConnectTimeout' was raised.
2023-04-04 18:19:45,165 - kedro.io.data_catalog - INFO - Loading data from '<dataset_name>' (ParquetDataSet)...
...Melvin Kok
04/04/2023, 11:23 AMdatajoely
04/04/2023, 11:29 AMpkg_resources.declare_namespace('google.rpc')Melvin Kok
04/04/2023, 11:29 AMdatajoely
04/04/2023, 11:29 AMdatajoely
04/04/2023, 11:30 AMkedro-telemetry from your dependenciesdatajoely
04/04/2023, 11:31 AMimport sys
print(sys.version)
print(sys.executable)Melvin Kok
04/04/2023, 11:31 AM2023-04-04 18:24:27,500 - kedro.framework.session.session - INFO - Kedro project Py_FuelEfficiencyPOC_svc
2023-04-04 18:24:27,505 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\globals.yml'
2023-04-04 18:24:27,509 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\local\globals.yml'
2023-04-04 18:24:27,512 - kedro.config.common - INFO - Config from path '<project_folder>\conf\local' will override the following existing top-level config keys: base_path, workspace
2023-04-04 18:24:27,523 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l01_raw.yml'
2023-04-04 18:24:27,534 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l02_intermediate.yml'
2023-04-04 18:24:27,540 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l03_primary.yml'
2023-04-04 18:24:27,546 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l04_feature.yml'
2023-04-04 18:24:27,552 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l05_model_input.yml'
2023-04-04 18:24:27,557 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l06_models.yml'
2023-04-04 18:24:27,562 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l07_model_output.yml'
2023-04-04 18:24:27,566 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l08_reporting.yml'
2023-04-04 18:24:27,585 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\local\credentials.yml'
2023-04-04 18:24:27,700 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\google\rpc\__init__.py:20: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google.rpc')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See <https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages>
pkg_resources.declare_namespace(__name__)
2023-04-04 18:24:27,724 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\pkg_resources\__init__.py:2349: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See <https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages>
declare_namespace(parent)
2023-04-04 18:24:30,173 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\google\auth\_default.py:78: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. See the following page for troubleshooting: <https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds>.
warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
2023-04-04 18:24:36,907 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l01_raw.yml'
2023-04-04 18:24:36,915 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l02_intermediate.yml'
2023-04-04 18:24:36,928 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l03_primary.yml'
2023-04-04 18:24:36,936 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l04_feature.yml'
2023-04-04 18:24:36,943 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l05_model_input.yml'
2023-04-04 18:24:36,948 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l06_models.yml'
2023-04-04 18:24:36,954 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l07_model_output.yml'
2023-04-04 18:24:36,959 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l08_reporting.yml'
2023-04-04 18:24:36,968 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\local\parameters.yml'
2023-04-04 18:24:42,138 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\seaborn\rcmod.py:82: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
if LooseVersion(mpl.__version__) >= "3.0":
2023-04-04 18:24:42,152 - py.warnings - WARNING - <project_folder>\venv\lib\site-packages\setuptools\_distutils\version.py:345: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
other = LooseVersion(other)
2023-04-04 18:26:05,540 - kedro_telemetry.plugin - WARNING - Failed to send data to Heap. Exception of type 'ConnectTimeout' was raised.
2023-04-04 18:26:05,557 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l01_raw.yml'
2023-04-04 18:26:05,569 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l02_intermediate.yml'
2023-04-04 18:26:05,577 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l03_primary.yml'
2023-04-04 18:26:05,584 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l04_feature.yml'
2023-04-04 18:26:05,590 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l05_model_input.yml'
2023-04-04 18:26:05,597 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l06_models.yml'
2023-04-04 18:26:05,604 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l07_model_output.yml'
2023-04-04 18:26:05,610 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\catalog\l08_reporting.yml'
2023-04-04 18:26:05,631 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\local\credentials.yml'
2023-04-04 18:26:06,354 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l01_raw.yml'
2023-04-04 18:26:06,362 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l02_intermediate.yml'
2023-04-04 18:26:06,374 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l03_primary.yml'
2023-04-04 18:26:06,381 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l04_feature.yml'
2023-04-04 18:26:06,388 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l05_model_input.yml'
2023-04-04 18:26:06,394 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l06_models.yml'
2023-04-04 18:26:06,401 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l07_model_output.yml'
2023-04-04 18:26:06,407 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\base\parameters\l08_reporting.yml'
2023-04-04 18:26:06,414 - kedro.config.common - DEBUG - Loading config file: '<project_folder>\conf\local\parameters.yml'
2023-04-04 18:26:06,454 - kedro.io.data_catalog - INFO - Loading data from '<dataset_name>' (ParquetDataSet)...
...
2023-04-04 18:26:08,022 - kedro.runner.sequential_runner - INFO - Completed 3 out of 3 tasks
2023-04-04 18:26:08,025 - kedro.runner.sequential_runner - INFO - Pipeline execution completed successfully.
2023-04-04 18:26:08,028 - kedro.framework.session.store - DEBUG - 'save()' not implemented for 'BaseSessionStore'. Skipping the step.
Debug logs if it helpsMelvin Kok
04/04/2023, 11:32 AM>>> print(sys.version)
3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]
>>> print(sys.executable)
<project_folder>\venv\Scripts\python.exe
>>>datajoely
04/04/2023, 11:32 AMdatajoely
04/04/2023, 11:32 AMMelvin Kok
04/04/2023, 11:33 AMdatajoely
04/04/2023, 11:34 AM.telemetry file in the project root?Melvin Kok
04/04/2023, 11:34 AMconsent: falseMelvin Kok
04/04/2023, 11:35 AMdatajoely
04/04/2023, 11:37 AMconsent: falsedatajoely
04/04/2023, 11:37 AMMelvin Kok
04/04/2023, 11:37 AMbefore_command_run hook in KedroTelemetryCLIHooks is catching my .telemetry properly, it’s just the after_context_created hook in KedroTelemetryProjectHooks that doesn’t check for consentdatajoely
04/04/2023, 11:37 AMdatajoely
04/04/2023, 11:40 AMimport pathlib; pathlib.Path('.telemetry').read_text()
can you please add this to your logging?datajoely
04/04/2023, 11:41 AMMelvin Kok
04/04/2023, 11:41 AM>>> import pathlib; pathlib.Path('.telemetry').read_text()
'consent: false'datajoely
04/04/2023, 11:41 AMdatajoely
04/04/2023, 11:42 AMdatajoely
04/04/2023, 11:42 AMdatajoely
04/04/2023, 11:42 AMprint(sys.version)
print(sys.executable)datajoely
04/04/2023, 11:42 AMdatajoely
04/04/2023, 11:42 AMMelvin Kok
04/04/2023, 11:46 AMMelvin Kok
04/04/2023, 11:47 AMMelvin Kok
04/04/2023, 11:47 AM_check_for_telemetry_consent at alldatajoely
04/04/2023, 12:00 PMdatajoely
04/04/2023, 12:01 PMNok Lam Chan
04/04/2023, 12:11 PMdatajoely
04/04/2023, 12:24 PMMelvin Kok
04/04/2023, 1:59 PMkedro-telemetry. just pointing out that KedroTelemetryProjectHooks.after_context_created is missing the telemetry consent check (for reference, KedroTelemetryCLIHooks.before_command_run contains the consent check), perhaps that’s where a fix is needed 😀Melvin Kok
04/04/2023, 2:00 PMMelvin Kok
04/04/2023, 2:01 PMafter_catalog_created on my custom hook for MLFlow is being called twice - once before after_context_created and once afterNok Lam Chan
04/04/2023, 2:12 PMMelvin Kok
04/04/2023, 2:12 PMNok Lam Chan
04/04/2023, 2:14 PMMelvin Kok
04/04/2023, 2:15 PMNok Lam Chan
04/04/2023, 2:19 PMcontext.catalog get called it get created and trigger the after_catalog_created hook.
In the telemetry hook after_context_created it created catalog , so it trigger the after_catalog_created before your MLFlowHook’s after_context_createdNok Lam Chan
04/04/2023, 2:21 PMsettings.py explicitly, I guess in this case the telemetry hook is triggered first.Melvin Kok
04/04/2023, 2:32 PMdatajoely
04/04/2023, 2:33 PMNok Lam Chan
04/04/2023, 2:41 PMYetunde
04/04/2023, 2:45 PMkedro-telemetry.
We are going to do the following:
• Ship and release an immediate fix for the plugin which means that the hook which collected anonymised information about the size of the project (number of datasets, pipelines and nodes) will observe your consent
• And then we're deleting data collected from kedro-telemetry 0.2.2 and 0.2.3 which are the affected versions
• We'll also do a team retrospective to come up with additional actions to make sure that we don't miss things like this again
• And we'll roll out communication to all of our users which will cover all of the above