Vishal Pandey
09/13/2024, 2:38 PMDeepyaman Datta
09/13/2024, 2:47 PMTableDataset
implementation, you can do this read from filepath.Deepyaman Datta
09/13/2024, 2:48 PMFileDataset
to read and write from files, but I don't have that PR up yet.)Vishal Pandey
09/13/2024, 2:48 PMDeepyaman Datta
09/13/2024, 2:50 PMVishal Pandey
09/13/2024, 2:50 PMVishal Pandey
09/13/2024, 2:51 PMDeepyaman Datta
09/13/2024, 2:52 PMDeepyaman Datta
09/13/2024, 2:52 PMVishal Pandey
09/13/2024, 2:56 PMdocument_extraction_ibis:
type: ibis.TableDataset
table_name: document_classification
connection:
backend : postgres
host : ##
port : ##
database : ##
user : ##
password : ##
This is the existing table from which I have read the data and have done the transformations.
Now I would like to save the transformed data into a csv format on S3 for dataset versioning basically.
Once that is done I would like to pull the csv from s3 and save to another database with an existing table . But it is a different database and a different table not the one from where we read it .Vishal Pandey
09/13/2024, 3:06 PMDeepyaman Datta
09/13/2024, 4:20 PMVishal Pandey
09/13/2024, 5:52 PMDeepyaman Datta
09/13/2024, 6:46 PM1. if this process can append new entries to the existing table in the sql backend.
2. Also, is there a way we can update existing records in an existing sql tableNot right now, but I was thinking to mention this. We can create an append mode, because Ibis does support insert.
Vishal Pandey
09/13/2024, 6:55 PMDeepyaman Datta
09/13/2024, 7:04 PMmode
parameter for the dataset, and if mode=="append"
use insert
(https://ibis-project.org/backends/duckdb#ibis.backends.duckdb.Backend.insert)Deepyaman Datta
09/13/2024, 7:05 PMibis.TableDataset
enhancement I've promised to do this weekend, but I could try to address this next week.Vishal Pandey
09/13/2024, 7:19 PMDeepyaman Datta
09/13/2024, 7:22 PMDo you think we can implement update queries in sql as well by implementing a custom dataset.Can you give an example?
Vishal Pandey
09/13/2024, 7:32 PMDeepyaman Datta
09/13/2024, 7:33 PMVishal Pandey
09/13/2024, 7:34 PMVishal Pandey
09/13/2024, 7:34 PMVishal Pandey
09/13/2024, 7:35 PMVishal Pandey
09/13/2024, 7:36 PMDeepyaman Datta
09/13/2024, 7:43 PMinsert
method already; just need to make sure end up with a reasonable design. 🙂
Upsert: As far as I'm aware, Ibis doesn't support upsert functionality out of the box (I did see mention of this in the H1 roadmap, but don't see work done on it). I am confirming with the team. That said, Ibis supports "escape hatches" to be able to leverage underlying functionality; in this case, the dataset could implement upsert
using the raw_sql()
method, at least until Ibis natively support upserting.
That said, a raw_sql()
-based upsert
may or may not work cleanly for all backends (a big part of the value of Ibis is that it unifies the syntax for doing this across different SQL dialects). We can probably figure out a way to unblock you based on the backends you use (Postgres?) in the interim, even if the same may not work perfectly for MS SQL or another backend.Deepyaman Datta
09/13/2024, 7:45 PMJuan Luis
09/13/2024, 7:50 PMupsert
in Ibis https://github.com/ibis-project/ibis/issues/5391Vishal Pandey
09/13/2024, 7:51 PMVishal Pandey
09/13/2024, 7:51 PMDeepyaman Datta
09/13/2024, 7:53 PMVishal Pandey
09/13/2024, 7:54 PMDeepyaman Datta
09/13/2024, 8:02 PMraw_sql
and make sure it works for you (again, shouldn't be hard), if you're down to collaborate on this.Deepyaman Datta
09/13/2024, 8:03 PMibis.TableDataset
in your project is probably the right way to go. 🙂 That way we can implement some of this functionality quicker, and you also won't have to wait for a new Kedro-Datasets release.