Hi all. Does anyone know of a way to configure a d...
# questions
p
Hi all. Does anyone know of a way to configure a dataset to write out a GeoPandas dataframe to a DuckDb database table, and vice-versa? I tried using ibis.TableDataset to write but it complains that the GeoDataFrame object doesn't have an 'as_table' attribute. Would I have to implement a custom DuckDb dataset? It doesn't look too hard but I don't want to reinvent the wheel if there's already a way to do it...
d
Yeah there’s a chance our implementation is incomplete. The easiest way to work with it, copy our implementation into your project and reference the class in the catalog by class path. Extend it to fix the limitation and we’d love a PR to improve it for everyone else!
👍 1
s
When I ran into the 'as_table' error, my solution was to return a ibis.memtable()
def your_node(data: pd.DataFrame) -> ibis.Table:
# ... some code that transforms the data
return ibis.memtable(data)
This provide the ibis dataset with an acceptable format for it to consume and write
This worked with the ibis.TableDataset connected to postgresql
d
oh great, yes that will work!
p
Thanks for the suggestion. Using memtable seems to try to work but throws a table not found error. Maybe it needs the database itself to exist first. I don't know enough about the ibis implementation to troubleshoot, so I might just try writing a simple duckdb specific reader/writer to begin with and see if I can get that going.
d
duckdb backend is your friend with most thins Ibis
s
When I connected ibis to a pgsql table, the table already existed. I do not think it will create the table for you.
d
It should on save no?
s
I am not sure, but intuitively I would not expect it to. There are a lot of details with create table spec that I do not see covered in dataframe/ibis. I don't think an ibis created table would be a 'good' one
d
@Deepyaman Datta any thoughts here?
d
@Swift As @datajoely said, saving a table from a dataset will create a table or view (depending on the
materialized
save arg). Ibis-created tables work; that said, I think it will have less flexibility than raw DDL (e.g. maybe you can't create a table with constraints). It would be an enhancement on the Ibis side to support more complex DDL, although I'd also be curious what specifically you'd be looking for support for.
@Paul Haakma sorry I didn't respond here earlier. As @Swift wrote, the Ibis datasets just accept Ibis tables as objects to save, so you would need to return a
ibis.memtable
from the GeoPandas dataframe. I believe one user (@Nelson Zambrano) introduced the ability to automatically wrap DataFrames to save from `ibis.TableDataset`; that is still pending review though. When you load back, you'd still get an Ibis table.