Théo Andro
10/16/2024, 1:57 PMdatabricks.ManagedTableDataset
is created (Problem of type, precision of DecimalType.
To avoid that, I want to define schema in my yaml file in order define schema should have the ManagedTableDataset in databricks.
Would you have some yaml example on how to create this schema ? (with the DecimalType if possible 🙂 ). I did not find any example, and IntegerType (a spark type) did not match anything for example.
Thanks and have a good day !Nok Lam Chan
10/16/2024, 2:06 PMThéo Andro
10/16/2024, 2:26 PMfrom pyspark.sql.types import StructType
StructType.fromjson(path_to_file)
I will follow this lead, and share my resultNok Lam Chan
10/16/2024, 4:30 PMdef __init__( # noqa: PLR0913
self,
*,
table: str,
catalog: str | None = None,
database: str = "default",
write_mode: str | None = None,
dataframe_type: str = "spark",
primary_key: str | list[str] | None = None,
version: Version | None = None,
# the following parameters are used by project hooks
# to create or update table properties
schema: dict[str, Any] | None = None,
partition_columns: list[str] | None = None,
owner_group: str | None = None,
metadata: dict[str, Any] | None = None,
) -> None:
From the constructor, it is expecting a dictionary typeThéo Andro
10/17/2024, 9:27 AMThéo Andro
10/17/2024, 9:27 AMFailed to convert the JSON string '{"metadata":{},"name":"content_id","nullable":"true","type":"long"}' to a field.
Théo Andro
10/17/2024, 9:28 AMNok Lam Chan
10/17/2024, 9:45 AMNok Lam Chan
10/17/2024, 9:45 AMThéo Andro
10/17/2024, 9:51 AMNok Lam Chan
10/17/2024, 9:53 AMThéo Andro
10/17/2024, 9:54 AMNok Lam Chan
10/17/2024, 9:56 AMThéo Andro
10/17/2024, 12:17 PMThéo Andro
10/17/2024, 1:03 PMNok Lam Chan
10/17/2024, 1:09 PM.json()
as well. Maybe linking the Spark docs would help? https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.types.StructType.html