Puneet Saini
12/11/2024, 12:34 PMHall
12/11/2024, 12:34 PMMerel
12/11/2024, 12:41 PMPuneet Saini
12/11/2024, 12:44 PMLaurens Vijnck
12/11/2024, 12:45 PMjson.load(fs_file)
is used, which does not support jsonl
file out of the boxPuneet Saini
12/11/2024, 12:45 PMPuneet Saini
12/11/2024, 12:45 PMPuneet Saini
12/11/2024, 12:46 PMPuneet Saini
12/11/2024, 12:46 PMLaurens Vijnck
12/11/2024, 12:46 PMPuneet Saini
12/11/2024, 12:49 PMPuneet Saini
12/11/2024, 12:50 PMPuneet Saini
12/11/2024, 12:51 PMLaurens Vijnck
12/11/2024, 12:58 PMLaurens Vijnck
12/11/2024, 12:58 PMLaurens Vijnck
12/11/2024, 12:59 PMdatajoely
12/11/2024, 2:32 PMdatajoely
12/11/2024, 2:32 PMIan Whalen
12/11/2024, 2:39 PM"""
Dataset definition for a JSONLines dataset.
"""
import json
from typing import List
from kedro.extras.datasets.text import TextDataSet
class JSONLinesDataSet(TextDataSet):
"""Class for handling JSON lines files (.jsonl).
"""
def _load(self) -> List[dict]:
return [json.loads(line) for line in super()._load().split("\n")]
def _save(self, data: List[dict]) -> None:
super()._save("\n".join(json.dumps(line) for line in data))