Vishal Pandey
10/15/2024, 8:57 AMimport ibis
con = ibis.connect(POSTGRES_CONNECTION_STRING)
training_meta_table:ir.Table = con.table("training_metadata")
filters = {
"customer_ids" : [59] ,
"queue_names" : ["General Lit - Misclassifications", "MoveDocs-MR"],
"start_date" : "2024-09-5 00:00:00",
"end_date" : "2024-09-11 00:00:00",
"doc_types" : [],
"fields" : ["patientFirstName", "patientLastName", "Service Date", "Doctor"]
}
field_conditions = training_meta_table.fields_present.contains(filters["fields"][0]) | training_meta_table.fields_present.contains(filters["fields"][1]) | training_meta_table.fields_present.contains(filters["fields"][2]) | training_meta_table.fields_present.contains(filters["fields"][3])
So there are many or conditions we would like to dynamically join together to create 1 final condition based on the input filtersVishal Pandey
10/15/2024, 8:58 AMdatajoely
10/15/2024, 9:01 AMVishal Pandey
10/15/2024, 9:05 AMdatajoely
10/15/2024, 10:05 AMimport ibis
from functools import reduce
import operator
# Establish connection
con = ibis.connect(POSTGRES_CONNECTION_STRING)
training_meta = con.table("training_metadata")
# Define filters
filters = {
"customer_ids": [59],
"queue_names": ["General Lit - Misclassifications", "MoveDocs-MR"],
"start_date": "2024-09-05 00:00:00",
"end_date": "2024-09-11 00:00:00",
"fields": ["patientFirstName", "patientLastName", "Service Date", "Doctor"]
}
# Helper function to build filter conditions
def build_filters(table, filters):
conditions = [
table.customer_id.isin(filters["customer_ids"]),
table.queue_name.isin(filters["queue_names"]),
table.date.between(filters["start_date"], filters["end_date"]),
table.fields_present.overlaps(filters["fields"])
]
return reduce(operator.and_, conditions)
# Apply filters and select fields using the helper function
result = (
training_meta
.filter(build_filters(training_meta, filters))
.select(filters["fields"])
)
# Execute the query (uncomment when ready)
# fetched_results = result.execute()
Vishal Pandey
10/15/2024, 10:12 AMoverlaps()
here.
Just a bit of context here - The fields_present column in the table contains comma separated strings , attaching some examples as well here -
Document Date, Dr. Name, Doctor, Sender Free Text, Sender, Sender Logo, Sender Signature
Document Type, patientDOB, patientFirstName, patientLastName, providerEntity, referringPhysician, serviceDate
and we have to return the row if any of the value present in filters["fields"] is found as a substring in the fields_present column and that is why I used the code below :
training_meta_table.fields_present.contains(filters["fields"][0])
contains()
comes from ibis string expressions.
But you see it is difficult to manually create such ibis expressions for the whole list of values