Working with dataframes, I wanted a way to filter for string matching stuff. So, you can use usual boolean expressions such as df["column"]==val
. But if I have my own boolean function, boolfoo(str) => Bool
, I can't just do boolfoo(df["column"])
! Because df["column"]
isn't a single value, it's a series of data. Or maybe it's a dataframe itself. I can't remember, but either way, it's not going to work.
So why does the boolean == work? The series obviously doesn't match val
, but == is a special function known as a ufunc which smartly works on elements of arrays. It's actually really easy to make your own function into a ufunc.
import numpy as np def boolfoo(input): # your stuff here return output uboolfoo = np.frompyfunc(boolfoo,1,1)
Now, you can use uboolfoo on arrays, dataframes, whatever! And it will perform the pattern match on the stuff inside.
The first argument, boolfoo
in the example above, is your function you want to make into a ufunc. The second argument is the number of inputs your function takes. The third argument is the number of outputs your function gives. So change as you like, but I guess 1,1 is the most common!
text/gemini
This content has been proxied by September (3851b).