Ufuncs

Working with dataframes, I wanted a way to filter for string matching stuff. So, you can use usual boolean expressions such as df["column"]==val. But if I have my own boolean function, boolfoo(str) => Bool, I can't just do boolfoo(df["column"])! Because df["column"] isn't a single value, it's a series of data. Or maybe it's a dataframe itself. I can't remember, but either way, it's not going to work.

So why does the boolean == work? The series obviously doesn't match val, but == is a special function known as a ufunc which smartly works on elements of arrays. It's actually really easy to make your own function into a ufunc.

import numpy as np

def boolfoo(input):
  # your stuff here
  return output

uboolfoo = np.frompyfunc(boolfoo,1,1)

Now, you can use uboolfoo on arrays, dataframes, whatever! And it will perform the pattern match on the stuff inside.

What is the 1,1 about?

The first argument, boolfoo in the example above, is your function you want to make into a ufunc. The second argument is the number of inputs your function takes. The third argument is the number of outputs your function gives. So change as you like, but I guess 1,1 is the most common!

Proxy Information

Original URL: gemini://envs.net/~jupy/minihacks/ufuncs.gmi
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 115.43031 milliseconds
Gemini-to-HTML Time: 0.27089 milliseconds

This content has been proxied by September (3851b).