Ancestors

Written by Dachary on 2025-01-30 at 23:06

🤞

I just kicked off a program that uses a local LLM on my laptop to categorize code examples based on definitions we’ve come up with.

This ONE docs repo I’m working on (we have 50+) has 5,500 code examples. Based on some smaller data sets, it will take over 5.5 hours to complete this one repo, assuming it doesn’t crash.

I’ve got it outputting a processed count to the console every 100 files so I’ll wander by occasionally to see if it’s still running. Hope it doesn’t crash!

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 03:38

For the curious, we are trying to categorize and count the code examples in our docs so we can get a better understanding of what we have. Developers ask for “more” and “more realistic” code examples, but we can’t really plan work to address gaps without more info. We have realized we have a very poor understanding of what already exists, so step one is to categorize and count our code examples by “type” and programming language. We can analyze an inventory of what exists to identify gaps.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 03:43

We’re using an LLM to help assign categories to each code example based on some definitions we’ve come up with. We have over 20,000 code snippets across our docs corpus, so manually categorizing this volume just isn’t possible. We will spot check key product areas and any suspicious numbers (a Python snippet that came up as a CLI command, for example) and will take a reasonably accurate categorizing process as a starting point. Currently iterating on definitions for the LLM to improve results.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 12:35

Run 2 completed overnight! The results seem - worse - at a quick glance. The LLM hallucinated more categories this time, and assigned more examples to the hallucinated categories.

I think I’ll kick off another run without changing category definitions. Curious how the results will vary.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 18:33

Run 3 results were worse than run 2. The LLM added more undesired categories, and categorized more things into these made-up categories.

I tweaked the prompt so I’m now feeding it one of three possible prompts, based on the code example programming language. I.e. JSON will never be a Usage Example, according to our definitions, so I have a version of that prompt that doesn’t include the category if the file is JSON.

Running again, then will add handling to reject invalid categories.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 18:39

I do wonder now how much faster this would run on my personal laptop. I maxed out the laptop specs when I bought it in 2021, and it is still substantially more powerful in both CPU and RAM than my work laptop.

Maybe I’ll try it on my personal machine over the weekend just to see. 🧐

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-01-31 at 23:51

lol. Decided to try running my tool on my personal laptop. It is more than 2x as fast as my work laptop, even though the machine is older.

Thanks, past me, for looking out for future me! I will apparently never regret extra CPU & RAM.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-02-01 at 03:26

Cool beans. I did two runs in a row on my personal laptop (without tweaking anything) in the time it took me to do 1 run on my work laptop. And they match! So at least the output is consistent between runs.

I’m weirdly looking forward to digging into the results next week.

I suspect it’s possible to make additional gains in accuracy by looking for specific keywords in the file path. But that might be yak shaving. So now my task is to tamp down on the inclination to optimize.

=> More informations about this toot | More toots from dachary@dacharycarey.social

Written by Dachary on 2025-02-01 at 03:50

Oh no.

The report was interesting. And I may have decided to run my tool across the rest of our corpus.

Why, oh why, am I spending my Friday night doing work?

Darn these interesting problems!

(I am a nerd. I have been sniped.)

=> More informations about this toot | More toots from dachary@dacharycarey.social

Toot

Written by DJ Fragile Toolchains on 2025-02-01 at 04:36

@dachary it's just occurred to me that if you're given an interesting problem and then had it taken away from you, you've been nerd snipped. 😁

=> More informations about this toot | More toots from plaindocs@chaos.social

Descendants

Written by Dachary on 2025-02-01 at 05:06

@plaindocs Well that is my favorite thing I have heard in a while! 😂 Nice wordplay!

=> More informations about this toot | More toots from dachary@dacharycarey.social

Proxy Information

Original URL: gemini://mastogem.picasoft.net/thread/113926771529759525
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 289.822282 milliseconds
Gemini-to-HTML Time: 1.8458 milliseconds

This content has been proxied by September (3851b).