Ancestors

Toot

Written by Romain Guy on 2024-11-11 at 00:34

"A Micro-optimization You Will Never Need"

And that you should never use. But you might learn something.

https://www.romainguy.dev/posts/2024/a-micro-optimization-you-will-never-need/

=> More informations about this toot | More toots from romainguy@androiddev.social

Descendants

Written by Arseny Kapoulkine on 2024-11-11 at 02:23

@romainguy Once the value is 0-32, you can do a table lookup from a 33-byte table, which will eliminate branch mispredictions (if they happen).

Could be slower if most values follow the first branch, but not super likely.

=> More informations about this toot | More toots from zeux@mastodon.gamedev.place

Written by Romain Guy on 2024-11-11 at 03:23

@zeux Most values fall in the first branch, there’s also the problem of getting the lookup table (this is Java/Kotlin so there are interesting issues related to that)

=> More informations about this toot | More toots from romainguy@androiddev.social

Written by Romain Guy on 2024-11-11 at 16:38

@zeux I double-checked just in case, and for the common case it's 20% slower (accessing an array always has a branch in Java/Kotlin for bounds checks, and there are extra ldr on the main path). Iterating over all possible values it's ~6% faster

=> More informations about this toot | More toots from romainguy@androiddev.social

Written by Tanany on 2024-11-11 at 20:38

@romainguy

it may be a sign for me to focus on building more user-pleasing features and let micro optimization for fun off work hours😭😭

=> More informations about this toot | More toots from itsTanany@mastodon.social

Written by Flo on 2024-11-11 at 22:06

@romainguy correct me if I'm wrong, but isn't that just a 32 - clz (count leading zeros) ? Eventually clamped to 13 at least.

Is all of this just because you can't ask the JVM to generate specific assembly on some architectures?

=> More informations about this toot | More toots from pikzen@androiddev.social

Written by Romain Guy on 2024-11-11 at 22:24

@pikzen It's not just 32 - clz. We want values with 17 to 19 leading zeroes to return 15, and values with 14 to 16 leading zeroes to return 18. There might be a clever way to do this without a switch, but might as well use a lookup table at that point.

=> More informations about this toot | More toots from romainguy@androiddev.social

Written by Flo on 2024-11-12 at 20:41

@romainguy ah, right, didn't consider that there's some snapping to values. Yeah, at this point, the approach you have is probably the most efficient, lookup tables might cause memory reads instead.

=> More informations about this toot | More toots from pikzen@androiddev.social

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113461509943028991
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
311.557973 milliseconds
Gemini-to-HTML Time
1.292308 milliseconds

This content has been proxied by September (3851b).