Ancestors

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:06

I wonder what the probability that 32 random bytes happens to be valid UTF-8 is

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:16

@monoidmusician about 0

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:18

@pierogiburo well it’s at least 1/32, if all the top bits happen to be zero. but i see where you’re coming from :3

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:18

@monoidmusician that's 1/2^32, no?

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:20

@pierogiburo oh wait, yeah you’re right. i knew that last week :neocat_woozy:

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:22

@pierogiburo that said, i’ve rolled one before in way less than 2^32 attempts, so it feels like it is a bit more reasonable of a number?

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:28

@monoidmusician hmmm lemme think

say P(n) is the probability that n bytes form a valid utf-8 string. define P(0)=1

so we have 1/2 probability that the highest bit is 0, so we have a sequence of length 1

1/8 probability that the highest bits are 110, times 1/4 probability that the next byte is continuation byte, sequence length 2

1/161/41/4 probability of valid sequence length 3

1/321/41/4*1/4 probability of valid sequence length 4

P(n)=P(n-1)/2 + P(n-2)/32 + P(n-3)/256 + P(n-4)/2048

i'm from my phone rn, so can't check what P(32) evaluates to, but it seems unlikely to be that big

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Toot

Written by Evvy :neofox_floof: on 2024-12-21 at 19:34

@monoidmusician 1.3044272060623614e-08 apparently (thanks python on my phone)

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Descendants

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 20:16

@pierogiburo oh that’s great, thank you!

huh, so either i got really really lucky, or this QR scanner isn’t fully validating UTF-8 or something :neocat_think_woozy:

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 20:34

@pierogiburo more precisely, that would be 4037006666794396657/309485009821345068724781056 :ms_wink_tongue: (thanks Haskell Ratio)

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113692484821774712
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
313.343564 milliseconds
Gemini-to-HTML Time
1.34236 milliseconds

This content has been proxied by September (3851b).