Ancestors

Toot

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:06

I wonder what the probability that 32 random bytes happens to be valid UTF-8 is

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Descendants

Written by Raymond on 2024-12-21 at 19:13

@monoidmusician its 1 in UTF-8 Codec Can’t Decode Byte

=> More informations about this toot | More toots from isAdisplayName@mathstodon.xyz

Written by Evvy :neofox_floof: on 2024-12-21 at 19:16

@monoidmusician about 0

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:18

@pierogiburo well it’s at least 1/32, if all the top bits happen to be zero. but i see where you’re coming from :3

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:18

@monoidmusician that's 1/2^32, no?

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:20

@pierogiburo oh wait, yeah you’re right. i knew that last week :neocat_woozy:

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:21

@monoidmusician :neocat_pat_woozy:

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 19:22

@pierogiburo that said, i’ve rolled one before in way less than 2^32 attempts, so it feels like it is a bit more reasonable of a number?

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:28

@monoidmusician hmmm lemme think

say P(n) is the probability that n bytes form a valid utf-8 string. define P(0)=1

so we have 1/2 probability that the highest bit is 0, so we have a sequence of length 1

1/8 probability that the highest bits are 110, times 1/4 probability that the next byte is continuation byte, sequence length 2

1/161/41/4 probability of valid sequence length 3

1/321/41/4*1/4 probability of valid sequence length 4

P(n)=P(n-1)/2 + P(n-2)/32 + P(n-3)/256 + P(n-4)/2048

i'm from my phone rn, so can't check what P(32) evaluates to, but it seems unlikely to be that big

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Evvy :neofox_floof: on 2024-12-21 at 19:34

@monoidmusician 1.3044272060623614e-08 apparently (thanks python on my phone)

=> More informations about this toot | More toots from pierogiburo@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 20:16

@pierogiburo oh that’s great, thank you!

huh, so either i got really really lucky, or this QR scanner isn’t fully validating UTF-8 or something :neocat_think_woozy:

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-21 at 20:34

@pierogiburo more precisely, that would be 4037006666794396657/309485009821345068724781056 :ms_wink_tongue: (thanks Haskell Ratio)

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Fay 🏳️‍🌈 on 2024-12-23 at 01:06

@monoidmusician is this why windows uses utf-16?

=> More informations about this toot | More toots from obfusk@tech.lgbt

Written by Verity :transHaskell:​:verifiedtransfem: on 2024-12-23 at 01:09

@obfusk is that more or less likely? :neocat_upsidedown:

=> More informations about this toot | More toots from monoidmusician@tech.lgbt

Written by Fay 🏳️‍🌈 on 2024-12-23 at 01:12

@monoidmusician easier to get valid utf-16

=> More informations about this toot | More toots from obfusk@tech.lgbt

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113692375181777288
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
418.143 milliseconds
Gemini-to-HTML Time
3.154795 milliseconds

This content has been proxied by September (ba2dc).