Ancestors

Toot

Written by Marc B. Reynolds on 2024-10-28 at 08:08

FWIW: That Geoff Langdale question about byte gathers in AVX2 lead to @lemire pointing to a 16-bit elements in 128-bit reg gather which uiCA predicts at less that 4 cycles/iteration (skylake) in simdjson. Great example of super clever table lookups murdering computation.

https://github.com/simdjson/simdjson/blob/3c0d032dedcc3c87d4ef726a2f7a3c2a26a738b8/include/simdjson/westmere/simd.h#L119

=> More informations about this toot | More toots from mbr@mastodon.gamedev.place

Descendants

Written by Marc B. Reynolds on 2024-10-28 at 08:46

@lemire Stripped down version:

https://gcc.godbolt.org/z/41qd4GsM3

=> More informations about this toot | More toots from mbr@mastodon.gamedev.place

Written by Leonard Ritter on 2024-10-28 at 08:47

@mbr @lemire "is this FPGA?"

=> More informations about this toot | More toots from lritter@mastodon.gamedev.place

Written by Marc B. Reynolds on 2024-10-28 at 09:08

@lritter @lemire I haven't tried deciphering 'thintable' but a comment by Geoff seems to indicate that it's "the method of four Russians".

=> More informations about this toot | More toots from mbr@mastodon.gamedev.place

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113384023443788672
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
265.719079 milliseconds
Gemini-to-HTML Time
0.826226 milliseconds

This content has been proxied by September (ba2dc).