Ancestors

Written by Jepsen on 2024-11-12 at 14:15

A new #Jepsen report! We worked with Buf to analyze the safety of Bufstream, a Kafka-compatible streaming system. We found three safety and two liveness issues in Bufstream 0.1.0, including the loss of acknowledged writes in healthy clusters. These problems were resolved by version 0.1.3.

https://jepsen.io/analyses/bufstream-0.1.0

=> More informations about this toot | More toots from jepsen@jepsen.io

Toot

Written by Jepsen on 2024-11-12 at 14:23

One of the surprising things we found during this collaboration was that the #Kafka transaction protocol assumes messages are delivered in order, but allows those messages to be delivered to different nodes over different TCP connections. This means that a client's commit or abort message could be applied to a completely different transaction, or even in the middle of a transaction.

=> More informations about this toot | More toots from jepsen@jepsen.io

Descendants

Written by Jepsen on 2024-11-12 at 14:26

We reproduced lost writes, aborted reads, and torn transactions in Bufstream and Kafka. We believe every Kafka-compatible system is likely vulnerable to these issues. Some clients may be able to mitigate them, but the official Java client presently does not.

=> More informations about this toot | More toots from jepsen@jepsen.io

Written by Jepsen on 2024-11-13 at 16:59

Thanks to everyone who wrote in objecting to the report's description of data loss due to auto-commit. Some experiments this morning suggest that we got it wrong (at least for the official Java client). We've published an update to the report: https://jepsen.io/analyses/bufstream-0.1.0#updates

=> More informations about this toot | More toots from jepsen@jepsen.io

Written by 🦇💩🤪🏳️‍⚧️👧🏽 on 2024-11-12 at 16:33

@jepsen OMG PTSD! That hit me a few years ago. Took forever to figure out what was going on.

=> More informations about this toot | More toots from CumVampire@chaosfem.tw

Written by Jepsen on 2024-11-12 at 16:46

@CumVampire Ugh seriously!? I'm not surprised, it took us like a week to sort out too. KIP-890 sort of characterizes the problem, but it's sort of buried--they're talking about hanging transactions when there's actually a data loss risk!

Did you happen to write up your experience?

=> More informations about this toot | More toots from jepsen@jepsen.io

Written by 🦇💩🤪🏳️‍⚧️👧🏽 on 2024-11-12 at 16:56

@jepsen Oh, no. That would have been smart. That was a POC project I was working on, so it never made daylight.

=> More informations about this toot | More toots from CumVampire@chaosfem.tw

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113470435374631236
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
295.203105 milliseconds
Gemini-to-HTML Time
1.707374 milliseconds

This content has been proxied by September (ba2dc).