A new #Jepsen report! We worked with Buf to analyze the safety of Bufstream, a Kafka-compatible streaming system. We found three safety and two liveness issues in Bufstream 0.1.0, including the loss of acknowledged writes in healthy clusters. These problems were resolved by version 0.1.3.
https://jepsen.io/analyses/bufstream-0.1.0
=> More informations about this toot | More toots from jepsen@jepsen.io
One of the surprising things we found during this collaboration was that the #Kafka transaction protocol assumes messages are delivered in order, but allows those messages to be delivered to different nodes over different TCP connections. This means that a client's commit or abort message could be applied to a completely different transaction, or even in the middle of a transaction.
=> More informations about this toot | More toots from jepsen@jepsen.io
We reproduced lost writes, aborted reads, and torn transactions in Bufstream and Kafka. We believe every Kafka-compatible system is likely vulnerable to these issues. Some clients may be able to mitigate them, but the official Java client presently does not.
=> More informations about this toot | More toots from jepsen@jepsen.io
Thanks to everyone who wrote in objecting to the report's description of data loss due to auto-commit. Some experiments this morning suggest that we got it wrong (at least for the official Java client). We've published an update to the report: https://jepsen.io/analyses/bufstream-0.1.0#updates
=> More informations about this toot | More toots from jepsen@jepsen.io
@jepsen OMG PTSD! That hit me a few years ago. Took forever to figure out what was going on.
=> More informations about this toot | More toots from CumVampire@chaosfem.tw
@CumVampire Ugh seriously!? I'm not surprised, it took us like a week to sort out too. KIP-890 sort of characterizes the problem, but it's sort of buried--they're talking about hanging transactions when there's actually a data loss risk!
Did you happen to write up your experience?
=> More informations about this toot | More toots from jepsen@jepsen.io
@jepsen Oh, no. That would have been smart. That was a POC project I was working on, so it never made daylight.
=> More informations about this toot | More toots from CumVampire@chaosfem.tw This content has been proxied by September (ba2dc).Proxy Information
text/gemini