Release cluster state bytes earlier in PublicationTransportHandler #127054

original-brownbear · 2025-04-18T13:15:18Z

The cluster state publication can be quite large, especially when initially joining a cluster. By not leaving the forking to the transport layer we can release the network buffer before actually processing it which can save up to O(100M) of heap in some situations and improves stability quite a bit for situations where small nodes join clusters with large existing states.

PS: I admit this looks a little hacky, but we can create a nicer primitive for ensuring a single dec-ref under all failure conditions separately IMO. We have a couple similar looking spots that could benefit from a cleanup as well.

See the following screenshot, with recent transport layer fixes this does indeed work to free the actual underlying Netty buffer now:

The cluster state publication can be quite large, especially when initially joining a cluster. By not leaving the forking to the transport layer we can release the network buffer before actually processing it which can save up to O(100M) of heap in some situations and improves stability quite a bit for situations where small nodes join clusters with large existing states.

elasticsearchmachine · 2025-04-18T13:15:42Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

original-brownbear · 2025-04-18T15:49:30Z

Hmm fixing that one failing test (testIncludesLastCommittedFieldsInDiffSerialization) is a a little tricky, we do play some tricks there with the way the transport request is handled that break once we move the forking into the handler itself. Not entirely sure how to fix this yet while preserving what we are actually trying to test. Any tipps are welcome, but hopefully I'll be able to figure this out eventually still :D

original-brownbear added >non-issue :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.19.0 v9.1.0 labels Apr 18, 2025

elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Apr 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release cluster state bytes earlier in PublicationTransportHandler #127054

Release cluster state bytes earlier in PublicationTransportHandler #127054

original-brownbear commented Apr 18, 2025

elasticsearchmachine commented Apr 18, 2025

original-brownbear commented Apr 18, 2025

Release cluster state bytes earlier in PublicationTransportHandler #127054

Are you sure you want to change the base?

Release cluster state bytes earlier in PublicationTransportHandler #127054

Conversation

original-brownbear commented Apr 18, 2025

elasticsearchmachine commented Apr 18, 2025

original-brownbear commented Apr 18, 2025