Skip to content

Centralize the concept of processors configuration #89662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Sep 6, 2022

Conversation

fcofdez
Copy link
Contributor

@fcofdez fcofdez commented Aug 26, 2022

This commit centralizes the processor count concept into the Processors class.
With this change now all the places using a processor count rely on this new class,
such as desired nodes, node.processors setting and autoscaling deciders.

  • Processor counts are rounded to up to 5 decimal places
  • Processors can be represented as doubles

Desired nodes processors were stored as floats, this poses some challenges during
upgrades as once the value is casted to a double, the precision increases and therefore
the number is not the same. In order to allow idempotent desired nodes updates after
upgrades, this commit introduces DesiredNode#equalsWithProcessorsCloseTo(DesiredNode that, double maxError)
which allows comparing two desired nodes that differ up to maxError in their processor
specification as floats.

@fcofdez fcofdez force-pushed the processors-precision branch 6 times, most recently from a047e87 to e5d90ae Compare August 29, 2022 22:44
@fcofdez fcofdez force-pushed the processors-precision branch from e5d90ae to 46ebfc1 Compare August 29, 2022 23:43
@fcofdez
Copy link
Contributor Author

fcofdez commented Aug 30, 2022

@elasticmachine run elasticsearch-ci/bwc

@fcofdez fcofdez added >enhancement :Distributed Coordination/Autoscaling Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Aug 30, 2022
@fcofdez fcofdez marked this pull request as ready for review August 30, 2022 07:45
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine
Copy link
Collaborator

Hi @fcofdez, I've created a changelog YAML for you.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the production code and left comments.


@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
return builder.value(count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this produce more than 5 decimals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count is already truncated to 5 decimal places in the constructor. Additionally, there's a test (ProcessorsTests#testTruncatesAfterFiveDecimalPlaces) that checks that the string representation has up to 5 decimal places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to read Double.toString. IIUC, we would get scientific notation for numbers less than 1e-3? Maybe not a problem though, I suppose it is json compatible.

I also worry that there is some number, where even when rounded to 5 decimal places we need more decimals to represent it uniquely. I do not have an example, but did you run the test thousands of times to check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to read Double.toString. IIUC, we would get scientific notation for numbers less than 1e-3? Maybe not a problem though, I suppose it is json compatible.

According to RFC-8259 the scientific notation should be supported by any standard parser.

I also worry that there is some number, where even when rounded to 5 decimal places we need more decimals to represent it uniquely. I do not have an example, but did you run the test thousands of times to check?

I improved the test and I've run it 10K times.

}

private static boolean validNumberOfProcessors(double processors) {
return Double.isFinite(processors) && processors >= 0.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support 0.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, now seeing this is indeed needed for autoscaling. Sad, I wish we could find another way there, but that is for another day.

But can we ensure that only the ZERO constant can create a 0.0 value by having a special private constructor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove 0.0 as a valid number now?

Suggested change
return Double.isFinite(processors) && processors >= 0.0;
return Double.isFinite(processors) && processors > 0.0;

@@ -39,6 +39,7 @@

public class TransportUpdateDesiredNodesAction extends TransportMasterNodeAction<UpdateDesiredNodesRequest, UpdateDesiredNodesResponse> {
private static final Logger logger = LogManager.getLogger(TransportUpdateDesiredNodesAction.class);
private static final double MAX_DELTA_PROCESSORS = 7E-5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a magic constant, how did you come to this? Would this also work for 128 processors for instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run quite a few experiments in DesiredNodeTests#testFloatProcessorsConvertedToDoubleAreCloseToEqual starting from 1E-5 but that failed quite a few times. I increased it by 1E-5 at a time until the test stopped failing. I agree that's not super scientific method.

Comment on lines 347 to 349
float floatCount = (float) a.count();
float otherFloatCount = (float) b.count();
return Float.isFinite(floatCount) && Float.isFinite(otherFloatCount) && (Math.abs(floatCount - otherFloatCount) < maxError);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead use Math.max(Math.ulp(floatCount), Math.ulp(otherFloatCount)) + MIN_REPRESENTABLE_PROCESSORS as the error bound? Or has the number been rounded during x-content too?

I also wonder if we can be slightly more precise since we could know that the new number originates from a double? Might not be worth it though...


private final long refreshInterval;
private final int availableProcessors;
private final int allocatedProcessors;
private final Processors allocatedProcessors;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this changes our nodes x-content output to have double processors. I wonder if we need to consider that a breaking change, since it could break downstream clients?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll serialize the rounded allocated processors in x-content. I wonder if we want to serialize the fractional value in an additional field.

@@ -92,7 +92,8 @@ private void validate(DesiredNode node) {
int minProcessors = node.roundedDownMinProcessors();
Integer roundedUpMaxProcessors = node.roundedUpMaxProcessors();
int maxProcessors = roundedUpMaxProcessors == null ? minProcessors : roundedUpMaxProcessors;
Setting.intSetting(NODE_PROCESSORS_SETTING.getKey(), minProcessors, 1, maxProcessors, Setting.Property.NodeScope).get(settings);
Setting.doubleSetting(NODE_PROCESSORS_SETTING.getKey(), minProcessors, 1, maxProcessors, Setting.Property.NodeScope)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we lower the minValue to Double.MIN_VALUE or Processors.MIN_REPRESENTABLE_PROCESSORS?

private static int roundDown(float value) {
return Math.max(1, (int) Math.floor(value));
private static boolean invalidNumberOfProcessors(Processors processors) {
return processors != null && processors.count() <= 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can only be true for processors.count() == 0 now, but is that even a legal case for Processors? Given that we now validate Processors separately perhaps this method is no longer needed?

}

@Nullable
public static Processors of(Double count) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move all Processors construction to use an of method, just to simplify and avoid mistakes? That of method could hold the validation too then, helping on the ZERO part I commented on in another comment (though a minor point).

@@ -56,7 +56,7 @@ public AutoscalingDeciderResult scale(Settings configuration, AutoscalingDecider
builder.total(MINIMUM_FROZEN_STORAGE, MINIMUM_FROZEN_MEMORY, null);
builder.node(MINIMUM_FROZEN_STORAGE, MINIMUM_FROZEN_MEMORY, null);
} else {
builder.total(0L, 0L, 0f);
builder.total(ByteSizeValue.ZERO, ByteSizeValue.ZERO, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use Processors.ZERO instead of null?

@@ -153,6 +154,10 @@ public static AutoscalingPolicy randomAutoscalingPolicyOfName(final String name)
return new AutoscalingPolicy(name, randomRoles(), randomAutoscalingDeciders());
}

public static Processors randomProcessors() {
return Processors.of(randomInt(128) + randomDouble());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use randomDoubleBetween?

@fcofdez
Copy link
Contributor Author

fcofdez commented Sep 6, 2022

@henningandersen could you review this when you have the chance? Thanks!

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

}

private static boolean validNumberOfProcessors(double processors) {
return Double.isFinite(processors) && processors >= 0.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove 0.0 as a valid number now?

Suggested change
return Double.isFinite(processors) && processors >= 0.0;
return Double.isFinite(processors) && processors > 0.0;

}

private double randomNumberOfProcessors() {
return randomDoubleBetween(Math.ulp(0.0), 512.99999999, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Math.ulp(0.0) not the same as Double.MIN_VALUE? I'd prefer to use the latter then.

@@ -184,4 +184,8 @@ public static DiscoveryNode newDiscoveryNode(String nodeName) {
Version.CURRENT
);
}

public static double randomNumberOfProcessors() {
return randomDoubleBetween(Math.ulp(0.0), 512.99999999, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use Double.MIN_VALUE here?


@Override
protected Processors createTestInstance() {
return Processors.of(randomDoubleBetween(Math.ulp(0.0), 512.99999999, true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use Double.MIN_VALUE here?

}

public void testRounding() {
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add 1-2 more cases:

  1. A number higher than 1.something, for instance 10.1.
  2. A number that is already an integer, for instance 10.

import static org.hamcrest.Matchers.lessThanOrEqualTo;

public class ProcessorsTests extends ESTestCase {
public void testTruncatesAfterFiveDecimalPlaces() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, can we add a test that Processors.of(value).count() == value?

@fcofdez fcofdez merged commit 284dce6 into elastic:main Sep 6, 2022
@fcofdez
Copy link
Contributor Author

fcofdez commented Sep 6, 2022

Thanks Henning!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Autoscaling >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants