Skip to content

[ACP] RangeBounds::overlaps #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
theli-ua opened this issue Oct 12, 2023 · 15 comments
Open

[ACP] RangeBounds::overlaps #277

theli-ua opened this issue Oct 12, 2023 · 15 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@theli-ua
Copy link

theli-ua commented Oct 12, 2023

Proposal

Problem statement

RangeBounds today is a standard trait to represent ranges and provides a single method contains used to determine if a range contains a given element.
Another common operation on ranges (like intervals) is to check if 2 overlap. Today there is no method to check for that

Motivating examples or use cases

  • a database implementation using btree index where a parent node would split its total key range in ranges per child. To implement an operation to find keys within a given range it would need to check if child ranges overlap with an input range
  • An interval map implementation would need to be able to check for overlaps
  • A memory copy/move implementation operation may check for overlaps to decide if it needs to use move or copy
  • A database implementation allowing to operate on ranges (eg range deletes, range queries, etc) would need to have an ability to check for range overlaps

Solution sketch

pub trait RangeBounds<T: ?Sized> {

    /// Returns `true` if there exists an element present in both ranges.
    ///
    /// # Examples
    ///
    /// ```
    /// assert!( (3..5).overlaps(&(1..4));
    /// assert!(!(3..5).overlaps(&(5..6));
    /// ```
    ///
    fn overlaps<O, E>(&self, other: &O) -> bool
    where
        T: PartialOrd<E>,
        E: ?Sized + PartialOrd<T>,
        O: RangeBounds<E>,
    {
          todo!()
    }
}


Open questions

There is a question whether to consider (Excluded(1), Excluded(3)) as overlapping with (Excluded(2), Excluded(5)).
Since this is only a problem for cases where a discrete step between elements is known we could have a specialized implementation for anything that implements Step that handles these cases.

Alternatives

Alternative would be for developers to implement it on specific types, or an extension trait on top of RangeBounds (which could be a separate crate)

Links and related work

@theli-ua theli-ua added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 12, 2023
@pitaj
Copy link

pitaj commented Oct 12, 2023

In most cases, don't people want to perform some sort of operation on the overlapping region? Could the function return a range instead? Then you could use Range::is_empty to just check if there is overlap.

@theli-ua
Copy link
Author

@pitaj well, we'd probably need to add is_empty to RangeBounds in that case.

@scottmcm
Copy link
Member

Hmm, intersect probably wants different return types depending on the input ranges, not just an impl RangeBounds.

Is it reasonable to mix the types? I guess Range & RangeInclusive doesn't know the type to return, so maybe not?

@theli-ua
Copy link
Author

Yeah, given that this is a trait and in case of intersect/overlap we would need to return a concrete type I'd say for the trait specifically it makes sense to have overlaps that returns a boolean and would be similar to currently provided method - contains

@quaternic
Copy link

quaternic commented Oct 17, 2023

  fn overlaps<O, E>(&self, other: &O) -> bool
  where
      T: PartialOrd<E>,
      E: ?Sized + PartialOrd<T>,
      O: RangeBounds<E>,

Functions like overlaps or intersect should require T: Ord, because if the upper (or lower) bounds of the two sets are not mutually comparable, you can't necessarily compute the intersection as a range. It might not even be representable as a range at all.

Example: Nonnegative integers can be partially ordered by divisibility, so that e.g. D(2)..=D(30) contains D(x) iff x is even and divides 30. Then the intersection of the ranges D(2)..=D(30) and D(3)..=D(12) is D(6)..=D(6), but computing that requires more than what PartialOrd provides.

@theli-ua
Copy link
Author

@quaternic that makes sense so it'd be

pub trait RangeBounds<T: ?Sized> {

    /// Returns `true` if there exists an element present in both ranges.
    ///
    /// # Examples
    ///
    /// ```
    /// assert!( (3..5).overlaps(&(1..4));
    /// assert!(!(3..5).overlaps(&(5..6));
    /// ```
    ///
    fn overlaps(&self, other: &T) -> bool
    where
        T: Ord,
    {
          todo!()
    }
}

@pitaj
Copy link

pitaj commented Dec 25, 2023

Hmm, intersect probably wants different return types depending on the input ranges, not just an impl RangeBounds.

Is it reasonable to mix the types? I guess Range & RangeInclusive doesn't know the type to return, so maybe not?

Yeah, given that this is a trait and in case of intersect/overlap we would need to return a concrete type I'd say for the trait specifically it makes sense to have overlaps that returns a boolean

The function could just return (Bound, Bound) rather than a range type to avoid this issue. It would be good to have is_empty on RangeBounds though.

Should it maybe return Option<(Bound, Bound)> with None for the "no overlap" case, or should it return a specific empty range?

Like what should (2..6).intersect(11..13) return? (Bound::Excluded(11), Bound::Excluded(6))?

@scottmcm
Copy link
Member

Should it maybe return Option<(Bound, Bound)> with None for the "no overlap" case, or should it return a specific empty range?

For the concrete types case, I think the right behaviour would be Range::intersect(a..b, c..d) => max(a, c)..min(b, d). And thus (2..6).intersect(11..13) would give the range 11..6 -- on which people can call Range::is_empty if they want, just like they would on any other Range they get without knowing a priori whether it's empty.

I'm not exactly sure how it should work for other bound combinations, though. Can we usefully define min and max on Bound, maybe? Certainly things like min(Unbounded, Unbounded) => Unbounded and min(Unbounded, Included(10)) => Included(10) are easy...

@quaternic
Copy link

Bound on its own doesn't have a meaningful (unique) ordering, since it depends on whether it is used as the upper or lower bound:
As a lower bound, Unbounded < ... < Included(1) < Excluded(1) < Included(2) < Excluded(2) < ...
As an upper bound, ... < Excluded(1) < Included(1) < Excluded(2) < Included(2) < ... < Unbounded

But taking that into account by defining max_lower and min_upper, I agree that

the right behaviour would be Range::intersect(a..b, c..d) => max_lower(a, c)..min_upper(b, d)

where it turns out that both max_lower(a,c) and min_upper(b,d) have semantics of "choose according to the respective operator on the inner values, prefer Excluded when inner values are equal, and only Unbounded if both are".

@scottmcm
Copy link
Member

Bound on its own doesn't have a meaningful (unique) ordering, since it depends on whether it is used as the upper or lower bound

That just means it's not Ord, though, not that it can't define a min/max. After all, f32::min has the analogous "and only NAN if both are" behaviour.

Said otherwise, would a hypothetical min_lower be any different from min_upper?

@quaternic
Copy link

Yes, the minimum of two lower bounds (or the maximum of two upper bounds) would prefer Included over Excluded on equal inner values. These would similarly be used for the operation of computing the least range that contains both input ranges. E.g. (2..13).join(11..13) == 2..13 and (2..13).join(11..=13) == 2..=13. (That one is probably harder to justify for the standard range types than intersect, but comes up at least for bounding boxes.)

We won't be changing it now (or ever), but for the sake of discussion one could consider an alternative bound-type defined as

enum Bound<T> {
    BeforeAny,
    Before(T),
    After(T),
    AfterAny,
}

This wouldn't suffer from the same ambiguity, as you'd just define BeforeAny < ... < Before(0) < 0 < After(0) < Before(1) < 1 < After(1) < ... AfterAny

And the ranges would be like

  • 0..=5 -> (Before(0), After(5))
  • 0..5 -> (Before(0), Before(5))
  • 0.. -> (Before(0), AfterAny)
  • .. -> (BeforeAny, AfterAny)

@pitaj
Copy link

pitaj commented Feb 11, 2025

I've implemented clamp (intersection) in this playground along with is_empty and into_bounds. It has to work with the bounds by value, otherwise there's ambiguity (multiple applicable impls) as seen here

@kennytm
Copy link
Member

kennytm commented Feb 13, 2025

@pitaj thanks for the hard work

But if you need to go through that LessNonAdjacent mess to get (Excluded(1_u8), Excluded(2_u8)) return is-empty, I have doubt why ..0_u8 ((Unbounded, Excluded(0_u8))) remains treated as not-empty.

@quaternic
Copy link

Indeed.(Exluded(1), Excluded(2)) should be considered non-empty as a range. Sure, there isn't an x: u8 that would satisfy 1_u8 < x < 2_u8, but conceptually the range represents that predicate rather than any concrete set of values.

In fact, the contains-method allows for any type U: PartialOrd<T>, so you can implement another type that does have values in that range: playground

[src/main.rs:39:5] (Excluded(1), Excluded(2)).contains(&IntPlusHalf(1)) = true

On the other hand, when a >= b, the emptiness of a..b is implied by the required properties of PartialOrd.

@pitaj
Copy link

pitaj commented Feb 13, 2025

Great point! I've removed that and now (Excluded(1), Excluded(2)) is considered to be non-empty.

playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

5 participants