内容简介:Two years ago I had the luck to participate toSome time ago I started working on a little Haskell pet project to compute the common time availability of several people (something likeA bit after that
Two years ago I had the luck to participate to DDDEurope and listen to the nice keynote by Eric Evans on Modelling Time . I really liked the idea of modelling time not necessarily as instants but more as intervals, and making explicit the possible relations which two intervals can have. These relations are described by the so called Allen’s interval algebra .
Some time ago I started working on a little Haskell pet project to compute the common time availability of several people (something like doodle.com but where everyone first declares his/her availability and only then a time is chosen from the intersection of the availabilities) and the key concept of the domain were exactly intervals.
A bit after that Taylor Fausak released a Haskell package called Rampart which implements exactly the relations of Allen’s interval algebra.
Finding myself with a little of free time, I didn’t resist the temptation of trying to link my own library, which was dealing with intervals, to Rampart, so that I could consider relations between intervals.
While I was doing that, I realised that interpreting Allen’s interval algebra relations without ambiguities was not trivial (see for example this Rampart issue) and in the end I decided to reimplement it in a more general and unambiguous way.
INTERVALS
Loosely speaking we can define an interval as the set of values included between two boundaries. Trying to model this idea using types, we could sketch something like
data Boundary a = Boundary { _value :: a } data Interval a = Interval { _start :: Boundary a, _end :: Boundary a }
We would like to ensure that _start
is always smaller than _end
. To this end we hide the Interval
data constructor and we expose a function to build an interval from two boundaries
instance Ord a => Ord (Boundary a) boundsInterval :: Ord a => Boundary a -> Boundary a -> Interval
At this point we have to make our first design decision. What should boundsInterval
return if the first boundary if bigger that the second one? One option could be returning Maybe Interval
to notify the caller that he provided invalid values. Another option is to refine a bit the vague definition of interval we gave above, declaring an interval to be the set of values which are greater than _start
and smaller than _end
. Doing so, it becomes clear that, if _start
is greater than _end
, the interval delimited by _start
and _end
should simply be empty. Currently though we don’t have an explicit way to represent the empty interval, so let’s add it
data Interval a = Empty | Interval { _start :: Boundary a, _end :: Boundary a }
RELATIONS
Now that we defined what an interval could be, let’s try to understand how two intervals could relate to each other. We can list all the 13 possible relations in the following table
Relation | Illustration | Interpretation |
---|---|---|
X takes place before Y | ||
X meets Y ( i stands for i nverse ) | ||
X overlaps with Y | ||
X starts Y | ||
X during Y | ||
X finishes Y | ||
X is equal to Y |
and define the following Haskell data type to enumerate all the possibilities
data Relation = Equal | Starts | Finishes | During | StartedBy | FinishedBy | Contains | Before | After | Meets | IsMet | Overlaps | OverlappedBy
We would like now to define a function
relate :: Ord a => Interval a -> Interval a -> Relation
to compute how two given intervals are related.
SINGLETONS
When we try concretely to implement relate
we realise that there are some situations which are a bit ambiguous and they all pertain the case where one of the two intervals is a singleton, i.e. the starting bound and the ending bound coincide.
Let’s consider for example the relation between two intervals a = boundsInterval x x
and b = boundsInterval x y
. Should it be that a
meets b
since they have a boundary in common and b
happens all after a
? Or should it be that a
starts b
? Or maybe a
overlaps b
? It is not immediately clear what the correct choice should be.
The Allen’s paper notices the issue and dismisses it considering only intervals with a start smaller than the end.
One possible solution comes from a more precise inspection of the domain and using a bit of maths to abstract the relevant information out of our particular case.
BOUNDARIES
The first ingredient to solve our issue with relate
is to be more precise in the description of our domain model.
Above, when we gave the definition of interval, we never really specified the semantic of the boundaries. When we write Interval x y
do we mean that x
and y
are actually contained in the interval, or do we mean that they are excluded? There is no correct choice here, so to remain agnostic of this detail, we provide the possibility to specify explicitly in a boundary whether it should be included or excluded
data IsIncluded = Included | Excluded data Boundary a = Boundary { _boundaryValue :: a , _isIncluded :: IsIncluded }
Now we can be very explicit when we define our intervals
a = boundsInterval (Boundary 1 Included) (Boundary 3 Excluded) -- 1 ≤ x < 3 b = boundsInterval (Boundary 2 Included) (Boundary 2 Included) -- 2 ≤ x ≤ 2 => b = {2} c = boundsInterval (Boundary 0 Excluded) (Boundary 0 Included) -- 0 < x ≤ 0 => c is empty
RELATIONS RELOADED
Now that we introduced both included and excluded boundaries, the 13 cases in the Relation
data type reveal themselves as being not that well defined. For example, if we consider the two intervals
a = boundsInterval (Boundary 1 Excluded) (Boundary 2 Included) b = boundsInterval (Boundary 2 Included) (Boundary 3 Excluded)
what should relate a b
be? Should it be Meets
or should it be Overlaps
? We clearly need a more precise definition to disambiguate such cases.
To do this, we need to consider the operations that we have available on intervals. Considering intervals as sets, we can derive for them operations to compute the intersection and the union of two intervals
intersection :: Ord a => Interval a -> Interval a -> Interval a union :: Ord a => Interval a -> Interval a -> Either (Interval a) (Interval a, Interval a)
Notice how we are making explicit that the union of two intervals could be either one or two separate intervals.
Intersection and union induce a partial order ≤
on intervals defined by
-- a is contained in b a ≤ b = a `union` b == b = a `intersection` b == a
Using this partial order, we can identify four cases, when considering two intervals a
and b
:
-
a ≤ b
andb ≤ a
: this, by antisymmetry, means thata == b
; hence, it is clear that is must berelate a b = Equal
-
a ≤ b
andb ≰ a
: this means thata
is properly contained inb
. We can further dissect this possibility checking if the boundaries are equal:-
start a == start b
: in this caserelate a b = Starts
-
end a == end b
: in this caserelate a b = Finishes
-
start a != start b
andend a != end b
: in this caserelate a b = During
-
-
a ≰ b
andb ≤ a
: this is the symmetric case with respect to the previous one. Dissecting the possibilities as we did above we obtain theStartedBy
,FinishedBy
andContains
cases -
a ≰ b
andb ≰ a
: this means that neithera
is contained inb
, norb
is contained ina
The last case is still too broad, so we have to analyse it more precisely. We can check the intersection of a
and b
to split this further:
-
a `intersection` b != empty
: this means that the two intervals have some value in common, hence they overlap. Depending on the ordering ofstart a
andstart b
, we would get eitherrelate a b = Overlaps
orrelate a b = OverlappedBy
-
a `intersection` b == empty
: no value is in both intervals. At this point we inspect better the union ofa
andb
:-
a `union` b
is two separate intervals: in this case the two intervals are really disjoint, meaning that there exists a value which is bigger that one and smaller than the other. Depending on the order ofstart a
andstart b
we will get eitherrelate a b = Before
orrelate a b = After
-
a `union` b
is one single interval: the two intervals are disjoint, but there is not value between them, so they must be one right after the other. Depending on the order ofstart a
andstart b
we will get eitherrelate a b = Meets
orrelate a b = IsMet
-
This gives a precise algorithm to determine the relation occurring between any two intervals. We can see now that we have a precise way to distinguish what happens in cases where two intervals have a boundary with a common value
relate (boundsInterval (Boundary 0 Included) (Boundary 1 Excluded)) (boundsInterval (Boundary 1 Excluded) (Boundary 2 Included)) = Before -- not Meets relate (boundsInterval (Boundary 0 Included) (Boundary 1 Included)) (boundsInterval (Boundary 1 Excluded) (Boundary 2 Included)) = Meets relate (boundsInterval (Boundary 0 Included) (Boundary 1 Included)) (boundsInterval (Boundary 1 Included) (Boundary 2 Included)) = Overlaps -- not Meets
It disambiguates also the issues with singleton intervals
relate (boundsInterval (Boundary 0 Included) (Boundary 0 Included)) (boundsInterval (Boundary 0 Included) (Boundary 0 Included)) = Equals relate (boundsInterval (Boundary 0 Included) (Boundary 0 Included)) (boundsInterval (Boundary 0 Included) (Boundary 2 Excluded)) = Starts relate (boundsInterval (Boundary 1 Included) (Boundary 1 Included)) (boundsInterval (Boundary 0 Included) (Boundary 2 Included)) = During
CONCLUSION
We took a look at intervals and how they can relate with each other. We noticed how a naive model of the domain leads to subtle issues and ambiguities. With a better model, distinguishing between included and excluded interval boundaries, and using their algebraic operations on intervals as much as possible, we were able to disambiguate the problematic cases and recover a clear view of the situation.
In general I believe that trying to model a domain using mathematical abstractions is a very productive choice. It usually leads to more general and elegant solutions because it forces us to understand our domain to isolate the relevant aspects and forget about the noise given by context-over-specific information.
If you want to check out a working implementation of what we discussed above, you can find it in my availer project.
以上所述就是小编给大家介绍的《Intervals and their relations》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
数据结构与问题求解
韦斯 / 清华大学出版社 / 2011-8 / 89.50元
《数据结构与问题求解(Java语言版)(第4版)》是专为计算机科学专业的两个学期课程而设计的,从介绍什么足数据结构开始,继而对高级数据结构与算法进行分析。《数据结构与问题求解(Java语言版)(第4版)》以独特的方式,清晰地将每种数据结构的接口与其实现分离开来,即将如何使用数据结构与如何对数据结构编程相分离。《数据结构与问题求解(Java语言版)(第4版)》从抽象思维和问题求解的角度出发,为数据结......一起来看看 《数据结构与问题求解》 这本书的介绍吧!