Another look at halfspace depth: flag halfspaces with applications

The halfspace depth is a well-studied tool of nonparametric statistics in multivariate spaces. We introduce a flag halfspace – an intermediary between a closed halfspace and its interior – and demonstrate that the halfspace depth can be equivalently formulated also in terms of flag halfspaces. Flag halfspaces allow us to derive theoretical results regarding the halfspace depth without the need to differentiate absolutely continuous measures from measures containing atoms, as was frequently done previously. Flag halfspaces are used to state results on the dimensionality of the halfspace median set for random samples. We prove that under mild conditions, the dimension of the sample halfspace median set of d-variate data cannot be d−1 and that for d = 2, the sample halfspace median set must be either a two-dimensional convex polygon or a data point.


Introduction: Halfspace depth and its median
Denote by M R d the set of all finite Borel measures on the Euclidean space R d .The halfspace (or Tukey) depth of x ∈ R d with respect to (w.r.t.) µ ∈ M R d is defined as 1   (1) D (x; µ) = inf {µ(H) : H ∈ H(x)} , where H(x) is the collection of closed halfspaces in R d that contain x on their boundary.The halfspace depth quantifies the centrality of x w.r.t. the mass of µ.That is quite useful in nonparametric statistics, as it allows us to rank sample points according to their depth, from the central to the peripheral ones.As such, the depth enables the introduction of rankings, orderings, and quantile-like inference to multivariate datasets [3,20,21].The upper level sets of the halfspace depth of µ, given for α ≥ 0 by play in nonparametric statistics the role of the inner quantile regions of µ.They are often called the (halfspace) central regions of µ.The sets (2) are nested, closed and convex; they are compact for α > 0, and non-empty for α ≤ α * (µ), where α * (µ) = sup x∈R d D (x; µ) is the maximum halfspace depth of µ.Of special importance is the set D * (µ) = D α * (µ) , which contains points that are the most centrally positioned w.r.t.µ.It is called the set of the halfspace medians of µ and, as its name suggests, it generalises the median to R d .The halfspace depth has many applications in multivariate statistics, and is already for 30 years a subject of active research [11,12,13,15,16].Although many other statistical depth functions have been developed [2,14,21], in this paper we focus on the halfspace depth, and sometimes write simply depth instead of halfspace depth.We call a measure µ ∈ M R d smooth if the µ-mass of every hyperplane in R d is zero.A measure with a density is smooth; examples of non-smooth measures are those with an atom.The infimum in (1) is attained for smooth measures.That is why theoretical results on the halfspace depth are often formulated only for smooth measures, and why the analysis of the sample halfspace depth (that is, the halfspace depth evaluated w.r.t.empirical measures of random samples) is performed using different techniques [10,12,13].In this paper we introduce flag halfspaces -symmetrised variants of closed halfspaces that may be considered in (1) instead of H(x) without altering the depth, with the property that a flag halfspace attaining the depth always exists.We will see that our restatement of formula (1) simplifies many theoretical derivations about the halfspace depth, as it is no longer needed to distinguish whether the infimum in (1) is attained.Flag halfspaces are introduced in Section 2. Two applications to the computation of the depth are given in Section 3. In Section 3.1, we investigate the dimensionality and the structure of the median set D * (µ) for µ an empirical measure.We show that for datasets sampled from absolutely continuous probability measures in R d , the halfspace median set cannot be of dimension d − 1, almost surely.In a series of examples in R 3 we demonstrate that already for random samples of size n = 8 from the standard Gaussian distribution, halfspace median sets of dimensions 0, 1, and 3 occur with positive probability.In Section 3.2 we deal with the special situation of data of dimension d = 2.We show that if the dataset satisfies a mild condition of general position, then the halfspace median set must be either a full-dimensional polygon, or a data point.Both these advances find applications in the computation of the halfspace median and the central regions (2), where the dimensionality of D * (µ) plays a crucial role [6,11].The paper is complemented by online Supplementary Material containing R and Mathematica scripts with visualisations and computations completing examples from Section 3.
Notations.Some of our proofs are based on convexity theory.As a basic reference we take [17]; we now gather notations and elementary definitions that will be used throughout the paper.The unit sphere in R d is S d−1 .We write S ⊂ K for S being a proper subset of The affine hull aff (S) of S ⊆ R d is the smallest affine subspace of R d containing S. The dimension dim(S) of S is defined as the dimension of aff (S).For example, the affine hull of two different points in R d is the infinite line joining them, and its dimension is 1.We write int (S), cl (S), and bd (S) for the interior, closure, and boundary of S ⊆ R d .The interior, closure, and boundary of S when considered as a subset of its affine hull aff (S) is denoted by relint (S), relcl (S) and relbd (S), and is called the relative interior, relative closure, and relative boundary of S, respectively.Of course, if dim(S) = d, the interior is the same as the relative interior etc.
The class of all closed halfspaces in R d is H.A generic halfspace from H may be denoted simply by H; H x,v means a halfspace y ∈ R d : y, v ≥ x, v whose boundary passes through x ∈ R d with inner normal v ∈ R d \ {0}.For an affine space A ⊆ R d and x ∈ A we denote by H(x, A) the set of all relatively closed halfspaces H in A whose relative boundary contains x; surely H(x, R d ) ≡ H(x).We say that a sequence of halfspaces Finally, for any of the symbols H, H(x), or H(x, A), a superscript • designates the corresponding relatively open halfspaces, e.g.

Flag halfspaces
For µ ∈ M R d and x ∈ R d we call H ∈ H(x) a minimising halfspace of µ at x if µ(H) = D (x; µ).For d = 1 minimising halfspaces always trivially exist.They also exist if µ is smooth, or if µ is supported in a finite number of points.In general, however, the infimum in (1) does not have to be attained.We give a simple example.
Example 1.Take µ ∈ M (R 2 ) the sum of the Dirac measure at a = (1, 1) ∈ R 2 and the uniform distribution on the disk {x ∈ R 2 : x ≤ 2}.For x = (1, 0) ∈ R 2 no minimising halfspace exists.As we see in Figure 1, the depth D(x; µ) is approached by µ(H x,vn ) for a sequence of halfspaces The problem with measures not attaining the infimum in (1) is elegantly resolved by considering flag halfspaces instead of the usual closed halfspaces.
Definition.Define F (x) as the system of all sets F of the form The name flag comes from geometry [17], where an analogous recursive construction is considered, involving nested faces of convex polytopes.The formal definition of flag halfspaces is somewhat convoluted, but these sets appear naturally.In R 2 , a flag halfspace at x is the union of an open halfplane G 2 whose boundary passes through x, a relatively open halfline G 1 originating at x contained in the one-dimensional affine space (line) bd (G 2 ), and the 0-dimensional point x itself.For an example see Figure 1.A flag halfspace is neither an open nor a closed set.In contrast to a usual closed halfspace, a complement of a flag halfspace F ∈ F (x) is, except for its central point x, again a flag halfspace from F (x), i.e. (R d \ F ) ∪ {x} ∈ F (x). Several more interesting properties and characterisations of flag halfspaces can be found in [9].
We define a minimising flag halfspace of µ at x to be any F ∈ F (x) that satisfies µ(F ) = D(x; µ).In the following Theorem 1 we show that the halfspace depth (1) of any measure can be expressed in terms of the µ-mass of flag halfspaces, and a minimising flag halfspace always exists.The intuition behind this result is as follows: Even if the minimising closed halfspace of x ∈ R d does not exist, there is a sequence of closed halfspaces Because the unit normals {v n } ∞ n=1 of these halfspaces come from the compact set S d−1 , we can also assume that the sequence of halfspaces is convergent and lim n→∞ v n = v ∈ S d−1 (otherwise, we extract a convergent subsequence).For n large enough, µ (H n ) is arbitrarily close to D (x; µ), but this fact alone, of course, does not imply that the µmass of the limit H ≡ H x,v defined as H = lim n→∞ H n is equal to D (x; µ).It turns out that for general measures, it is not possible to find any useful upper bound on the mass µ (H), but it is possible to bound the mass of its interior by µ (int (H)) ≤ D (x; µ).In the right hand panel of Figure 2 we see a visualisation of our setup, with d = 2.In the situation displayed, as n → ∞, the halfspaces H n do not intersect the halfline G − 1 ⊂ bd (H) originating at x, so the µ-mass of G − 1 does not contribute to the depth of x, and G − 1 should not be contained in F .The formal statement of our theorem follows.Theorem 1.For any µ ∈ M R d and x ∈ R d we have

The interior of
In particular, there always exists a minimising flag halfspace.
We first bound both summands on the right hand side from below.For each n = 1, 2, . . .
From the convergence of the halfspaces {H n } ∞ n=1 we know that A n ↑ int (H) as n → ∞, and using the continuity of measure from below [4,Theorem 3.1.11]we obtain the equality in (7) µ(int For any line segment with endpoints y, z passing through x, exactly one of the points y, z belongs to F .Right hand panel: On the other side, x ∈ H n ∩ bd (H) for all n = 1, 2, . . ., so H n ∩ bd (H) is either a closed halfspace when considered in the (d − 1)-dimensional space bd (H), or is equal to bd (H).
In any case, we have that µ (H n ∩ bd (H)) ≥ D x; µ| bd(H) for µ| bd(H) the restriction of µ to the hyperplane bd (H).Consequently Combining ( 6), (7) and ( 8) one gets (9) Assume now for a contradiction that the inequality in ( 9) is strict, i.e. that D ( The definition of the halfspace depth implies that there exists a halfspace H ∈ H(x, bd (H)) in the hyperplane bd (H) that satisfies (10) Denote by v ∈ S d−1 the unit inner normal of H and set For n large enough we have µ(H x,wn \ H) < c/2.Note also that H x,wn ∩ bd (H) = H for all n = 1, 2, . . ., due to the choice of w n .Altogether, we have ( 11) where the last inequality in (11) follows from (10).Note that because H x,wn ∈ H(x), inequality (11) contradicts the definition of the halfspace depth (1), and we get where we denoted We have just constructed the first open halfspace G d in the system (3).We proceed by induction.We consider µ| bd(H) = µ| relbd(G d ) instead of µ and using the same argument obtain that satisfies an equation analogous to (12), i.e. .
Continuing the same procedure we eventually obtain a flag halfspace The last but one equality above follows from the fact that relbd (G 2 ) is a line, meaning that G 1 is one of the two relatively open halflines determined by x in relbd (G 2 ) having a smaller µ-mass.Thus, D x; µ| relbd(G 2 ) = µ ({x}) + µ(G 1 ).
In Example 1, the single minimising flag halfspace of µ at x is In formula (12) in the proof of Theorem 1 we unveiled the recursive nature of the halfspace depth.The following result formalises that observation.In the special situation of an empirical measure µ ∈ M R d , a related result has been observed in [5, Theorems 1 and 2] and successfully applied in the task of exact computation of the halfspace depth.Proof.There are more flag halfspaces in F (x) than closed halfspaces in H(x), in the sense that the mapping F (x) → H(x) : F → cl (F ) is not bijective.We define an equivalence relation ∼ between the elements of F (x) by By K we denote the quotient set of ∼.This allows us to rewrite (5) from Theorem 1 as for F ′ a flag halfspace centred at x when considered inside the affine space bd (G K ) (denoted by F ′ ∈ F (x, bd (G K ))).We get, using Theorem 1 again, inf The mapping , and we obtain desired result.

Applications: Properties of the sample halfspace median
We now use flag halfspaces to derive several properties of the sample halfspace median that are of interest in the practice of the depth; additional applications of flag halfspaces to the theory of the halfspace depth can be found in [8,9].Write A R d for the set of all empirical measures µ ∈ M R d , that is all purely atomic probability measures with a finite number n of atoms, each atom having µ-mass 1/n, for some n = 1, 2, . . . .These measures are typically obtained observing a random sample X 1 , . . ., X n ∈ R d from a probability distribution ν ∈ M R d , each sample point corresponding to an atom.To approximate the halfspace depth of ν, the depth of µ is computed.The latter depth function is standardly used for inference about the unknown distribution ν.Naturally, it is therefore crucial to understand the behaviour of the halfspace depth w.r.t.empirical measures.We provide results on the dimensionality of the median set, assuming that the atoms of µ ∈ A R d lie in a sufficiently general position.The last assumption is not restrictive; it is satisfied if, for instance, the measure ν from which we sample is smooth.The proof of the following lemma is standard and omitted.Lemma 3. Let X 1 , X 2 , . . ., X n be independent random variables sampled from smooth (and possibly different) probability measures from M R d .Then the following holds true almost surely.
(i) The points X 1 , X 2 , . . ., X n are in general position. 2ii) Writing l(x, y) for the infinite line determined by Proof.We use two auxiliary lemmas.Our first lemma is a special case of a more general result that can be found in [7, Lemma 4].In [7], that lemma is formulated with a final inequality µ(int (H)) ≤ α; for µ ∈ A R d also a strict inequality can be written, because the depth of µ attains only finitely many values.
and a face F of D α (µ) are given so that the relatively open line segment L(x, y) formed by x and y does not intersect D α (µ) for any y ∈ F .Then there exists a touching3 halfspace H ∈ H of D α (µ) such that µ(int (H)) ≤ α, x ∈ H and F ⊂ bd (H).If, in addition, µ ∈ A R d , then we can write even µ(int (H)) < α.
Our second lemma is a simple observation about the structure of a simplex, that is a convex hull of k + 1 points in general position, in the linear space R k .These k + 1 points are called the vertices of S. Lemma 6.For a simplex S ⊂ R k and any convex set K ⊆ S with non-empty interior there exist x, y ∈ K and v ∈ S k−1 such that each of the disjoint halfspaces H x,v and H y,−v contains only one vertex of S.
Proof.In this proof, all the vectors are column vectors, and by A T we denote the transpose of a matrix A. Denote s 1 , . . ., s k+1 ∈ S the vertices of S. Denote by a any point in the interior of K. We first transform both S and K by an affine transform T : R k → R k : z → A z + b for A ∈ R k×k non-singular and b ∈ R k such that T (s i ) = e i for each i = 1, . . ., k for e i the i-th standard basis vector in R k , and T (a) = 0 is the origin in R k .Such an affine transform certainly exists, because each full-dimensional simplex in R k can be uniquely mapped to any other one using an invertible affine mapping.Because a ∈ int (K) ⊆ int (S), the origin T (a) must be contained in the interior of the T -image of S defined by T (S) = {T (z) : z ∈ S}, meaning that necessarily T (s k+1 ) ∈ (−∞, 0) k .Since K is a convex set with a in its interior, also T (K) is convex with 0 = T (a) ∈ int (T (K)).Thus, there is a closed ball B centred at the origin with radius δ > 0 small enough so that B ⊆ T (K) ⊆ T (S).For v = e 1 ∈ S k−1 we have v, T (s 1 ) = v, e 1 = 1, v, T (s i ) = v, e i = 0 for i = 2, . . ., k, and v, T (e k+1 ) < 0. Take x = δ e 1 ∈ B and as the only vertex of T (S), and ) as the only vertex of T (S).Certainly, also H x, v ∩ H y,− v = ∅.Now it remains to apply the inverse affine transform the inverse of A, and define x = T −1 ( x), y = T −1 ( y), and v = A T e 1 / A T e 1 ∈ S k−1 .Because v is taken to be the inner normal vector of T −1 (H x, v ) = H x,v , we indeed found the desired pair of halfspaces H x,v and H y,−v .
We are ready to prove Theorem 4. Recall that α * (µ) = sup x∈R d D (x; µ).Assume for a contradiction that dim(D * (µ)) = d − 1.Then D * (µ) is contained in a hyperplane that determines two different closed halfspaces -we denote them by H + and H − , respectively.Take any w ∈ int (H + ) and q ∈ int (H − ).We can consider the set D * (µ) itself as a (d−1)dimensional face of D * (µ) that satisfies the conditions of Lemma 5 for either of the choices x = w, or x = q.We apply Lemma 5 twice, first to x = w and then also to x = q.We obtain that µ(int (H + )) < α * (µ) and µ(int Applying Corollary 2 to x ∈ D * (µ) and halfspaces H + and H − and using (13), we get that From ( 14) and ( 15) it follows that α * (µ) > µ(G) ≥ α * (µ) − 1/n for G ∈ {G + , G − }.Since µ is an empirical measure with n atoms, the µ-mass of any set can be only a multiple of 1/n, so it must be that µ( Since we have shown that there are exactly d atoms of µ in A, it has to be n ≥ d.From an assumption of our theorem we thus have n > d.Then there exists z ∈ R d \ A such that µ({z}) = 1/n.We apply Lemma 6 in the subspace A to conclude that there exist x, y ∈ D * (µ) and closed halfspaces Note that the sets S x and S y are disjoint and (S x ∪ S y ) ∩ (A ∪ {z}) = ∅, so ( 18) Combining ( 16), ( 17) and ( 18), we obtain It follows that min{µ(H x,u ), µ(H y,−u )} < α * (µ), a contradiction with our choice {x, y} ⊂ D * (µ).
Theorem 4 is valid for empirical measures; an analogous theorem for absolutely continuous measures can be found in [18,Proposition 3.4].There, it was shown that for µ ∈ M R d satisfying certain smoothness conditions including the existence of the density, the dimension of the median D * (µ) cannot exceed d − 2 provided that d ≥ 2. A version of the latter theorem with weaker conditions, but still requiring smoothness and contiguous support of µ ∈ M R d , is given in [7, Corollary 7].Unlike the proofs for smooth measures, the proof of Theorem 4 requires the use of flag halfspaces, which makes the derivation more technical and delicate.Without the assumption of general position, the claim of Theorem 4 is not valid.An example of a measure in µ ∈ A (R 2 ) whose atoms are not in general position but dim(D * (µ)) = 1 is given in [7, Section 2].
Excluding the case of dim(D * (µ)) = d−1 for random samples from smooth probability measures, one can ask whether there are other dimensions that the sample median set cannot attain.The answer is negative already in the case of n = 8 points sampled randomly from a Gaussian distribution in R 3 , as we show in the next example.
Example 2. For ν ∈ M (R 3 ) the standard Gaussian probability measure and X 1 , . . ., X 8 a random sample from ν with empirical measure µ ∈ A (R 3 ), the median set D * (µ) is of dimension 3, 1, or a single-point set, all with positive probability.The claim follows by considering three setups of eight points x 1 , . . ., x 8 in the space R 3 .Denote k = dim (D * (µ)) and write µ ∈ A (R 3 ) for the empirical measure of x 1 , . . ., x 8 .The direct computations described below are based on the analysis performed using the R package TukeyRegion [1] for evaluation of full-dimensional central regions, and the Mathematica visualisations provided in the script in the online Supplementary Material.Plots of the three setups below are displayed in Figure 3.

(outer polyhedron), the median line segment (thick green line segment between the pair of the inner coloured points), and three planes, each separating two sample points from the median set (yellow planes). Bottom panels: Convex hull of the sample points (coloured polyhedron) with the single median (green point). The halfspace in the right hand panel is one minimising halfspace of the median z containing 3 sample points (the green one and two blue ones). For interactive visualisations see the supplementary Mathematica script.
• Case k = 3.This situation is standard and common.For example, direct computation shows that already for randomly perturbed vertices of a unit cube in R 3 , i.e. points in a configuration where the convex hull of x 1 , . . ., x 8 contains all the eight points on its boundary, possess a full-dimensional polyhedral median set with maximum depth 2/8.• Case k = 1.Arrange the points so that x 1 , x 2 , x 3 form vertices of a triangle T 1 in a plane, and x 4 , x 5 , x 6 form vertices of a triangle T 2 in a plane parallel to that determined by T 1 , so that the convex hull of x 1 , . . ., x 6 is a triangular prism in R 3 .To obtain points in general position, we perturb the six points slightly.Direct computation shows that for these six points, the halfspace median set is a three-dimensional polyhedron M inside the prism that does not intersect T 1 or T 2 , of points with depth 2/6.Place the last two points x 7 and x 8 in the interior of M, so that the straight line l(x 7 , x 8 ) between these points intersects both relative interiors of T 1 and T 2 .Note that certainly D (x 7 ; µ) = D (x 8 ; µ) = 3/8, since the two points were placed inside M. No point can have depth 4/8, as in that situation the setup would exhibit halfspace symmetry which is clearly impossible [10, Proposition 1].Finally, projecting all points of µ into the plane orthogonal to l(x 7 , x 8 ) shows that any point y / ∈ l(x 7 , x 8 ) can be separated from l(x 7 , x 8 ) by a plane that is parallel to l(x 7 , x 8 ) and contains only two sample points, meaning that D (y; µ) ≤ 2/8.The median set of µ is therefore the line segment between x 7 and x 8 , with depth 3/8.
• Case k = 0. Consider four points x 1 , . . ., x 4 forming the vertices of a tetrahedron T (blue points in the bottom panels of Figure 3).Three points x 5 , x 6 , x 7 / ∈ T are attached to three different facets of T so that each of these points together with its facet forms another (non-regular) tetrahedron not intersecting int (T ) (red points in the bottom panels of Figure 3).Finally, a single point x 8 is placed strategically inside T into the full-dimensional halfspace median of x 1 , . . ., x 7 .An example is the configuration The medians in all three cases above are stable in the sense that for a small perturbation of all the sample points, the dimension of the median set remains unchanged.Thus, in each setup and for each x i we can find a small open ball around x i such that if x i is replaced by any element of this ball, the dimension of the new median remains the same.
In conclusion, all three cases k = 0, 1, 3 occur with positive probability if x 1 , . . ., x 8 are sampled from any distribution in R 3 with positive density everywhere. 4.2.Computation of the halfspace median in R 2 .In dimension d = 2, Theorem 4 leaves only trivial cases: the halfspace median must be either full-dimensional or a singleton, and both situations may occur.But, as we show in our last result below, if µ ∈ A (R 2 ) has a unique median and n = 4, then the median must be one of the data points.The case of n = 4 data points is trivial and not interesting. 5heorem 7. Let µ ∈ A (R 2 ) be an empirical measure with precisely n atoms of mass 1/n, with n = 4, that satisfy conditions (i) and (ii) from Lemma 3. If the halfspace median D * (µ) is a single point set, then it must be an atom of µ.In particular, the median set is either full-dimensional, or an atom of µ.
Proof.Suppose without loss of generality that x = 0 ∈ R 2 is the unique median of µ.
Assume, for a contradiction, that µ({0}) = 0. We start with the following observation: For every v ∈ S 1 there is w(v) ∈ S 1 that meets the following conditions , and µ bd H 0,w(v) = 2/n.
To prove the existence of w = w(v) satisfying ( 19) pick a real sequence a i ↓ 0 and note that for every i = 1, 2, . . .we have a i v / ∈ D * (µ), so there is The existence of a minimising halfspace H a i v,w i ∈ H(a i v) follows from the fact that minimising halfspaces always exist for µ ∈ A R d , as we observed in Section 2. Then necessarily 0 ∈ H a i v,w i , meaning that v, w i > 0. The sequence {w i } ∞ i=1 ⊂ S 1 is bounded and therefore contains a convergent subsequence {w i j } ∞ j=1 with a limit point w ∈ S 1 that satisfies v, w ≥ 0. By the Fatou lemma [4,Lemma 4.3.3]applied to the sets int (H 0,w ) ⊆ lim inf j→∞ int H a i j v,w i j and (20) we have that (21) µ(int (H 0,w )) ≤ lim inf j→∞ µ int H a i j v,w i j ≤ α * (µ) − 1/n, which together with Corollary 2 gives us Therefore, D 0; µ| bd(H 0,w ) ≥ 1/n.Because the halfspace median 0 is not an atom of µ, the condition of general position of the atoms of µ from part (i) of Lemma 3 implies that the straight line bd (H 0,w ) contains exactly two atoms of µ at some points y, z ∈ bd (H 0,w ) such that 0 is contained in the relatively open line segment formed by y and z.Denote by l y ⊂ bd (H 0,w ) the open halfline centred at 0 that contains y.The flag halfspace F = {y}∪l y ∪int (H 0,w ) ∈ F (0) then satisfies µ(F ) = µ (int (H 0,w ))+1/n.Inequality (21) implies µ(F ) ≤ α * (µ), so it must be µ(F ) = α * (µ) because of Theorem 1. Consequently, µ (int (H 0,w )) = α * (µ) − 1/n and we may take w(v) = w.We have proved (19).Pick any v ∈ S 1 .There exists u = w(v) ∈ S 1 that satisfies (19).Using the same observation again, we are able to find u ′ = w(−u) ∈ S 1 that satisfies (19) for v replaced by −u.We consider two different cases.
First case: u ′ = −u.By summing up equalities µ (int (H 0,u )) = α * (µ) − 1/n, µ (int (H 0,−u )) = α * (µ) − 1/n and µ (bd (H 0,u )) = 2/n that all follow from (19), we obtain α * (µ) = 1/2.Consider any infinite line l that passes through the origin and the two open halfplanes G + and G − determined by l.If µ(l) = 1/n, then one of the open halfplanes G + and G − is of µ-mass at most 1/2 − 1/(2n).Assume that µ(G + ) ≤ 1/2 − 1/(2n).Because l contains only one atom of µ that is not at the origin, there is a flag halfspace F ∈ F (0) composed of G + and the relatively closed halfline in l starting at 0 that does not contain atoms.Then µ(F ) = µ(G + ) ≤ 1/2 − 1/(2n) < α * (µ), a contradiction with µ(l) = 1/n.Due to the assumption of general position of atoms from part (i) of Lemma 3, we know that µ(l) ≤ 2/n, so µ(l) can take only one of the two possible values: either 0 or 2/n.Because of our assumption from part (ii) of Lemma 3, there however cannot be three different lines determined by pairs of sample points that all intersect in the origin.This means that for only at most two lines l in R 2 passing through the origin, the µ-mass of l can be 2/n; all the other lines that we now consider have null µ-mass (given that we have already excluded the case µ(l) = 1/n).This leaves only two possibilities: either n = 2, or n = 4.If n = 2, then the median set D * (µ) is the line segment determined by the only two atoms of µ, and therefore it is one-dimensional.Only the case n = 4, not covered by the statement of this theorem, remains.
Second case: u ′ = −u.There exists a closed halfspace H 0,v ′ whose boundary passes through the origin that does not contain any of the points u and u ′ .Let ũ = w(v ′ ) be the unit vector that satisfies (19) with v = v ′ .Directly by (19), each of the three different lines bd (H 0,u ), bd (H 0,u ′ ) and bd (H 0,ũ ) contains two atoms of µ, a contradiction with our assumption from part (ii) of Lemma 3.
The last part of the statement of Theorem 7 follows directly from Theorem 4.
Theorems 4 and 7 fully justify the algorithmic procedure from [11] and [6] for finding the halfspace medians of samples from smooth probability distributions in R 2 .If the median set is full-dimensional, the algorithm from [11] implemented in the R package TukeyRegion [1] finds the median set exactly, as proved in [6].If the median is not full-dimensional, we conclude that it has to be a single sample point, and evaluation of the maximum halfspace depth of all sample points gives the unique halfspace median.
In dimension d > 2, the situation with possible less-than-full-dimensional halfspace medians appears to be much more convoluted, as demonstrated already in Example 2. Our proof technique from Theorem 7 does not extend directly to d > 2. One might, however, conjecture that in accordance with Theorem 7, a less-than-full-dimensional median of a dataset in general position must contain at least one atom of µ ∈ A R d .Our final example shows that this is not true: a configuration of points in general position without an atom in the halfspace median set is indeed possible.Similarly as in case k = 1 in Example 2, points x 1 , . . ., x 6 are perturbed vertices of a triangular prism.Points x 7 and x 8 determine a line segment that passes through both triangular bases of that prism.The dataset is in general position.A direct computation performed in Mathematica, provided in the script in the online Supplementary Material, confirms that the sample halfspace median set of this dataset attains depth 3/8, and the median set consists of the line segment between points (0, 1/2, 0) and (3/44, 1/2, 9/22).This median line segment lies strictly in the relative interior of the straight line between points x 7 and x 8 , and does not contain any atoms of the corresponding measure µ ∈ A (R 3 ).Thus, it is possible that in dimension d > 2, a less-than-full-dimensional median set contains no data points.For a visualisation of our dataset and its median set see Figure 4.In Example 3, we constructed a dataset in general position in dimension d = 3, with a one-dimensional median set.We do not know an example of a dataset sampled from a distribution with a density in R d , d > 2, with a unique (zero-dimensional) halfspace median that is not a data point.The higher dimensional situation therefore deserves further investigation.
H is the first open halfspace G d in the construction of the minimising flag halfspace (3).The remaining relatively open halfspaces G k are found by iterating the same process inside the relative boundary of the previous G k+1 , k = 1, . . ., d − 1.
wn = H and lim n→∞ µ(C n ) = 0 due to the continuity of measure from above [4, Theorem 3.1.1].

3. 1 .Theorem 4 .
Dimensionality of the sample halfspace median.As our first application we show that for an empirical measure with atoms in general position, the median set D * (µ) in dimension d ≥ 2 cannot be (d − 1)-dimensional, unless we are in the trivial case when the number of atoms is equal to d.Our findings should be seen as complementary to the earlier advances from [19,Lemma 6], where it was demonstrated that for µ ∈ A R d with atoms in general position are all the depth regions D α (µ) full-dimensional, except for possibly the depth median D * (µ).Let µ ∈ A R d be a measure with n atoms of mass 1/n in general position.If n = d ≥ 2, then dim(D * (µ)) = d − 1.

( 14 )
D (x; µ| A ) ≥ α * (µ) − µ(G) > 0 for all x ∈ D * (µ) and G ∈ {G + , G − }.Because dim(D * (µ)) = d − 1 and D (x; µ| A ) > 0 for all x ∈ D * (µ), there must exist at least d atoms of µ in the hyperplane A. At the same time, due to our assumption of the atoms of µ being in general position, there are at most d atoms of µ in any hyperplane, meaning that A contains exactly d atoms of µ, and these atoms are in general position inside A. Consequently, (15) D (x; µ| A ) = 1/n for all x ∈ D * (µ) .

Figure 3 .
Figure 3. Example 2. Top left hand panel: Convex hull of the sample points in case k = 3 (outer polyhedron) and the full-dimensional median set D * (µ) (inner colouredpolyhedron).Top right hand panel: Convex hull of the sample points in case k = 1 (outer polyhedron), the median line segment (thick green line segment between the pair of the inner coloured points), and three planes, each separating two sample points from the median set (yellow planes).Bottom panels: Convex hull of the sample points (coloured polyhedron) with the single median (green point).The halfspace in the right hand panel is one minimising halfspace of the median z containing 3 sample points (the green one and two blue ones).For interactive visualisations see the supplementaryMathematica script.

Example 3 . 1 =
Consider a dataset of n = 8 points in R 3 given by x

Figure 4 .
Figure 4. Example 3. Left hand panel: Convex hull of the sample points (outer polyhedron) and the full-dimensional central region D α (µ) with α = 2/8 (inner coloured polyhedron).Right hand panel: Convex hull of the sample points (outer polyhedron), the median line segment of depth α * (µ) = 3/8 (thick orange line segment), and the line between data points x 7 and x 8 (dashed black line).The halfspace median line segment forms a piece of the dashed black line, and is contained inside the region D 2/8 (µ) from the left hand panel.For interactive visualisations see the supplementary Mathematica script.
,u i ) = D (x 8 ; µ) = 3/8 for each i = 1, 2, 3, 4. At the same time, the union of the open halfspaces int (H x 8 ,u i ) is R 3 , meaning that for any y = x 8 we can find i with y ∈ int (H x 8 ,u i ), and the shifted closed halfspace H y,u i = H x 8 ,u i + (y − x 8 ) ∈ H(y) necessarily contains at most two atoms of µ.Thus, D (y; µ) ≤ 2/8 and the point x 8 is the single halfspace median of µ.