The Chern-Mather class of the multiview variety

The multiview variety associated to a collection of $N$ cameras records which sequences of image points in $\mathbb{P}^{2N}$ can be obtained by taking pictures of a given world point $x\in\mathbb{P}^3$ with the cameras. In order to reconstruct a scene from its picture under the different cameras it is important to be able to find the critical points of the function which measures the distance between a general point $u\in\mathbb{P}^{2N}$ and the multiview variety. In this paper we calculate a specific degree $3$ polynomial that computes the number of critical points as a function of $N$. In order to do this, we construct a resolution of the multiview variety, and use it to compute its Chern-Mather class.


Introduction
Suppose that a collection of cameras is used to generate images of a scene. The problem of triangulation is to deduce the world coordinates of an object from its position in each of the camera images. If we assume that the image points are given with in nite precision, then two cameras su ce to determine the world point. However, due to the many sources of noise in real images such as pixelization and distortion, there typically will not be an exact solution and we will instead try to nd a world point whose picture is "as close as possible" to the image points.
More precisely, suppose the cameras are C 1 , . . . , C N and the image points are p 1 , . . . , p N ∈ R 2 . The goal is to nd a world point q ∈ R 3 that minimizes the least square error One application is the problem of reconstructing the 3D structure of a tourist attraction based on millions of online pictures. It is di cult to obtain the precise con guration of any single camera, so it would not make sense to use only a small subset of them and disregard the rest. A better approach is to solve an optimization problem which incorporates as many of the cameras as possible. This technique was used in [1] to reconstruct the entire city of Rome from two million online images.
Since the camera function C i : R 3 → R 2 is not linear, the standard method for solving the triangulation problem is to rst nd the critical points of error(q) (e.g, with gradient descent), and then select the one with the smallest error. In order to gauge the di culty of this problem, it is important to be able to predict the number of critical points that we expect to nd for a given con guration of cameras.
The goal of this paper is to give an explicit expression for the number of critical points of error(q) as a function of the number of cameras N. In fact, we compute this expression for a variation of the problem in which we allow the world points to take complex values, and we allow these points to be in the projective space P 3 C as opposed to the a ne space C 3 . Our main result is that the number of critical points of error(q) is polynomial in the number of cameras. Theorem 1. The number of critical points of error(q) on P 3 C is equal to where N ≥ 3 is the number of cameras.
By the argument in the proof of theorem 4, the polynomial p(N) is an upper bound on the number of critical points in the complex a ne version of the problem. One can solve the original real version by rst nding these complex a ne points, and then discarding the ones that are not in R 3 . In [10], a detailed investigation of the Lagrange multiplier equations which de ne the complex a ne critical points is used to compute the number of such points for N ≤ 7. Based on these results, it was conjectured in [4,Conjecture 3.4] that the number of points should grow as the following polynomial: We note that our upper bound p(N) is fairly close.
In order to compute the number p(N), we take a slightly di erent perspective on the function error(q). By combining the cameras C i : R 3 → R 2 , we obtain a rational map A er passing to the complex numbers and taking the projective closure we obtain a rational map φ : P 3 C → P 2N C . The image of this map is a three-dimensional variety MV N ⊂ P 2N which is known as the multiview variety. We can now interpret the error function error(q) as measuring the distance between a point q ∈ P 2N and MV N . With this formulation, the number of critical points is known as the Euclidean distance degree of the variety MV N . The notion of ED degree was introduced in [4], and the authors remark in [4, ex 3.3] that the triangulation problem was their original motivation for this concept.
In particular, by using results from [4] we prove in section 5 that this number can be computed in terms of the Chern-Mather class c M (MV N ). In general, the Chern-Mather class only provides an upper bound on the ED degree, but in the proof of theorem 4 we show that this inequality can be promoted to an equality for reasons speci c to the multiview variety. One advantage of this approach is that it depends only on the geometric properties of MV N and not on the speci c features of the de ning equations. Another advantage is that it reduces most of the di culty to local calculations on MV N .
One common way of calculating the Chern-Mather class of a singular variety X is to rst nd a resolutionX f − → X and then analyze the singularities of f in order to compare the Chern class c(X) to the Chern-Mather class c M (X) In our situation, it is natural to build a resolution of MV N by resolving the rational map φ. In section 3, we construct such a resolutionφ :P 3 → MV N and calculate its Chow ring and Chern class.
In order to compare the Chern class ofP 3 to the Chern-Mather class of MV N , we use the theory of higher discriminants which was introduced in [9]. One aspect of this theory is that it speci es which parts of the singular locus of X we need to understand in order to relate c(X) to c M (X). A precise statement is given in proposition 5.
As we show in proposition 6, the higher discriminants ofφ are surprisingly nice. Speci cally, it turns out that in order to calculate c M (MV N ), we only have to compute the Euler obstruction of a single point x ∈ MV N . Moreover, in section 5.2, we show that a er intersecting MV N with a hyperplane at x, the resulting surface singularity (S, x) is taut. In particular, the Euler obstruction Eu MV N (x) is determined by the resolution graph of x in S. This allows us to use the enumerative properties ofP 3 that are worked out in section 3 to compute Eu MV N (x).
In the nal section, we put these pieces together and obtain the polynomial p(N).

De nitions and notation
Let P be a 3 × 4 matrix with values in R. We consider each row l as an a ne function on R 3 . Explicitly, l sends a vector v = (x, y, z) to the dot product of l and (x, y, z, 1). We denote these functions by f , g and h.
The matrix P de nes a rational map φ P : which corresponds to the operation of mapping the "world coordinates" R 3 to the "image coordinates" R 2 . In other words, it describes the process of taking a picture of the world with a camera whose parameters are encoded in P.
It is not hard to prove that this description of a camera is equivalent to the pinhole camera model. In particular, the camera has a position called the camera center and is pointing in a certain direction. The plane de ned by the camera center and direction is called the principal plane. It turns out that with the above notation, the principal plane is the plane de ned by the ideal (h), and the camera center is the point de ned by (f , g, h). For the purposes of this paper, this observation will be taken as a de nition. Now, suppose that we have a collection of cameras P 1 , . . . , P N . By taking a picture of the world with each of the cameras, we obtain a rational map: This map clearly extends to the complex numbers, giving us a rational map from C 3 C 2N . Furthermore, by clearing the denominators in the de nition of the maps φ P i we obtain a rational map The scheme theoretic image of this map is called the multiview variety associated to the cameras P 1 , . . . , P N .

Example 1.
Consider the following three cameras: The associated rational map is We say that a collection of cameras is in general position if the hyperplanes de ned by the linear functions {f 1 , g 1 , h 1 , . . . , f N , g N , h N } associated to the rows of the camera matrices are in general position.
Finally, we will use the following notation throughout the paper (see Figure 1). The principal plane of the i-th camera will be denoted by H i and the center of the i-th camera will be denoted by q i . Also, we

A resolution of the multiview variety
In this section we describe a resolution of the multiview variety associated to N ≥ 3 cameras in general position. It is obtained as an iterated blowup along smooth centers. We then apply standard theorems to compute a presentation of the Chow ring of the resolution, and identify a couple of important ring elements.
Let P 1 , . . . , P N be camera matrices for a collection of N cameras in general position, and let be the corresponding rational map. We denote the associated multiview variety by MV N ⊂ P 2N .

Proposition 1.
The base locus B of φ is the reduced scheme supported on the union of the camera centers q 1 , . . . , q N and the lines Proof. It can be seen directly from the equations of φ (equation 1) that B is supported on the camera centers union the lines L ij . We will show that the scheme structure of B is the reduced structure on this set. By a strategic choice of coordinates on P 3 , we can assume that h 1 = x, h 2 = y and h 3 = z. We now analyze the scheme structure of B in a neighborhood of the point p 123 = (x, y, z). First of all, recall that the i-th camera contributes the two equations f i · j =i h j and g i · j =i h j to the ideal of B.
By our genericity assumptions, all of the f i 's, all of the g i 's, and h i for i ≥ 4 are invertible in some Zariski neighborhood of p 123 . This implies that in a neighborhood of p 123 , the ideal of B has the form: (xy, xz, yz).
Thus, the ideal de ned by this scheme is reduced and supported on the coordinate axes. The same argument shows that all of the lines L ij in the base locus have the reduced scheme structure. A similar argument implies the points q i are reduced.

Constructing a resolution of φ
In this section we construct a resolution of MV N in two stages. First, we blow up P 3 at the points q 1 , . . . , q N and at the points p ijk for all 1 ≤ i < j < k ≤ N. This gives us a map LetL ij ⊂ Y 1 denote the proper transform of L ij . Note that these proper transforms are disjoint lines in Y 1 .
For the second step, we blow up each of the linesL ij and obtain a resolution Let us denote Y 2 byP 3 , and denote the composition b 1 • b 2 by π . Since the pullback of the base locus π −1 (B) is a Cartier divisor onP 3 , there exists a canonical mapP 3 ψ − → Bl B P 3 which ts into the following diagram: were b is the blowup map and Bl B φ is the resolution of the rational map φ. Finally, we de neφ = Bl B φ • ψ. SinceP 3 is smooth, we thus obtain the following resolution of MV N : By an abuse of notation, we will sometimes think ofφ as a map to P 2N , and other times as a map to MV N .

The Chow ring ofP 3
SinceP 3 is an iterated blowup of P 3 along smooth centers, we can use standard theorems to compute its Chow ring. We will use a statement in [6] which we state here for convenience.
is a degree d polynomial whose constant term is [X], and whose restriction to X is the Chern polynomial of N X/Y . In other words, The polynomial P X/Y is called the Poincaré polynomial of X in Y.
By applying theorem 2 rst to Y 1 b 1 − → P 3 and then toP 3 The meaning of the generators is as follows. Letq i ∈P 3 denote the exceptional divisor of the camera center q i , letp ijk ∈P 3 the exceptional divisor of the point p ijk , and letL ij ⊂P 3 the exceptional divisor of the line L ij .
Then, we have the following identities in A • (P 3 ): In the next section, we will need to evaluate the degree map deg : SinceP 3 is irreducible, A 3 (P 3 ) has rank 1. In addition, deg(h 3 ) = 1. This means that calculating the degree map is equivalent to expressing every monomial α ∈ A 3 (P 3 ) as a multiple of h 3 : To simplify the calculation, note that product of two generators that correspond to disjoint subschemes ofP 3 is zero. For example, Q i · P jkl = 0 for all i, j, k and l.
Thus, the main di culty is dealing with self intersections such as T 3 ij . In order to deal with these, we will calculate the Poincaré polynomials of q i ∈ P 3 , p ijk ∈ P 3 and L ij ⊂ Y 1 . By theorem 2, this will give us relations involving the self intersections, which in this case turn out to su ce for the degree calculation.
Since q i ⊂ P 3 is a point, its Poincaré polynomial is P q i /P 3 (Q i ) = Q 3 i + h 3 , and similarly, Finally, note that L ij ⊂ Y 1 is a line that passes through N − 2 blown up points. We deduce from this that

The Chern class of the resolution
In this section we compute c(P 3 ) as an element of A • (P 3 ) and nd its pushforward to P 2N (proposition 4).
Our main tool will be the following proposition. Suppose that c k (N X/Y ) = i * c k for some c k ∈ A k (Y), and that c(X) = i * α for some α ∈ A • (Y). Let η = c 1 (OỸ (X)). Then, One takeaway of this proposition is that the Chern class of the blowup along a disjoint union of subvarieties is obtained by summing over contributions from the individual components.

Proposition 3. The Chern class of the resolutionP 3 is equal to
Proof. Our strategy will be to use proposition 2 to compute the contributions to the Chern class of each of the varieties that are blown up during the construction ofP 3 .
We rst apply proposition 2 to the situation where Y = P 3 and X = q i for some i. In this case, we can take c 0 = 1, c k = 0 for k > 0 and α = 1. By proposition 2 the blowup at q i will contribute Similarly, γ ijk represents the contribution from the blowup of the point p ijk . Finally, we compute the contribution from the blowup along a line f : L ij ֒→ Y 1 . Since L ij passes through N − 2 of the blownup points in Y 1 , a quick calculation shows that we can take c 0 = 1, c 1 = −2(N − 3)h, and the rest to be zero. In addition, since L ij ∼ = P 1 , we can take α = (1 + h) 2 . This implies that the contribution coming from L ij is We now compute the pullback of c 1 (O P 2N (1)) in A • (P 3 ) along the mapφ.
Proof. It is well known (e.g [5, 4.4]) that if L is a line bundle on X, V ⊂ H 0 (X, L) is a linear system and is the induced resolution, then where E ⊂X is the exceptional divisor.
In our case, one can show by a local calculation that the preimage inP 3 of the base locus B has class gives the stated expression.
We can now compute the pushforwardφ * c(P 3 ) as an element of the Chow ring of P 2N .
Proof. Since we have already calculated c( , the calculation ofφ * c(P 3 ) is reduced to calculating the degrees of the intersections for 0 ≤ k ≤ 3. Using the relations in A • (P 3 ) that we described in Section 3.2, the result follows by a direct calculation.

Higher discriminants
Higher discriminants, introduced in [9], provide a framework in which to study the singularities of a map. In particular, we will use them to understand how the Chern class ofP 3 computed above pushes forward alongφ. We now recall the de nitions from [9] and phrase them in a way that will be easiest to use in our context.

De nition 1.
Let f : Y → X be a map of smooth manifolds. The i-th higher discriminant of the map f is the locus of points x ∈ X such that for every i − 1 dimensional subspace V ⊂ T x X, there exists a point y ∈ f −1 (x) such that: We denote the i-th higher discriminant by i (f ).
For example, a point x ∈ X is in 1 (f ) if and only if it is a critical value of f . Indeed, according to the de nition this happens exactly when there is a point y ∈ f −1 (x) whose Jacobian On the other extreme, x ∈ dim(X) (f ) if and only if for every codimension one subspace V ⊂ T x X, there exists a point y ∈ f −1 (x) that satis es: It is instructive to consider the blowdown map: f : Y = Bl p P 2 → P 2 . For every point y ∈ E p = f −1 (p), f * T y Y is one-dimensional. This means that p ∈ 1 (f ). In addition, it is not hard to see that for every one-dimensional subspace V ⊂ T p P 2 , there is a point y ∈ E p such that f * T y Y = V. This implies that p ∈ 2 (f ).

Lemma 2 ([9, Rem. 3]).
Let Y → X be a proper map of smooth manifolds. Then all of the higher discriminants of f are closed, and we have the following strati cation of X: The signi cance of the higher discriminants is that they tell us which strata appear when writing f * 1 Y in the basis of Euler obstruction functions on X. For background on Euler obstructions we recommend [8].

Higher discriminants of the resolutionφ
In this section we describe the higher discriminants of the map φ :P 3 → MV N ⊂ P 2N .
Since the de nition of higher discriminants assumes that the source and target are smooth, in this section we considerφ as a map to P 2N . Let X i ∼ = P 1 ⊂ MV N denote the image of the proper transform of the principal plane of the i-th camera. The restriction ofφ to the complement of the preimage of the X i 's is an isomorphism, which means that the set theoretic singular locus ofφ is contained in the disjoint union ∐ i X i .
The following proposition describes the higher discriminants ofφ. Proposition 6. The higher discriminants ofφ are given as follows: To prove this proposition, we use the following lemma, which follows almost immediately from the de nition of the higher discriminants.
Lemma 3. Let f : Y → X be a map of smooth complex algebraic varieties. Let C ⊂ X be a smooth curve. Suppose that the restriction of f to f −1 (C) has no critical values. Then C ∩ dim(X) = ∅.
Proof. Since the restriction of f to f −1 (C) has no critical values, for every point x ∈ C and every point y ∈ f −1 (x) the one-dimensional space T x C ⊂ T x X is contained in f * T y Y. Therefore, if V ⊂ T x X is any vector space complement to T x C, then f * T y Y is not contained in V. By de nition, this implies that x / ∈ dim(X) (X).
We apply this lemma to each of the P 1 's X i ⊂ P 2N . Let f : Y → P 1 ∼ = X i denote the restriction ofφ to X i . Then Y is isomorphic to the blowup of P 2 at 1 + N−1 2 points: q = q i and p ijk for j, k = i. The map f is obtained as follows. First, let g : Bl q P 2 → P 1 be the resolution of the projection away from q. Then, let h : Bl q,p ijk (P 2 ) → Bl p (P 2 ) be the blowup along all of the points p ijk for j, k = i.
Finally, we claim that f ∼ = g • h.
In particular, f has no critical values. According to the lemma, this proves proposition 6.

The Chern-Mather class of the multiview variety
In this section we compute the Chern-Mather class of MV N using the theory of higher discriminants. We then use the result to determine the ED degree of MV N .

The basic setup
By propositions 5 and 6, there exists and integer α such that At a general point x ∈ X i , the Euler characteristic of the ber is χ (φ −1 (x)) = χ (P 1 ) = 2 and Eu X i (x) = 1. This implies that For the moment, suppose we knew the Euler obstruction Eu MV N (x). Then, by taking the Chern-Schwartz-MacPherson class (see [8]) of both sides of equation 2 and recalling that X i ∼ = P 1 we obtaiñ Since we have already calculatedφ * (c(P 3 )) for all N, this would give us the Chern-Mather class of the multiview variety MV N .

Calculating Eu MV N (x)
To compute Eu MV N (x), rst note that we can intersect MV N with a general hypersurface H passing through x. As a result, we obtain a surface singularity: x ∈ S = MV N ∩ H.
By a wellknown theorem about Euler obstructions (see [3,Sec. 3]), Now, suppose we restrict the resolutionφ to S. Lemma 4.φ| S is a resolution of S such that the preimage of x is a rational curve with self intersection −(N − 1).
Proof. Let E be the preimage of x. Note that E is the proper transform of a line in the principal plane of the i-th camera. To compute the self-intersection of E inS =φ −1 (S) consider the following embeddings: By the Whitney sum formula, we have As we have already computedφ * (O P 2N (1)) ∈ A • (P 3 ), we just have to calculate (ji) * c(N E/P 3 ). By intersecting E with the generators of A 2 (P 3 ) we nd Using this identity together with our presentation of A • (P 3 ) gives Plugging everything into the Whitney sum formula shows that the degree of c(N E/S ) is −(N − 1), which completes the proof.
We now show that this self intersection number determines the Euler obstruction Eu S (x).

Lemma 5.
With x ∈ S the isolated singularity as above, Eu S (x) = 3 − N.
Proof. Recall ( [7]) that a singularity germ (X,x) is taut if the analytic type of (X, x) is determined by the resolution graph of some resolution of singularities. By [7, 2.2] the vertex of the cone over the rational normal curve with degree n is taut. Let us denote this singularity by (X n , 0). Since this singularity has a resolution in which the exceptional divisor is a P 1 with self intersection −n, the resolution graph is a single vertex with weight (0, −n). It follows that any singularity with this resolution graph is analytically equivalent to (X n , 0). In particular, by lemma 4, (S, x) is analytically equivalent to (X N−1 , 0) so the Euler obstruction Eu S (x) is equal to the Euler obstruction Eu X N−1 (0). By [2, 3.17], the latter is equal to 3 − N.

The ED degree of the multiview variety
As a corollary of theorem 3, we can compute the Euclidean distance degree of MV N . Proof. We can use the formula in [2] to express the sum of the polar degrees of MV N in terms of the Chern-Mather classes. Using this formula gives: Now, by the proof of [4, 6.11], if X is an a ne cone, then the ED degree of X v is equal to the sum of the polar classes of X v for a general translate X v of X.
Suppose MV N is the multiview variety associated to the camera matrices P 1 , . . . , P N . Recall that MV N ⊂ P 2N is the projective closure of a subvariety of C 2N which we will call X. Let (v 1 , v 2 , . . . , v 2N−1 , v 2N ) ∈ C 2N be a vector. We will now show that X v is multiview variety associated to a di erent collection of cameras. Indeed, let M i be the matrix Then, the variety X v is the multiview variety associated to the cameras M i · P i for 1 ≤ i ≤ N.
In conclusion, there exists a general con guration of cameras such that the ED degree of the associated multiview variety MV N is equal to the sum of the polar classes of MV N .