Implementation and characterization of the optical-see-through Maxwellian near-eye display prototype using three-dimensional printing

By presenting always-in-focus virtual images, Maxwellian near-eye displays can remove the vergence accommodation conflict, which causes eye fatigue or dizziness. Described in this paper are an implementation process and observations of virtual images of the optical-see-through Maxwellian near-eye display prototype made with a three-dimensional printer.


Introduction
Near-eye displays (NEDs) present virtual images to their users, and they are crucial devices in augmented reality (AR) and virtual reality (VR) applications. Various methods have been proposed to design imaging configurations for the AR and VR NEDs [1][2][3][4][5][6][7][8]. Most of such configurations, however, suffer from a vergence accommodation conflict (VAC), which is caused by the difference between the distances of the eye focus and virtual image planes [9]. The VAC problem should be addressed because it may cause dizziness and nausea on the part of the users. Holographic or light field displays can solve the VAC problem by providing true three-dimensional (3D) images, but their complicated optical configuration and specialized content creation requirement are problematic.
Maxwellian displays, also called 'retinal scanning displays,' have sparked great interest because they can relieve the VAC problem in a simple way [10][11][12][13][14]. The Maxwellian displays give two-dimensional (2D) virtual images with a large depth of focus (DOF), making it possible to remove the monocular focus cue of the displayed virtual images. The removal of the focus cue addresses the VAC problem by forcing the users to estimate the image distance based only on the binocular disparity or eye convergence. Although the Maxwellian displays can address the VAC problem, they also have some limitations. One of these is their restricted eyebox due to the spot-like exit pupil formed by a convex lens or a concave mirror. S.-B. Kim  this limitation [13]. It was shown that the use of holographic optical elements (HOEs), in which three concave mirrors are recorded, could expand the eyebox to about three times its previous size. In the said study, however, only a proof-of-concept experiment was conducted on the optical bench, and a compact wearable prototype was not fabricated. Reported in this paper is the fabrication process of the prototype of the eyebox-enlarged Maxwellian NED using a 3D printing method. The design and fabrication of the prototype as well as the analysis of the displayed images are described in detail in the following sections.

Principle of the Maxwellian displays
Maxwellian displays can address the VAC problem by providing always-in-focus 2D images to a user. Two simple applications of the traditional Maxwellian displays are shown in Figure 1. As can be seen in Figure 1(a), a convex lens is used to create a focal spot in front of the eye. In Figure 1(b), a concave mirror is used for the same purpose. Through the lens or mirror in front of the eye, an effective eye pupil becomes a small pinhole. Similar to the principle behind the pinhole camera, this optical system displays virtual images with a large DOF, always presenting in-focus images regardless of the focal power of the eye lens.
These systems, however, cannot realize AR because the real scene is distorted by the convex lens or blocked by the concave mirror in front of the eye. Another problem is the strictly limited eyebox because the focal spot should be located exactly within the pupil of the eye. To prevent these drawbacks of the Maxwellian display, an HOE is used in the prototype of the proposed eyeboxenlarged Maxwellian NED. One of the crucial properties of the HOEs is that they can be made to act like any kind of mirror or lens. Second, they are very thin and transparent, allowing the users to see through them. Thanks to these optical characteristics, they can act as a compact and lightweight optical combiner, mixing real-world and virtual images [15,16]. Figure 2 shows a schematic diagram of the Maxwellian display using an HOE attached onto a waveguide. To fabricate this HOE, diffraction gratings must be recorded in the medium. Among the several holographic recording media candidates, a photopolymer film was used in the prototype of the proposed eyebox-enlarged Maxwellian NED due to its easy handling and processing properties. Figure 3 shows a schematic diagram of the HOE recording setup. One reference beam and three object beams are used to make the HOE act like three concave mirrors, creating three focal spots instead of one. Using the recorded HOE, an optical-see-through Maxwellian NED prototype providing AR images to the users can be implemented. The input beam is diffracted by the HOE, providing virtual images with a large DOF, and the light from the outside passes through the HOE without diffraction, giving an undistorted real-world scene. Moreover, the recorded three concave mirrors expand the horizontal eyebox size to about three times that of the traditional Maxwellian displays. With this proposed method, the Maxwellian optical-see-through NED prototype was designed.

Maxwellian near-eye display prototype fabrication
To fabricate the Maxwellian NED prototype, suitable components according to the optical design were first chosen. Figure 4 shows a schematic diagram of the   arrangement of the prototype's optical components. The detailed specifications of the components are shown in Table 1. The light source of the NED is a green laser (Coherent sapphire SF-532), which was also used for the HOE recording. A single-mode optical fiber is used to transmit the laser light to the NED prototype. For the eye safety, the laser intensity was kept low during the operation [17]. One of the lenses located in front of an optical fiber is for collimating the light from the light source. The two other lenses and the aperture are used to create a 4f-system, which removes unwanted high-order diffraction from the spatial light modulator (SLM). A slanted waveguide was designed to satisfy the total internal reflection (TIR) conditions of the input beam. After  the collimated light passes through the SLM and the 4fsystem, it goes into the slanted waveguide. When the light is reflected several times in the waveguide via TIR and then reaches the HOE, the beam is diffracted by the HOE, which creates three focal spots so that clear images will always be provided to the user's eye. After the design of proper optical elements, a housing scheme that can fix them should be selected. 3D printing is a good way to create housings because it is cheap and can easily make complicated 3D figures. Considering the sizes and characteristics of the components, the 3D model of the NED housing was designed with a computer-aided design (CAD) tool. Figure 5 shows the rendering image of the bottom plate of the housing and the optical components. With the combined bottom and upper plates, the housing's dimensions are about 99.31 × 177.31 × 37.20 mm. After the design of the 3D model, a proper 3D printer was selected. Fused deposition modeling (FDM) 3D printers have been frequently used of late to make 3D models because of their low printing cost and simple mechanism. They create 3D models by melting and then stacking the filament, which is weak in heat. As a filament, heat-resistant polylactic acid (PLA) filament is used to fill the 3D model.
As shown in Figure 6(a), the bottom plate of the housing was successfully printed, and it was then assembled with the optical components. Figure 6(b) shows the completed Maxwellian NED prototype combining the bottom and upper housing plates.

Experiment results
In this section, the image display characteristics of the prototype of the proposed eyebox-enlarged Maxwellian NED are shown and described. The resolving power of the prototype is also analyzed by measuring the modulation depth according to the spatial frequency of the test images.

Observed images from the Maxwellian NED prototype
Shown in Figure 7 are the schematic diagram of the optical system of the prototype and the captured image of the three focal spots generated by the HOE. The focalspot picture was taken by positioning a diffuser in the eyebox plane, which was 8.5 cm from the attached HOE. The lateral distances between such focal spots were 3 mm, comparable to the diameter of the human pupil.   To verify if the prototype can always provide in-focus virtual images to the user's eye, a camera was positioned at one of the focal spots, and pictures were taken while changing the camera's focal length. Figure 9(a), (b), and (c) show the pictures that were obtained when the camera was focused at 2.63D (38 cm), 1.67D (60 cm), and 0.00D (infinite plane), respectively. As expected, no matter where the camera was focused, focused virtual images were always observed from the prototype.

Analysis of the virtual images
As the Maxwellian display provides virtual images with a large but not infinite DOF, the clearest virtual image exists at a certain distance from the camera. As these researchers aimed to provide a virtual image with a DOF from 3.33D to 0.00D, the prototype was adjusted to form the virtual image plane at 1.67D, which is the mid-distance between 3.33D and 0.00D.
After the creation of these settings, the NED prototype was analyzed to determine how large a DOF it provides as a Maxwellian display. An experiment was performed to determine how high the spatial frequencies that can be expressed by the NED prototype are when the camera focus deviates from the image plane at 1.67D. Twenty seven images with different spatial frequencies ranging from 1 to 16 line pairs (lp)/mm measured based on   the SLM display size (11.176×8.382 mm) were created. Figure 10(a) shows zoomed input images with four pairs of lines, and Figure 10(b) shows an observed image from the prototype. The observed color images were converted to grayscale, as shown in Figure 10(c), and the pixel intensities of the images were analyzed. In the analysis, the 145 pixel rows marked with a red box in Figure 10(c) were averaged for higher reliability. Using these values, a graph was plotted, as shown in Figure 10(d), where the x-axis is the horizontal pixel position and the y-axis is the average intensity of each pixel column.
For the evaluation of the resolving power of the prototype, the modulation depth of the displayed images was measured at different spatial frequencies. Virtual images with 27 different spatial frequencies were displayed by the prototype and captured with eight different focal lengths of the camera, giving a total of 216 captured images for analysis. As there already was an intensity graph with four bright and three dark areas, the average values of the four maximums and three minimums in each area were used to get the modulation depth. Figure 11 shows the results of the analysis. The measured modulation depth was plotted according to the different spatial frequencies of the input images. As can be seen in this graph, the prototype of the proposed eyebox-enlarged Maxwellian NED could provide images of 6 lp/mm at all focal lengths, keeping the modulation depth higher than 70%. At the same modulation depth criteria, 8 lp/mm images could be displayed with the camera focus from 2.00D to infinite plane. 10 lp/mm images could be clearly seen at the camera focus from 2.00D to 1.00D. From the 12 lp/mm spatial frequency, the modulation depth could not exceed Figure 11. Graph of the modulation depth according to the spatial frequency of the input image.
70% anywhere. As a result, the prototype of the proposed eyebox-enlarged Maxwellian NED can always provide infocus virtual images with a 6 lp/mm spatial frequency at all the eye focal lengths.
Field of view (FOV) is also an important specification for evaluating AR NEDs. The theoretical FOV of the prototype of the proposed eyebox-enlarged Maxwellian NED was calculated as 9.4 × 9.4°because the laser beam diameter in the HOE recording process was 1.4 cm and the recorded focal length of the HOE was 8.5 cm. Only a 5.7 × 4.3°FOV, however, was actually used in the experiment because of the poor quality of the HOE at the boundary of the recorded region. Note that the FOV of the proposed eyebox-enlarged Maxwellian NED is not fundamentally limited to this small value because it depends only on the HOE recording setup. Note also that the proposed eyebox-enlarged Maxwellian NED has a relaxed trade-off relationship between the eyebox and the FOV [13,18]. The size of the individual eyebox associated with a single focal spot of the HOE has a trade-off relationship with the FOV due to the etendue conservation law, as with the conventional NEDs. The number of replications of the individual eyebox, however, is determined only by the number of focal spots of the HOE in the proposed eyebox-enlarged Maxwellian NED. Therefore, the total eyebox area can be much enlarged over the conventional trade-off limit at a given FOV. In the prototype of the proposed eyebox-enlarged Maxwellian NED, a 9 × 3 mm total eyebox was achieved by using three focal spots of the HOE, while the individual eyebox was about 3 × 3 mm in size.

Conclusion
An optical-see-through Maxwellian NED prototype was implemented using an FDM 3D printer with a PLA filament. From the experimental analysis, it was confirmed that the virtual images with a 6 lp/mm spatial frequency can be clearly observed with an over 70% modulation depth while the eye or camera focus changes from 3.33D to 0.00D. In the narrower eye focus range, images with higher spatial frequencies can be displayed with the same modulation depth by the implemented prototype. As a result, it was proven that the Maxwellian display can be fabricated as a compact prototype rather than only presenting a proof of concept for it on an optical bench.

Disclosure statement
No potential conflict of interest was reported by the authors.  He has been working on the acquisition, processing, and display of three-dimensional information using holography and light field techniques.