Evaluation of the legibility of the on-screen contents of in-vehicle transparent displays

With regard to the on-screen contents of transparent displays utilized inside vehicles, legibility is one of the most important factors to be considered. This paper proposes a quantitative legibility evaluation model of in-vehicle transparent displays. Human visual experiments were performed based on the simulation of a perceived image. The results of the visual experiments served as qualitative evaluation data of legibility, and were utilized to determine the weights in the proposed model through the multiple regression technique. The experiment results indicated that the proposed model obtained higher Pearson correlation coefficient values than those obtained in the previous works.


Introduction
Transparent displays can simultaneously provide onscreen information on the display and background information behind the display. For this reason, transparent displays can be utilized for various applications. One of the possible application candidates may be the use of transparent displays inside vehicles. For example, instead of a head-up display system, transparent displays can be utilized for providing driving-related information. In addition, contents describing the scenery or buildings visible through the window of a vehicle can be displayed on the transparent displays. In these applications, it is important for the on-screen information to be clearly recognizable. There are two different criteria for measuring the degree of perception of on-screen contents. One is readability, which often refers to the ease by which the words, sentences, and/or paragraphs can be recognized [1,2]. Another is legibility, which indicates the ease by which symbols or single characters presented in a noncontextual format can be distinguished [1,2]. This paper focuses on the legibility of the on-screen information of transparent displays.
The legibility of the on-screen information of transparent displays depends on the perceived contrast of the text, symbols, and through-screen contents. It is affected by the illuminance, transmittance, and (maximum) luminance of transparent displays, etc. In this paper, illuminance refers to that outside the vehicle. Transmittance is the ratio of the luminous flux transmitted through transparent displays [3]. The (maximum) luminance of transparent displays affects the luminance level of the on-screen text. The legibility of transparent displays also depends on the size of the text or symbol. In this paper, all of these factors are considered to develop a quantitative evaluation model of the legibility of the on-screen contents of in-vehicle transparent displays.
A few studies have been reported to have developed a quantitative model for evaluating the legibility or readability of displays [2,[4][5][6][7], but most of them were aimed for traditional displays and HUDs (head-up displays). The studies on transparent displays are scarce. In [2,4], human visual experiments were first performed. The contrast of text and background was chosen in constructing the legibility evaluation model. The values of the weights in the legibility evaluation model were determined by applying regression techniques. The method used in [2] was designed for the EPD (electronic paper display), and [4] utilized CRTs (cathode ray tubes). In [2,4], the effects of the viewing environments on the legibility were not considered. The RVP (relative visual performance) model quantifies the readability of displays [5,6]. In [5], the conventional LCDs (liquid crystal displays) and transflective displays were utilized in the experiments. The contrast of text and backgrould as well as the size of the text were considered. In addition, the effect of the ambient illumination on the readability was reflected in the RVP model. The PJND (perceptible justnoticeable difference) model quantifies the readability of an in-vehicle HUD [7]. In the PJND model, the effect of the text size is considered during the calculation of the relative contrast sensitivity [8].
This paper proposes a method of quantitatively evaluating the legibility of the on-screen contents of in-vehicle transparent displays. Human visual experiments were first performed. The results of the visual experiments served as the qualitative evaluation data of the legibility in this study. To perform visual experiments, a set of perceived images of the on-screen contents of in-vehicle transparent displays was needed. There are various factors affecting the legibility of the on-screen contents of in-vehicle transparent displays. They include the viewing illuminance, the transmittance and (maximum) luminance of the transparent displays, and the size of the text and/or symbols, among others. In this study, simulation of a perceived image was performed to obtain perceived images with varying degrees of legibility. The different values of the aforementioned factors affecting the legibility served as the parameters in the simulation. The examination of the effects of the aforementioned factors allowed the selection of a set of components consisting of the proposed model equation. The values of the weights in the proposed legibility evaluation model were obtained using the multiple regression technique. The performance of the proposed model was compared with those from previous works. This paper is organized as follows. In section 2, the performed simulations of perceived images of the onscreen contents of in-vehicle transparent displays are described. In section 3, the human visual experiments that were performed to obtain qualitative evaluation data of the legibility are described, the four major factors affecting the legibility are examined, and the construction of the proposed legibility evaluation model is explained. In section 4, the performance evaluations of the proposed model are described. Finally, section 5 concludes this paper.

Simulation of perceived images of the on-screen contents of in-vehicle transparent displays
This paper proposes a method of simulating perceived images of the on-screen contents of in-vehicle transparent displays to evaluate the legibility of such contents. For the simulation, it was assumed that transparent OLED (organic light-emitting diode) displays are used for invehicle transparent displays. The proposed simulation method generates perceived images with varying degrees of legibility. The simulated images are utilized in human visual experiments to obtain qualitative evaluation data of the legibility. The first step in the simulation is to identify the major factors affecting the legibility so that they can be considered in the simulation process.

Major factors affecting the legibility
Four different factors affecting the legibility can be described using the examples of perceived images of the on-screen contents of a transparent display illustrated in Figure 1. Figure 1(a) shows a through-screen image, and Figure 1(b) shows an example of an on-screen content. Figure 1(c) shows perceived images of the on-screen contents of a transparent display with different illuminance levels. The lower the illuminance is, the lower the background brightness. It can be noticed that the legibility is higher in the right image of Figure 1(c) because the on-screen content appears much brighter than the background. Figure 1(d) shows examples of perceived images with different transmittances. The left image exhibits the case with higher transmittance. Changes in legibility can be clearly noticed in Figure 1(d). Figure 1(e) shows perceived images of the on-screen contents of transparent displays with different maximum luminance values. The left image is the case when the maximum luminance is lower than that in the case in the right image. The on-screen content appears brighter when the maximum luminance of the display increases. Figure 1(f) shows the effect of the size of the on-screen content. This paper simulates perceived images of the on-screen contents of in-vehicle transparent displays by changing the values of the aforementioned factors.

Simulation of perceived images
Simulation of perceived images is performed in two steps. First, a perceived image is simulated with different values of the aforementioned factors. Suppose that a perceived image is generated under the illuminance value of 15,000 lux. The legibility of the perceived image can be properly evaluated when it is viewed under ambient illumination yielding the illuminance of 15,000 lux. In this study, it was assumed that the illuminance is measured by placing an illuminance meter perpendicular to the ground in front of the windshield of the vehicle. To perform visual experiments with the simulated images generated under different illuminance conditions at a (single) reference viewing environment, extra simulation for conversion to the reference viewing environment is needed. This process is performed based on the CIECAM02 model [9]. Figure 2 shows a flowchart of the proposed simulation method. First, the CIEXYZ values of the see-through image are calculated according to the illuminance and transmittance [10]. To obtain the see-through image, blur simulation is first performed [11]. The spectral transmission through a transparent display is measured and applied. In addition, simulation for the depth of field is employed. In addition, the RGB values of the on-screen image are converted into CIEXYZ values through the colorimetric characterization of the display. This paper utilizes the LUT(look-up table)-based characterization  method [12]. When the RGB-to-CIEXYZ conversion of the see-through and on-screen image is completed, the CIEXYZ values of both images are added. These added values represent the CIEXYZ values of the perceived image of an on-screen content of the in-vehicle transparent display under a given illuminance value. The simulated image is then converted to a reference viewing condition for the visual experiments. The color appearance model CIECAM02 is used for viewing condition conversion [9]. In CIECAM02, the QMH color space is used. The CIEXYZ values of the perceived image of an on-screen content of the in-vehicle transparent display are converted to QMH values by CIECAM02. The measured illuminance and maximum luminance values of the transparent display are used in this process. Then the QMH values are converted to CIEXYZ values through the reverse process of CIECAM02. The reference viewing illuminance and maximum luminance values of the experimental display are used in this process. Figure 3 shows examples of the simulation results.

Proposed legibility evaluation model
The proposed legibility evaluation model was developed as follows. First, perceived images of the on-screen contents of in-vehicle transparent displays were generated through the simulation method described in section 2. Second, human visual experiments were performed using the simulated images. Third, by examining the effects of the aforementioned factors, a set of components consisting of the proposed model equation were selected. Finally, the values of the weights in the proposed legibility evaluation model were determined using the multiple regression technique.

Visual experiments
Figure 4(a) shows an example of an on-screen image that was used in the experiment. The hue of the content was set as yellow, which has high visibility in both daytime and nighttime [13]. Images of the outside taken from the driver's position using a camera were utilized as the background scene. Figure 4(b) and (c) show examples of the daytime and nighttime background scenes, respectively. In this study, four different illuminance values were utilized for the perceived-image generation via simulation. The low-and medium-illuminance environments were represented by 50 and 3,000 lux, respectively [13], and the 6,000 and 15,000 lux illuminance values were utilized for the high-illuminance environments [13]. The 15000, 6000, and 3000 lux illuminance values can be regarded as daytime illuminance, and 50 lux as nighttime illuminance. Four different maximum luminance values of in-vehicle transparent displays were utilized for each illuminance value. The selected maximum luminance values were within a range where no glare due to the display luminance occurs. They were experimentally determined. The selected transmittance values were 2.3, 7.5, 15.0, 26.3, 37.5, and 48.8%. They were chosen to generate perceived images with wide degrees of legibility. In this paper, the size of the on-screen content is defined by the height of the content in the arcmin unit. The sizes of the on-screen contents that were used in this study were 225, 150, and 75 arcmin. They correspond to 4.5, 3.0, and 1.5 cm, respectively. Table 1 provides the configuration of the simulation images for the visual experiments. The number of simulated images was 576. They were generated through the procedure illustrated in Figure 2. First, the simulation images satisfying the conditions in Table 1 were obtained. All the human visual experiments were  performed in a darkroom, which was chosen as the reference viewing condition in this study. Therefore, viewing condition conversion was applied to the simulated images.
Human visual experiments were performed to obtain qualitative evaluation data of the legibility. A category judgment method with a 5-score scale was used [14]. To mimic a real driving situation where the driver is watching a transparent navigation display, the participants were asked to evaluate the legibility of the transparent navigation display's on-screen contents by watching the simulated images for a short time only. The evaluation criteria that were used for the visual experiments are  Table 2. Score 3 represents acceptable legibility for the on-screen contents of in-vehicle transparent displays. The visual experiments were performed in the darkroom. The participants had a darkness adaptation time of about 5 min. Simulated images were displayed on a 55-inch OLED TV. The distance between the display and the participant was set to 70 cm, which represents the typical distance between the driver and the windshield of a vehicle. When the FOV (field of view) value is too narrow or too wide, it may affect the legibility of the onscreen contents. Thus, in this study, the FOV value was set at 10.5˚so that it would not influence the degree of legibility [13]. Ten male and ten female observers aged 20-28 years participated in the experiment. All of them had normal vision, without color blindness, color weakness, or astigmatism. Their visual acuity ranged from 0.8 to 1.2.

Selection of components of the proposed model formula
In this study, the contrast of the perceived image was calculated using equation (1), and was utilized for the proposed evaluation model.
where L on is the average luminance of the on-screen contents on the simulated perceived image, and L bk is the average luminance of the remaining areas on the simulated perceived image. The change in legibility due to a combination of the illuminance, transmittance, and maximum luminance of the display may be represented by the contrast defined in equation (1). Figure 5 shows a graph of the qualitative evaluation data of the legibility obtained from the visual experiments vs. the calculated contrast value in equation (1). The graph in Figure 5 indicates that as the contrast increases, the legibility increases as well in the form of a log curve, as marked by the red dotted line. All the sample points in Figure 5 have the same content size but  different display illuminance, transmittance, and maximum luminance values. In the graph, the sample points with a fixed contrast value exhibit a wide range of legibility scores. In other words, the contrast in equation (1) alone cannot accurately express the change in legibility due to the differences in the illuminance, transmittance, and maximum luminance of the display. This means that in addition to the contrast in equation (1), other components should be added to the proposed model formula. Figure 6 shows the changes in legibility due to the changes in illuminance. Figures 5 and 7 show the sample points with the same on-screen content size. Figure 6 shows that as the illuminance increases, the legibility is reduced even though the contrast values remain the same. This is mainly due to the fact that the CSF (contrast sensitivity function) of the human visual system depends on the illuminance [15]. Based on this finding, illuminance is added to the proposed legibility model formula. Figure 7 shows the variations in legibility when two selected components, the contrast in equation (1) and illuminance, remain constant. In addition, the sizes of the contents for the sample points in Figure 7 are the same. The vertical axis represents the qualitative evaluation values of legibility obtained from the visual experiments. The horizontal axis represents the values of the transmittance and maximum luminance of the transparent displays. It can be noticed from both Figure 7(a) and (b) that the legibility increases as the transmittance and maximum luminance of the transparent display increases when the values of the contrast in equation (1), the illuminance, and the size of the content remain the same. This finding implies that the contrast in equation (1) and illuminance may not be sufficient to accurately represent the degree of legibility. Based on the results in Figure 7, the transmittance and maximum luminance of transparent displays can serve as candidates for an additional component to the proposed model. An increase in transmittance results in the improvement of the legibility only  when the contrast in equation (1) and the illuminance remain the same. Otherwise, the legibility decreases as the transmittance value increases. This can be explained by the examples illustrated in Figure 1(d). The left image in the said figure shows high transmittance, and the right image shows low transmittance. The lower the transmittance is, the darker the background, because the amount of background light transmitted through the transparent display is lowered. Therefore, the legibility increases, as shown in the right image in Figure 1(d). The increases in the maximum luminance of the transparent display, however, improve the legibility without such pre-conditions. Therefore, the maximum luminance of a transparent display is selected as the third component of the proposed legibility evaluation model. Legibility is also affected by the size of the on-screen content. Figure 8 illustrates the relationship between the legibility and the size of the on-screen content. As the size of the content increases, the legibility also increases in the form of a log curve, even though all the other factors remain fixed. Therefore, the size of the on-screen content is selected as the final component added to the proposed legibility evaluation model formula.

Derivation of the legibility evaluation model
In this paper, the legibility of the on-screen contents of invehicle transparent displays is modeled by the following equation (2).
where x 0 is the contrast defined by equation (1), x 1 is the size of the on-screen content in the arcmin unit, x 2 is the maximum luminance of the transparent display in the cd/m 2 unit, and x 3 is the illuminance in lux. According to the findings presented and explained in section 3.2, the log relationship is assumed for the contrast, size of the on-screen content, and maximum luminance of the transparent display. In equation (2), a i , i = 1, 2, ..5 are constant weights to be estimated using the regression technique.

Performance evaluation of the proposed legibility model
The weights in the proposed model in equation (2) were determined using a training set of simulation images. Cross-validation for weight estimation was first performed [16]. The selected components of the proposed model were verified by examining the value of the tstatistic [17] and the p-value [18]. The accuracy of the proposed model was evaluated by calculating the values of the Pearson correlation coefficient and RMSE (root mean square error) of the proposed legibility evaluation model. These performance indices were also calculated for the four existing methods of evaluating the legibility or readability of the on-screen contents of displays [2,[4][5][6][7]. In addition, the prediction probability of the acceptable legibility was calculated based on the acceptable legibility criterion [19].

Validity of the proposed model
The weights in the proposed legibility evaluation model defined in equation (2) were determined through the multiple linear regression technique. In this study, the number of simulated images used in the human visual experiments was 576. They were utilized for model construction and performance evaluation. As such, they were divided into two groups: the training and testing sets. The simulated images that were chosen for the training set for determining the proposed model numbered 432 (75% of the 576 simulated images). The remaining 144 samples (25% of the 576 simulated images) served as testing samples for verifying the performance of the proposed model. The performance of the proposed model depended on the estimated weights that were determined using the training samples. In this study, training samples were randomly selected from the 576 simulated images. The random selection of the training samples was repeated 10,000 times. Therefore, there were 10,000 different sets of training and testing samples. For each of the 10,000 training sets, the five weights in equation (2) were determined through the multiple linear regression technique. Thus, 10,000 model equations were obtained. To crossvalidate the proposed model construction method, the Pearson correlation coefficient between the calculated value of the proposed model and the qualitative evaluation data of legibility was calculated for each testing sample of the 10,000 models. Figure 9 shows the histogram of the Pearson correlation coefficient for the 10,000 sets of testing images. The average Pearson correlation coefficient value was 0.90, and the standard deviation was 0.01. The weights of the proposed model estimated by simulation image training set were averaged. The proposed legibility evaluation model with the average weights is expressed by equation (3).
(3) Figure 9. Histogram of the Pearson correlation coefficients. The validity of the selection of the components of the proposed legibility evaluation model was verified using a t-statistic and a p-value, which are mainly used in statistical analysis [16,17]. It is desirable that all the components satisfy the verification criteria that the t-statistic is more than the absolute value of 2 [17] and the p-value is less than 0.05 [18]. Table 3 presents the verification results. It can be seen that all the components that were selected for the proposed legibility evaluation model satisfy the verification criteria.

Accuracy of the proposed model
The accuracy of the proposed legibility evaluation model was compared with those of the following models proposed for the evaluation of the legibility or readability of the on-screen contents of displays: two models based on regression [2,4], the RVP model [5,6], and the PJND model [7]. For the clarity of the explanation, the regression-based models in [2] and [4] are called 'RM1' and 'RM2' in this paper, respectively. For a fair comparison with the proposed model, the weights in RM1 and RM2 were estimated using the training set utilized in this study. The same procedure applied for the proposed model was utilized for RM1 and RM2. To avoid dependency on the selection of training samples, the weights of RM1 and RM2 were estimated based on each of the 10,000 randomly selected sets of training images. Their values were averaged. Equations (4) and (5) represent the model equations for RM1 and RM2, respectively.
Legib RM1 = 3.38 · log(L con ) − 0.78 · C pd + 5.13 (4) Legib RM2 = 13.29 · log( L + 1) log(L b + 4) + 2.24 · log A − 13.85 (5) In equation (4), L con is a contrast measure defined in [2], and C pd is the content size expressed in the CPD (cycles per degree) unit. In equation (5), L is the luminance difference between the background and the content, and L b is the background luminance. In [4], the log( L + 1)/ log(L b + 4) term is defined to represent the contrast. In equation (5), A is the content size expressed by the height of the content in the arcmin unit.
The accuracy of the proposed model and those of the two regression-based models RM1 and RM2 can be compared based on the Pearson correlation coefficient values and the RMSE between the calculated value of the model and the qualitative evaluation data of legibility. The accuracy of the proposed model and those of the two remaining models (the RVP and PJND models), however, can be compared based only on the Pearson correlation coefficient values because they are not regression-based models. Figure 10 shows the RMSE results for the proposed model and the regression-based models RM1 and RM2. The average RMSE values of the training and testing sets for the proposed model were 0.43 and 0.43, respectively. The average RMSE values for the training and testing sets for the RM1 model were 0.53 and 0.54, respectively. The corresponding values for RM2 were 0.51 and 0.51, respectively. The main reason that the proposed model yielded smaller RMSE values is that the effects of the viewing environment on the legibility were not considered for both RM1 and RM2.
The accuracy of the proposed model was compared with those of the four existing models by examining the Pearson correlation coefficient values between the calculated measure of the model and the qualitative evaluation data of legibility obtained from the visual experiments. Figure 11 shows the results of the Pearson correlation coefficients calculated and averaged for the training and testing sets. In addition, the Pearson correlation coefficients are listed in Table 4. The Pearson correlation coefficient of the proposed model was the highest among the five models that were tested in this study. This is because the RVP and PJND models do not properly consider the change of legibility according to the content size. In the RVP model, the content size means a Landolt ring gap. The size of the on-screen content in the proposed model  can be compared with that in the RVP model by converting the ranges of the Landolt ring gap in [6] to the height of the contents utilized in the proposed model. The content size in the proposed model is much bigger than that in [6]. In the PJND model, the content size is considered through the term called 'RCS (relative contrast sensitivity)' [8]. The slope of RCS changes based on the content size, but the RCS change considerably decreases when the luminance of the display increases.

Prediction probability of acceptable legibility
The qualitative evaluation data of the legibility of the onscreen contents of in-vehicle transparent displays were obtained through human visual experiments employing the category judgment method with a 5-score scale [14]. As listed in Table 2, if the average score from the visual experiments is greater than score 3, it means 'moderately visible (acceptable),' and it can be judged that the legibility is acceptable for in-vehicle application. In addition to the RMSE and Pearson correlation coefficient, the prediction probability of acceptable legibility can be defined as an additional performance index of the legibility evaluation models. The confusion matrix can be utilized for model verification [19]. Figure 12 shows a confusion matrix. If both the qualitative evaluation data from the visual experiments and the calculation results from the evaluation models are greater or less than score 3, a count of 'correct' is obtained because the result of the acceptability  judgment is the same. Otherwise, the count of 'error' is obtained because the result of the acceptability judgment is different. The acceptability prediction performance can be expressed as a probability using 'correct' and 'error,' through equation (6).
Prob. of acceptability(%) = Correct Correct + Error × 100 (6) Figure 13 illustrates the average probability of acceptability for the training and testing sets of the proposed model. The average probability of acceptability is 91% both in the training and testing sets. These results indicate that the proposed model can predict legibility acceptability with a high probability.

Conclusion
With regard to the on-screen contents of transparent displays utilized inside vehicles, legibility is one of the most important factors to be considered. This paper proposes a method for quantitatively evaluating the legibility of the on-screen contents of in-vehicle transparent displays. Human visual experiments were performed to obtain the qualitative evaluation data of legibility in this study. Simulation of perceived images was performed to obtain perceived images with varying degrees of legibility. The viewing illuminance, transmittance, (maximum) luminance of transparent displays, and size of the text and/or symbols were considered in the simulation. The simulated images were utilized for the construction and verification of the proposed model. By examining the effects of the aforementioned factors, four major components affecting the legibility were selected for the proposed model equation. The values of the weights in the proposed model were determined through the multiple regression technique. The performance of the proposed model was compared with those of the models from previous works.
The experiment results indicate that the proposed model has better accuracy than the models from previous works. It was also determined in this study that the legibility of the on-screen contents of in-vehicle transparent displays depends on the colors of the on-screen contents as well as on the display background. In the future work, the effect of the color information on the legibility will be investigated. In addition, in this study, visual experiments were performed using still images. In the future study, it is desirable to utilize moving scenes mimicking the real driving environments. The transmission of light through a transparent display will degrade the contrast ratio with ambient illumination. This will result in decreased legibility. In addition, there is a limit to increasing the maximum luminance. When transparent plastics that can control the light transmission are added to the imaging chain, the legibility of the on-screen contents can be increased. The proposed legibility evaluation model can be utilized to develop the technical specifications for the in-vehicle application of transparent displays and plastics.

Notes on contributors
Seong-En Lee received his B.S. Information and Communication Engineering degree from Inha University, Incheon, South Korea in 2017. He is currently pursuing a Master's degree at the same university. His research interests include image analysis, image enhancement, and image quality of modern displays.