Visual assessment of viewing direction based on color discrimination

For the large-screen TV viewer, it makes sense to obtain visual information within the same viewing direction rather than based on any physical change between the normal-axis and viewing directions, as in the conventional measurement methods. The aim of the proposed visual assessment method is to enable visual assessment based on color discrimination. This article describes a structural sensitivity assessment method that is dependent on the viewing direction, interpretation of the assessment results, and correlation between the assessment and S-CIELAB results. The paper also reviews the impacts of the color changes, image complexities, and image luminance of the test patterns on the visual assessment results. In fact, the proposed method has a good correlation with the S-CIELAB method. It may be used in the display industry as a supplemental method to physical measurement.


Introduction
The current display measurement standards use simple patterns to measure the display performances. More advanced recent studies suggested multiple color patterns to simulate real images based on physical measurements. This traditional method is important and is an essential measurement method of the display industry. Frequently, humans can perceive the structural similarity (SS) of image contents more sensitively than a change in color, in terms of image quality [1]. The viewing-direction-dependent display characteristic has become more important of late with respect to using the 4K-and 8K-UHD large-screen displays for multiviewer situations.
The image quality of a display usually depends on the viewing direction. As proof of this, the conventional physical measurement method is well established in industry, with a stable and robust performance [2]. The physical measurement, however, is not fully able to include the human visual system (HVS) sensations and abilities. In particular, color discrimination within a certain viewing direction does not include the physical measurement standards [2]. There are only measurement methods for the color and luminance differences between the normal-axis (as reference) and other viewing directions.
It has been proven that SS, or the ability to reproduce the image details and shape, is a good basis for measuring the image quality. This means that two neighboring colors should be distinguishable from each other. Thus, SS can be an auxiliary feature for the conventional measurement of the color and luminance characteristics. Figure 1 shows a comparison of the new visual assessment method for color discrimination under varying viewing direction and the conventional physical measurement method.
The conventional measurement method describes the viewing-direction-dependent characteristic in relation to the normal axis (IEC TS 62977-3-1). With respect to the TV viewer, it is significant to obtain visual information within a certain viewing direction rather than based on any physical change in relation to the normal axis.
Another significant advantage of the visual assessment method compared to the conventional physical measurement method concerns the observers. The colorimetric value of the measuring device is evaluated by the 1931 CIE color-matching function (CMF), which is derived from the average sensitivities of the observers. The effectiveness of the standard CMF has been extensively reviewed, and there has been a strong demand for improvements to such CMF. Although 1931 CIE-CMF had to be supplemented, its continuous use was considered more efficient than its revision [3,4].
Therefore, the physical measurement method needs to be complemented by the visual assessment method because there is a limit to the representation of the change in the spectral sensitivity of different observers by the mean-value-derived CMF. Such observer variation can also be considered by the proposed visual assessment method.
Generally, human color perception is also influenced by the geometric changes of object shapes that occur through a change in the viewing direction. Physical measurement, however, cannot take this effect into account. Furthermore, the color space and color difference metrics may not be perfectly consistent with human vision for all colors [5,6]. Thus, the proposed visual assessment method may be a good supplemental method for the conventional measurement method. This paper describes a method of SS assessment that is dependent on the viewing direction, interpretation of the assessment results, and correlation between the assessment and physical measurement results. This correlation value is a basis for the determination of the viewing direction range of display, which has relevance from the point of view of visual quality. This paper consists of the following sections. Section 2 describes the standard physical measurement methods and defines the coordinate system for the viewing direction. Section 3 introduces the geometric and colorimetric design concepts of test patterns for visual assessment. Section 4 explains the visual assessment method, including the issues of the ethnic origins of the observers, the instruction procedure of the practical visual assessment, and the presentation method of the visual assessment results. Section 5 considers the S-CIELAB transform to check the performance of the proposed visual assessment method [7]. Section 6 first presents the primary assessment results of the white-luminance dependency and the dots fill factor dependency of the pattern, and then the main assessment results depending on the viewing direction. Section 7 presents the results of the physical measurement in CIELAB and the simulation results of S-CIELAB. Section 8 provides the results of the visual assessment depending on the viewing direction, pattern colors, and individual observers. Section 9 provides the performance of visual assessment compared with those of S-CIELAB transform and physical measurement. Finally, section 10 concludes the paper.

Standard measuring equipment and coordinate system
Light measuring devices (LMDs) for the initial setup of the visual assessment described in IEC 62977-2-1 were used in this study [2]. LMD is used to prove the color differences of the visual test pattern and to measure the colorimetric values of a test display. The viewing direction coordinate system for LMD also specified in subclause 5.6 of IEC 62977-2-1 was used and is shown in Figure 2.
For visual assessment in a horizontal viewing direction, the observer can be placed as shown in Figure 3. The observation layout for the vertical viewing direction can

Construction of pattern
The geometrical structures and dimensions of the test patterns are shown in Figure 4. The first, rectangular color patch type 1, is used for optical measurement. The second, inside font (number or alphabet) type 2, is designed for visual assessment. All the sizes are specified by the factor of the display screen height (H), and the size of the inside stimuli is H/18, with a 2°viewing angle for the observer at 1.6H as the default viewing distance. This can be adjusted to obtain a proper measuring field at a higher viewing direction angle.

Pattern color assignment
The Macbeth color chart was used (Table 2) for the assignment of colors on the test pattern. These 24 colors (6 achromatic and 18 chromatic) are named as reference colors, and the proximal fields (each circle of type 2) are filled with these reference colors.
The inside stimuli (fonts of type 2) are filled with single colors varying in lightness, hue, and chroma direction in the CIE-L * C * H * color space ( Figure 5(a)). For example, the 2nd-row numbers increased ('+'-marked number '3') and decreased ('−'-marked number '4') the hue from the reference colors for the given ΔE * 00 (CIE-2000 color difference [5]). In the same way, chroma-and luminance-varying colors can be assigned for chromatic and achromatic patterns, as shown in the 1st row of Figure 5(b,c).
To measure the reference and font colors, they will be successively assigned to the inner rectangular area of type 1.
The size and luminance of the gray background of type 2 can be freely selected by test organizations, but it is preferred that 18% (L * = 56 for dim-surround) luminance of DUT white and full-screen size be allocated, where 18% gray luminance means the average luminance of the scenes during TV viewing, according to the gray-world theory [5]. Hence, the observers might adapt to this average scene as an approximate average of perceived lightness. This gray background might have helped stabilize the light adaptation of HVS. Figure 5(d) shows an example of the achromatic and chromatic test patterns.

Dots fill factor
Generally, most scene contents are more or less structured. This image complexity can be alternatively realized as the fill factor (FF) of the dots in test pattern type 2. Thus, FF is defined by the ratio of the dots area to the circle area, as shown in the equation below.
This construction is similar to the Ishihara pattern for color deficiency check. By default, the dots FF of 70% is utilized for the proposed visual assessment. The locations of the dots in the circle and of the multiple-sized dots of the test pattern are randomly generated by a computer program. For all the observers, the same dot pattern should be viewed to achieve the same sensation within the same visual assessment session. More details related to this topic are presented in section 6.

Visual assessment method
The proposed visual assessment method was constructed based on the single-stimulus method of ITU.R.BT. 500-13 [10]. A series of single test patterns is displayed on the screen, and the assessors provide their responses for all the patterns. All the responses are collected by an operator, and the statistical results of the responses are evaluated. Figure 6 shows the test environment. The display under test (DUT) is connected with an operating PC, in which the test pattern sequence can be displayed by the operator on the DUT screen manually or semi-automatically. These test patterns will be sequentially viewed by the observer, who will be asked for the visible fonts in each pattern within a limited time. The operator records the observer's responses in 'True' or 'False' binary form.
The assessors' viewing conditions should be as follows. The test room should be dark to prevent any reflection on the screen. The assessors should adapt to the test room environment for at least 15 min before starting visual assessment. For a given pattern size, the viewing distance must satisfy the viewing angle condition shown in Figure 4.
The display white luminance varies between individual displays, so the white luminance can be freely adjusted by the test organization. It is preferred that it be set to over 100 cd/m 2 , however, because the visual assessment results are slightly affected by the luminance of the pattern images. This luminance dependency will be discussed in section 6. In addition, it is preferred that the correlated color temperature of DUT be set to D65 white ( Table 3).
The darkroom contrast ratio may be adjusted to less than 0.02, as low as possible, to achieve the best color reproduction on the DUT screen. For the same reason, the optimum brightness and contrast of DUT should be autonomously adjusted by the test organization.
The number of pixels of DUT should be greater than the FHD (1920 × 1080) pixel resolution due to the high pixel resolution of the dotted pattern image. If the dotted pattern image rendered on the screen is not sufficient due to a low pixel resolution, it is recommended that the pattern image be enlarged, and that the viewing distance be proportionally increased.

Observers
The following criteria for the observers are mentioned by ITU-R.BT. 500-13 [10,11]. The observers may be experts or non-experts, depending on the objectives of the assessment. A non-expert observer is an observer with no expertise in image artifacts while an expert observer is an observer with expertise therein. In any case, the observers should not be and should not have ever been directly involved in any related research and development. Prior to a session, the observers should be screened for normal visual acuity on the Snellen or Landolt chart, and for normal color vision using specially selected charts (e.g. Ishihara).
There should be at least 15 observers. For studies with a limited scope (e.g. exploratory studies), however, there may be fewer than 15 observers. In this case, the study should be designated as 'informal' [10].
A study on the consistency between the results obtained at different testing laboratories has revealed a possible explanation for the differences in the results obtained from different laboratories: that there may be different skill levels among different groups of assessors. In the interim, however, experimenters should include as many details as possible on the characteristics of their assessment panels to facilitate the further investigation of this factor. The data suggested to be provided can include the observers' occupation category (broadcast organization employee, student, office worker, etc.), gender, and age range [10].
The 1931 CIE-CMF was derived from the average values of the spectral responses of a number of color-normal observers [3]. According to Asano, the CMF is usually dependent on the age, viewing field size, density of the eye lens and macula, and optical densities of the LMS cones [4]. These characteristic values are dependent on the individual observers.
The average CMF of the 151 observers in Asano's database (122 Europeans, 9 North Americans, 17 Asians, and 3 Africans) by ethnic origin are reproduced in Figure 7. For example, the color differences between Europeans and Americans and between Europeans and Asians for D65-white stimuli are 0.78 and 1.03 ΔE * 00 . The dependency on the ethnic origin may be ignored. Some factors that are more sensible than the ethnic origin are the abovementioned characteristics of the individual observers.
The physical measurement is performed based on CIE-CMF; therefore, it is a representative value of the average observer's CMF. In the same sense, the results of the proposed visual assessment are derived from the average recognition values of 15 observers. The average CMF of 15 individual observers does not significantly differ from 1931-CIE-CMF; as such, 15 observers with prior observer screening may be sufficient [10].

Instructions for the visual assessment method
A test session should not exceed 30 min to prevent any response affected by tiredness or insufficient adaptation.
First of all, the method of assessment should be carefully explained to the observers, and sufficient training sequences should demonstrate the range and type of the test pattern to be assessed ( Figure 8). All the observers' doubts should be cleared during the training sessions [10]. The training sessions should have the same duration (e.g. 5 min) for all the observers.
To stabilize the observers' answers, at the beginning of the first session, about three to five 'dummy sessions' should be conducted [10]. The results of these tests, however, must not be included. If several sessions are necessary, only about three dummy presentations should be made at the beginning of the following session.
As in Figure 8, each test pattern (type 2) will be presented for T1 (5-7) seconds to the observer, who must provide answers to the recognized fonts. The operator records the correct and incorrect answers during the next gray pattern within T2 (8-10) seconds. The gray pattern helps neutralize the visual adaptation of HVS for the following experiment. This procedure should be repeated for all the test patterns.

Presentation and interpretation of the assessment results
The results of the visual assessment can be presented as the statistical plot of the recognition rates of all the observer's ratings for the test patterns with dependence on the viewing direction, as shown in Figure 19. In most cases, the plot of the mean (marked as '−' inside the bar graph) values of the recognition rates is sufficient to characterize the viewing direction dependency. The recognition rate R is defined as the ratio of the number of correctly answered fonts N f to the total number of fonts N t in the test patterns, as follows: ( 2 )

S-CIELAB transform
The entire process of the quality judgment in S-CIELAB is shown in Figure 9. As the first step, the computer-generated colors (font and background) of 24   test patterns are replaced by the measured values for the given viewing directions.
As the next step, the perspective transformation of the test pattern for a given viewing direction is performed [12]. This geometrically distorted test pattern is then transformed into the opponent color space (AC1C2) of HVS and is further transformed into the frequency domain using DFT (discrete Fourier transform).
The contrast sensitivity function (CSF) of HVS ( Figure 10) is applied in this frequency domain [13,14]. Thereafter, the filtered patterns are inversetransformed into the AC1C2 and CIELAB spatial domains.
Next, the S-CIELAB-transformed images are divided into four sub-images (four fonts) and are masked in the font and background regions for each. Here, the masking filters are obtained from the perspective-transformed image. Using the masking images, the CIELAB pixel values of the font and background regions are gathered and utilized to compute the mean of the color difference in ΔE * 00 between the font and the background.

Preliminary experiment
The color reproduction performance of DUT should be satisfied before starting the visual assessment. As proof, 18 uniform type 1 patterns with four hue and chroma variations were generated for the given 2 ΔE * 00 . They were displayed on the DUT screen and measured using LMD (Minolta CS-1000A). The measured values were then compared with the theoretical computed values. As shown in Figure 11, some data fluctuation is caused by the noise of the LMD and by the quantization of the digitized signals. The means of the computed, measured values and the two differences of ΔE * 00 are 2.1, 2.0, and 0.3, respectively. The performance of DUT may be sufficient for visual assessment.
To verify an effect of the white adaptation of HVS, the following visual experiment was carried out. The following Table 4 shows the parameters of the visual assessment.
The assessment results of the 15 observers are shown in Figure 12.
At each luminance level, the R rate deviates by approximately 15% from observer to observer. The mean recognition rate slightly increased from 83% to 92% with the tendency of gradual saturation. The following fitting function R, as indicated by the blue line in Figure 12, approximates the mean recognition rate with The white-luminance dependency is much higher in low luminance due to the incomplete white adaptation of HVS for low-luminance colors in dark-surround [14,15]. At over 100 cd/m 2 white luminance of DUT, the luminance dependency of the test patterns may be practically disregarded. Therefore, it is recommended that the white  luminance of DUT be set to over 100 cd/m 2 for stable visual assessment. The test patterns are designed by multiple-sized dots, whose positions are randomly generated. The recognition rate of the fonts of the test pattern in visual assessment depends on the fill factor of the dots.
According to Fairchild et al., the just-noticeable difference (JND) of the colors in complex scenes is approximately 2.3 ΔE * ab,76 [15,16]. So that a test pattern can be used instead of a typical complex scene with the same JND (2.3ΔE * ab,76 ) performance, the fill factor of the test pattern is adjusted ( Figure 13). Thus, the recognition rate with dependency on the fill factor was experimented on.
For this experiment, almost the same test conditions as those in Table 4 were used, except for the following: • 15 (9 male) observers with a 22-55 age range; • test patterns for 18 chromatic reference colors with four inside numbers 3 ΔE * 00 -variated in terms of hue and chroma; • dots fill factor: 30%-70%, with 10% incremental steps, and 100% (uniform pattern) were tested; • display: Rec.709 color space, 100 cd/m 2 white luminance; and • background: 20% gray of the display white luminance with full-screen size.
The assessment results are shown in Figure 14, as a box-whisker statistical plot of the R rates, with dependency on the fill factor FF. The recognition rate is proportional to the fill factor. The deviation of the R rate decreased from low to high fill factor. The blue line in As such, the fitting performance of the R-squared value is 0.993. Figure 15 shows a recognition rate comparison between the uniform pattern (100% FF) and the various fill factors. The four red triangles are the mean recognition rates for the 100% FF pattern with the variation in the font colors for the ΔE * 00 = {0, 1, 2, 3} cases. The blue line is a fitting function of the mean recognition rates for 100% FF with dependency on the color difference, ΔE * 00 . This fitting function is approximated by an error function, and its R-squared value is 0.999, as shown in the equation below.
where Erf[z] = (2/ √ π) z 0 e −t 2 dt is the error function, which is the integral of the Gaussian distribution.
For example, the recognition rate of the 70% FF pattern with ΔE * 00 = 3 is equivalent to the result of the 100% fill factor with ΔE * 00 = 1.59 (see the dashed line of 70% FF in Figure 15). Considering that the value of ΔE * 00 is generally less than ΔE * ab,76 , the JND (2.3ΔE * ab,76 ) of the complex scene is then replaced by the 70% FF pattern with approximately 5 ΔE * 00 . Therefore, the pattern with 5 ΔE * 00 and 70% FF can be used in the visual assessment instead of the complex scene. Figure 16 shows the statistical plot of the color differences measured in CIELAB between the reference (the dots of the outer circle) and font colors for all the 24 color patterns (Figure 4) with dependency on the viewing direction.

Results of CIELAB and S-CIELAB
As such, all the patterns are designed for the color difference of 5 ΔE * 00 . For accurate measurement, each of the reference and font colors is sequentially displayed as a rectangle 1/18 * H at the center of the screen. For the measurement, the viewing directions are varied horizontally (0, 15, 30, 45, and 60°) and vertically (15 and 30°). By increasing the viewing direction, the mean value of the color difference is decreased. It is important to consider that there is no drastic change in the mean color differences even though the viewing directions are increasing. The variance is increased, however, due to the viewing-direction-dependent changes of the noise of the measurement device and display colors.
To proceed with S-CIELAB transformation, the measured CIELAB values of all the test patterns are computationally assigned as the font and reference colors in the test pattern. For these patterns, the entire process of S-CIELAB transformation is applied according to section 5. The S-CIELAB results are shown in Figure 17.
The variance in the color difference with increasing viewing direction is more stable than the physical measurements in CIELAB. Furthermore, the mean color differences are more clearly reduced with increasing viewing directions compared with the CIELAB results.
The correlation between the CIELAB and S-CIELAB results is shown in Figure 18. The correlation coefficients by Pearson's test resulted in a reasonable value of 0.958 in this experiment.

Visual assessment for the viewing direction
For the visual assessment, 24 test patterns (type 2) with four number patches, 70%-FF and 5 ΔE * 00 , were used.  The results in Figure 19 were determined for a 55-inch LCD-type UHDTV, and there were 16 observers with a 22-55 age range. The observers were nine males and seven females from Asia. Figure 19(a) shows the statistical plot of the recognition rates of all the observers and all the 24 test patterns with dependency on the viewing direction. The bar graph indicates the mean recognition values as a function of the viewing direction. With increasing viewing direction, the recognition rate decreased, respectively. Especially in contrast to the physical measurements, the R rate at horizontal 60°was drastically reduced to half of the R rate at the normal viewing direction. This means that HVS has a near-JND threshold at this viewing direction.

Pattern color dependency
The pattern color dependency is presented in Figure 20. In this plot, all the answers of the observers and all the viewing directions are separately plotted for the 24 color patterns. In the case of a well-designed test pattern, the R rate may be varied from maximum at the 0°viewing direction to the minimum R rate at the 60°viewing direction.
The recognition rates are usually dependent on the colors of the test pattern because the color difference metric ΔE * 00 is not perfectly uniform throughout the entire CIELAB color space [6]. For example, the patterns (9, 13, and 21) were not well designed in this respect.  Even though the pattern has an approximately 5 ΔE * 00 measured color difference, pattern 9 could hardly be perceived by the observers. In contrast, pattern 13 was well perceived by all the observers as a case of excessive perceived color difference. Such pattern types are reflected as DC offsets in the recognition rates.
There are also problems regarding the metamerism of the display and observer. This type of metamerism and a non-uniform CIE color difference metric, however, do not fall within the scope of this work. The proper design of the test pattern and the study of metamerism represent future tasks.

Observer dependency
The observer dependency is shown in Figure 21. The last column ('All') indicates the statistical plot of the recognition rates for all the observers and viewing directions.
A fairly good result was achieved by observer 7, who had good visual acuity and color discrimination ability and was faithful to the experiment. On the other hand, a relatively poor performance was recorded by observer 13. The screening of an extreme outlier observer in relation to the average data ('All') can be carefully eliminated according to ITU-R.BT 500.

Performance of visual assessment
A performance of the proposed visual assessment will be described in relation to the physical measurements in CIELAB ΔE * 00 and a quality judgment based on S-CIELAB transformation. How the visual experiment relates to the S-CIELAB and physical measurement methods will be discussed.
The recognition rate usually decreases with increasing viewing direction, similar to the results of the physical measurement and the S-CIELAB transformation. Specifically, the reduction in the R rate at H_60 is relatively higher than that in the other methods because HVS can perceive not only the color changes but also the geometric changes by perspective transform of the patterns. Furthermore, at this viewing direction, the perceived color difference of HVS is near the JND threshold. These features are limited in the CIELAB and S-CIELAB methods.
The results of Pearson's test in Figure 22 show that the visual assessment is more correlated with S-CIELAB measurement than with the simple CIELAB measurement. The reason for this is that S-CIELAB takes into account the perspective transformation and the CSF of HVS.  The correlation coefficients with S-CIELAB have high values of 0.96 and 0.84 with and without the critical data of H_60 (Table 5). These correlation values are much better than the values with the simple physical measurement in CIELAB.

Conclusion
In this paper, a new visual assessment method for viewing direction is introduced. This method uses the color discrimination ability of the observers to obtain visual information in a consistent viewing direction rather than the color changes in relation to the normal axis, as in the conventional viewing angle measurement method. The proposed method considers the typical realistic image complexity using the dotted test pattern and many other color appearance effects (change of object shapes, image complexity, CSF, etc. [17,18]), which are not easy to handle in the conventional measurement methods. The proposed visual assessment method may be a good supplemental method for the conventional physical measurement method, and only 15 observers may be required therein. The time cost of each observer for all the viewing directions will not exceed 1 h.