Sky type classification in Harbin during winter

ABSTRACT Sky type classification is a significant element in daylight simulation. As a “winter city”, the daylight climate of Harbin is distinctive, and there is therefore a particular need to determine the most appropriate sky models to establish sky luminance distribution in the city and thereby improve the accuracy of daylight simulations. To determine the closest sky type to that of Harbin during winter, sky luminance data consisting of 145 sky elements were collected by a sky scanner at 10 min intervals from June 2018 to February 2019. The sky luminance data were compared with the 15 standard sky definitions, varying from overcast sky to clear sky, proposed by the Commission International de l’ Eclairage (CIE). The sky with the lowest root-mean-square-error was selected as the most appropriate sky type. The results show that clear skies prevail in Harbin during winter, with an occurrence of 70.56%. The dominant sky type is type 12, with an occurrence of 48.57%, followed by types 11, 10, and 8 with occurrences of 17.67%, 5.81%, and 5.36% respectively. The sum of the occurrences of these four sky types exceeds 77%, which means that they can be regarded as generally representative of Harbin’s skies during winter.


Introduction
In recent years, daylight performance evaluation has been considered to be an important element in "green" building design and has accordingly attracted increasing attention (Gago et al. 2015). Daylight calculations are usually carried out using computer simulations and focus on sky luminance distribution. An accurate sky model is required to establish sky luminance distributions for daylight simulations. The 15 Commission International de l' Eclairage (CIE) standard sky models were introduced to accurately describe sky luminance distribution and facilitate classification. The models provide a set of well-defined sky luminance patterns, from overcast to clear skies. However, sky luminance distribution is also affected by other factors, such as solar position, cloud conditions, turbidity, the pollutant content of the atmosphere, etc. (Kittler and Darula 2016). There are distinct differences in luminance distribution between areas with different climates and geographical features. Thus, there is a need to determine the closest CIE sky type at a certain location and time, based on regional daylight measurement data, in order to accurately calculate luminance distribution.
The literature reports many climate variables which contribute to classification of the CIE standard skies, including sky luminance, Lz/Dv, VSC, Dv/Ev, Gv/Ev, etc. The ratio of zenith luminance to horizontal sky-diffuse illuminance (Lz/DV) is the base criterion for sky classification, but this is not suitable for locations at low altitudes (Tregenza 2004). Vertical sky component (VSC) appears to be ambiguous at certain scattering angles (Alshaibani, 2008(Alshaibani, , 2011Li et al. 2011). The ratio of horizontal sky diffuse illuminance to extra-terrestrial illuminance (Dv/Ev) can be applied in combination with other methods for effective sky classification. The ratio of horizontal global illuminance to extraterrestrial illuminance (Gv/Ev) is sometimes not reliable in describing actual sky conditions accurately (Li and Tang 2008). Horizontal sky illuminance (Alshaibani, 2017) is a simple criterion, but is sometimes uncertain. Overall, sky luminance of an entire sky hemisphere has generally been considered to provide an accurate and direct data for sky classification (Li, Chau, and Wan 2014). A sky scanner is normally used for sky luminance observation based on the recommendations of CIE 108-1994. Furthermore, long-term luminance measurements can provide basic data for daylight modelling and dynamic daylight simulations.
However, luminance data for the entire sky hemisphere are not always available in many areas. Studies on sky luminance distributions have been carried out in a few areas around the world, such as England (53. 38°N (Bartzokas et al. 2003(Bartzokas et al. , 2005Markou et al. 2005;Ferraro, Mele, and Marinelli 2011;Torres et al. 2010aTorres et al. , 2010bSuárez-García et al. 2018;Janjai and Palon 2011;Mettanant, Chaiwiwatworakul, and Chirarattananon 2017;Wittkopf and Soon 2007;Li, Lau, and Lam 2004;Ng et al. 2007;Luo et al. 2015). However, sky luminance research has been carried out in very few areas in Asia. The sky luminance distribution in the northeast of China, which has a high latitude, is relatively unknown. Harbin is a historic industrial city, which faces air pollution problems. The city also has a comparatively long period during winter when municipal heating is in operation because of the extremely cold climate. Increases in anthropogenic emissions exacerbate air pollution problems such as haze (Mao et al. 2018). These pollutants increase the turbidity of the atmosphere and result in the reduction of solar radiance and sunshine duration, which then affects exterior daylight conditions and indoor illuminance. This probably distinguishes Harbin from other cities and explains variations in performance between winter and other seasons. Recently, however, visual comfort concerns have given rise to more creativity in daylight design. Harbin has a lower solar altitude during winter compared to other major cities because of its relatively high latitude, and there is therefore high potential for discomfort glare when the skies are sunny in winter. Hence, more attention has to be paid to the design of buildings in Harbin to take account of visual comfort during winter. Knowledge of Harbin's typical sky type in winter will therefore contribute significantly to appropriate daylight design in the city.
The present study aims to define typical daylight conditions and representative skies during the winter period in Harbin, based on sky luminance observations. This will provide a reference for daylight simulation for buildings in Harbin in the future.
The structure of the paper is as follows: Information from the measurement facility and data collection is described in Section 2. An introduction to the CIE standard general skies and sky classification method is given in Section 3. The characteristics of Harbin's skies during the winter period and analysis of time and solar altitudes are set out in Section 4. A summary of the observations and an assessment of the contribution of the present work to future building projects is set out in Section 5.

Sky luminance distribution data measurements
Harbin is the capital city of Heilongjiang Province. The city is characterised by severely cold winters with average annual sunshine duration between 2300 and 2800 h. China is divided into five daylight climate zones by annual average daylight illuminance according to China's Standard for Daylighting Design of Buildings (Architecture and Press, 2013). Harbin is classified into zone Ⅳ, in a daylight climate zone of China with an annual average daylight illuminance of between 30 and 35 klx and an exterior design illuminance value of 13,500 lx. The meteorological observation station used to provide data for Harbin is located in the urban centre of the city (45. 75°N, 126.68°E). It is installed on the roof of a five-storey building which is 28 m above ground level. Figure 1 (Zi 2020) shows the location of the meteorological observation station.
As shown in Figure 2 (Zi 2020), measurement instrumentation consists of a sky scanner and an all-sky imager. The sky scanner is an EKO (MS-321 LR), and is used to measure the sky luminance and radiance of 145 elements of the sky hemisphere. Measurement is accomplished by using a sensitive sensor with a viewing angle of 11°and a dual-axis control drive. A single scan takes 4.5 min. The luminance value is measured in kcd/m 2 and the radiance value in W/m 2 /sr. The luminance and radiance ranges are 50 kcd/m 2 and 300 W/m 2 /sr. The sky scanner was calibrated before measurements were carried out. The all-sky imager is installed next to the sky scanner. It is equipped with a fisheye lens with a 180°field of view. It is used to take photographs of the entire sky and contributes to defining sky conditions.
The two instruments were set to operate every 10 min from sunrise to sunset every day from June 2018. The period from 1 November 2018 to 28 February 2019 was defined as the winter period. Luminance data and sky image data from November 2018 to February 2019 were selected for winter analysis and data gathered before November 2018 were used for comparative analysis in this study. Unfortunately, sky luminance data for the period from 7 February 2019 to 13 February 2019 are missing as a result of a power failure. In addition, some abnormal measurement data were removed before data analysis ( Figure 3) (Zi 2020). Measurement scans with solar altitude angles of equal to or lower than 5°were excluded from analysis (Suárez-García et al. 2018). Further, as some sky elements close to the sun are strongly influenced by beam sunlight, luminance data with extremely high values were removed. A luminance value of 50 kcd/m 2 was adopted as the criterion for consideration in the luminance range. Luminance data with values lower than 0.1 kcd/m 2 were also removed. Luminance data from sky elements at an altitude angle of 6°were eliminated, as this measurement would be influenced by tall buildings in the distance Marinelli 2011, 2013). In practice, abnormal measurement data (values higher than 50 kcd/m 2 , lower than 0.1 kcd/m 2 , and with an altitude angle of 6°) were assigned a value of 0. This process was carried out using a MATLAB program.

CIE standard sky luminance distribution
The CIE standard skies scheme was originally proposed by Kittler and others (Kittler, Darula, and Perez 1998), and subsequently adopted by the CIE and the International Standards Organisation (ISO) (International Standards Organization 2004). The classification scheme is now known as the "CIE/ISO Standard General Skies". The CIE standard skies classify skies into 15 defined sky types, comprising 5 overcast skies, 5 intermediate skies, and 5    Table 1 (International Standards Organization 2004). The CIE sky model describes sky luminance according to the ratio of the luminance of one sky element to the sky zenith luminance (Equation 1).
The gradation function φ Z ð Þdescribes luminance from the zenith to the horizon, and the indicatrix function f χ ð Þ describes the relationship between the sky element and the sun. They are calculated as follows: where: a is a parameter of the gradation function, used to determine whether horizon region is either darker or brighter than zenith region; b is a parameter of the gradation function, used to indicate the luminance gradation close to horizon region; c is a parameter of the indicatrix function, used to describe the luminance rise towards the sun position; d is a parameter of the indicatrix function, used to express the width of the circumsolar region; e is a parameter of the indicatrix function, used to indicate changes of horizon luminance because of solar angular distance; Z is the zenith angle of a sky element; Z s is the zenith angle of the sun; and χ is the angular distance of a sky element from the sun.
The value χ can be calculated using Equation (4). Descriptions of the different parameters are shown in Figure 4 (Zi 2020).
where α and α s are the altitude angles of a sky element and the sun.

Sky classification with Tregenza methods
The method in the present study is based on the Tregenza method. The principal characteristic of this method lies in its normalisation of measured luminance with respect to horizontal illuminance and solar elevation, rather than using measured sky zenith luminance directly. The use of measured zenith luminance could lead to uncertainty, as the single sky element of luminance can change considerably, even under stable cloudy conditions, and the luminance of the sky zenith cannot be measured accurately when the sun is close to the zenith. The Tregenza method appears to be more stable and effective, as it takes these potential issues into consideration. The method has been applied successfully by many researchers (Ng et al. 2007;Torres et al. 2010aTorres et al. , 2010bSuárez-García et al. 2018). As shown in Figure 5 (Zi 2020), the sky hemisphere was divided into 145 sky elements. The horizontal illuminance from the entire sky (E hp ) is the sum of the horizontal illuminance from the 145 sky elements (E hp ) (Equation 5). As some abnormal data were discarded prior to analysis, a correction factor (F c ) was proposed for the calculation of horizontal illuminance (Equations 6 and 7).
where b p is the number of the band range from 1 to 8 and n p is the number of each band. The horizontal illuminance of each sky element (E hp ) is a contribution of the luminance of the sky element p, and can be calculated according to Equations 8 and 9.
where L p is the measured luminance value of a sky element.
The normalised luminance of each sky element (L p;sc ) can then be determined by the ratio of the measured luminance of each sky element (L p ) to the horizontal illuminance from the whole sky hemisphere (E h ) (Equation 10).
The luminance distribution of each standard general sky type (L pred ) was subsequently calculated according to Equations 1-4. To enhance the comparison values, the mean luminance of each sky element was acquired by taking the mean of the calculated luminance at the four corners of the element ( Figure 5) (Zi 2020). The angular position of the four corners (p 1 , p 2, p 3, and p 4) of element p (α, Z) are (α -π/ n p , Z -π/30), (α -π/n p , Z + π/30), (α + π/n p , Z -π/30), and (α + π/n p , Z + π/30). The final sky element was divided into six triangles ( Figure 5) (Zi 2020). The luminance of this element was calculated as the mean luminance of the six triangles (Equation 11).
The horizontal illuminance (E h;pr ) from the standard sky was also calculated according to Equations 8 and 9, and 15 sets of luminance (L p;st ) can be identified using Equation (10).
The final step was the comparison between each standard sky and the measured sky to determine the closest sky type. The root-mean-square-error (RMSE) for the calculation was obtained by subtracting the normalised luminance of sky element measured data  from that of each standard sky, then calculating the sum of these values, and finally dividing the sum by the total number of available sky elements (Equation 12). The closest sky type was obtained once the lowest RMSE value had been calculated. A sequence of MATLAB code was developed for the entire process.
where n means the numbers of available luminance data, which are required to exclude void elements from the sum.
n ¼ where, for each sky element, m p was originally determined to be 1, and would be assigned a value of 0 if the luminance of the sky element was 0.

Characterisation of Harbin's skies during winter
In this study, a total of 6332 sky scans during winter were obtained, of which 5589 were eventually used for analysis. Figure 6(a) shows the frequency of occurrence (%) of different sky types and Figure 6 (b) shows the distribution of overcast, intermediate, and clear sky conditions in Harbin during winter. Figure 6(c) shows the RMSE between the best-fitting CIE skies and the measured skies. It can be observed that the frequency distributions of the 15 general sky types are demonstrably different.
The dominant sky type of Harbin during the winter is type 12 (CIE standard clear sky) with a frequency of occurrence of 48.57%. This is followed by types 11, 10, and 8, with frequencies of occurrence of 17.67%, 6.29%, and 5.36%, respectively. The total frequency of occurrence of these four sky types is about 77.89%, which means that, between them, these four sky types are effectively representative of the sky luminance distribution of Harbin during winter. Figure 7 shows typical sky images of these four sky types at certain times. Type 1 appeared with quite a low frequency of occurrence at about 3.75%, which suggests that the standard overcast sky type is not suitable for prediction of daylight conditions in Harbin during winter, even though current standards are still using this sky type for daylight evaluation across China. From the perspective of cloudiness conditions, the frequency of occurrence of clear skies is 70.56%, with types 12, 11, and 13 being the dominant sky types indicating As shown in Figure 6(c), the RMSE has a tendency to increase from overcast skies to clear skies. For clear skies, a high deviation was found, particularly for type 15, which has an extremely low frequency of occurrence. However, types 11, 12, and 13 show relatively low RMSE values with a high frequency of occurrence, which confirms that these three sky types better represent clear sky conditions. For intermediate skies, the distribution of the RMSE values shows a similar pattern to the frequency of occurrence. Type 8 had the highest RMSE as well as the highest frequency of occurrence. For overcast skies, these five sky types have relatively low deviation compared to the other sky types, with type 1 exhibiting the lowest value. This means that type 1 was closest to the real sky luminance distribution, though it is not regarded as one of the representative skies of Harbin during winter.
In the meantime, some cities at similar latitudes to Harbin were selected for comparative analysis using data from previous research, including Pamplona (42.83°N, 1.6°W), Bratislava (48.2°N, 17.1°E), Beijing (116°E, 40°N) and Sheffield (53.38°N, 1.50°W) (Torres et al. 2010b;Bartzokas et al. 2005;Luo et al. 2015;Markou et al. 2005). Significant differences in sky luminance distribution appear between these sites. Overcast skies prevail in Bratislava, Sheffield, and Pamplona, as Bratislava has a Mediterranean climate and the other two cities have warm temperate climates. The climates of these three cities are humid and rainy days occur more frequently in winter. However, intermediate skies prevail in Beijing and clear skies dominate in Harbin, although both cities have a temperate monsoon climate. They are generally cold and dry in winter, although Beijing is markedly warmer. The extremely low temperatures, high atmospheric pressure, and low volumes of water vapour molecules in Harbin mean that clear skies prevail during winter.
In this study, some data from other months were chosen for comparison to assist in defining the sky luminance distributions of Harbin during winter. Figure 8 (Zi 2020) shows the frequency of occurrence of the 15 sky types of different months from summer to winter. Considerable changes can be observed from summer to winter for overcast and clear skies, with the frequency of occurrence of overcast skies showing a declining trend. Conversely, a significant increase from summer to winter can be found for clear sky conditions. The frequency of occurrence of intermediate skies is generally stable with only slight fluctuations. Type 1 was the dominant sky type in July with a frequency of 5.29%, but with frequencies of only 1.5% in October and 0.09% in January. This is because there are more rainy days during summer than winter in Harbin, and type 1 prevails on rainy days. Clear days are more prevalent in winter, with the change of frequency of sky type 12 being the most significant, with frequencies of 1.2% in July, 4.61% in October, and 4.69% in January. Figure 9 presents sky distributions at various times of the day during winter. To find the relationships between the sky types and the periods of the day, the day during winter was divided into four periods, including early morning (before 9 solar hour), morning (between 9 and 12 solar hour), afternoon (from 12 to 15 solar hour), and late afternoon (after 15 solar hour). Figure 10 shows the distribution of sky types in different periods of the day during winter in Harbin. It can be observed that clear skies prevail overall, especially in the mornings and early mornings. The frequency of intermediate skies increases incrementally, in a stable fashion, and exhibits a high frequency of occurrence in the late afternoons, but comparatively low frequency in the early mornings. In the meantime, the frequency of overcast skies appears to be stable across the different daily periods.

Analysis of daylight hours
For clear skies, sky type 12 was generally the dominant type with a high and consistent frequency of occurrence of over 40%. Aside from type 12, type 13 had a relatively high frequency of occurrence in the mornings, but a lower frequency in the afternoons, probably due to reduction in atmospheric pollution at that time. Type 11 exhibited the opposite trend. At around 12:00, the frequency of occurrence of type 11 peaked, and exceeded that of type 12. For intermediate skies, type 8 prevailed in the mornings and early mornings and type 10 prevailed in the afternoons and late afternoons. For overcast skies, type 1 was the dominant sky type with a high frequency of occurrence across all periods of the day. Figure 11(a-d) shows the sky types at various time points during 4 representative days. The dynamic sky conditions over time can be clearly identified. As is universally recognised, the limitation of the CIE standard sky models is that the correct sky type must be selected before simulation. The closest sky type is selected to establish the corresponding sky luminance distribution for daylight calculations, and in this way assists in the simulation of real-time indoor daylight conditions.

Analysis of solar altitude
The solar altitude of Harbin during winter is relatively low, with values of not more than 37°. Figure 12 shows the frequency of occurrence of different sky types against solar altitude in Harbin during winter, and Figure 13 shows data for the four representative sky types during winter. An obvious change against solar  altitude can be identified with regard to type 12, with the frequency of occurrence of type 12 showing a gradual decline with increase in solar altitude, but then an abrupt uptrend when the solar altitude approaches 30°. Conversely, the frequency of occurrence of type 11 exhibits a rise-fall-rise tendency, with the first peak value at about 25% when the solar altitude is approximately 21°. When the solar altitude is close to 35°, the frequency of occurrence of type 11 is about 50%. Types 10 and 8 have similar distributions  and appear to be stable, with only slight fluctuations. Type 10 exhibits a peak value of about 15% with a solar altitude of 26°.
During daylight, the windows of a room are the main source of glare. Relatively low solar altitudes and a high frequency of occurrence of clear skies will make non-north-facing rooms suffer long-term beam sunlight, particularly in directly south-facing rooms, which will increase the potential for discomfort glare. With the high frequency of occurrence of type 12 skies, glare is more likely to occur in Harbin than in the other cities mentioned supra. Glare evaluation seems to be essential for Harbin during winter, so it is reasonable to select a clear sky type for glare risk evaluation.

Conclusion
In this study, the skies distribution of Harbin during winter was analysed for the first time. The representative sky types of Harbin were identified by comparative analysis of measured data with theoretical sky models based on observed sky luminance distribution. The frequency of occurrence of different skies turned out to vary significantly. Clear skies have the highest frequency of occurrence, at 70.56%, followed by intermediate skies.
Overcast skies were found to have the lowest frequency of occurrence. The dominant sky type in Harbin during winter is type 12, with a frequency of occurrence of approximately 50%. A reduced set of CIE standard skies could be used to describe the sky luminance distribution  of Harbin in winter. A simplified set of four out of the fifteen sky types, types 12, 11, 10, and 8, can be regarded as representing the skies of Harbin in winter, with a total frequency of occurrence of more than 70%.
A standard overcast sky is still used in current China as a benchmark to evaluate daylight conditions, but according to the results of this study, this appeared to be inaccurate and ineffective for Harbin. Clear skies are more suitable for describing the overall sky luminance distribution of Harbin during winter. The closest sky type could be identified, based on real-time luminance observation, and used for establishing the sky luminance distribution. A sky model based on measured luminance data could also be established to represent real sky brightness, which will help to develop a real-time daylight simulation and improve the accuracy of daylight simulation.
In the future, further long-term luminance measurements will be required for the annual study of the daylight climate of Harbin and other cities in China. This could contribute to the establishment of acceptable general daylight design rules, and other meteorological data observations can be carried out for analysis in concert.

Disclosure statement
No potential conflict of interest was reported by the authors.