Construction of functional data analysis modeling strategy for global solar radiation prediction: application of cross-station paradigm

To support initiatives for global emissions targets set by the United Nations Framework Convention on climate change, sustainable extraction of usable power from freely-available global solar radiation as a renewable energy resource requires accurate estimation and forecasting models for solar energy. Understanding the Global Solar Radiation (GSR) pattern is highly significant for determining the solar energy in any particular environment. The current study develops a new mathematical model based on the concept of Functional Data Analysis (FDA) to predict daily-scale GSR in the Burk-ina Faso region of West Africa. Eight meteorological stations are adopted to examine the proposed predictive model. The modeling procedure of the regression FDA is performed using two different internalparametertuningapproachesincludingGeneralizedCross-Validation(GCV)andGeneralizedBayesianInformationCriteria(GBIC).Themodelingprocedureisestablishedbasedonacross-station paradigmwhereintheclimatologicalvariablesofsixstationsareusedtopredictGSRattwotargetedmeteorologicalstations.Theperformanceoftheproposedmethodiscomparedwiththepaneldata regressionmodel.Basedonvariousstatisticalmetrics,theappliedFDAmodelattainedconvincingabsoluteerrormeasuresandbestgoodnessoffitcomparedwiththeobservedmeasuredGSR.In quantitativeevaluation,thepredictionsofGSRattheOuahigouyaandDoristationsattainedcorre-lationcoefficientsof R = 0.84 and 0.90 using the FDA model, respectively. All in all, the FDA model introduced a reliable alternative modeling strategy for global solar radiation prediction over the Burkina Faso region with accurate line fit predictions.


Introduction
The growth in electrical energy demand is becoming a critical issue, especially as regards promoting sufficient technologies for solar (and other renewable) energy utilization that must support United Nations Sustainable Development Goal 7.Over the past three decades, the main genuine channel of energy as being through because it can maintain and sustain every process and activity that enhance the lives of animals, plants and other materials on earth (Yang, 2019).The main source of energy that meets the environmental challenges related to limited reserves and fossil fuels is solar (Li, Bu, Long, Zhao, & Ma, 2012;Ulgen & Hepbasli, 2004).Other forms of non-renewable energy can generate significant environmental issues.Renewable energy, includes tidal, solar, wind and geothermal, are favored because they present reduced environment impact compared to traditional means like fossil fuels.Hence, solar energy can be a CONTACT Zaher Mundher Yaseen yaseen@tdtu.edu.vnsustainable and promising energy source that can minimize environmental hazards (De Souza et al., 2016).Solar radiation from the sun resulting in solar energy is an electromagnetic radiation with a varied wavelength from radio waves (10-8 μm) to U-rays (10-6 μm) (Adeyefa & Adedokum, 1991).Extra-terrestrial and terrestrial spectra deviate from each other owing to different absorptions in the atmosphere (Ugwuoke & Okeke, 2012).In general, power per unit area covered by the sun regarding electromagnetic radiation within the measuring instrument wavelengths.Solar radiation helps in improving energy efficiency, de-carbonizing the global economy and ameliorating greenhouse gas emitter costs (Besharat, Dehghan, & Faghih, 2013).A coherent understanding and precise evaluation of solar radiation is needed for several applications, including: the supply of energy to natural processes and photovoltaic cell electrons that are in existence such as photovoltaic and thermal photosynthesis systems (Yakintepe & Genc, 2015); climatology, meteorology, energy budgets and radiation, water treatment processes, natural and heating lighting, use of renewable energy, forestry and agriculture (De Souza et al., 2016); building energyconscious designers and air conditioning engineers (Li et al., 2012;Muneer & Munawwar, 2006).
Solar radiation changes from one geographical area to another.It depends on: (i) meteorological variables including the effects of cloud cover, evaporation, relative humidity, temperature, precipitation, extra-terrestrial solar radiation and sunshine duration; (ii) geographical variables including the elevation of the site, longitude and latitude; (iii) geometrical variables including the orientation and inclination angles of solar receivers; (iv) astronomical variables including hour angle, solar constant, solar declination and earth-sun distance; and (v) physical variables including water vapour content, scattering due to air molecules, scattering due to dust, earth-sun distance, and other atmospheric components such as CO 2 , N 2 and O 2 .
Different methods and measurements have been employed in the various parts of the world to measure global solar radiation.These techniques required consistent measurements using meteorological measuring instruments including satellite remote sensing and Eppley pyranometer instruments such as Meteosat-images and Moderate-Resolution Imaging Spectroradiometer (MODIS) products.Because of the maintenance, cost and skill required in producing satellite-derived data and ground measurements, especially in developing and rural nations, several prediction models have been postulated to generate global solar radiation data that do not require a high initial outlay for the instrumental network (Sunday, Agbasi, & Samuel, 2016;Sunday, Samuel, Agbasi, & Sylvia, 2016).
Any technologically conscious developing country can get through conventional or renewable sources.This might be due to the enormous energy usage required by some developing countries, expertise, cost of installation, and required maintenance.Thus, combining both nonrenewable and renewable sources will favor power supply in developing countries; nevertheless, renewable energy sources should be focused upon owing to their minimal environmental hazard.Most West African countries lack global solar radiation data.For the past 30 years, Global Solar Radiation (GSR) has been evaluated based on the horizontal interface on a monthly and daily basis.
Different kinds of empirical models have been postulated in several West African countries.Owing to this, several input variables have been used to achieve many functional forms.The models that have been employed fall into six groups, depending on the input variables used.These models were categorized into several subgroups, depending on the year postulated.Overall, a total of 68 functional forms and 356 empirical models have been postulated in previous studies for evaluating GSR in West African countries.Soft and empirical models were compared for evaluating GSR across West Africa, and the results obtained reflected a better outcome for soft computer models.
It is not possible to gather solar radiation data in many regions/locations owing to the absence of solar power stations.Thus, the solar radiation data for such locations have to be predicted, and the accuracy of the predictions depends on the model used (Yagli, Yang, & Srinivasan, 2019).Several statistical and data-driven models have been proposed for solar radiation prediction.For example : Olatomiwa, Mekhilef, Shamshirband, and Petković (2015b) developed A Neuro-Fuzzy Inference System (ANFIS) to predict solar radiation; Aybar-Ruiz et al. (2016) proposed a grouping genetic and extreme learning machine algorithms to predict global solar radiation; Kaplani, Kaplani, and Mondal (2018) investigated a spatiotemporal model for predicting daily global solar radiation; Meenal and Selvakumar (2018) compared the accuracy of several datadriven methods in predicting solar radiation; Bahrooz, Mert, and Kisi (2018) compared four different heuristic regression methods for estimating solar radiation; Khosravi, Koury, Machado, and Pabon (2018) proposed two machine learning algorithms to predict the hourly solar irradiance; Cornejo-Bueno, Casanova-Mateo, Sanz-Justo, and Salcedo-Sanz (2019) compared several machine learning regression techniques for global solar radiation estimation; and Torres-Barran, Alonso, and Dorronsoro (2019) evaluated the accuracy of random forest, gradient boosted and extreme gradient boosting regression models in solar radiation prediction.Such models model the data observed from a single time point.Throughout the literature, multiple investigations have been conducted in this regard, for example: solar energy prediction using linear and nonlinear models by the American Meteorological Society (Aggarwal & Saini, 2014); operational and ground-based models developed for solar radiation prediction for multiple advanced daily-scales throughout Greece (Kosmopoulos, Kazadzis, Lagouvardos, Kotroni, & Bais, 2015); the feasibility of a support vector regression model examined for solar irradiance prediction throughout coastal Taiwan (Kosmopoulos et al., 2015); the prediction of monthly-scale global solar radiation conducted based on a statistical distribution modeling strategy using a clearness index for Nigeria (Ayodele & Ogunjuyigbe, 2015); and hourly-scale solar irradiance prediction established using the potential of the Long Short-Term Memory (LSTM) model for Santiago Island, Cape Verde (Qing & Niu, 2018).The literature has demonstrated noticeable progress in solar radiation pattern prediction using diverse advanced methodologies.
Among the procedures for the generation of global solar data, the ideal method is the use of a proper radiometric instrument that will directly measure the solar data at a given solar farm.However, the cost demand and expertise required for on-site global solar radiation measurement have limited the availability of radiometric data in most African and Asian countries (Zou et al., 2019).Another problem is the hosting of solar radiation stations in urban areas while effectively neglecting rural areas, where the energy crisis is more predominant.In Burkina Faso, most stations owned by the government do not have the capacity to measure solar radiation data routinely (Azoumah, Ramde, Tabsoba, & Thiam, 2010), while monthly or daily radiometric data are missing in areas with readily available data due to poor calibration of equipment.Solar radiation can also be generated using a meteorological reanalysis technique called Meteoblue (David & Lauret, 2018).This involves the use of physical models to simulate meteorological parameters physically (5 km × 5 km).This simulation relies on Non-hydrostatic Meso-scale Modeling (NMM) technology, which depends on parameters such as topography, soil and coverage.One major problem of this approach is that the generated values are simulated rather than real.However, its major advantage is the incorporation of physical processes that influence ground-based solar radiation.Given that mathematical equations with predetermined initial and model boundary conditions are used during the simulation, the data observed physically at a station may differ significantly from the predicted data (Fabbri, Canuti, & Ugolini, 2017).
Most investigations on solar radiation prediction have generally been done using empirical datasets collected from a single time point.However, datasets that are repeatedly measured over discrete time points may provide more information.Also, recent technological developments lead to data collection processes having highdimensional and complex structures.Traditional statistical/mathematical techniques may not be applicable for such data types because of some difficulties such as multicollinearity, high dimensionality, high correlation between sequential observations, etc. Analyzing such datasets using Functional Data Analysis (FDA) techniques may be more useful since FDA has several important advantages over traditional statistical techniques.For example, FDA does not suffer from the missing data problem and the high correlation problem between repeated measurements; by smoothing the data, it minimizes the noise present in the data; and it can be used for irregularly sampled data.Thus, the need for FDA techniques is gradually increasing.
Functional regression models are used among others to explore the relationship between the functional response and predictor variables, and these models have received substantial attention in the literature.Also, they have successfully been used in many areas; see for example Valderrama, Ocana, Aguilera, and Ocana-Peinado (2010), Ivanescu, Staicu, Scheipl, and Greven (2015) and Chiou, Yang, and Chen (2016).See also Ferraty and Vieu (2006), Horvath and Kokoszka (2012) and Cuevas (2014) for more information about functional regression models and their applications.In this paper, we propose a functional regression model to predict global solar radiation data using meteorological variables so as to improve prediction accuracy.In summary, the proposed model works as follows: first, Gaussian basis function expansion and two information criteria -Generalized Cross-Validation (GCV) and Generalized Bayesian Information Criteria (GBIC) -are used to convert discretely observed data into a functional form.Second, the penalized log-likelihood method is used to estimate the discretized version of the model parameter matrix.Finally, the coefficient function of the functional regression model is obtained by applying a smoothing step.To the best of our knowledge, this work is the first study to predict global solar radiation data using a functional regression model.For future work, the FDA procedure proposed in this study can be extended to other reallife problems as an alternative to the methods proposed by Chau and Muttil (2007), Ghorbani, Kazempour, Chau, Shamshirband, and Ghazvinei (2017), Yaseen, Sulaiman, Deo, and Chau (2018) and Moazenzadeh, Mohammadi, Shamshirband, and Chau (2018).
The rest of the paper is organized as follows.Section 2 presents the details of the proposed method and the panel data regression model.The performance of the proposed method is evaluated with real-world data and the results are given in Sections 3 and 4. Section 5 concludes the paper.

Functional regression model
Let {t} J j=1 ∈ T represent the discrete time points at which the data is observed.For n = 1, . . ., N and m = 1, . . ., M, let x nm (s), y n (t); s ∈ T m , t ∈ T denote the m functional predictors and a functional response with ranges T m ⊂ R and T ⊂ R, respectively.The functional relationship between the predictors and response can be modeled by the following functional regression model (Matsui, Kawano, & Konishi, 2009;Ramsay & Silverman, 2005): where β 0 (t), β m (s, t) and n (t) represent the intercept function, bivariate coefficient functions and error functions, respectively.For the sake of clarity, the role of function β 0 (t) can be eliminated by centering the functional predictors and response.Let , denote the centered functional predictors and response, respectively.Then the functional regression model (1) can be written as follows: where is the centered error functions.Hereafter it is assumed that both functional response and predictors are centered.
The first step in FDA is to smooth the functional data using a suitable basis function system.Let k (t) = {φ 1 (t), . . ., φ K (t)} denote a system of k, k = 1, . . ., K, basis functions; then a function, say y(t), can be defined as y(t) = K k=1 c k φ k (t) where c k is the coefficient vector of the kth basis function φ k (y).Accordingly, the smooth functions of the (centered) functional predictors x nm (s) and functional response y n (t) are defined as follows: where (t) = {φ 1 (t), . . ., φ K y (t)} and (s) = {ψ m1 (s), . . ., ψ mK m,x (s)} are vectors of basis functions and c n = {c n1 , . . ., c nK y } and d nm = {d nm1 , . . ., d nmK m,x } are the corresponding vectors of coefficients.Choosing the right basis functions is one of the most crucial steps in FDA.Several types of basis function, such as the Fourier basis, the B-splines basis and the radial basis, have been proposed to smooth functional data; please see Ramsay and Silverman (2005) for more details.We consider the following Gaussian basis functions in our numerical analyses (see Matsui et al., 2009): where the equally spaced knots τ k and τ (m) j determine the centers of the basis functions, and σ = τ k+2 − τ k /2 and σ m = τ (m)  j+2 − τ (m) j /2 are the widths.Another important task in smoothing functional data is to choose the optimum number of basis functions K. Generally, (i) the data is well fitted by the functions when K is large, but the noise present in the data may not be eliminated; on the other hand (ii) some key features of the smooth function could be ignored when K is too small.To select the optimal K, we consider the generalized cross validation and generalized Bayesian information criteria proposed by Matsui et al. (2009).
Using the basis function, the bivariate coefficient functions β m (s, t) in (1) can be written as follows: where B m = (b mjk ) j,k is a coefficient matrix with dimension K m,x × K y .From (2), ( 3) and ( 6), the functional regression model given in (1) can be written as follows: where Accordingly, the functional linear model for the whole system can be expressed as follows: Several techniques, including the least squares, maximum likelihood and penalized maximum likelihood methods, have been proposed to estimate the coefficient matrix B, see for example Ramsay and Silverman (2005), Yao, Muller, and Wang (2005), Konishi and Kitagawa (2008) and Matsui et al. (2009).The least squares and/or maximum likelihood methods provide unstable/unfavorable estimates for the model parameters (Matsui et al., 2009), and thus the penalized maximum likelihood method proposed by Matsui et al. (2009), which controls the degree of smoothness of the functions and provides more flexible results, has been considered to estimate the functional parameters of the regression model ( 8).Suppose that the error function * n (t) has the form * n (t) = e n (t), where the K y -dimensional error vectors e n = (e n1 , . . ., e nK ) are assumed to be independent and identically distributed Gaussian random variables with mean 0 and variance-covariance matrix .Define the functional regression model ( 7) by Multiplying both sides of ( 9) by (t) and integrating with respect to T yields Let f (y n |x n ; θ) with parameter vector θ θ θ = (B, ) denote the probability density function of the model ( 10).Then the penalized log-likelihood function for θ is obtained as follows: ) -dimensional matrix of penalty parameters, and tr{•} are, respectively, the Hadamart product and the trace of a matrix, and is a positive semi-definite matrix.Equating the derivatives of the penalized log-likelihood function given in (11) with respect to θ = (B, ) to 0 gives the penalized maximum likelihood estimators of θ, θ = B, ˆ , as follows: Finally, the penalized maximum likelihood estimator of C is obtained as In practice, the performance of the penalized maximum likelihood method vigorously depends on a suitable choice of the parameter λ values, since the estimated model parameters θ and Ĉ depend on the penalty matrix M .Several information criteria have been proposed to select proper penalty terms, λ m , that minimize the corresponding objective function.Two different tuning parameter techniques including GCV and GBIC are implemented to obtain the suitable prediction process, as follows:  where q = p − rank( ), p = m K m,x and r = K y (K y + 1)/2, please see Matsui et al. (2009) for the derivation of GCV and GBIC.To select the best prediction model, these two criteria work as follows: (1) the functional regression model is constructed based on the data, which are approximated by basis function expansion using several combinations of smoothing parameter and number of basis functions; then (2) both the GCV and GBIC select the best model according to a smoothing parameterthe number of basis function combinations that produces minimum GCV and GBIC values.
For the sake of clarity, a flowchart is presented in Figure 1 to show how the proposed method works to obtain the experimental results in this paper.

Panel data regression model
In this study, the finite sample performance of the proposed modeling strategy is compared with the linear panel data regression model with fixed effects.Let i = 1, . . ., N denote the individuals observed at time points t = 1, . . ., T. Let also y it and x it denote the response and K-dimensional predictor variables.The linear panel data regression model is then defined as follows: where α i , β and u it represent the individual effects, coefficient vector, and the error terms, respectively.The coefficient vector β is estimated using the Ordinary Least Squares (OLS) method.Briefly, let ȳi , xi and ūi denote the averages of y it , x it and u it for each individual i = 1, . . ., N.
Then, the OLS estimate of β is obtained as follows: where xit = x it − xi and ỹit = y it − ȳi .Readers are referred to Baltagi (2005) for more information about the linear panel data regression model.

Case study
The solar radiation prediction model was developed for the Burkina Faso region, located in Sub-Saharan Africa (Figure 2).About 70% of the total power generation capacity in Burkina Faso is sourced from thermal-fossil fuel, while hydro-power accounts for the remaining 30% (REN21 2015(REN21 , 2017)).Owing to the incremental cost of production, the instability of the oil price, as well as the ever-increasing demand for electricity, the country recently installed 28 fossil-fuel powered stations with a generating capacity of 247 MW.The net energy import of the country from its neighboring countries currently stands at about 20%.However, fuel-wood, charcoal, agricultural residues and animal dung are used as major sources of energy in remote villages.
In the present study, the proposed mathematical model was developed for the prediction of daily global solar radiation using eight meteorological stations distributed all over the Burkina Faso region, namely Bur Dedougou, Bobo Doulasso, Fada N'gourma, Ouahigouya, Bormo, Dori, Gaoua and Po.The dailyscale climatological data, obtained from 1 January 1998 to 31 December 2012, consist of six variables: wind power, temperature, log humidity, the difference between the saturation and the actual vapour pressure (Es−Ea), evaporation (Eo) and solar radiation.The datasets were averaged over the data points obtained from the whole time span to construct a functional regression model.The mean value of the climate variables in time series for each involved meteorological station are plotted in Figure 3.

Application and results
The current study has reported the feasibility of a newly developed mathematical model called the FDA technique to predict daily-scale global solar radiation in the Burkina Faso region of West Africa.The global solar radiation was simulated based on various related climatological variables using a consistent timescale.At first, the datasets of all the climate variables were converted into functional form using penalized Gaussian basis function expansion taking into account the number of basis functions K and the penalty parameter λ estimated by GBIC and GCV.The modeling was conducted based on the distinguished modeling strategy cross-station paradigm simulation.Six meteorological stations were selected randomly to predict the solar radiation at two targeted meteorological stations (i.e.Ouahigouya and Dori).The main merit of this modeling archetype is the possibility of using the nearby maintained meteorological stations information as predictors for any particular station.This is highly significant and essential in the case where there is no consistency of monitoring measurements, lack of climate information over certain historical periods, and other reasons that might be experienced in such developing countries.For validation purposes, the predictability performance of the FDA technique was compared with one of the well-known regression models called panel regression.
To demonstrate the functionality of GCV and GBIC over the inspected meteorological dataset, the functions of the variables for the Fada N'gourma station (selected as an example) are illustrated using the K and λ values of GCV and GBIC (see Figures 4 and 5, respectively).Table 1 reports the tuning parameters of K and λ values in the form of a quantitative presentation.It is obvious, based on the tabulated values, that GCV performs the regression function with higher magnitudes of basis functions over those of GBIC to convert the raw data to functional form.Both functions provide a clear picture of the raw data as demonstrated in Figures 4  and 5.
The functional regression model was constructed using the variables of six randomly selected stations, i.e.Bobo Dioulasso, Boromo, Bur Dedougou, Fada where y * n , x * n1 , x * n2 , x * n3 , x * n4 and x * n5 are the centered functional variables for solar radiation, wind, temperature, log humidity, Es−Ea and Eo, respectively, and T = {0.5, 1.5, 2.5, . . ., 364.5}.The parameter matrix B was estimated using the penalized maximum likelihood method, and GBIC and GCV were used to select the best model.
Scatter plots of the observed average global solar radiation versus the fitted smooth function values are displayed in Figures 6 and 7.These scatter plots show that the observed solar radiation values were well fitted by the smooth functions obtained from the proposed model.On the other hand, Figure 8 presents the modeling performance of the panel regression model for the same modeled six stations.Based on the attained modeling performance of the six randomly selected meteorological stations, the model was platformed to predict the GSR at the two targeted stations (i.e.Ouahigouya and   Following several research works in the literature, the current research modeling was validated statistically using various performance metrics including root mean squared errors (RMSE), the determination coefficient (R 2 ) and the correlation coefficient (R) using the observed (average) solar radiation and fitted and/or predicted solar radiation functions.The mathematical formulation can be expressed as follows (Rodrigues & Henggeler Antunes, 2018;Yadav, Malik, & Chandel, 2015): Values of all the performance metrics examined (i.e.RMSE, R 2 and R) are reported in Table 2.Note that the values given in columns three to eight belong to the stations used in the training modeling phase, whereas the values in the last two columns belong to the stations for which the performance metrics were predicted.These values indicate that the fitted/predicted functions evaluated by GCV provide slightly better approximations compared to those obtained by GBIC.However, the GCV and GBIC functions reveal a much better predictive capacity in comparison with the panel regression model.
The current research results are validated against established research in the literature and within the African region.Olatomiwa et al. (2015b) established an ANFIS method to predict monthly solar radiation at     18).Note that the coefficient functions were estimated based on GCV.(Olatomiwa et al., 2015a).Another study was developed using the empirical formulation for diffuse solar radiation prediction by Khorasanizadeh and Mohammadi (2015).The results demonstrate six different empirical formulations with prediction accuracy achievement in the range RMSE = 0.9548-1.1698.Based on the statistical metrics performance of the prediction reported in Table 3, in comparison with the current research results, the performance metrics demonstrate superior prediction performance at the Ouahigouya and Dori stations.
The estimated coefficient functions of the functional linear model presented in Figures 13 (GBIC) and 14 (GCV).These figures show the effects of the meteorological variables on the predicted solar radiation.For example, panel (c) of Figure 13 indicates that, while the humidity has little effect on the predicted solar radiation during the early months of the year, it has a large effect in the last months of the year.Although the attained predictability performance of the functional data analysis technique on global solar radiation prediction is good, there is still room for modeling enhancement via the incorporation of the physical-based model established using Meteoblue (Fabbri et al. (2017)).Indeed, formulating such an integrative model based on functional data analysis and the mathematical formulation of the Meteoblue method could possibly enhance the prediction capacity performance further.

Conclusions
The development of a scientific, robust and reliable modeling strategy to predict global solar radiation in particular climatic regions could help climate change mitigation advocates and numerous energy decision-makers.This is to embrace renewable energy as a dynamic solution to mitigate the risk of the global warming and climate change phenomena.Converting global solar radiation into power grids entails an economical and intelligent model authenticated by the reliability of simulation.Hence, the exploration of newly friendly and robust mathematical models for comprehending the correlated available climate variables empowers research interest and innovation for the new era of energy engineering.The current study was devoted to exploring the feasibility of a new mathematical model based on the functional data analysis modeling technique to simulate daily timescale global solar radiation in the Burkina Faso region of West Africa.Two different statistical modeling procedures were established (i.e.GCV and GBIC) for the prediction learning process.Fifteen years of daily-scale climate variables, including wind power, temperature, log humidity, the difference between the saturation and the actual vapour pressure (Es−Ea), evaporation (Eo) and solar radiation, were used to implement the prediction process.The findings of the current research are presented as follows.
• The conducted FDA modeling technique exhibited a reliable predictive model for GSR with a high and acceptable degree of accuracy based on the reported statistical metrics.• Based on authentication against well-known machine learning predictive models conducted in the literature and within the same region, FDA proved to have greater prediction capacity based on RMSE and R 2 .• The predictability of the established modeling strategy was totally location dependent where the variance results can be observed.Hence, the idea of initiating a cross-station paradigm was an excellent proposition with which to gather more informative climate information from the nearby meteorological stations in order to enhance the learning procedure.• Both of the applied learning procedures (i.e.GCV and GBIC) demonstrated an efficient computational methodology for solar radiation simulation based on various climate input variables.The merit of the results supports the possibility of embedding the model as a generalized predictive tool for the simulation of other meteorological stations.• The investigated FDA predictive model provided a reasonable solar radiation prediction that is totally relying on the selected climate input attributes.Also, the appropriate internal parameters tuning that verified based on GCV and GBIC approaches were controlling the reliability of the modeling procedure.
Future investigations could be performed on the uncertainty analysis of data, model structure and input variability.

Figure 1 .
Figure 1.Flowchart of the proposed method.

Figure 3 .
Figure 3.Time series plots of averaged datasets.

Figure 4 .
Figure 4. Functional datasets for Fada N'gourma city obtained using GBIC.Gray points represent the raw data and the solid (black) lines are the functions.

Figure 5 .
Figure 5. Functional datasets for the Fada N'gourma city obtained using GCV.Gray points represent the raw data and the solid (black) lines are the functions.

Figure 6 .
Figure 6.Scatter plots of the observed (average) and fitted (function) solar radiation values obtained from the model evaluated by GBIC.

Figure 7 .
Figure 7. Scatter plots of the observed (average) and fitted (function) solar radiation values obtained from the model evaluated by GCV.

Figure 8 .
Figure 8. Scatter plots of the observed and fitted solar radiation values obtained from the panel data model.

Figure 9 .
Figure9.Results for the Ouahigouya and Dori stations.Gray points are the observed discrete solar radiation data points, black solid lines are the smoothed raw data, blue solid lines are the predicted functions obtained using the functional regression model, and the brown dashed lines are the approximate 95% confidence intervals of the predicted functions.

Figure 10 .
Figure10.Results for the Ouahigouya and Dori stations.Gray points are the observed discrete solar radiation data points and blue solid lines are the predicted observations obtained using the panel data model.

Figure 11 .
Figure 11.Scatter plots of the observed (average) and predicted (function) solar radiation

Figure 14 .
Figure 14.Estimates of coefficient functions β i (s, t), for i = 1, . . ., 5, of the functional regression model given by Equation (18).Note that the coefficient functions were estimated based on GCV.

Table 1 .
Estimated number of basis functions and penalty parameters.