Development of a component-based interactive visualization system for the analysis of ocean data

ABSTRACT With the continuous development of various types of fixed marine observation equipment, satellite remote sensing technology and computer simulation technology, modern marine scientific research has entered the era of big data. Interactive ocean visualization has become ubiquitous owing to the use of ocean data in studies of marine disasters, global climate change and fisheries. However, the primary challenge in analyzing large amounts of ocean data originates from the complexity of the data themselves. Therefore, an interactive multi-scale, multivariate visualization system with dynamic expansion potential is needed for analyzing larger volumes of ocean data. In this study, a unified visual data service was constructed, and a component-based interactive visualization structure for multi-dimensional, spatiotemporal ocean data is presented in this paper. Based on this structure, users can easily customize the system to visualize other types of scientific data.


Introduction
Ocean big data are complex and diverse. The integrated analysis of these multiple types of data can be used to describe various dynamic physical, chemical, biological and other processes that take place in the ocean. As a result of advances in various observation technologies, the era of big data has arrived, and extensive, continuous, multi-source, three-dimensional observations have led to the amount of ocean data reaching the Exabyte (EB) level (Guidi et al., 2020). In addition, the rapid development of computer simulation technology has led to enormous growth in various ocean data types that are highly accurate and exhibit dynamic changes in time and space. However, the highly heterogeneous nature of ocean data together with the huge data volume and strong spatiotemporal correlations that are present pose challenges for the automated mining and analysis of these data as well as for traditional visual representation methods. Interactive visual analysis can effectively integrate user knowledge with computer algorithms and allow users to explore, analyze and evaluate different types of ocean data in multiple ways. As a result, users gain insight into the inner workings of the data (Thomas & Cook, 2006), thus contributing to a better understanding of complex ocean dynamic processes (Andrienko & Andrienko, 2013). In addition, the construction of interactive visualization and analysis methods based on ocean big data and the development of oceanography-specific visualization and analysis models are conducive to the rapid identification of ocean phenomena and problem-solving.
Marine data analysis is typically performed using the following four main methods (Xie, Li, Wang, & Dong, 2019): 1) data representation and visual analysis of multielement marine environments (Liu, Chen, Yao, Tian, & Liu, 2017), 2) identification and tracking of ocean phenomena (Franz, Roscher, Milioto, Wenzel, & Kusche, 2018), 3) pattern or feature discovery (Sacha et al., 2017) and 4) uncertainty exploration and cognition (Kappe, Böttinger, & Leitte, 2019). Unfortunately, most types of visual analytics software or systems encounter difficulties when attempting to visualize the fusion of data from multiple sources because of the specific demands of different oceanographic domains.
Marine interactive analysis has become a powerful tool for analyzing and modeling marine data and has produced satisfactory results (Li, Jaroszynski, Pearse, Orf, & Clyne, 2019;Rautenhaus, Kern, Schäfler, & Westermann, 2015). Providing users with the most widely applicable analytical tool based on interaction and improving the efficiency of analysis are the goals behind the development of component-based visualization platforms. Therefore, in this paper, we develop the concept of an interactive analysis platform using a component-based approach; each analytical tool is embedded in the platform as a component. Each component can be executed independently or as any combination of the components, and different components can be used to solve different problems (https://casearthocean.qdio.ac.cn/oceanVisual_latest/).
In the remainder of this paper, we first describe the platform architecture and explain the function of each part. Following this, we introduce different methods of storing and processing ocean data. Next, based on the use of existing data sources, we describe the construction of ocean visualization functional components, the production of an interactive, multifaceted, multi-scale ocean visualization tool pool, and the development of visualization application services that are designed for use in marine environmental protection and marine scientific discovery. Finally, we describe the construction of applications of the platform based on two particular cases. Figure 1 shows the basic structure of the visualization framework, which is divided into five layers: a data source layer, a data-processing layer, a visual data layer, a visual component layer and a visual application layer. Compared with traditional visualization frameworks, a visual data layer has been added between the front and back ends; this provides unified data services for the different visual components. In the visual data layer, the data are convered and encoded after cleaning, quality control, and standardization by the data service layer based on the data type. We constructed the service interface of the Restful API architecture based on open source tools such as GeoServer, MinIO and Thredds. Restful API provides user interface with the component library for the visual component layer; this allows rapid retrieval and application of data. Commonly used analysis methods are combined in an orderly manner in the visual component layer to form an independent visual ocean component.

Data source layer
Currently, the data included in the visualization framework include ocean observation data, satellite remote sensor data and the results of model calculations ( Figure 2).

Ocean observation data
Ocean observation information from several main sources was integrated into our system. This included: 1) survey data from instruments and equipment on oceanographic research vessels; 2) buoy data including hydrological, meteorological and water quality data; 3) currents, temperature and salinity data acquired by submersibles. These data can be used to produce vertical profiles of a particular area.

Satellite remote sensing data
Satellite ocean remote sensing sensors include ocean color, infrared and microwave sensors, as well as altimeters, scatterometers, radiometers and synthetic aperture radar systems. The majority of the marine environmental monitoring data included in our system consisted of remote sensing data.

Model calculation data
To allow accurate prediction of marine disasters and identify oceanic phenomena, numerous complex, accurate and efficient numerical models of the ocean were integrated into our system. These included the Regional Ocean Modeling System (ROMS) (Klonaris et al., 2021) and Hybrid Coordinate Ocean Model (Metzger et al., 2020). The rapid development of GPU technology has led to artificial intelligence technology being widely used in ocean studies ). In our system, we integrated floodplain recognition data (Liu, Li, & Zheng, 2019), internal wave prediction data (Zhang, Li, & Zheng, 2021), temperature prediction data  and pCO 2 reconstruction data based on deep learning (Wang et al., 2021).

Data-processing layer
The collected data cannot be directly used for data analysis or information mining. Instead, they need to be cleaned, pre-processed, and stored in a unified space to provide raw data services for other applications.

Data cleaning
Data cleaning first involves the removal of noise from the original data using traditional range checking, null checking, error checking and ranking. The valid data are then complemented by integrating data from different sources or by using interpolation methods to fill in missing data. We also adopted various data quality control measures from other disciplines, such as the statistical and gradient tests used in the processing of hydrological and meteorological data. For example, we applied the total and correlation tests to marine geological survey data, whereas we applied heading, speed and gradient tests to geophysical data.

Data conversion
Ocean data are derived from various measurement devices and computer simulations, typically with differences in structure and resolution between the types of data. Therefore, it is necessary to convert data into a uniform format to establish metadata indexing services based on multiple data types and convert physical data to computer-friendly data types before services for different applications can be provided. For example, we convered data acquired by temperature-depth instruments and ship-based acoustic Doppler current profilers into CSV format files. Similarly, for the output, we used NetCDF files as the unified format.

Data management
After the data conversion was completed, the data were divided into structured and unstructured types according to their characteristics. Structured data, such as data from ship-based surveys, buoys and submersible markers, and other real-time or delayed fixedpoint observation data, exhibit a high degree of correlation between voyages, stations and survey times. Therefore, their structures and formats are already determined. Metadata can be created automatically based on these data, and such data are typically stored in relational databases. However, in the case of shipboard experimental data, owing to the uncertainty in the survey results, the data are generally stored in NoSQL databases to facilitate dynamic expansion (Han, Le, & Du, 2011). Unstructured data such as pictures, audio and video, satellite remote sensing data and model results are indexed by establishing a distributed file storage system.

Visual data serviece layer
Before performing an analysis of ocean data, it is essential to first understand the data. We therefore added a visual data layer, which provides a unified data interface for visual application layer, to the traditional data-management process. The addition of this layer means that users will not notice any effects due to the heterogeneity of different types of data when working with very large volumes of data.
Because of the multi-dimensional characteristics of ocean data and the strong spatiotemporal dynamics associated with them, we selected the Cesium map engine as the spatial display platform. This engine supports 2D, 2.5D, and 3D map extensions and timebased dynamic displays (Fu, Luan, Cai, Li, & Zhao, 2020). However, the data sources that this map engine can identify are limited. In order to make the system compatible with more types of data, we added a visualization data service layer between the visual component layer and the data-processing layer. This layer provides different types of data as a unified data interface service, and the visualization layer builds applications through the data interface so as to achieve a smooth transition from raw data to visual data.
Given the requirement for the application service to provide unified spatial data for the visualization application layer, the visualization data service layer performs three services, it supports the basic data types of point, line, section and volume, which change dynamically over time (Figure 3).

Multi-source data processing and conversion
The format of the data in the data management platform satisfies the data retrieval and download requirements; however, to achieve fast application of the data, we unified the format of the multi-source ocean data used for this service and standardized the structural, configuration and user interface data to the JSON format. For raster data, we unified the data format using coordinate system correction and layer rendering. For vector data, automated conversion to the GeoJSON format was applied; unified data conversion was also applied to the standard ocean model results. Each tool service completes the corresponding transformation using different encoding request formats, sends messageto different data service terminals according to the application requirements and renders it to the map engine through the data control terminals in the application layer.

Massive image data services
We adopted the multi-resolution image pyramid hierarchy model (Figure 4) to achieve real-time display of massive image data and fast access to multi-scale information. This model collects series of images with fine to coarse resolutions generated according to  specific rules. The image pyramid technique creates several image layers with different resolutions using image resampling methods. Each layer is split and stored by a corresponding spatial indexing mechanism, which improves the display speed when zooming through the images. As shown in Figure 3, the bottom of the image pyramid has the highest image resolution (256 × 256 pixels), and the resolution becomes lower toward the top of the pyramid. The resolution of the middle layer is 128 × 128 pixels; the top has the lowest resolution (64 × 64 pixels). Thus, the image pyramid consists of three layers corresponding to three levels of resolution, allowing image data to be explored from coarse to fine resolution and from whole images to the local scale.
We used the open-source GeoServer (GeoServer, 2021) to slice the original data and provide them as a service; appropriate professional parameter settings were used for this. In our system, automated display image data is implemented using the GeoServer API interface. Users can upload TIFF files in the background to access the pyramid service link for the image, thus ensuring that the data can be directly accessed when the visual component is used to build the application.

Distributed object storage
The system was designed so that the different files -including vector data, raster data, pictures, videos and gene data -are organized and coded in a uniform way based on the user's usage habits. Table 1 lists the encoding methods used for several common data types. The encoded data are stored using the MinIO (MinIO, n.d.) object storage service. Data service links are created according to the coding rules to provide a unified RESTful interface service (Pautasso, 2014) for the visualization component layer.

Visual component layer
Oceanographic problems typically involve multidisciplinary analysis of data. Thus, different approaches are needed to deal with the relevant issues. When designing the visualization system, we therefore considered oceanographic problems to be made up of separate problems that needed to be analyzed and designed appropriate visualization components for each problem. We established a component analysis library for marine characteristics based on ecosystem responses to provide rapid analysis and decision making. Figure 5 shows the basic arrangement of the interative component and display component

Moving particle component
We used the moving particle method for vector data such as wind, ocean current and wave data ( Figure 6). Compared with the traditional arrow symbols, the moving particle method can intuitively express the laws of motion that apply to wind, currents and waves. The process of building the dynamic particles on a Cesium engine can be divided into five steps. (1) Based on CANVAS (Fulton & Fulton, 2013) technology, a high-resolution grid is constructed according to the boundaries of the study area.
(2) Randomly moving particles are drawn in each grid cell.
(3) For each particle, the direction of movement is determined according to the horizontal and vertical components of the wind, ocean currents and waves in the grid cell.
(4) Some particles are reset to their original random positions. This makes sure that areas the moving particles pass never become fully empty.
(5) The continuous motion of particles is simulated by the fading in and fading out of different particles. In addition, corresponding interactive components are constructed which can be used to modify the moving particlesrelative speed, color and density of the moving particles. After the user initiates an application request using the browser, the interactive components communicate with the background through the control terminal and transmit the information to the moving particle component. After the data have been loaded, the interactive component can directly perform interactive operations on the style and movement mode of the moving particle component.

Visual application layer
The United Nations has announced outlines to guide and support the development of marine science over the next decade (2021)(2022)(2023)(2024)(2025)(2026)(2027)(2028)(2029)(2030). In addition, the ocean is a significant theme of the United Nations Sustainable Development Goals (SDGs). Achieving unified data management and building a component-based visualization system, can help with the understanding of scientific issues such as climate prediction, biogeochemistry and ocean-atmosphere physical coupling using various analytical tools. As part of our system, we developed reusable components that can be applied to individual scientific problems. This will allow the rapid construction of visual analysis tools that can support the realization of the SDGs. Using the system that was developed, we built an early warning and forecasting system for coastal dikes in Fujian as well as an integrated system for marine ranching. These systems can aid disaster prevention and mitigation and the sustainable management of fisheries.

Early warning and forecasting system for Fujian coastal dikes
China is one of the countries most seriously affected by marine disasters. With the rapid development of the marine economy, the risk of maritime disasters in coastal areas has become increasingly prominent, posing a severe challenge to marine disaster prevention and mitigation. According to the China Marine Disaster Bulletin released in April 2021, storm surges are the most destructive type of marine disaster causing 97% of the total economic losses directly attributable to marine disasters. Therefore, it is crucial to strengthen research on typhoon storm surges and improve forecasting accuracy to aid disaster prevention and mitigation. Figure 7 shows the operating process of our early warning and forecasting system for coastal dikes in Fujian province. First, a check is made to see whether the typhoon center location exceeds the 48-hour warning line(as in the Figure 8 yellow dash line). Assuming that the storm-surge and wave-forecasting models have been activated to generate forecast data, the warning information and typhoon information are directly stored in the MangoDB database and published as JSON-format data by the visual data layer. After being processed by the data-processing layer, the data are stored in the object storage system to provide services via the RESTful application programming interface, and the interface services can be called on-demand using appropriate components to create a decision-support system. In this forecasting system, we used coupled storm-surge and wave models with unstructured grids; these models have high nearshore resolution and improve the portrayal of complex topography and physical mechanisms (Feng, Li, Yin, Yang, & Yang, 2018a). In addition, the joint operations from data calculation to storage and display are fully automated. Decision support is supplied by the component-based visualization platform. We thus established an improved database for the collaborative analysis of disaster big data. In terms of data presentation, the temporal and spatial fields of the model data are interlinked through the integration of multiple visualization components (e.g. moving particles, charts, rotating layers and GeoJSON components). The result is that these components cooperate to visually display the path, size and intensity of a typhoon. Moreover, the rise in water level, the distribution of the wind field and the distribution of the wave field are integrated to predict the development of the storm surge andto provide warnings to the dikes, thus providing decision support for government agencies and related users (Figure 8).

Integrated monitoring system for marine ranches
The marine ranching environment represents the primary conditions required for the survival and development of marine organisms. Real-time environmental monitoring of marine ranches can provide information that can be applied to the environmental protection and sustainable development of marine ranches. Figure 9 shows the operating process of our marine ranch information system. In this system, we integrated information such as ship data, meteorological data, water quality data, hydrological data, and video using chart components to achieve real-time monitoring of marine ranches and ensure the stable operation of these ranches ( Figure 10). Monitoring data were used to construct a biological carrying capacity assessment model (Feng et al., 2018b). We then used visual components and model results to build applications of this model and made predictions of biomass changes for major resource organisms using these applications. Based on the results, we then developed harvesting strategies and guidance that could be applied to the healthy development of marine rangeland ecosystems (Figure 11).

Discussion
The applications described above demonstrate how building component-based visualization applications can quickly facilitate multiple combinations of visualization methods and interactions between different types of data. This technology can effectively be used to display multi-variable, multi-dimensional ocean datasets in a browser. The complex and time-consuming tasks of data-processing and conversion, as well as other tasks such as data organization, image data service publishing and grid data service publishing, are performed in advance at the back end. The front end needs only to combine the different components according to the application requirements and access the data. The development of interactive visualization technology is difficult and requires a significant amount of programming; in addition, many factors must be considered when displaying large-capacity, high-dimensional data in a network environment. Figure 9. The operating process of the marine ranch information system. Monitoring data are stored in the MangoDB database after quality control and data conversion, and model data are directly stored in the database. The panel layout components, chart components, GeoJSON components, and streaming components obtain data and then combine these to form a marine ranch information system. In contrast, our system is based on the Cesium and React frameworks, and the basic ocean visualization application library contains many visualization components and provides users with development interfaces. Users can create visualization functions and development specifications according to their requirements and add these to the component library to achieve compatibility with and the visualization of different types of data. Users can also choose different components for publishing and constructing their application system.
Compared with ODV and Vapor, our system has the following advantages. 1) Browser-server hierarchical architecture is adopted; thus, service providers host software applications, and users can access browsers without installing any software or plug-ins. 2) The software is not specific to certain types of data. For example, ODV's primary function is the visual display of voyage survey data, and it cannot integrate marine data from different data sources. In contrast, our system provides online datapublishing functions and visual component combination functions. In this way, the fusion of data from multiple sources can be achieved. 3) Users are provided with a programming interface in the visualization layer. The user uploads the data through the background to obtain the data link. The user then selects the component function according to the application and inputs the data link and parameters (time range, spatial range, frequency, etc.) according to the function requirements; finally the user combines multiple components to form an application service. In addition, the proposed visualization framework can easily be extended and applied to other ocean visualization data, thus reducing development costs.
Despite these achievements, our visualization framework has certain limitations. A unified data service is used to improve data compatibility. When dealing with timesensitive applications, the results cannot be presented quickly. For example, the triangular grid data calculated by the storm surge model needs to be converted Figure 11. The use of a marine components library to build a marine ranch decision-support information system: (a) Chart component.
before it can be used by the visualization layer. However, a typhoon forecast is timesensitive; this means that it is necessary to further improve the data compatibility of the visualization layer to achieve faster display. Another problem is that in order to be able to display greater amounts of marine data, a large number of users is needed to participate in the development of the system. This will include the development of component libraries and the visualization of marine data. In addition, to be able to combine modules, our system currently requires users to be able to program.
Our next step will be to address these limitations. First, we intend to include more applications such as recognition of Enteromorpha, sea ice and eddy based on artificial intelligence, El Niño Southern Oscillation forcasting and global ocean temperature changes . Second, we will continue to expand the ocean component library and plan to develop an interactive front-end system for combining components. Users will be able to create custom visual application displays by using drag-anddrop operations. Finally we will build an interactive visual analysis platform for application to oceanography.

Conclusions
Because of the complexity of ocean data and of the ocean itself, a single tool cannot meet all the requirements of ocean data analysis. Current visualization and analysis systems face challenges in dealing with high volumes of data from multiple sources and with the strong spatiotemporal correlation within ocean datasets. These challenges include the display and storage of multi-resolution big data, the scalability of display systems and the fast visualization of ocean applications. In this study, we first built a unified data resource service system for multi-source heterogeneous ocean data. Second, we set up a task-oriented system to develop ocean visualization components for different applications in response to the characteristics of ocean data and realized a modular combination of component libraries. Third, we adopted various interactive analysis schemes that could be applied to different ocean problems. These schemes allow component reuse and enhance the ability to solve ocean problems by providing decision support and will thus contribute to the realization of the United Nation SDGs.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the the Key R&D project of Shandong Province (2019JZZY010102), the Big Earth Data Science Engineering Project (XDA19060104), the 13th Five-year Informatization Plan of the Chinese Academy of Sciences, the Construction of Scientific Data Center System (XXH-13514).