Application of deep neural network learning in composites design

Abstract A timely review is presented on artificial intelligence (AI) and, more specifically, deep learning, which is a subfield of machine learning (ML), applied to the design and behaviour of modern composite materials systems. The use of composites is increasing due to their high specific strength and stiffness, which make them comparable to metals, and their tunable properties that can be altered to produce lightweight materials with efficient structural configurations. Recent studies are examined and discussed, wherein computational tools have been developed that mimic human brain activity to answer questions and solve challenging problems toward characterizing materials behaviour and improving the performance of materials with less effort and cost. The attractiveness of AI comes from its self-learning capability, the faster computer processing time of large datasets, and the potential to yield highly accurate results. However, as a data-driven method, the quantity and quality of data largely affect the accuracy of ML in addition to the need for well-designed AI algorithms and virtual reality models, hence the need to continue the research efforts in this area.


Introduction
Materials engineering has progressed in every field of activity, including metallurgy, inorganic materials, and polymer science and engineering. Recently, new alloys as lightweight innovative structural materials (Ogawa, Ando, Sutou, & Koike, 2016) and phase change materials as low energy non-volatile memories (Hatayama et al., 2018;Mori, Hatayama, Shuang, Ando, & Sutou, 2020) have been developed, and much progress has been made in technology for the design and evaluation toward functional composite materials Wang, Yeo, Su, Wang, & Abdalla, 2020;Wang, Yeo, et al., 2020) and technology to detect the cracking or damage of composite materials systems (Léonard, Stein, Soutis, & Withers, 2017;Takeda & Narita, 2017). Until recently, the development and characterization of materials have mostly relied on the knowledge and experience of researchers rather than computing power.
Originally, computers were used by the United States (U.S.) military, as their mathematical equations that needed to be solved were becoming more complex, and would thus have required a costly labour force to solve. Since this, computers have pervaded many aspects of daily life over the years, with microprocessors having been embedded in many household appliances and the use of personal computers becoming more commonplace. It can now be said that the lives of humankind are inextricably linked to computers. As computing speed and storage capacity improved, scientists began to use computers more extensively in their investigations. However, the use of computers in research is not limited to the solving of linear mathematical calculations, as with many methods having also been proposed to deal with nonlinear problems. To this aim, artificial intelligence (AI) is a useful computational tool that can be used to solve such nonlinear tasks as it mimics human brain activity to answer questions and solve challenging problems.
Traditionally, to optimize performance, materials research has involved a great deal of time-consuming and labour-intensive experimentation. Experimental research is often aided by analytical and numerical modelling to gain a better understanding of the measured data and the factors affecting behaviour. In line with the advances made in technology and computing power, more accurate methods and equipment have been developed to facilitate research investigations. In particular, more accurate information related to materials nano/microstructures can be obtained using measurement and observation techniques, such as optical and scanning electron microscopy (SEM), differential scanning calorimetry, Fourier-transform infrared spectroscopy, and X-ray computed tomography (CT). For this reason, the scale of research data is huge and not confined to ordinary arithmetic data, involving rather multifarious data types. To analyze that data and draw conclusions, many professional operators would be required. Furthermore, it would be a waste of time and resources if the data were not adequately analyzed and utilized.
Although there has not been widespread adoption of AI and machine learning (ML), it appears that the time is ripe for this to happen due to powerful computers now being readily available, in addition to the big data and internet of things revolution. Furthermore, deep learning has recently attracted much interest due to its faster processing time, self-learning capability, and the potential to yield highly accurate results (Amanullah et al., 2020). The distribution of publications about deep learning in each subject area was investigated and is shown in Figure 1(a). Apart from computer science itself, the applications of deep learning are mostly associated with informative subjects where a large volume of data need to be processed, such as mathematics, physics and astronomy, and medicine. Benefiting from many successful applications, the barrier to employing ML in materials science is lower than ever, as resources and tools for machine learning become more abundant and easier to access . As materials informatics have evolved from a niche area of research into an established discipline, distinct frontiers have come into focus, and best practices for applying ML to materials science are emerging (Riley, 2019). In the field of materials science, composite materials are extensively used as their performance is superior in many aspects as a result of the thoughtful design and tailoring of their properties. However, from design to production many factors and parameters need to be considered, thus countless attempts at their design and preparation are required to succeed. Consequently, a staggering amount of information on the design of composites has been generated and collected in various databases over the years. This information makes it possible to significantly reduce the time and cost associated with the design, development, and evaluation of composite materials, by utilizing computer-calculated materials databases and ML to simulate the properties of the materials. Figure 1(b) illustrates the number of publications related to composite materials published per year, which provide valuable data for taking advantage of deep-learning techniques.
The application of deep learning in the optimization and prediction of the properties of composite materials systems is graphically shown in Figure 2, with this research article focusing on the training dataset, algorithm, and output data, which are the three basic operational steps of deep learning involved in analyzing the properties of materials. In terms of the training dataset, it is expected that researchers observe, understand and collect, becoming aware of the usage potential of data generated during the experimental process. Researchers may realize that well-informed AI algorithms can solve design problems more effectively than traditional methods after becoming familiar with the entire application process. A brief introduction to deep learning, including development and some fundamental knowledge is presented, followed by a review of several applications in composite materials science; the prediction of properties, data processing, and composite materials design (topology optimization) are discussed. Finally, progress and limitations that ML faces in the development of composite materials and factors related to this are presented, with some suggestions made for future research.

Development of deep learning
Deep learning has emerged from research on AI and ML, Figure 3 (Kelleher, 2019). There has been fascination as to when programmable computers could become as intelligent as a human brain, process information, and deal with complex problems (Goodfellow, Bengio, & Courville, 2016). As the human brain is capable of dealing with vast amounts of data and solving problems based on long-term knowledge, for this reason a computer needs to capture valid knowledge from an enormous amount of informal information. To simulate human decision-making and reasoning processes, AI aims to program intelligence into machines by promoting them to learn from experiences and adapt to changes in their environment (Muthukrishnan et al., 2020). In the early days of AI, some problems in relatively simple environments were rapidly solved using a list of formal, mathematical rules because the computer did not require much knowledge about the entire world. For example, the Deep Blue chess-playing system was the first example of a computer system that outsmarted a human, defeating a reigning world champion in a chess match under standard tournament time controls (Saletan, 2007). However, chess is a relatively simple environment, containing only 64 locations and 32 pieces that can move in specific ways.  The movement of the game pieces can thus be governed by a set of formal rules after an appropriate strategy has been devised.
However, many problems in the real world are too complex to be described by such simple rules, meaning that AI systems must learn from raw data patterns. This capability is known as ML, which is a subset of AI, where the ML algorithms are designed to optimize the performance of a certain task using examples and/or experience (Alpaydin, 2014). Training and inference are the two steps of ML. An algorithm processes a dataset during training and selects the function that best matches the data patterns, with the model fixed once training is completed. The model is then used to infer new values from new examples in the inference stage. The concept of a function as a deterministic mapping from inputs to outputs is introduced, and the goal of ML is to find a function that matches the mappings from input features to the output features that are observed in the examples in the dataset. Several factors make ML tasks difficult to learn, even with the assistance of a computer. ML is an ill-posed problem when the set of possible functions exceeds the set of examples in the dataset: the information given in the problem is insufficient to find a single best solution; instead, multiple possible solutions will match the data.
Deep learning was introduced as a subfield of ML that focuses on the design and evaluation of modern neural network training algorithms and model architectures. Figure 4 presents the evolution of deep learning, where in the early years it was not "deep," since it just used a simple activation function to define the output of a node given an input or set of inputs. It only had a single layer (a layer is a structure or network topology in the architecture of the model), which took information from previous layers and then passed information to the next layer. After the "Exclusive Or" (XOR) problem was raised by Minsky and Paper in 1969, it was realized that many problems could not be solved using a single linear function. As a result, by connecting various function layers, connectionism was used to solve linear problems. As more function layers are required to connect and solve the problem as a result of the rise of big data and increased problem complexity, models are becoming more sophisticated. All this accomplished deep learning is thus capable of making accurate data-driven decisions by thinking with the "mind power" of a group of human experts.
ML algorithms are machine-like, meaning that they still require human intervention to be capable of fulfilling their designed purpose. In traditional ML techniques, most features need to be extracted by a domain expert to reduce the complexity of the data and make patterns more visible for algorithms to work. The greatest advantage of deep-learning algorithms is that they can learn categories incrementally through their hidden layer architecture, progressing from low-level to higher-level categories. They can also automatically extract features and classify datasets for further processing in practice, eliminating the need for domain expertise and hardcore feature extraction. Another advantage of deep learning, and a key factor in understanding why it is becoming popular, is that it is powered by massive amounts of data. As the amount of data collected grows, the performance of older learning algorithms grows and stabilizes until it reaches a certain point. The performance of deep-learning will thus improve in line with data growth. Correspondingly, in contrast to traditional ML, deep learning requires a high-performance processor and sufficient time to train models due to the large numbers of parameters involved. Deep-learning algorithms, such as, for example, the popular residual network algorithm, take around 2 weeks to train from scratch, whereas traditional ML algorithms take just a few seconds to a few hours to train. Surprisingly, this scenario is reversed in the testing phase, where the deep-learning algorithm takes much less time to run.

Fundamentals of deep learning
Deep learning refers to a class of neural network models that contain multiple layers of simple information-processing programs, referred to as neurons. Figure 5 illustrates the structure of a simple neural network, where the squares in the input layer represent locations in memory that are used to present inputs to the network and are excluded when the depth of the deep-learning model is taken into consideration. The information-processing neurons in the network are represented by circles in the diagram. Each of these neurons takes a set of numeric values as input and maps them to a single output value. The arrows in Figure 5 indicate how information flows through the network from the output of one neuron to the input of another neuron. Each connection in a network connects two neurons and each connection is directed, which means that information flows only in one direction. Each connection in a network carries a weight that influences how a neuron processes the data it receives. Searching for the best set of weights is the essence of training an artificial neural network.
Deep-learning algorithms can be classified as either supervised or unsupervised types of ML algorithms, and they can be used in both. In supervised learning, each example in the dataset is labelled with the expected output (or target) value. The most common type of supervised ML is when the algorithm uses these target values in the dataset to aid the learning process. However, the target feature values are sometimes difficult and expensive to collect. As there is no target value in the database in unsupervised learning, the algorithm cannot compare the fitness of a candidate function to the target values of the dataset. As a result, most data clustering is completed using unsupervised ML. Other variants of the learning paradigm are possible. For instance, in semisupervized learning, some examples include a supervision target but others do not. In multi-instance learning, an entire collection of examples is labelled as containing or not containing an example of a class, but the individual members of the collection are not labelled. As the dataset will be inserted into a neural network there is a need to understand how an artificial neuron processes information. Figure 6 illustrates the structure of an artificial neuron that receives n inputs [x 1 , x 2 … x n ] from n different input connections, where each connection has an associated weight [w 1 , w 2 … w n ]. To map inputs to output, a neuron uses a two-stage process, the first of which involves calculating a weighted sum, z, of the inputs of the neuron, which can be written as: The result of the weighted sum is then passed through a second function that maps the weighted sum score to the final output value of the neuron. When designing a neuron, various functions can be used, which can be as simple as an "add" function or may be more complex. Typically, the output values of a neuron are known as its activation values, so this second function, which maps from the result of the weighted sum to the activation values of the neuron, is known as an activation function (φ in Figure 5). The z value, the result of the weighted sum, is then passed through an activation function in the second stage of processing within a neuron. Some neurons in a neural network may use activation functions that are different from those used by other neurons. In brief, the calculation of the output activation of this neuron can be summarized as: Deep-learning networks are composed of large numbers of simple processing units that work together to learn and implement complex mapping from large datasets. Once the output value is calculated, the error can be calculated by subtracting the estimated output from the correct output for the example listed in the dataset, as: error output output correct estimated Based on this error value, the weights for the model are then adjusted for the next calculation to find the optimal set of weights.

Commonly used neural networks
Over the years, researchers have come up with amazing improvements on the original idea of deep learning, with each new architecture targeted on a specific problem and being improved upon in terms of accuracy and speed. A range of neural networks will be briefly explained in this section for later analysis in terms of different applications.
An artificial neural network (ANN), simply referred to as a neural network, is a computing system inspired by biological neural networks, which is based on a multilayer of the neurons that were introduced in the previous section. A shallow neural network is a network with 12 or fewer hidden layers. Deep neural networks (DNNs), which are the basic architectures of deep learning, become deeper as the numbers of layers increase. Neural networks are classified as feedforward neural networks (FFNNs) or backpropagation neural networks (BPNNs), according to the direction of information transfer.
Fully connected FFNNs, also known as DNNs, are supervised neural networks that are suitable for most classification tasks. However, in reality, it is difficult to support a fully connected FFNN due to the huge demands required in its implementation. Comparatively, convolutional neural networks (CNNs) add convolution and pooling layers before fully connected layers, where only a handful of neurons connect with the next ones. In a way, CNNs attempt to regularize FFNNs to avoid overfitting, which makes them very good at identifying spatial relationships between data (Lecun, Bottou, Bengio, & Haffner, 1998). That is why their primary use is in computer vision and applications such as image classification, video recognition, medical image analysis, and self-driving cars.
Traditionally, in an aforementioned neural network of FFNN, the outputs of previous layers are used as inputs of the next layers. Recurrent neural networks (RNNs) are FFNNs with interesting loops, where neurons are fed information not just from the previous layer but also from themselves in each step, allowing information to persist (Elman, 1990). In this way, the neurons can recall past data and use it to make predictions. Furthermore, the more computation that is conducted in RNNs, the higher the volume of information gathered, and in addition, the model size does not increase in line with the input size. Hence, these networks are perfect for time-related data and are used in various fields, such as in image captioning, time-series analysis, natural-language processing, handwriting recognition, and machine translation. To improve performance and avoid the vanishing gradient problem of RNNs, which causes information to be lost rapidly over time, a more complex structure, involving long short term memory (LSTM) units and gated recurrent units (GRUs), must be devised to remember more data. LSTM units have been used extensively in natural-language processing in tasks such as language translation, speech generation, and text-to-speech synthesis.
A generative adversarial network (GAN) is a type of network with two main components: a generator that generates fake data and a discriminator that learns from that data. During the training period, the generator becomes increasingly better at generating images, as its ultimate goal is to fool the discriminator. As its goal is not to be fooled, the discriminator then gradually improves its ability to distinguish between fake and real images. As a result, the fake data of the generator becomes incredibly realistic and this data is used to complement expensive real data labelled by experts. In contrast, GANs are challenging to train because they require not only the training of two networks but also the balance of their dynamics. Because there is inherent divergence, the GAN will not converge if prediction or generation becomes too good compared to the other (Goodfellow et al., 2014).
A Hopfield network (HN) is a network where every neuron is connected to every other neuron, which is a completely entangled plate of spaghetti as even all the nodes function in relation to all the others. Before training, each node is input, then hidden during training, and then output (Hopfield, 1982). Boltzmann machines (BMs) are similar to HNs, except that some neurons are labelled as input neurons while others are left unlabelled (Hinton & Sejnowski, 1986). Restricted Boltzmann machines (RBMs) are remarkably similar to BMs and therefore also similar to HNs, in that none of them have an output layer. RBMs, however, are easier to train due to more constraints, such as the neurons not being connected to the same input group or hidden neurons (Smolensky, 1986). RBMs are stochastic neural networks that can learn from a probability distribution over the data they have been fed, and they have found uses in regression, collaborative filtering, feature learning, and even many-body quantum mechanics. RBMs are still used occasionally in deep learning, but GANs or variational autoencoders have largely replaced them.
An unsupervised ANN, referred to as an autoencoder (AE), learns and understands how to compress and encode data efficiently. The image is first encoded, then the input is reduced to a smaller entity by the AE. Finally, the image is reconstructed via decoding. A feature of AEs is that the entire network always resembles an hourglass-like shape, with the hidden layers being smaller than the input and output layers (Bourlard & Kamp, 1988). In this way, the middle of the network can be used to extract a representation of the input with fewer dimensions. In this way, AEs are used for the purpose of the reduction of dimensions, in pharmaceutical discovery, popularity prediction, and image processing.
Deep-learning and neural network algorithms are prone to overfitting because of their relative increase in complexity. Furthermore, increased model and algorithmic complexity can necessitate a significant amount of computational time and resources. Given all of this, proper care must be taken when leveraging AI algorithms to solve problems, including the selection, implementation, and performance assessment of the algorithms themselves. ANNs and the more complex deep-learning technique are among the most capable AI tools for solving very complex problems and for this reason will continue to be developed and leveraged in the future.

Applications of deep learning in composite materials science
Deep learning can be described as a method for simulating the human brain by creating a model that can learn from data, extract features, and map those features and results. Three aspects need to be considered when deep learning is applied to materials science research. To begin with, the outcomes that need to be achieved must be identified, such as the materials properties, structural shape or geometry, or component geometry. Secondly, the type of data involved needs to be identified; are the relationships between the data and results appropriate for learning? Is there enough data to learn from? Finally, in terms of data mining, forward learning, or backward learning, the type of method of learning needs to be decided upon. These three considerations are applied to composite materials, as discussed in the following.
As technology advances, different types of data storage in a computer become available. From the initial simplest type of the 0/1 sequences, data types are becoming complicated as image, audio, and oscilloscope trace or photograph (oscillogram). On the one hand, images or audio are more informative than traditional types, capable of storing larger amounts of data. On the other hand, they are easy to be harmed by artifacts, which are impurities in the images, or audio noise. Therefore, the information that the data represents generally requires professionally trained researchers to ultimately make decisions and come to conclusions. Deep learning can be used in this situation to save money and avoid human error by automatically accounting for nonlinear relationships between raw data and output.
Due to the performance limitations of monolithic materials, synthetic composite materials systems can be purposely designed to satisfy given loading and environmental conditions encountered when in service. However, composite data are harder to acquire because the circumstances of both the fabrication and construction processes influence the selection of the constituents and the properties of the composites. Additionally, in the traditional design process of composites, due to many design variables, several attempts have to be made to achieve the desired hygrothermo-electro-mechanical performance for minimum weight. Even though there are analytical and numerical tools to accelerate the materials design process, such as preliminary analytical models, detailed finite element analysis calculations, or deeper molecular dynamics simulations, the time and cost associated with high accuracy can be prohibitive. However, over time, much research data have become available on the internet and elsewhere, allowing deep learning to be used in composite design.

Prediction of properties
Composites are a combination of the properties of two or more materials (constituents), and any two materials (metals, ceramics, polymers, elastomers, and glasses) can be used. These materials may adopt different geometries (particulate, chopped fibre, woven, unidirectional fibrous, and laminate composites) to create a system with a property profile not offered by any monolithic material. Mechanical design is often used to improve the stiffness-to-weight ratio or strength-to-weight ratio or improve the toughness of materials, while thermo-mechanical design is used to reduce thermal expansion, maximize heat transfer, or minimize thermal distortion. Many experiments have been devised to evaluate the properties of materials under different environmental conditions under static (tension, compression, shear, and bending), fatigue, creep, and dynamic (impact, blast) loading because of their heterogeneous nature and anisotropy to characterize their behaviour and confirm whether they meet the requirement (Soutis, 2005).
Thousands of test samples must be meticulously prepared according to international testing standards, significantly increasing production costs. Furthermore, it is hard to evaluate and test the structural integrity of the composite materials in use. As a result, developing appropriate non-destructive testing (NDT) techniques that enable the determination of residual strength properties and life becomes critical (Diamanti & Soutis, 2010;Kessler, Spearing, & Soutis, 2002). Deep learning appears to be an appealing tool to use to map these multifaceted relationships and accurately predict long-term performance in composites design, where the constituent characteristics and properties must be linked to the overall materials system behaviour, to the component, and ultimately, structural response.
Indeed, the emergence of big data has been one of the most critical factors that has driven the rapid development of deep learning over the last few decades. Massive datasets have become available due to the proliferation of sensors and online scientific platforms, which provide the necessary data to train neural network models and support new applications in a variety of domains. For example, the fatigue behaviour of a composite wind turbine blade is a very complex task because it can be influenced by wind speed, light, moisture, and temperature exposure, in addition to materials and loading condition variables. As a result, fatigue testing must be performed to determine the stiffness evolution of the blade. However, it is difficult to accomplish such testing for the following reasons: 1) the test process is inconvenient due to the blade size, 2) fatigue testing is costly and time-consuming, and 3) it is not easy to deal with the large amount of data generated during the process. With this type of problem, the deep-learning method becomes appealing because the data can be used to train a neural network model to identify critical design issues and effectively propose optimal solutions for the operating conditions.

Ann based on the dataset of the experiment
Polyvinyl chloride (PVC) is one of the most widely used and valuable polymers in the chemical industry. Products with different mechanical properties can be obtained by varying the proportion of their constituents, with the resultant materials being used in a variety of applications and industries. However, due to the nonlinear nature of the relationship between its composition and the resulting after-production properties, predicting the production properties of PVC under specific operating conditions is difficult. Altarazi, Ammouri, and Hijazi (2018) built a detailed supervised feedforward ANN model (Figure 7) to predict and optimize three properties of PVC composites (tensile strength, ductility, and density) based on different weight percentages of the different compositions. Additionally, this model was also able to identify the optimal weights of the ingredients of the composites to achieve any other desired physical properties. To model this multi-input and multi-output relationship, different training algorithms, activation functions, and ANN architectures were developed. However, there are 240 available datasets for this study, which is not quite enough for a fully connected ANN model. To increase the size of training datasets, the proposed modelling methodology can be improved or extended by considering more datasets and composite ingredients, or by combining the ANN with other optimization models, such as a GAN.
Polycaprolactone (PCL) is a widely used polyester in the biomedical and pharmaceutical fields, which is conventionally synthesized under harsh conditions, such as high reaction temperature, long reaction time, and using organic solvents. Also, PCL should be manufactured with caution to avoid the release of toxic compounds, which may cause side effects in users or organisms (Espinoza, Patil, San Martin Martinez, Casañas Pimentel, & Ige, 2020). In a study by You and Arumugasamy (2020), the effects of reaction temperature and time on PCL molecular weight were studied for specified production goals using FFNN and adaptive neural fuzzy inference system (ANFIS) methods. These models can be used to predict the molecular weight (output) of the biopolymer from enzymatic polymerization after the ANN has been trained with operating temperature and polymerization time (input) datasets. Comparison of the FFNN and ANFIS results proved that the ANFIS model is better at prediction, achieving 99.99% accuracy of validation. This is because the ANFIS model combines the advantages of the fuzzy logic system and neural network learning so it can adapt and establish a good relationship between the input variables and the molecular weight of the PCL. Other similar research studies have been conducted, using various ANN models to solve this type of nonlinear relationship problem (Ashhab, Breitsprecher, & Wartzack, 2014;Muñoz-Escalona & Maropoulos, 2010;Velten, Reinicke, & Friedrich, 2000). Li et al. (2019) proposed a model (as shown in Figure 8) to predict the effective modulus of a shale sample, which is a complex heterogeneous composite consisting of multiple mineral constituents. The model was trained using SEM images, which were then used to generate a large number of stochastic samples using a stochastic reconstruction method. The finite element method was used to calculate the labels of the dataset, which are the moduli of the samples. In this study, a CNN was trained on an ordinary desktop computer equipped with an i7-8700 CPU using 10,000 generated stochastic mesoscale shale samples. The training was iterated for 100 cycles and took around 43 minutes to finish the training process, with this method having relatively high efficiency compared to other deep-learning models or ML methods. Eventually, the average prediction error was as low as 0.97%, which suggests that the trained CNN model exhibits promising performance in predicting the effective moduli of real shale samples. However, since all the labels of the data are simulated, many factors have not been considered thoroughly, therefore meaning that the accuracy of this model still needs to be confirmed.
Practically, SEM is mostly used as an auxiliary tool due to its insensitivity to small differences in the degree of object characteristics evaluation as a result of limited observation areas. Carbon fibre (CF) reinforced cement-based composites (CFRCs) are attractive for use in future civil engineering applications because of their excellent mechanical (Graham, Huang, Shu, & Burdette, 2013), electrical (Wang, Li, Li, Guo, & Jiao, 2008), and thermal properties (Teomete, 2015). In research by Tong, Gao, Wang, Wei, and Dou (2019), a deep-learning method was proposed to characterize the CF morphology distribution in the CFRC and predict the properties of the material. Ordinary Portland cement, short-cut CFs, and mixing water were used to make the CFRC samples in this study. Firstly, a modified CNN, of which fully connected layers were replaced with deconvolutional layers (reverse process of convolution), was used to segment X-ray images and characterize the CF morphology distributions, as shown in Figure 9(a). Red, yellow, and green were used to represent CF bundles, CF clustered areas, and uniformly dispersed CF areas. Following that, based on the segmentation results, 3 D reconstruction [ Figure 9(b)] was applied to the samples. 3 D reconstruction results provided visual and analyzable models, wherein the volumes and changes of each component were presented, which were utilized to predict resistivity and mechanical properties. However, using empirical equations based on the CF morphology distribution to predict CFRC properties is still difficult because even the same CF distribution in different slides of the samples make a different contribution toward the CFRC properties, which is for humans to define. The cascade deep-learning method proposed in this work, a combined radial basis function (RBF) network and fully convolution network, effectively coped with this problem and quantificationally measured the contributions of the different CF morphology distributions in terms of the effects that they had on the properties of the CFRC, as a result of the ability for output feedback control of nonlinear systems using an RBF network (Seshagiri & Khalil, 2000).
In addition to SEM images and numerical values, other forms of data can be utilized to train deep-learning models, such as spectra. A deep-learning framework consisting of GAN and DNN was used by Tong et al. (2021) to characterize the hydration and dry shrinkage behaviour of cement emulsified asphalt composites (CEACs). With the help of the GAN component, which can generate SEM images and characterize the hydration production and background in the microstructure of the CEACs, the proposed framework is capable of mapping the design parameters of the CEACs to their X-ray powder diffraction spectra and SEM images. The proposed framework achieved the lowest errors in the 36 test groups using an ANN-based method with the same architecture and training dataset. In a study by Ma et al. (2021), an accurate prediction structure for the flank wear experienced by a milling tool when machining TC18 titanium alloy was established by determining the tool wear mechanism and determining the real-time milling force, as shown in Figure 10. Comparing the images and values, the data of the signals have an important feature in that they represent a dynamic process based on time instead of static values, which can be perfectly solved using the RNN-based method, since it can be used to process time-related data by memorizing inputs of the neural networks. For this task, LSTM and GRU were chosen in combination with a CNN to prevent backpropagation errors from disappearing and exploding due to memory constraints. The predicted minimum values were all found to have errors of <8%, demonstrating that the deep-learning method is a novel and promising approach for the wear monitoring of online tools.

Deep learning based on theoretical data
Besides the methods for learning data from experimental data, the rise of computer science has brought about the development of many Figure 10. outline of the prediction method . the raw forces data were the milling force signals in the three directions of Fx, Fy, and Fz. calculation methods, such as stochastic image reconstruction, finite element, and 3 D reconstruction methods, which can be used to generate enormous datasets. However, all these methods are based on complex mathematical equations, thus making them extremely costly to compute. Moreover, to maintain sufficient accuracy, tasks as a whole are in general finely divided into elements for calculation, which may take a long time. Therefore, this is another area in which the deep-learning method is expected to help reduce calculation time.
In some situations, conductive heat transfer methods, such as embedding conduit materials with a high thermal conductivity into substrate materials with a much lower thermal conductivity to act as cooling channels are used (Dbouk, 2017). Thermal conductivity is a significant thermophysical property of any composite material. Generally, experiments are carried out to study the heat transfer process in composites. Extensive trial and error testing, however, is prohibitive due to issues such as the cost of experiments and uncertainties in measurement processes. To investigate the capacity of ML methods for heat transfer analysis, Wei, Zhao, Rong, and Bao (2018) created a database containing the properties and structures of composites generated using a quartet structure generation set, and applied the lattice Boltzmann method to calculate the effective thermal conductivity of the materials. In a study by Rong, Wei, Huang, and Bao (2019), in consideration of it being more convenient to obtain 2 D images compared to 3 D images, 2 D cross-sectional images and 2 D CNNs (a set of CNN structures) were used to predict the effective thermal conductivity of 3 D composite materials. Using multiple cross-sectional images along or perpendicular to the preferred directionality of the fillers in the materials, the results showed that 2 D CNNs can provide a relatively accurate prediction of thermal conductivity. In addition to the prediction of thermal conductivity, several topology optimization (TO) methods have been developed for designing the layout of cooling channels, including the constructal theory (Bejan, 2015), level set (LST) method (Yaji, Yamada, Kubo, Izui, & Nishiwaki, 2015), phase-field method (August et al., 2015), variable thickness method (VTM) (Chiba, 2012), homogenization method (HDM) (Zhou & Li, 2008), evolutionary structural optimization method (Ansola, Veguería, Canales, & Alonso, 2012), and solid isotropic material with penalization method (Marck, Nemer, Harion, Russeil, & Bougeard, 2012;Page, Dirker, & Meyer, 2016). Lin, Liu, and Hong (2019) proposed a novel supervised deep-learning predictor (SDLP) to directly infer and predict the optimal layout of cooling channels. The resulting predictor is made up of an encoder and a decoder-like autoencoder, with the goal of the encoder being to reduce the dimensionality of the input data by encoding it into an intermediate variable, and the goal of the decoder being to decode the intermediate variable into a topological structure that describes conductive heat transfer. The physical parameters describing the cooling problem to be optimized, such as the boundary and constraint conditions, were then used as input for the deep-learning predictor, with the final predicted and actual conductive heat transfer topologies shown in Figure 11. As depicted, the prediction of the main branches is accurate, but some tail ends are missing, which suggests that the accuracy still needs some further improvement. Furthermore, some unreasonable structures, such as disconnected sections, were generated as a consequence of the application of a coarse grid and the chosen filtering technique. If the grid size were smaller, the structure would be much smoother and finer, with no disconnected sections. These points highlight some of the inherent flaws of SDLPs, such as the fact that errors are unavoidable due to the reduction in dimension size compared to raw images. In addition, the TO algorithm was executed 10,000 times to create the training dataset, which resulted in the consumption of a large amount of time in terms of both the data generation process and training period. After these, the elapsed time from input to output was in the order of 1.837 s, which is almost instantaneous.
Using the mechanics of structure genome (MSG) and fully connected DNN, Liu, Gasco, Goodsell, and Yu (2019) proposed new failure criteria for fibre tows based on a micromechanical model. In addition, benchmark results from a meso microscale coupled model were used to verify the accuracy and efficiency of the proposed model. The database for this study, containing 3,000,000 samples, was generated by microscale failure analysis based on an MSG solid model. The results indicated that the proposed model is in good agreement with the benchmark results, while the traditional failure criteria show a significant loss of accuracy in some strength predictions. All computations were carried out on a single CPU on the same Windows system workstation. The failure criteria took the DNN model 8.3 h to build, including the sampling and MSG microscale analysis. Compared to traditional MSG microscale analysis, the calculation time for determining the strength constants under six loading cases was reduced from 8.1 to 0.32 h after the proposed criteria were used to implement the DNN model.
Homogenization refers to the transfer of salient information from a lower materials structure scale to a higher materials structure scale, whereas localization is the process of transferring important information from a higher materials structure (macro)scale to a lower materials structure (nano)scale. Localization is critical in assessing or predicting failure-related properties of composite materials. Many numerical approaches have been used to address localization problems, such as finite element method (FEM) through subdividing a large system into smaller and simpler element to solve a problem (Aarnes, Krogstad, & Lie, 2006;Luscher, Mcdowell, & Bronkhorst, 2010), iterative methods employing Green's functions, and fast Fourier transforms (FFTs) (Moulinec & Suquet, 1998). Yang, Dai, Rao, and Chyu (2019) proposed a deep-learning model that can efficiently and accurately predict the microscale elastic strain field of 3 D composites that have large differences in elasticity between each component. Figure 12 shows the localization in hierarchical multiscale modelling, wherein finite element simulations produced the Figure 12. localization in hierarchical multiscale modelling (Yang, Dai, et al., 2019). datasets in this study. In assessing or predicting failure-related properties of composite materials, the aforementioned localization process is crucial. In addition, there are other investigations (Mendizabal, Márquez-Neila, & Cotin, 2020;Qi, Zhang, Liu, & Chen, 2019) that have been conducted using both deep-learning and finite element methods.
As evidenced by the examples presented in this section, deep-learning technology is already being used in materials research to predict the properties and thermo-mechanical behaviour of materials under a variety of experimental conditions based on materials compositions, microstructure images, numerical experimentation, or indirect measurement results.

Data processing
Even though continuous improvements are being made to equipment to ensure that they are more accurate, there is still a need for professionally trained and qualified personnel to analyze the gathered data. From the raw data of electron microscope tests, ultrasonic tests, and thermography, for example, it is difficult to quickly determine and identify all critical aspects. The analysis of these data therefore take a long time due to their volume and complexity. Worse still, human errors made throughout the process reduce the accuracy of the results, especially for something that is difficult to distinguish with the naked eye. For this purpose, a deep-learning algorithm can utilize known data to train a model, making it possible to excavate and memorize intrinsic features of given data, and then recognize these features from a new unknown dataset.
Composite materials are emerging to meet the high requirements of a variety of fields. Similarly, the complexity of composites causes a slew of problems in their manufacturing and in-service inspection processes. During the manufacturing process of composites, microcracks, fibre breakage, voids, porosity, delamination, and inclusions are all common occurrences (Le, Pham, & Lee, 2021). These defects may lead to serious accidents and casualties depending on the intended applications of the composites. To evaluate composite materials, ultrasonic, thermographic, infrared thermographic, radiographic, visual detection, acoustic emission, acoustic-ultrasonic, stereographic, optical, electromagnetic, liquid penetrant, and magnetic particle testing can all be used. To reduce the effects of operator subjectivity and improve defect detection efficiency, methods based on modern computer vision, particularly deep learning, are used to detect defects in composite materials. Many different types of research on using deep-learning techniques to process detection data will be introduced in this section. Gong, Shao, Luo, and Li (2020) proposed a deep transfer learning model that accurately extracts features from unlabelled X-ray images of aerospace composite materials (ACMs) for the inclusion of defects. The proposed deep transfer learning model is made up of four modules: a feature extractor, a label classifier, a domain classifier, and distance metrics. The feature extractor uses the model trained using these four modules to obtain domain-invariant features, and the label classifier undergoes the same process to achieve good inclusion defect detection performance. To demonstrate the benefits of this model, comparison experiments were conducted using the same ACM X-ray image samples using four widely used methods: ANN, DANN, a conscious neighbourhood-based crow search algorithm, and a deep convolutional transfer learning network. The proposed models exhibited an accuracy of 96.8%, exceeding the performance of the other models. As shown in Figure 13, inconspicuous inclusions can be detected by this model, and the corresponding heat map results can also be shown for reference after the automatic process. Furthermore, this model is real-time, with a detection time of 0.24 s for a single ACM X-ray image of a common size.

Crack detection
Aside from inclusions, crack formation is the other important factor that greatly influences the functions of structures and leads to serious consequences. In terms of macrostructures, many architectures are suffering crack propagation and thus caused safety risk. For instance of Nowak (2012), 40% of the 570,000 bridges in the USA have been classified as deficient, requiring rehabilitation or replacement, at an estimated cost of 50 billion dollars. Civil structures and infrastructures, such as bridges, tunnels, buildings, dams, and roads, are prone to damage due to various mechanisms related to mechanical loading, chemical processes, and environmental action. As field evaluation is costly and inconvenient, several structural health monitoring techniques have been proposed for detecting, locating, and monitoring such damage. Visual inspection has been the most widely used method for monitoring concrete structures that are in service. A study by Flah, Suleiman, and Nehdi (2020) proposed a nearly automated inspection method based on image processing and deep learning for detecting defects in typically inaccessible areas of concrete structures. In the training datasets in this study, 20,000 images of cracked concrete structures and 20,000 images of structurally sound concrete structures were included. After CNN classification and segmentation, the dimensions of the cracks, such as length, width, and angle, were calculated. The failure mode was then predicted, and the extent of the damage was assessed. Finally, a comprehensive evaluation of the cracks was achieved. For a typical CPU processor, the training time for this network was 2-3 h, with a recorded testing accuracy of ≥98.15%, according to the classification analysis. Finally, unlike the reported comparable current state-of-the-art methods, the proposed approach demonstrated its computation efficiency and prompt performance, but damage recognition and quantification were time-consuming tasks. This research, however, is not without flaws. For example, when comparing one single crack pattern per image to a group of cracks, the prediction results are frequently overestimated. Similarly, Zheng, Lei, and Zhang (2020) proposed other different building crack detection models (convolutional networks) to analyze the surface data of roads, bridges, houses, and dams. A richer fully convolutional networks model based on image recognition exhibits the best processing effect, with higher degrees of recognition, accuracy, and precision, as well as better stability, according to this investigation. Aside from cracks on buildings, Huang et al. (2020) presented an intelligent surface damage detection method based on a CNN, which has powerful learning ability and can automatically extract discriminant features via training on surface images of steel wire rope. The use of more layer connections and neuron learning units in common fully connected CNN models usually results in better learning ability. However, this may exacerbate the problem of overfitting. The "dropout" function (Srivastava, Hilton, Krizhevsky, Sutskever, & Salakhutdinov, 2014), which allows neurons in the training process to be randomly inactivated according to a dropout ratio, was introduced in this study to solve this problem and improve the training efficiency and generalization ability of the model. In addition, by normalizing all the input features, a batch normalization method (Loffe & Szegedy, 2015) was used to speed up the deep learning. As a consequence, the intrinsic limitations of the manual feature extraction methods were not only overcome, but outstanding accuracy of 99% was achieved, a performance better than those of other conventional ML methods.
Depending on their width, length, and propagation, a collective pattern of microcracks can lead to the formation of macrocracks associated with the catastrophic breakage of the structural unit (Ohno & Ohtsu, 2010). Problematic cracks must be identified before secondary measurements or final improvement and processing to avoid any negative consequences. Issues can be effectively avoided if future crack propagation can be predicted. Cracks can range in size from internal microcracks to large (macroscale) cracks, and their origins and classifications are varied and complicated. Macroscale cracks can be formed either through the collective motion of existing microcracks or through external factors, such as inappropriate construction onsite practices, errors in structural design and detailing, and excessive interactions in hostile environments (Landis, 1999;Walker, Lane, & Stutzman, 2004). The visual inspection of cracks takes time and can be subjective depending on the experience and skill of the inspector, as evidenced by the misclassification of small and noisy cracks. In the study of Hwang et al. (2019), loess/water mixtures were chosen as a model system that generate sufficient data on cracks that form as a result of water evaporation. In addition, as deep-learning model was used to detect and classify the edges and nodes of cracks that formed during the drying stage of the loess/water mixture system, as shown in Figure 14, using the MATLAB implementations of the AlexNet and Yolo object detection algorithms. High-precision crack detection was implemented using topology network-based analysis, with a focus on information of the connectivity between neighboring nodes, based on the predetermined information of nodes and edges. The total training using RGB-based and binary images was completed in 8 min 20 sec and 9 min 12 sec, respectively. Moreover, the training procedure was successfully carried out until the accuracy reached 99%. Consequently, this method is fast, reliable, and is an objective approach with high precision for understanding crack formation and propagation from both qualitative and quantitative aspects.
In addition to these studies, other formats of datasets from different composite materials systems have also been used to build deep-learning models for crack detection. Yang, Chen, Wang, and Wang (2021) developed an acoustical crack detection model for carbide anvil using a deep-learning method, which plays a significant role in producing synthetic diamonds. Rather than using normal crack digital images to train deep-learning models, this study used acoustic signals expressed as sound impulses to build a detection system. When faced with complex sound impulses, the shallow ANN technique can result in poor representation Tran-Ngoc, Khatir, De Roeck, Bui-Tien, & Abdel Wahab, 2019). In addition, if the network is too deep the training process will become difficult and easy to overfit. Compared to ANN-based models, a stacked autoencoder (SAE) (Zenzen, Khatir, Belaidi, Thanh, & Abdel Wahab, 2020) exhibits stronger nonlinear expression ability due to its increased number of network layers and can discover useful hierarchical feature representations. Furthermore, hyperparameters are regularized using dropout and weight decay coefficient methods to improve generalization ability and overcome the flaws of the SAE model, namely information loss. As a result, the proposed acoustic detection method, which is based on an improved stacked autoencoder optimized by a particle swarm optimization algorithm (SAE-PSO algorithm), can recognize cracked anvils in synthetic diamond workshops with greater accuracy than other algorithms.
Another popular NDT imaging technique is thermography, which detects defection methods by observing heat patterns on a target material (Lu, Wang, Qin, & Ma, 2017;Yang & He, 2016). The advantages of thermography, such as short inspection time, no contact or coupling, and easy interpretation of data, led to its recognition as standard practice for NDT in the aerospace field by relevant organizations (Ciampa, Mahmoodi, Pinto, & Meo, 2018;Yang, Choi, Hwang, An, & Sohn, 2016). However, the surrounding environment limits its ability to detect defects over a large surface due to thermal gradients, high demand for energy input as thermal stimulation, and difficulties associated with uniformly heating a large area (Busse, Wu, & Karpen, 1992;Lucchi, 2018). Moreover, NDT results obtained by thermography are largely determined by inspectors, which means that the quantitative information is inherently biased (Yang, Choi, et al., 2016). Bang, Park, and Jeon (2020) prepared two types of CF composite samples with artificial defects to obtain thermographic images and evaluate identification accuracy using a CNN-based model. A faster regionbased convolutional neural network, including a region proposal algorithm, was used to reduce the computation time in the CNN. Instead of selecting a large number of regions using sliding-window techniques in CNN, specific regions are extracted and warped in a square as CNN input images, resulting in increased efficiency. The performance of the proposed system was evaluated by assessing its ability to identify defects, and discussed in light of the average precision for the identification of defects. Although some of the groups in the study showed promising accuracy, the overall precision of the study was low due to a lack of data. By training with more data obtained from a larger scale of practice, the reliability of the system proposed can be further improved. Similarly, Luo, Gao, Woo, and Yang (2019) labelled thermal optical pulsed thermography (OPT) images, as shown in Figure 15, to train a defect detection system to increase accuracy. A hybrid of spatial and temporal mathematical model can be used in the OPT system to express the temperature field of the measured object as a function of space and time. In this model, spatial and temporal information are represented and processed by two types of networks respectively. To complete this task, an AE-based model was used because the spatial information is extremely complex and large. However, because temporal information is linked to time, LSTM models are used to deal with it. The most appropriate deep-learning network was used to achieve optimal performance based on the characteristics of each task. Moreover, the probability of detection (POD) was used to represent the detection effects of a model defined as the proportion of the number of defects detected to all defects. After feature extraction (labeling), the average POD of the principal component analysis algorithm combined with the random color method was increased by as much as 96%. Furthermore, when training sets and test sets contain the same type of data, performance can be improved.

Semantic segmentation
Deep learning has also been used in the composite field for semantic segmentation, which is a direct and visible application. Through integrated semantic segmentation using image processing-assisted stereography tools, detailed microstructural features can be quantitatively extracted automatically and objectively without any human involvement, such as size distribution, surface (or equivalently, volume) fraction, length of two-phase boundaries, and density of triple-phase boundaries based on 2 D images (Bulgarevich, Tsukamoto, Kasuya, Demura, & Watanabe, 2018;Hagita, Higuchi, & Jinnai, 2018;Modarres et al., 2017). Solid oxide fuel cells Figure 15. process of labeling. a) one frame with the strongest defect information is selected in the thermal sequence, confirmed by peers and a preparation map; b) the data calibration tool -labelme is used to calibrate the defects of the frame image. this process is confirmed by multiple peers; and c) a binary image is generated (luo et al., 2019).
(SOFCs) were once thought to be extremely complicated multicomponent materials systems with porous electrodes, densified electrolytes interconnects, and sealing materials, and have been hailed as one of the most powerful next-generation energy conversion systems due to their high efficiency, great fuel flexibility, and low pollution levels (Hwang et al., 2019;Singhal, 2000;Ziatdinov et al., 2017). The implications of semantic segmentation in SOFCs are discussed in consideration of the efficient analysis and design of high-performance electrode structures in devices for energy-related applications. Hwang et al. (2020) proposed a novel approach in which semantic segmentation operations can be automated based on a state-of-the-art deep-learning algorithm, named DeepLabV3+, to minimize labour-intensive human involvement (Filippo, da Fonseca Martins Gomes, da Costa, & Mota, 2021). The encoder is divided into deep separation convolution and atrous spatial pyramid pooling layers, and the decoder merges low-level features and performs feature map recovery in Deeplabv3+. The proposed model is made faster and stronger by the use of separable convolution, which significantly reduces the computational complexity (Zeng, Peng, & Li, 2020). Different types of images can be obtained to analyze different issues via automated segmentation and parametric adjustment, as shown in Figure 16. Image segmentation of phases is very effective for petrographic analysis in the more complicated case of concrete because it saves not only a lot of time but also a lot of money in terms of labour costs. Furthermore, this process can alleviate concerns about operator subjectivity during the repeated visual judgments of an inspector over several hours (Zhao et al., 2000). Song, Huang, Shen, Shi, and Lange (2020) explored the potential of using novel deep-learning techniques for the petrographic analysis of concrete, with the results showing that the CNN involved is highly accurate accuracy and exhibits a time advantage over conventional color-based approaches. Furthermore, the CNN segmentation approaches the quality achieved by human judgment in many of the cases presented in this paper, but at a fraction of the time. CNN segmentations take seconds to complete, with parameters being able to be computed almost instantly using a simple program.
The process of phase boundary extraction is currently used to segment the majority of data. The deep-learning method can recognize a variety of features in addition to those listed above. Metallic nanoparticle complexes, for example, have a strong relationship between their physicochemical and mechanical properties and the order and distribution of their atomic structure. Although TEM can provide the highest spatial resolution to an image of the nanoparticle's atomic structure, it is time-consuming and cumbersome to estimate atomic column height from the 2 D projected TEM images. With the advancement of TEM techniques in the discovery of nanoscale science, it is critical to combine them with AI and real-time TEM imaging methods. Ragone et al. (2020) presented a modelling framework based on a deep-learning approach for detecting the atomic column heights in high-resolution TEM images for different sizes of gold nanoparticles. Moreover, 2 D images were processed and reconstructed for the 3 D model for further calculation, in accordance with published literature (Ali, Guan, Umer, Cantwell, & Zhang, 2020). A dataset from both real and computer-generated virtual samples was used to train a deep-learning network in this study. The real samples Figure 16. stereological analysis of resolved phase images for extracting microstructural parameters: (a) original electron microscopy image, (b) image inferred from semantic-segmentation-assisted deep learning, (c) application of line intercepts to binary data files, (d) two-phase boundaries between GDC, lsC, and pores, and (e) the distribution of triple-phase boundaries (Hwang et al., 2020). were raw 3 D volumes reconstructed from CT images, whereas the virtual samples were computer-generated geometrical representative volume elements. This work shows great potential for developing effective and accurate segmentation procedures, enabling further prediction of mechanical and structural properties and promising the optimization of manufacturing processes by paving the way toward smart manufacturing, after being enhanced by the datasets of virtual samples.

Composite materials design (TO)
Both the prediction of properties and data processing discussed in the last two sections are forward propagation processes, where neural networks map from input to output space. In contrast, searching for a topological design involves inverse mapping from output to input space. In the traditional optimization process, the optimization algorithm must begin with an initial topology and run iterations until the optimal topology is obtained. However, because it has no unique solution, this inversion is an ill-posed problem. To put it another way, a specific output can be derived from a variety of different inputs. As a result, finding the patterns with the smallest loss function is the goal.
Over the years, several optimization techniques have been widely used to determine the optimal shape and size of engineering structures (trusses and frames) under different constraints (stress, displacement, buckling instability, kinematic stability, and natural frequency) (Aykut, 2020). The most valuable tool for identification in the early stages of the design process has proven to be the TO of structures, which is widely used in the automotive and aerospace industries, as well as civil engineering, materials science, and biomechanics, to design lightweight structures (Bujny, Aulig, Olhofer, & Duddeck, 2016;Zhu & Gao, 2016). TO offers the conceptual design of lighter and stiffer structures. With the help of FEM software, it can also check the design from the perspective of its feasible design range, accuracy for different loads and conditions, and consideration of the design and manufacturing constraints.
TO offers a systematic platform for obtaining new designs of materials and structural systems with optimized responses (Bendøse & Sigmund, 2003;Bendsøe & Kikuchi, 1988;Hofmeyer, Schevenels, & Boonstra, 2017;Ivarsson, Wallin, & Tortorelli, 2018;Li, Zhang, & Khandelwal, 2017). Both scientific and industrial interests have been piqued in the pursuit of high-performance materials and lightweight structures (Abou-Ali, Al-Ketan, Rowshan, & Abu Al-Rub, 2019; Michaleris, Tortorelli, & Vidal, 1994). In general, components, volume fractions, and architectures are used to create composite materials and structures. Due to the high degree of design freedom, however, realizing this process and achieving the optimal materials structure is difficult. Obtaining optimal nonlinear structures is now possible thanks to cloud computing, ML, and simulation. In addition, designing the architectures of materials is epoch-making, since the obtained materials can possess unprecedented properties not observed for natural materials. These properties include, but are not limited to, a very high stiffness-to-weight ratio and negative Poisson's ratio (Ruzzene & Scarpa, 2005). TO aims to find the best materials distribution that maximizes system performance while adhering to design constraints (Vangelatos, Gu, & Grigoropoulos, 2019).
Metamaterials are gaining traction as promising materials with unique architectures that exhibit tailorable and unprecedented properties for a diverse range of applications. Due to the limitations of manufacturing techniques in the past, producing complex geometries was difficult. The fabrication of these materials with complex architectures has become possible in recent years thanks to advances in additive manufacturing (Bendsøe & Kikuchi, 1988;Bikas, Stavropoulos, & Chryssolouris, 2016;Gardan, 2016), and while more advanced fabrication techniques provide new opportunities for making such metamaterials, the optimization problem is still challenging (Alhammadi et al., 2020). Kollmann, Abueidda, Koric, Guleryuz, and Sobh (2020) developed a deep-learning model which can noniteratively optimize metamaterials by either maximizing the bulk Figure 17. flowchart showing the different steps used to develop a Cnn-based optimizer. it is used to identify optimal materials distributions of 2Dmaterials, trained using to data (Kollmann et al., 2020). modulus, maximizing the shear modulus, or minimizing the Poisson's ratio. Figure 17 summarized the different stages of developing the CNNbased optimizer. The CNN model in this study was based on ResUNet (Zhang, Liu, & Wang, 2018) and made use of batch normalizations, rectified linear units, and convolutional layers, which improved segmentation accuracy by concatenating feature maps from different neural network levels by taking advantage of residual learning and U-Net. The proposed model takes three input images (filter radius r min , volume fraction V f, and numeric identifier ID) and produces an output image. The model trained over 150 epochs and took 9.1 h to complete the task. As shown in Figure 18, the prediction result is very close to the groundtruth image, implying that using this CNN model, quality TO can be achieved almost instantly. Moreover, since this study is performed on a low-end computing platform, there is low consumption of computing power. Similar research has also been conducted by Xue, Wallin, Menguc, Adriaenssens, and Chiaramonte (2020), who in addition to simulation, Figure 18. Comparisons between optimized designs and truth values for the case of minimizing the poisson's ratio under specific conditions (Kollmann et al., 2020). the objective function and the volume of the optimized representative unit cells obtained from the Cnn model are compared with those of the ground-truth. carried out the same experiment using a 3 D printing method to ensure that their research was valid. This TO framework is reliable when compared to experimental results and those generated by the deep-learning method.
In the same way, Abueidda, Koric, and Sobh (2020) developed a CNN model to predict optimized designs for a given set of boundary conditions, loads, and optimization constraints. Materials and geometric nonlinearities influence the optimal design, especially when applied loads are sufficiently larger to trigger structural and/or materials nonlinearities Figure 19. Demonstration of the different channels (abueidda et al., 2020). (1) u x with a dimension of 33 × 33, (2) u y with a dimension of 33 × 33, (3) P x with a dimension of 33 × 33, (4) P y with a dimension of 33 × 33, and (5) V f with a dimension of 32 × 32. u x and u y matrices have zero components everywhere except at the nodes at the left-hand side, where fixed boundary conditions are imposed, with a value of 1 assigned. the P x and P y matrices have zero values everywhere except at the node that has the load P applied. the output of each data is composed of one channel, where the values of the different pixels (elements) are the densities obtained from the optimization framework as the rightmost black V shape shown. (Buhl, Pedersen, & Sigmund, 2000;Maute, Schwarz, & Ramm, 1998;Osanov & Guest, 2016). In addition, to further evaluate the proposed model, the case of a linear elastic material under stress constraint was considered. Each point of input data can be viewed as five channels (images), as shown in Figure 19, where V f represents the volume fraction, u x and u y are displacements of the x and y directions, respectively, and P x and P y denote the vectors of load P, respectively. Finally, the optimal design was validated in the ABAQUS environment by using the same parameters to generate the mesh and assign the boundary conditions and loads automatically. The supercomputer generated 18,000 data points, where high-throughput computing was applied to generate as many as 10 data points simultaneously, with an average rate of data generation of 0.31 min/data. The training process took 1.25 and 1.5 h for linear and nonlinear cases, respectively. After this model was trained, it is available for the corresponding optimized design once the material properties and desired optimization parameters were specified. Ground-truth designs were chosen at random and compared to their corresponding prediction results for qualitative evaluation of this model. Based on the findings, it was concluded that the developed models are stable and are in good agreement with the outcomes of mathematically rigorous nonlinear TO frameworks. However, around 90 min was required to solve a single optimization task on a personal computer equipped with a CORE i5 processor. Due to their promising mechanical properties, continuous fibre reinforced plastics (CoFRP) are attracting attention for use in industries that require materials for weight-sensitive applications. To achieve the optimal structure performance of CoFRP, the stacking sequences and fibre orientations of the materials must be carefully adjusted. Moreover, defect-free manufacture needs to be ensured, which is a challenging engineering task with potentially competing goals. In the textile draping of the manufacturing process, local defects may significantly reduce the load-bearing capacity of the materials , and the ill effects of manufacture will be reflected during structural simulations conducted via continuous chains of the virtual process (Kärger et al., 2015). In many cases, manufacturing issues arise as a result of the poor design of components rather than flawed process configuration (Dostaler, 2010). As a consequence, consideration of manufacturing during component design greatly contributes toward lean development and negates the need for costly redesign loops. In research by Zimmerling, Dörr, Henning, and Kärger, (2019), a component scale was studied by applying ML techniques for the optimization of the formation of textiles and formation-related composite design analysis. This research looked into the possibility of applying ML techniques in assessing the formability of textiles in variable geometries to obtain physically accurate predictions while maintaining computational efficiency. Evaluation of the model function μ ML was embedded in the workflow, as schematically illustrated in Figure 20. The algorithm accepts the 3 D geometry of interest as an input and automatically generates geometric features (corners) via scanning. Eventually, design improvement is achieved through a set of parameters that can be visualized for the design map. Accordingly, designers can explore more design alternatives intuitively without the need for laborious and computation-intensive FE simulations. For example, to create a corner design map, 1116 function evaluations are required, whereas the ML model only requires a total of 81 simulations for training and validation. Thus, the simulation effort reduces to 81/1116 ≈ 7.2% compared to an entirely FE-based computation. Aside from the completed ML model, optimization of the manufacturing process parameters (structure) carried out using the deep-learning method can also produce a surrogate model as part of the FE simulations (Pfrommer et al., 2018), which highly improves efficiency.
In large-scale industrial equipment, such as launch vehicles, missiles, and aircraft, stiffened structures such as thin-walled load-bearing components are widely used (Hao, Wang, & Li, 2012;Zhang, Chen, Shi, Teng, & Zhou, 2016;Zhu, Zhang, & Xia, 2016). In particular, curvilinearly stiffened panels have garnered significant attention due to their excellent design flexibility (Hao et al., 2018;Vescovini, Oliveri, Pizzi, Dozio, & Weaver, 2020;Wang, Abdalla, Wang, & Su, 2019). Although curvilinearly stiffened panels have been proven to exhibit superior structural efficiency, the explosion of design variables compared to traditional stiffened structures makes the design of such structures a challenging task Wang, Yeo, et al., 2020). Hao et al. (2021) proposed a novel structural layout optimization framework based on deep learning-based models. The image-based structural layout, which is characterized by curvilinear stiffener paths, is used as a design variable, as opposed to traditional design patterns that use various values of the parameter as constraint conditions. CNNs are used to extract layout features from curvilinear stiffeners and construct a surrogate model between layout features and structural performance for this purpose. Since the flexibility of the layout is extremely high, it is difficult to design a layout from scratch. The AE model is adopted here since it can effectively reduce dimensionality and be used to extract features via a backpropagation algorithm to equate the output of the neural network. Each RBF and Kriging remodeling takes only a few seconds during the optimization process, whereas each CNN retraining and corresponding sub-optimization takes around 5 min. Considering the initial sampling process, the duration required to perform a complete CNN-based optimization and a traditional model-based optimization are 24.2 and 22.5 h, respectively. The specific curvilinear stiffener layouts and corresponding buckling modes are presented in Figure 21, which perform admirably in structural layout design with a variable number of stiffeners, and cannot be addressed using traditional models, as shown. Although the accepted principle in the field of aerospace dictates that a stiffened panel with maximum structural efficiency is characterized by the simultaneous occurrence of global and local buckling in the first order buckling mode, the optimal layout identified in this study only involved global buckling. Moreover, the numerical examples used to evaluate this model are relatively simple, and failure analysis, manufacturability, and the manufacturing costs of the structure have not been considered. Therefore, the viability of this model merits further research.
Effusion cooling and transpiration cooling are efficient cooling technologies for hot section components, such as missiles, space vehicles, rockets, and gas turbines (Song, Choi, & Scotti, 2006;Zhu, Jiang, Sun, & Xiong, 2013). Porous media are used as the core structures in these technologies, providing numerous micro avenues for the coolant to eject and form a uniform coolant film on the external surface. In contrast, traditional cooling structures are unable to adapt to nonuniform incoming temperature loads due to model and design limitations. Yang, Dai, et al. (2019) developed a conditional generative adversarial neural (cGAN) network that constructs a high-dimensional and nonlinear mapping relationship between the surface profile and temperature including a series of effusively cooled plates. After training, a discriminator in this model identifies the information of the image pairs in different levels, inducing the magnitudes, gradient, and textures. The discriminator serves as a strong loss function to evaluate the regression accuracy, which is much more complicated than conventional manually designed loss functions, such as the root-mean-square error. The trained cGAN model, which converts over 100 surface profile images into temperature images in less Figure 21. Comparison between the optimal layouts and corresponding buckling modes (Hao et al., 2021). than a second, was one of the components of this optimization workflow. The results, including geometry, local cooling effectiveness, and external surface temperature, are displayed in Figure 22. In general, in all cases, temperatures were well controlled within a range of 500-525 K for a large proportion of the target surface areas. This resulted in a significant reduction in the thermal stress of the cooled plates and demonstrated the success of the presented optimization workflow, in which the cGAN model is coupled to a genetic algorithm.
Optical materials with special optical properties are widely used in a broad range of applications, from computer displays to solar energy utilization, which has led to massive datasets being accumulated as a result of years of extensive materials synthesis and optical characterization being conducted. Compared to single materials, the optical properties of composite metal oxides are gaining increasing attention in materials research because these properties can be unique and suited to a specific environment. First-principles calculations, such as density-functional theory, have been widely used to study the optical properties of composite materials (Khalid et al., 2019). Although first-principles calculations are powerful, their associated calculation costs prohibitive and greatly restrict the size of the materials design space and the number of materials. Over the past few years, ML has succeeded in predicting new features (Ward & Wolverton, 2017) that help to guide chemical synthesis and have led to the discovery of compounds with desirable properties (Lu et al., 2018;Popova, Isayev, & Tropsha, 2018). As materials that cannot easily be found have yet to be discovered and the scope of experimental exploration based on experience is narrow, new methods of materials discovery are still required (Zunger, 2018), which boosts the development of approaches toward inverse materials design (Sanchez-Lengeling & Aspuru-Guzik, 2018). Inverse design was first employed in the field of alloy design (Ikeda, 1997), with the use of genetic algorithms and molecular dynamics simulations to optimize the composition of multicomponent alloys. This method has since received widespread attention in a wide range of fields, with it now being widely used in the design of nanophotonics (Jiang & Fan, 2019;Liu, Tan, Khoram, & Yu, 2018;Peurifoy et al., 2018), surfaces (Aharoni, Xia, Zhang, Kamien, & Yang, 2018;Liu, Tan, et al., 2018), catalysts (Freeze, Kelly, & Batista, 2019), drugs, and materials. In this way, Dong, Dan, Li, and Hu (2021) developed and used a transfer learning-based inverse optical materials design algorithm to suggest materials compositions that are active in a desired light absorption spectrum. The final results demonstrated the universality of the model and proved the feasibility of using this inverse design method with the selected datasets. This inverse design model has allowed researchers to discover new materials that have the desired optical properties by exploring the hidden relationships between the compositions of the materials and their optical absorption properties.
Electromechanical coupling is a more complicated feature of materials than mechanical properties, as it is caused by strains that occur when Figure 23. two materials phases model composed of flexoelectric (green block) and elastic (blue block) constituents (Hamdia et al., 2019). structures are subjected to mechanical stress and electric fields. The flexoelectric electromechanical energy conversion mechanism has been widely used to design energy harvesters that could replace traditional batteries as an environmentally friendly alternative (Deng, Kammoun, Erturk, & Sharma, 2014). Thoroughly understanding and evaluating the flexoelectric effect is very beneficial for the design of materials toward target applications. Among all the flexoelectric structures, flexo-nanostructures are complex systems with high-dimensional issues that depend on several uncertain random variables. Physical experiments are not only expensive and consume tremendous time and resources, but are also largely constrained by the experimental conditions and techniques. Hamdia et al. (2019) employed neural networks for the computational design of flexoelectric materials, with the aim of developing novel constitutive models and finding optimal topology patterns. The arbitrary geometry distribution for the flexoelectric and elastic blocks in this system is depicted in Figure 23. Materials properties and electromechanical coupling effects can be determined by arranging flexoelectric and elastic phases in nonuniform strain distributions. The final predictions based on these DNNs reveal high accuracy and performance, which means that errors can be measured at very little time. To solve a problem using a DNN, the required time is at least a factor of 255-300 times less than for the corresponding original based isogeometric analysis model. Furthermore, the method exhibits strong ability to find high-efficiency and optimal design patterns, corresponding to an energy conversion factor, even for nontrained composite proportions, thus demonstrating the high compatibility of the developed model. The mean absolute percentage errors for both the training and testing sets are 5.1% and 5.6%, respectively.
Several deep-learning models were discussed in this article, as well as their main features and applications. These models can be used in a variety of scenarios because their neural networks have a variety of structures and their neurons have a range of activation functions. Relatively, they all have inherent limitations in terms of the mechanism used to process information. New models are currently being developed, and improved versions will be released in the future to address more complex issues. However, it is difficult for a non-computer expert to do so and to comprehend all of the principles of deep-learning models. It turns out that different models can take advantage of their respective merits, according to this research on composite materials science. If the whole workflow of a task can be reasonably divided into several modules and the right model or algorithm can be developed, it can exploit the potential of each model to a great extent and avoid their intrinsic limitations. Therefore, the pros and cons of each model can be summarized, as shown in Table 1.

Concluding remarks and future directions
This review article highlights that deep learning has been used in studies to predict the properties of polymer materials, develop new systems for data processing, and optimize performance of structural geometry, benefiting from its powerful capability of feature extraction and reduced computational costs. However, the application of deep learning in heterogeneous composite materials science is still a work in progress. Highquality data are still in short supply, resulting in low calculation accuracy, therefore meaning that more research is needed to improve the application of deep learning in composite materials systems.
The construction of standard materials databases is extremely important for the future development of deep learning. As a data-driven method, the quantity and quality of data primarily affect the accuracy of ML. The scientific literature and technical records contain a wealth of materials data that can be used to train deep-learning models, however, these need to be organized and classified. A text mining technique could be used to quickly collect data scattered across conference proceedings, journal articles, and scientific magazines, greatly enriching the existing databases and improving the accuracy of calculations. Interpretability is still a problem that needs to be solved, where algorithms of deep learning are used as a black box and are outside of basic understanding. Models based on unknown physical principles may fail in unexpected ways, even if they produce acceptable results in the majority of cases. As a result, the widespread adoption of black-box deep-learning models is hampered by the current lack of trust in them. Understanding the principles and assumptions that underpin the models not only improves their abilities but also allows further development. Using virtual reality models in materials design to identify potential problems prior to the manufacture of the materials will make Industry 4.0 and the smart factory for composites a reality. Industry 4.0 is focused on data-driven manufacturing, where in the future billions of machines, systems, and sensors will communicate with each other and share information, with physical systems connected to digital twins, and the industrial internet of things. This will not only enable companies to design and produce products with significantly more efficiency, but it will also give them greater flexibility when it comes to tailoring production to meet market requirements (Soutis, 2020).
In conclusion, although deep learning has the potential to be used in the design and evaluation of composites, due to a lack of materials data and difficult-to-follow processing principles the desired accuracy cannot be achieved. Although deep learning is currently in the process of changing and improving, with more reliable algorithms and materials models being introduced to account for fabrication induced defects and the evolution of damage in materials under different loading conditions, more work is needed to improve the accuracy and efficiency of these processes. Overall, it is certain that deep learning will reduce effort and costs in developing materials, reshaping many industries in a variety of ways in the years ahead.

Disclosure statement
No potential conflict of interest was reported by the authors.

Notes on contributor
Y. Wang is a PhD student at the graduate school of Tohoku University, studying materials science. Currently, she is conducting research on piezoelectric materials. As she has knowledge on both computers and materials science, she is particularly interested in interdisciplinary research.
Professor C Soutis PhD (Cantab) holds an Emeritus Chair in Aerospace Engineering at the University of Manchester (UK). He is a Fellow of the Royal Academy of Engineering, distinguished for his major contributions to the science and technology of fibre composite materials based upon polymeric matrices. Professor Soutis has authored or co-authored more than 400 ISI listed papers and successfully supervised over 40 PhD students. D. Ando studied materials science at Tohoku University in Japan and received his PhD at the same University in 2011. His present main research topic is structural light metal with functionality, such as Magnesium alloys with shape memory effect and high strength Aluminum alloys at high temperature.
Y. Sutou studied materials science at Tohoku University in Japan and received his PhD at the same University in 2001. His present main research topic is functional materials, such as shape memory alloys and phase-change non-volatile memory materials.
F. Narita received his PhD degree from Tohoku University, Japan, in 1998. He is currently a Professor at Tohoku University. He is engaged in research to design and develop piezoelectric and magnetostrictive composite materials in energy harvesting and self-powered environmental monitoring.