Knowledge mapping of credit risk research: scientometrics analysis using CiteSpace

To understand the development track of credit risk research clearly and discover the hidden internal connections between literatures, this article utilises the scientific information measurement software – CiteSpace – to conduct a scientometric analysis (citation analysis, co-citation analysis and co-occurrence analysis) of 2,384 articles on credit risk from Web of Science (W.o.S.) during 1998 and 2017. According to the research results, some useful conclusions can be drawn as follows: (1) Credit risk research has become interdisciplinary and subject involved are ‘Business Finance’, ‘Economics’, ‘Operations Research Management Science’, ‘Mathematics Interdisciplinary Applications’; (2) The U.S., Europe, and Asia make the majority of contributions and there are numerous collaborations among countries; (3) The key researchers with influence and authority in this field mainly are Merton Robert Cox and Jarrow Robert Alan; (4) ‘Rollover risk’, ‘Arbitrage-free pricing’, ‘Default cycle’, ‘Credit risk evaluation’ and ‘Correlated default’ are the major research area; (5) ‘Crisis’, ‘Contagion’, ‘Monetary policy’, ‘Counterparty risk’ and ‘Systemic risk’ have become major research hotspot currently. Finally, we hope this scientometric analysis can provides some inspiration for credit risk researchers. ARTICLE HISTORY Received 18 March 2019 Accepted 5 July 2019


Introduction
Credit has the dual function of stabilising market order and increasing capital utilisation rate in market economic activities. Good credit is the basic condition for the market to develop steadily, but there will inevitably be credit risk in the transaction process, especially in the era of cloud payment and virtual payment. Credit risk, also known as counterparty risk or performance risk, refers to the risk that the counterparty does not perform on the debt due, which is not only in the loan, but also in the business of guarantee, acceptance and securities investment. The small credit risk affects the operation and efficiency of the market, and the big one will trigger a credit crisis, affecting social and economic development. Credit risk can be said to be one of the most challenging problems the market faces, and relevant research literature is growing rapidly. the following: (1) to understand the cooperation relationship of authors, institutions and countries in the field of credit risk research; (2) to identify the most cited and most noteworthy references, authors, and journals; and (3) to clarify the structure of knowledge development and emerging trends in the field of discipline.
The remainder of this article is as follows: Section 2 describes data collection and current status (number of publications, subject categories and research directions) in the credit risk research domain. Then a comprehensive analysis of the research results, which includes citation analysis (authors, institutions, and countries), co-citation analysis (documents, authors and journals), and co-occurrence analysis (keywords), is provided in Sections 3, 4, and 5 respectively. Finally, Section 6 summarises the key conclusions and proposes future expectations.

The current status of credit risk research
The objective of this study is to carry out a scientometrics analysis of credit risk research. The relevant literature data is obtained through the literature retrieval method (including title, abstract, subject words, keywords and references, etc.), and will be the database for knowledge mapping. It is of great significance to choose a reputable and comprehensive bibliographic database to provide a wide range of highquality articles as a reliable source. The data selected in this study is from Science Citation Index Expanded (S.C.I.-E.) and Social Sciences Citation Index (S.S.C.I.) in the W.o.S. core collection. This database is relatively comprehensive and the exported data can be identified by CiteSpace directly. Because the CiteSpace software data format requirements are based on W.o.S. text data and it is updated as the data format in the I.S.I. database changes. CiteSpace software can import data from WoS database and PubMed database for visual analysis directly. Meantime, it also provides nine kinds of data converters, namely W.o.S., C.N.K.I., C.S.S.C.I., arXiv, Derwent, N.S.F., S.C.O.P.U.S., S.D.S.S., Project D.X., to convert data from other databases. Furthermore, determining an appropriate keyword to select articles from the bibliographic database is essential. This process should focus on the representativeness and validity of keywords.
The timespan of credit risk research is from 1998 to 2017. The main reason for choosing this period is that 20 years of literature data is representative and there will be no situation that affects the efficiency of the software due to too much data. In addition, since 2018 year is not over yet, in order to maintain data integrity, the deadline is 2017. The specific details are summarised as follows: Topics ¼ credit risk Timespan ¼ From 1998 Databases ¼ SCIÀE and SSCI By filtering out some record types, such as program files and editing materials, and limiting only in reviews or articles to eliminate the 'noise' of databases, a total of 2,386 related articles were published with credit risk terms in the titles, index terms, or abstracts, from 1998 to 2017. The number of publications per year is shown in Figure 1. The overall trend increase can be observed. The number of publications on credit risk research increases from 12 in 1998 to 224 in 2017, which indicates that the research on credit risk has gradually attracted the attention of scholars. According to the trend of the column chart, we can divide it into three periods: 1. 1998-2003 (start-up period) There are eight different file types that make up all the documents of credit risk research, such as, Articles, Proceedings papers, Editorial materials, Reviews, Meeting abstracts, Book chapters, Book reviews, and Corrections. The detailed document type is shown in Figure 2.
Research on credit risk has become multidisciplinary according to the categories distribution. Table 1 presents a broad summary of the top 10 categories in the credit risk research area. It can be seen clearly that 'Business Finance' is the most popular category with 1,167 publications, accounting for 48.910% of the total publications. 'Economics' is the second most popular category after 'Business Finance', with 1,124 publications, accounting for 47.108% of the total publications. Followed by 'Operations Research Management Science' (280), 'Mathematics Interdisciplinary  Table 2, we found that some categories correspond to the research direction. Table 2 shows the top 10 research directions in credit risk research. Among them, 'Business Economics' is the main research direction currently, at 75.943%. In addition, other research directions with a total of more than 10% are 'Mathematics' and 'Operations Research Management Science', while research directions such as 'Government Law', 'Physics' and 'Science Technology other Topics' have fewer publications.
The article is the most direct embodiment of the subject research direction. The top 20 most used articles on W.o.S. in credit risk research (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) are listed in Table 3 and the number of uses is counted, starting from 2013. These articles have been published in recent years, and six articles are from Expert Systems with Applications. Furthermore, the research directions of Economics, Operational Research, and Information Science and so on, are involved in these articles, which correspond to Tables 1 and 2. Scientometrics analysis (citation analysis, co-citation analysis and co-occurrence analysis) can demonstrate the macroscopic structure of scientific knowledge visually.  The relationship between research frontiers and basic knowledge can be ascertained through analysing a series of visual maps (collaboration network, co-citation network and co-occurrence network) drawn by CiteSpace software. It should be noted that the input data needs to be processed by the 'remove duplicates' function of CiteSpace software before running. Finally, a total of 2,384 data is retained. In order to illustrate this process logically, a work-flow diagram has been provided in Figure 3.

Citation analysis in credit risk research
To solve more complex scientific problems and stimulate innovative thinking, the cooperation between countries, institutions and scholars in various research fields is required. This study has drawn collaboration maps (author collaboration network, institution collaboration network, country collaboration network and geographical collaboration network) based on the collected literature data which will help to discover those scientific researchers, institutions and countries that deserve attention and their social relations.

Author collaboration network analysis in credit risk research
The author collaboration network can examine the cooperative strength and mutual relationship between different authors, as shown in Figure 4. There are 256 nodes and 160 links, each node means an author, the size of the node represents the number of times the author is cited, and the width of the line indicates the number of collaborative articles. The overall density of the network is only 0.0049, which signifies that this network diagram is not compact enough. Table 4 summarises the top 20 authors based on cooperative frequency. The frequency of Jeanblanc M in the first place is 11, the first time to appear in a cooperative relationship in 2004, followed by Lucas A, with the frequency of 10 and first appearance being 2005. Their collaboration history can be clearly seen in Figure 5. The cooperative frequency of other scholars all are below 10 and the time that they started working with other scholars is late. It indicates that the scholars conducting the credit risk study are more dispersed, and they have weak academic links.

Institution collaboration network analysis in credit risk research
The institution collaboration network consisted of 335 nodes, 407 links and the overall density is 0.0073. As shown in Figure 6, each node represents a different institution. The size of the node represents the number of documents issued by the organisation. The larger the node, the more documents are distributed by the organisation. Among them, the node representing New York University is the largest. Moreover, the links represent the cooperation relationship between the organisations. The more links, the closer the cooperation between the organisations. We can find that the links between the various institutions are complex, and the colour is mainly yellow, indicating that there are more close relationships and carried out in recent years. Besides, the outermost purple circle of the node reflects the centrality of each institution. It suggests the proportion of the shortest path through the node in the shortest path between all other vertices in the network. If the ratio is larger, the purple circle is thicker, demonstrating that the node has a high centrality, and the institution has a higher status in the field. Combined with Table 5, the nodes representing New York University have the thickest purple circle, and its centrality is 0.19. Other institutions with a purple aperture include Columbia University (0.16), National Bureau of Economic Research (0.12) and European Central Bank (0.11), which means that these institutions have great influence and are worthy of attention. Furthermore, among the 20 mentioned institutions in

Country collaboration network analysis in credit risk research
A total of 54 nodes and 262 links makes up the country collaboration network. The overall density is 0.1831 and the nodes represent countries or regions (Figure 7). There is much cooperation among countries or regions in the late twentieth and early twenty-first centuries according to the direct connection of nodes. The top 20 countries or territories that accounted for the major part of the total output contribution are listed in Table 6. The U.S. is the largest contributor, publishing 723 articles, followed by the People's Republic of China (281), England (274), and Germany (244), while the frequency of other countries' publications is below 200. In terms of centrality strength, the centrality of the U.S. remains the first place (0.49), followed by England (0.37), indicating that their mediating role in the field of credit risk research     is more obvious and they have played a vital role in establishing contacts with other countries. It is interesting that though Spain and the Netherlands' publications ranked tenth and twelfth respectively, their centrality is the same as that of Germany, which ranks fourth, with 0.12. On the contrary, the centrality of the People's Republic of China, which is ranked second in the frequency of publications, is only 0.05, and the centrality of the Taiwan region is 0.00. This means that the Asia still lacks influence in this field.

Geographical collaboration network analysis in credit risk research
A geographic information map is the interaction between CiteSpace software and Google Maps. We can understand the geographical distribution of a study quickly and its academic cooperation through the networks. The red dot represents the specific geographical distribution of credit risk research. The greater the density of red  Figure 7. A visualisation of the country collaboration network.
ECONOMIC RESEARCH-EKONOMSKA ISTRA ZIVANJA dots, the more research is happening here. The lines between the red dots represent the cooperation between the authors in the institutions of the corresponding countries and the different colours correspond to cooperation time. It is obvious that the U.S. and European countries are major contributors to credit risk research and the distribution of research in Europe is the most intensive ( Figure 8). In addition, Europe, the U.S., and Asia have formed a triangular zone for cooperation. As far as Asia is concerned, China, South Korea, and Japan have more research in the field of credit risk (b). These results are consistent with the conclusions drawn in Figures 6 and 7.

Co-citation analysis in credit risk research
Co-citation analysis can reveal the internal relations and laws of scientific literature and describe the dynamic structure of scientific development. It refers to the fact that two documents establish links with one or more other documents and can be used for research on document relations, literature retrieval, and literature structure research. For the two documents that have a co-citation relationship, the co-citation characteristics determine that they are always in a passive position. The relationship between them is always waiting for other documents to be established. Therefore, it is more adaptable to certain current research objects having constantly changing and developing characteristics. Credit risk research is just a dynamic science field that is constantly changing. This article has conducted the co-citation analysis of document, author and journal to investigate the development track of credit risk research. We expect to understand the knowledge base and obtain a leading role in this field.

Document co-citation network analysis in credit risk research
In general, the traditional document review based on narrative is qualitative, mainly depend on individual judgements and explanations, while CiteSpace software can visualise this non-descriptive judgement. These two methods complement each other and make the study more convincing. Document co-citation analysis can identify key or core literature in the field of credit risk research. Figure 9 shows the full shot of document co-citation network. This document co-citation network consisted of 648 nodes and 2293 links, and the density of the entire network map is 0.0109. These citation rings represent the citation history of each article. The colour of the citation tree-rings corresponds to the cited time. The thickness is proportional to the number of citations in the corresponding time partition. Using multivariate statistical analysis methods such as cluster analysis, the intricate co-citation network relationship between many analysis objects is simplified into a relatively small number of groups and visually represented. The results obtained after the clustering of the literature are displayed in the form of a timeline fisheye diagram and are presented in Figure 10. The colour of the citation tree-rings transitions from cool to warm means the continuous advancement of scientific knowledge intuitively. Yellow citation tree-rings are the current research hotspot. The entire network is divided into 85 clusters, and these main clusters are labelled by index terms from their own citers and are summarised with '#' on the right side of Figure 10. It should be noted that it uses title terms and a log-likelihood ratio (L.L.R.) weighting algorithm to label the clusters. L.L.R. is an algorithm to calculate and determine each type of labels, which presents core concept of each cluster with professional words (Fang, Yin, & Wu, 2017). These extracted primary clusters with '#' reflect the research frontiers of the discipline's development.  Table 7. Size implies the numbers of the publications in the cluster. The largest cluster (#0) contains 96 members' references, followed by cluster (#1), which has 82 members' references. Silhouette is an index to measure the homogeneity of a cluster, the higher the silhouette score, the better of the homogeneity. When the silhouette score is equal to 0.7, the clustering result is considered to be highly reliable. When the silhouette score is greater than 0.5, the clustering result is reasonable. The silhouette score of the largest 10 clusters listed in Table 7 are all above 0.7, implying that these clusters are efficient and convincing. Mean represents the average year of the published documents of the regarding cluster. It can be used to judge whether the cluster is new or old (Yu & Xu, 2017). Cluster (#0) is the latest, indicating that 'Rollover risk' is a research hotspot in the current research field. Table 8 shows the top 20 most cited references, with more than 38 citations. The listed references are the most cited documents from the 2,384 documents retrieved in this article and not from the most cited documents in W.o.S. or Google Scholar. These cited documents reflect the knowledge base, which is the reference trajectory of the research frontier in the literature. Among the 20 references listed, an article entitled 'Corporate yield spreads: Default risk or liquidity? New evidence from the credit default swap market' in cluster (#0) by Longstaff et al. in the Journal of Finance is the most cited article with 67 citations. The article used information from credit default swaps to measure the size of defaults and non-defaults in corporate spreads and found that most corporate spreads are due to default risk. At the same time, they discussed the relationship between non-defaulting parts and specific non-liquidity measures and macroeconomic measures of bond market liquidity. It is a core literature in  credit risk research. Furthermore, we believe research frontiers represent the development of a research field. Those articles that cite these well used references constitute the research frontiers. For example, Table 9 lists the top 20 articles citing the previously mentioned article by Longstaff et al. These citing articles are generally recently published. The evolution of scientific knowledge in the field of credit risk research and changes in the major research area can be found clearly by studying them. Other documents in Table 8 from cluster (#0), such as 'How sovereign is sovereign credit risk?' published in the American Economic Journal -Macroeconomicas were also by Longstaff et al., who have made a significant contribution to this field.

Author co-citation network analysis in credit risk research
By analysing author co-citation, we can find out the influential talents in the credit risk research field and provide reference for the introduction of talents in specific relevant institutions. The network contains 356 nodes and 1,629 links and has an overall density with 0.0258 which is illustrated in Figure 11. It should be pointed out that only the first author was considered in the study. The largest node represents the cited author Merton RC, with a frequency of 666. The centrality of this node with a purple outer ring is 0.12, which indicates that Merton RC has a certain position in this field. Table 10 lists the top 20 cited authors with a citation frequency greater than 136. The node representing Altman EI has the highest centrality of 0.23, so it is a key node. Because high centrality is a measure of the transformative potential associated with scientific contributions. Other cited authors such as Jarrow RA, Longstaff FA, Acharya W, Berger AN and Das SR whose centrality is all greater than 1, are also worthy of attention. The conclusion that Longstaff FA is the core author is consistent with the results of the document co-citation analysis, which verifies the accuracy of our research from a different angle.

Journal co-citation network analysis in credit risk research
The research literature on credit risk is from various journals. Understanding the distribution of core journals in this field can help to provide a valid basis for literature collection. The main journal clusters in credit risk research area are revealed in Figure 12. Combined with Figure 12 and    top 10 most cited journals listed, Journal of Finance is the most frequently cited, with 1,512 co-citations. In terms of centrality, Journal of Finance still ranked first, with 0.16. Therefore, Journal of Finance had the highest impact factor at 8.968 of the listed journals and so is a core journal in the field. It plays a major role in connecting with other journals according to these two indicators. Other journals with high centrality are Journal of Financial Economics and Econometrica, both with a centrality of 0.11, they are also of great value for credit risk researchers.  Figure 11. A visualisation of the author co-citation network. With the increasing complexity of research issues, the intersection between disciplines is required. An overlay of the cited journal is presented in Figure 13. Each point in the figure represents a journal, and the aggregation of each colour means the corresponding subject. The left side of the graph indicates the subject distribution of the citing article while the right expresses the subject distribution of the cited article. It is obvious that there are Mathematics, Medicine, Ecology, Molecular, Physics, Systems, Environment and so on in the base map. In addition, the position of the ellipse represents the distribution of disciplines involved in this study. The number of authors is explained by the horizontal axis of the ellipse, and the number of article is explained by the vertical axis of the ellipse. Total two distinct wavy lines in the picture that leads from left to right of the elliptical can be found. The red wavy line illustrates the interdisciplinary cross-reference between Mathematics, Systems, Mathematical and Systems, Computing, Computer,   Figure 13. An overlay of the cited journal in credit risk research.
while blue wavy lines connect citing articles and cited articles which represent Economic and Political in this study. Table 12 reveals the top 10 most prolific journals in credit risk research during the period of 1998-2017, the Journal of Banking and Finance published the most articles (244), accounting for 10.226% of the total, while other journals published below 100. However, the publications in the Journal of Finance which have the highest centrality and cited amount in Table 11 did not appear here.

Burst detections of keywords
The keywords are the refinement of the main content of the thesis, which can reflect the author's academic thoughts and viewpoints fully. The sudden detection of keywords refers to the words that are more frequently used or used in a shorter period and the special attention received by scholars at a certain time. According to the word frequency change of the emergent words, the frontiers and trends in this research area can be judged. Table 13 lists the top 20 keywords with the strongest bursts. The last column of the line means the entire year of the study (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017), and the red line represents the duration of the keyword outbreak. In terms of burst strength, the top ranked is 'Corporate Debt' with bursts of 10.6585, followed by 'Crisis' with bursts of 9.7225, occupying the third position is 'Term structure' (9.6572), followed by 'Interest rate' (8.7493), 'Basel ii' (8.5946), 'System' (7.8983), 'Value at risk' (7.8627), 'Contagion' (7.7693), 'Risk management' (7.7076), 'Performance' (7.2428) which burst strength is all above 7. These burst keywords reflect the characteristics of a certain period. Through detailed study, we can discover that the hot topics in this field change over time. In the 1990s, 'Term structure' and 'Interest Rate' are the mainstream trend of the credit risk research area. However, 'Crisis', 'Contagion', 'Performance', 'Economy', 'Firm', 'Monetary policy', 'Management', 'Financial crisis', 'Counterparty risk', 'Corporate governance' and 'Systemic risk' have become the research frontier in recent years.

Conclusion
This article has been a scientific review of 2,384 data on credit risk based on W.o.S., applying CiteSpace software. Some useful conclusions have been obtained through citation analysis, co-citation analysis and co-occurrence analysis. The literature on the credit risk knowledge area has increased significantly, especially after 2008. However, the research has been scattered, and there has been only little collaborations among scholars. The U.S. has been the main contributor, as many high-yield institutions such as New York University, Columbia University and the National Bureau of Economic Research, and so on, have been located here. Other areas with more research on credit risk have been Europe and Asia. In addition, Merton Robert Cox and Jarrow Robert Alan have been researchers worthy of attention in the field of credit risk. Journal of Finance, Journal of Financial Economics and Econometrica have played a vital role in establishing links with other journals. Besides, 'Rollover risk', 'Arbitrage-free pricing', 'Default cycle', 'Credit risk evaluation', 'Correlated default', 'Credit portfolio model', 'Interest rate risk', 'Systemic risk', 'Quantifying credit risk' and 'Asset price' have been the main research area in the knowledge field of 'credit risk'. Further, 'Crisis', 'Contagion', 'Performance', 'Economy', 'Firm', 'Monetary policy', 'Management', 'Financial crisis', 'Counterparty risk', 'Corporate governance' and 'Systemic risk' have become research hotspots in recent years.
In short, the scientometrics analysis is of great significance for identifying potential relationships between the literature and investigating the knowledge evolution of credit risk research. However, it should be pointed out that there are still some limitations in this research. For example, keywords are not comprehensive when searching for data, which can lead to the omission of partial data. In order to verify the accuracy of the research results and broaden the scope, collecting data with different related terms, conducting specific research on one or more core journals in this field based on CiteSpace or utilising other scientific measurement software tools to do an analysis are expected in further study.

Disclosure statement
The author reports no conflicts of interest.