Do we need to pay technical debt in blockchain software systems?

For blockchain software systems, framework developers may introduce technical debts that application developers are not aware of. Because these technical debts can have a negative impact on software projects, we need to investigate the issue of technical debt in blockchain software systems. We wanted to investigate what types of self-introduced technical debt exist in open-source blockchain software systems, and how these technical debts are distributed. We have selected six most popular blockchain software projects from GitHub. Then the code comments from these software projects were extracted and manually labelled. Finally, the code comments were statistically analysed. We propose a new type of technical debt, resource debt, which is explicitly identified by the framework developers and requires special attention in subsequent production systems. Six types of technical debt are prevalent and there is not any algorithm debt. In addition, we find that the code comments containing technical debt are not entirely determined by task tags. SATD is prevalent in blockchain projects. There is more significant variability between different application software projects for different technical debts. The results of the study imply that for detecting SATD, deep semantic discovery models should be used, such as pre-trained models.


Introduction
Decentralised cryptocurrencies have attracted the attention of industry and academia all over the world (Sai et al., 2021). What follows is that various blockchain technologies are being applied to various fields, including games, P2P energy markets (Borges et al., 2022), distributed storage, blockchain-based anonymous authentication for traffic reporting (Zhang & Xu, 2022), etc. The frameworks based on blockchain technology provide a quick way to deploy blockchain services. The blockchain framework provides the implementation of basic components, including network discovery, data sending and receiving, cryptographic libraries, and so on. Blockchain software system framework refers to the software products that are developed to provide basic blockchain services. There are currently thousands of open-source projects on Github to assist application developers to quickly deploy their own business application systems (D. . Developers of blockchain frameworks are expected to develop software frameworks that application developers can directly deploy. These frameworks should be fully tested and can be deployed directly into the application systems. However, due to the requirements of the project deploying time, or the budget of the open-source project, etc., there may be code issues that have not been fully resolved in the stable version released. The metaphor, technical debt is used to describe this situation, which is first proposed by Cunningham in 1993(Cunningham, 1992. Previous study has indicated that although technical debt may have negative effects, its impact is not only related to defects, rather making the system more difficult to change in the future (Wehaibi et al., 2016). However, the impact of technical debt on the blockchain systems has not been investigated.
To help framework developers to manage projects and application developers to use the blockchain framework, we conduct investigations on the blockchain frameworks. Prior studies have shown that technical debt is an important factor affecting the quality of software projects (Potdar & Shihab, 2014). For blockchain framework developers, they have a clear understanding of the technical debt in the project. However, after the framework is downloaded by the application developer, it may not be able to track the technical debt instantly. We focus on this kind of technical debt introduced by programmers on their own initiative. These technical debts are clearly stated in the code comments. This technical debt is first proposed by Potdar and Shihab (2014), called self-admitted technical debt(i.e.SATD). SATD refers to the situation where developers know that the current implementation is not optimal and write comments alerting the inadequacy of the solution (Maldonado & Shihab, 2015b). Understanding the different types of SATD in blockchain software systems is important because (1) the limitations of technical debt can be understood more deeply through code comments so that framework developers can better management technical debt; (2) SATD can be better understood from the developer's perspective, and (3) the study of SATD in blockchain software systems is a useful complement to the study of technical debt.
Therefore, we examine and quantify the different types of self-admitted technical debt in blockchain software projects. We mine the SATD in six of the most popular open-source projects from Github. 144,815 comments are extracted from these projects. These code comments are manually labelled. For these projects, we study the following issues.
Ten types of technical debt mentioned in the literature as the classification standard, including requirement debt (Y. , design debt (Maldonado & Shihab, 2015b), architecture debt (Y. , algorithm debt (Liu et al., 2020), code debt (Cunningham, 1992), documentation debt (Y. , compatibility debt (Liu et al., 2020), defect debt (Y. , build debt (Y. , test debt (Liu et al., 2020). architecture debt, code debt and build debt are not used because the coverage of these technical debts overlaps with other technical debts, such as design debt. We therefore categorised code comments into seven categories according to Liu's research (Liu et al., 2020), including requirement debt, design debt, algorithm debt, documentation debt, compatibility debt, defect debt and test debt. Finally, six types of technical debts are found in these six blockchain projects. In addition to these technical debts, we also find that there is a new type of technical debt, resource debt, in blockchain projects.
SATD is common in all six projects studied. Especially in the bitcoin project, there are a total of 1201 SATD examples.
(3) What is the distribution of different types of SATD in blockchain projects?
Among these six projects, the largest proportion of technical debt is defect debt and design debt. In the bitcoin project, the highest proportion is defective debt, which should not be ignored.
(4) Can the four representative task tags (i.e. TODO, FIXME, XXX and HACK) be used to distinguish between SATD comments and WITHOUT_SATD comments in blockchain software projects?
In the real process of a project development, a project manager usually requests developers to use task tags to mark technical debts explicitly in purpose of checking and refactoring them in the task list more conveniently later. Task tags can be used to detect SATD (Guo et al., 2021). However, Code comments with task tags don't completely determine whether the code comment contains SATD in blockchain projects.
To our best knowledge, the contributions of our study can be summarised as follows: • We crawl six open-source blockchain software systems and manually label the data set. we share a rich dataset of self-admitted technical debt data for blockchain software systems used in this study. It is now publicly available in Github . 1 • We are the first to investigate different types of SATD in blockchain systems.
• We propose a new type of SATD, resource debt for blockchain software systems.
The rest of this paper is organised as follows. Section 2 presents the related work, including the background of self-admitted technical debt and software engineering for blockchain-based software systems. Section 3 describes the motivation of this research. Section 4 reports our experimental setup. Section 5 discusses the results of our experiments. Section 6 discusses the results of our experiments. Section 7 analyses the potential threats to validity for our empirical results. Section 8 concludes the paper with some future work.

Related work
In this section, we introduce two aspects related to this research, including research on selfadmitted technical debt and research on blockchain projects in software engineering.

Self-admitted technical debt
Delivering high-quality and defect-free software products with limited resources (including time and cost) is an important goal of software engineering. However, due to unexpected factors such as changes in requirements and budgets, it is necessary to temporarily sacrifice the quality of the software to meet the needs of software delivery in the software development process. This situation is called technical debt. Technical debt is first introduced by Cunningham (1992). Technical debt is common in the software development, and it is inevitable. Technical debt should be made visible and explicit and be tracked using a Wiki, backlog, or task board (Lim et al., 2012).
Software developers submit low-quality or defective code due to misunderstanding of requirements or unfamiliarity with new development techniques. This kind of technical debt is introduced by programmers unconsciously (Lim et al., 2012). Due to the lack of a clear traceability basis for this kind of technical debt, it can only be located through reverse tracking and debugging after the system encounters an error. After weighing the short-term benefits of software development and long-term maintenance costs, some programmers consciously introduce some temporary solutions to meet the needs of software deployment. This kind of technical debt is called self-admitted technical debt(i.e. SATD) because it is introduced by programmers intentionally. Potdar et al. first study SATD from code comments (Potdar & Shihab, 2014). Research on SATD can easily identify potential defects in software projects and then software assurance teams are arranged to carry out corresponding software tests. The current research on SATD mainly focuses on software code comments. Developers clearly mark SATD through code comments to revise their temporary solution in future.
The research on SATD has attracted wide attention from academia. After Potdar and Shihab (2014) introduce using code comments SATD and conduct empirical research, Maldonado et al. classify code comments into five different types (da Silva . The problem of SATD detection is one of the hotspots of current research. SATD detection problems are currently classified as binary classification problems. Maldonado et al. use the maximum entropy classifier to classify SATD (da Silva . Huang et al. propose to use the bag-of-words model to classify code comments (Huang et al., 2018). Ren et al. propose to use convolutional neural network to conduct in-depth analysis of code comments, and achieved good classification results (Ren et al., 2019). Natural language processing techniques, text mining techniques, and convolutional neural networks are all used to classify SATD. These techniques are targeted to analyse open JAVA projects. So far no research has been conducted to analyse blockchain software systems.
Another very important research direction is to conduct empirical research on the characteristics of SATD in different projects, so as to better manage SATD and ensure software quality. At present, researchers study the characteristics of SATD from multiple dimensions. Researchers have conducted in-depth research on SATD categories in different projects. Different classification of SATD helps to deal with different SATDs and improve the efficiency of debt repayment (da Silva Y. Li et al., 2020;Liu et al., 2020;Maldonado & Shihab, 2015a). Researchers have studied the distribution of SATD, and the research on the distribution is helpful to assign different priorities to the processing of SATD under the condition of limited test resources (Bavota & Russo, 2016;Maldonado & Shihab, 2015a;Mensah et al., 2016Mensah et al., , 2018Potdar & Shihab, 2014). SATD is intentionally introduced by developers, and the analysis of the root causes can help developers better manage SATD and avoid the introduction of SATD (Bavota & Russo, 2016;. Research on the removal of SATD helps programmers to repay technical debt in the fastest way and remove SATD (Iammarino et al., 2019;Maipradit et al., 2020;Zampetti et al., 2020).
Researchers investigate the impact of different SATDs in order to determine the degree of harm of different SATDs, and guide programmers to carefully introduce SATDs in the software development process (Bavota & Russo, 2016;Kamei et al., n.d.;Mensah et al., 2016;Potdar & Shihab, 2014;Sierra et al., 2019;Wehaibi et al., 2016). These categories for SATD cover the types of technical debt in common software projects. But specific types of SATD for blockchain software systems have not been studied in depth.
The difference from the previous research is that our research focuses on self-admitted technical debt in open-source blockchain projects. The SATD in these projects has different characteristics and has a greater impact on the application system of the blockchain framework.

Software engineering for blockchain based software systems
Many scientific and practical areas have shown increasing interest in reaping the benefits of blockchain technology to empower software systems (Fahmideh et al., n.d.). Cryptocurrencies with features such as decentralised, open transaction ledgers (Sarode et al., 2021), including Bitcoin, smart contracts, etc. provide an excellent demonstration of the application of blockchain technology (Dai et al., 2021;Li et al., 2021). Blockchain-based software systems are growing rapidly, and Chakroborty et al. (Bosu et al., 2019) reported that in March 2018, there were more than 3000 blockchain-based software systems hosted on Github; by October, blockchain-based software systems had almost doubled to 6800.
The boom in blockchain applications is accompanied by a number of malicious attacks on these software systems that must be taken seriously. Porru et al. (2017) suggest that these software projects that receive attacks or are executed incorrectly are caused by a disorganised and hasty software development process. This situation is still not fundamentally improved, as many blockchain framework projects themselves are still in the experimental stage.
To alleviate these issues, more systematic software engineering approaches, referred to as engineering methodologies, have been researched to ensure the quality of both development and maintenance of blockchain-based software systems (Boopathi et al., 2020;Panda & Nagwani, 2021;Porru et al., 2017;Rankovic et al., 2021). According to 58 selected core studies, Fahmideh et al. organise the research of blockchain-based software systems in four aspects, including approaches, processes, modeling, and role which guide software developers, business managers, and academic researchers in the exploration of practical side and implications (Fahmideh et al., n.d.). There are four phases for the development of blockchain-based software systems, including system analysis, system design, system implementation and test, and system maintenance. In the test phase, conventional testing techniques are used to improve software quality. However, traditional software testing techniques are inadequate in practice. This is because blockchain technology has its own characteristics. For example, smart contracts have the property that cannot be modified after they are deployed. Code review is one of the basic methods to identify potential software bugs and fix them quickly and efficiently.
Unlike the previous research, we focus more on self-admitted technical debt to allocate smart contract test resources. We investigate based blockchain software systems from this perspective. Blockchain framework developers may be well aware of the introduction of a technical debt in a smart contract, while application developers are not aware of the existence of this technical debt.

Motivation
Blockchain technology has achieved great success. A few years after the release of the Bitcoin white paper, the open-source Bitcoin is implemented, which is also referred to as blockchain V1.0. Smart properties and smart contracts are introduced into the blockchain technology. The representative cryptocurrency is Ethereum, which is also referred to as blockchain V2.0. Bitcoin and Ethereum (Wood, 2014) are the two most popular cryptocurrencies that use blockchain technology, and their market cap reached $62 billion by April 2018. The application in non-financial fields based on blockchain technology has also achieved rapid development, which is also referred to as blockchain V3.0. Its application areas include identity management, dispute resolution, contract management, supply chain management, insurance and healthcare, to name a few (Padma et al., 2021). Developers of blockchain frameworks are expected to develop software frameworks that application developers can directly deploy. These frameworks should be fully tested and can be deployed directly into the application systems. Generally speaking, most software frameworks can be deployed and used directly. Based on these basic frameworks, programmers can quickly perform secondary development. However, due to the requirements of the project deploying time, or the budget of the open-source project, etc., there may be code issues that have not been fully resolved in the stable version released. Therefore, it is necessary to detect SATD in a timely manner during the rapid development and evolution of blockchain projects so that application programmers can detect potential software quality issues in a more timely manner.

Case study setup
In this section, we mainly introduce the key steps of data set creation, including project selection, code comment extraction, and the process of manually labelling the data set. Figure 1 shows an overview of our case study setup, and the following subsections detail each step of it.

Project selection
We focus on the open-source blockchain framework projects on Github. These projects cover multiple fields, from cryptocurrency to distributed storage and medical health. The selection of these projects follows the development sequence of blockchain technology, including both early cryptocurrency projects and the application framework of blockchain technology that has been more active in recent years.  To do so, we search for open-source projects with the keyword blockchain on Github, and manually select these projects by reading the readme file. We first chose the two most popular cryptocurrency projects, including Bitcoin 2 and Ethereum 3 projects. Then we select four other projects on Github, based on the fact that these four projects have won a large number of "stars" from other developers. These "stars" mean that these four projects are widely used by developers. These four projects include chia-blockchain , 4 diem-libra-core , 5 fabric, 6 solidity. 7 Table 1 describes the blockchain projects used in our research, including the name of the project, the code version used, the total number of lines in the project, the programming language used in the project and stars from Github. The bitcoin project is an experimental digital currency. The Ethereum project is official Golang implementation of the Ethereum protocol and Ethereum is an open-source public blockchain platform with smart contract functionality. The diem project is a decentralised, programmable database. The solidity project is a statically typed, contract-oriented, high-level language. The fabric project is a platform for distributed ledger solutions. The chia project is a modern cryptocurrency. For the Bitcoin, solidity and chia projects, we utilise the sloccount 8 to calculate the total source lines of code, following a previous study by Maldonado and Shihab (2015b). For the remaining three projects, we developed a Python-based program to count the total number of lines of code.

Comment extraction
We download the source code of the latest stable versions of the six projects from Github. We need to extract the code comments from the source codes. As shown in Table 1, there are five different programming languages in the six projects, including C++, Rust, Python, go, and Solidity. For the projects coded by C++ programming language, we use srcML 9 to parse source files into XML files. Then we develop a Python-based parser to extract code comments from the XML files. For the projects coded by Python programming language, we use the tokenise module 10 in the Python standard library to extract code comments according to "COMMENT" tag. For the projects coded by Rust programming language, go Rust programming language and Solidity programming language, we develop Python-based programs for each programming language. Single-line comments and multi-line comments are extracted from the source codes. Finally, we extract a total of 125,430 comments from the 6 projects.

Filtering rules for code comments
The comments extracted from the software projects contain a lot of noisy data. da Silva  classified the noisy comments into five categories. We filter the following types of comments for blockchain-based software systems. Table 2 shows the filtering details of our research.
(1) License comments. This type of comments is usually added in the head of source code files and to describe the licensing information of the source code. The filtering keywords we use include "mit software license", "spdx-license-identifier" etc.
(2) Automatically generated comments. These types of code comments are generated by the development environment and have no real meaning. (3) Document comments. This type of comment is merely a functional description of the source code and has no practical meaning. For example, comments in the Python projects that start with a set of triple quotes maybe documentation comments. The filtering keywords we use include "/usr/bin/env", "spdx-license-identifier" etc. (4) Commented source comments. The commented code segment belongs to the code that is no longer used. It has nothing to do with SATD. This code segment can only be filtered manually. (5) Multi-line comments. Most programming languages support multi-line comments.
These multi-line comments often need to be combined to represent the developer's ultimate intention. In our empirical study, we combine multiple lines of comments into one code comment by heuristic strategy for different programming languages. For example, in a python software project, the comments extracted by the tokenise module may be some single-line comments. We combine multiple single line comments into one multi-line comment according to the adjacency of code comments. In software projects using other programming languages, such as go, rust, and solidity, we merge multi-line comments for two different situations using heuristic strategy. In addition to the common comments types to be filtered, we also filter out meaningless identifiers in the comments. For example, if the comments only contain numbers, these comments are meaningless. (6) Pure number comments. The pure number comments are used to represent a list or to be highlighted, then they need to be filtered out.

Manual classification
After filtering unrelated source comments, we manually labelled all the instances. To reduce the subjective bias of different annotators and the difficulty of data annotation work, we first divided all samples into three categories, including, WITHOUT_SATD, SATD and resource debt. The code comments with WITHOUT_SATD or SATD have more obvious characteristics, so they can be quickly annotated. For comments involving network resource etc., the source comments are labelled as resource debt by a second experienced programmer. According to the annotation process of Maldonado and Shihab (2015b), all labelled data are stratified and randomly sampled, and the proportion of sampled data in each project is 20%. We invited a third program participant with extensive experience to label the sampled data. The results of the annotation by different people were statistically analysed, and the statistical results showed a high consistency of data annotation.

Empirical study
In this section, we describe in detail the results of our empirical study. We firstly investigate what types of technical debt are available. Then we investigate whether SATD is prevalent and what the proportion of different technical debt is. Finally, we show the relationship between the four representative task tags and comments with SATD in blockchain software projects.

RQ1: what types of SATD exist in blockchain projects?
Motivation. We want to check how many different types of SATD exist. Because different types of code comments have different impacts, and different code comments have different management priorities, the software quality assurance team needs to consider different strategies for handling them. Our classification is based on the previous research (Liu et al., 2020;Maldonado & Shihab, 2015b).
Approach. According to Maldonado's research (Maldonado & Shihab, 2015b), there are five different types of SATD, including design debt, defect debt, requirement debt, test debt and documentation debt. Moreover, by investigating deep learning frameworks, Liu et al. (2020) extend two technical debts (Liu et al., 2020), including algorithm debt and compatibility debt. When we manually labelled source code comments, as described in Section 4.4, we followed these classification criteria. However, we also found some new types of technical debts in the labelling process, so we proposed a new type of technical debt for blockchain software systems. To clearly illustrate our findings in blockchain software systems, we will describe them by different types, while using specific examples to illustrate the basis of the classification and the possible impact of such technical debt.
To explain more clearly the basis for our classification of the different code comments, we describe the label process of code comments. Since Maldonado et al. shared their data set, a variety of classification techniques have been used to detect SATD, including pattern matching-based approach (Potdar & Shihab, 2014), natural language processingbased approach (da Silva , text mining-based approach (Huang et al., 2018), convolutional neural network-based approach (Ren et al., 2019), a twostep approach (Yu et al., 2020) and a simple heuristic matching task-annotation-tags approach (Guo et al., 2021). From these research approaches, it can be seen that the classification problem of code comments is reduced to a natural language processing or text mining problem. It is acknowledged that SATD detection requires complicated tools such as machine-learning technologies. However, Guo et al. show that task tags have a significant impact on the label process of code comments (Guo et al., 2021). With simple but effective pattern matching, excellent classification results can be achieved. We believe that this finding is closely related to the dataset used by the researchers. As shown in Table 3, the projects collected by Maldonado et al. and Guo et al. are developed in the JAVA programming language. Eclipse is able to provide a friendly programming interface for JAVA programming language. However, for the open-source blockchain software system we are studying, the programming environment used for programmer development does not provide easy code annotation shortcuts like Eclipse, so we cannot rely on these task tags alone to complete the code annotation. This brings two problems: first, we need to review in detail all code comments that may be technical debts; second, we need to make special annotations for code comments with multiple meanings. Therefore, our data annotation process is divided into samples with definite SATD, samples with no definite SATD (unchecked SATD), and samples WITHOUT_SATD. To verify the reliability of our annotation, we performed validation for some of the samples. The validation was done by reviewing the source code or by reviewing the corresponding pull request record of the source code in Github.
Result. Next we will describe the different technical debts that we categorise in blockchain software systems. We first classify code comments according to the classification criteria of Liu et al. (2020); then we investigate specific code comments in blockchain software systems and we propose a new non-negligible technical debt.
(1) Design Debt indicates a sub-optimal design, including temporary solutions, functions that need to be extended to support more functionality, etc.
We have found that in blockchain software systems, due to time or development cost constraints, although the current solution can meet the needs of the current project, it is a temporary solution. As the system is improved, these sub-optimal code segments should ideally be refined. For example, this could be done more elegantly, but for now this will do . (From diem 11 ) There are a lot of codes that needs to be refactored according to source code comments, including refactoring of method implementation, refactoring of function signatures, refactoring of function calls, etc. For example, todo: refactor so that the fee delta is calculated before inserting . (From bitcoin 12 ) We found that design debt related to resource allocation needs special attention. The proper operation of a blockchain software system may require front-end resource allocation, such as the allocation of memory resources, especially the synchronisation of block files, etc. For example, todo: consider checking if the file was just written to (which would mean that the file is still being copied). a segfault might happen in this edge case. (From chia 13 ) (2) Defect Debt indicates the presence of an obvious error in a software code segment. Unlike the aforementioned design debt which indicates that the code is a sub-optimal solution, defect debt indicates that there is a problem in the code. We should allocate more of our software assurance resources to defect debt. In blockchain software projects, defect debt is identified from two perspectives: on the one hand, there is an explicit indication in the code comments that there are errors in the code that need to be attended to, and on the other hand, the code comments explicitly indicate the errors and what caused them to occur.
The reason for the programmer to introduce defect debt is that developers probably find error code segments during collaborative development and comment such errors; or developers adopt a compromise approach to business logic, but such processing is clearly problematic and requires a later upgrade. For example, these code comments clearly indicate that there is an error in the code segment. From the previous code comments, we can see that the programmer sometimes describes the found problems in a questioning manner. They want to prompt subsequent development to focus on these identified issues.
Another type of defect debt explicitly states why the defective debt occurred, and also states that the errors must be removed in subsequent development. Perhaps the programmer has a detailed description of how to remove these errors as well. For example, we include 'kwargs' as a hack for the wallet, which for some reason allows parameters to '_start '. this is serious brain damage, and should be fixed at some point. todo: move those parameters to '__init__ '. (From chia 16 ) (3) Documentation Debt indicates the lack of documentation describing the software development process. Software development perhaps progresses rapidly, it needs to be supplemented with appropriate documentation, such as testing documentation, etc. In our empirical study of six blockchain software projects, we found less documentation debt. This does not necessarily mean that there is no documentation debt, but possibly that the programmers did not explicitly mark the presence of documentation debt. write test summary . (From diem 17 ) (4) Requirement Debt indicates the code incompleteness of methods, algorithms, etc. The code is capable of meeting the current software requirements, and the long-term software requirements are postponed. Requirement debt must be repaid or it will affect the robustness of the software functionality. For example, todo: remove this requirement by making cuckoocache not require external locks . (From bitcoin 18 ) (5) Test Debt indicates an incomplete test for source codes. These source codes perhaps have already been implemented meeting current requirement; or these source codes may require test again in the future after the code refactoring is complete. For example, todo (?) -test that if we remove several head-files, as well as data last data-file, the index is truncated accordingly right now, the freezer would fail on these conditions:1. have data files d0, d1, d2, d3 2. remove d2,d3 however, all 'normal' failure modes arising due to failing to sync() or save a file should be handled already, and the case described above can only (?) happen if an external process user deletes files from the filesystem. In the above examples, the programmer explicitly pointed out the existence of compatibility debt in the current code. In particular, as shown in the following example, an error occurs when calling the preamble version of the API. Such errors are a potential threat to the stability of software projects. If the error is not handled in a timely manner, it can have a catastrophic impact on the production system. previous versions of solidity turned this into a parser error (they wrongly recognized these functions as state variables of function type). (From solidity 22 ) (7) Algorithm Debt indicates the sub-optimal algorithm implementation or the algorithm to be implemented.
We did not explicitly find the corresponding algorithm technical debt in our investigated software projects. Part of the technical debts may have some relation to the implementation algorithm, for example, a specific function to be implemented in the future. We categorised this kind of similar technical debt to design debt or defect debt.
(8) Resource Debt indicates a new type on technical debt that cannot be identified by the code comment itself. This kind of technical debt is very hidden compared to SATD. In the process of code annotation, we found that in some projects (for example, diem, et al.), the existence of technical debt could not be fully determined by using the code comment context. For example, reqwest error's builder is private to the crate, so send out a fake request that should fail to get an error . (From diem 23 ) This code comment does not contain the task tag word, and we cannot fully confirm whether there is SATD in the code comment, so we need to check the corresponding source code. The related source code is: let test_client = ClientBuilder::new().timeout (Duration::from_millis(1)).build().unwrap(); let req_err = test_client.get(Url::parse("http://192.108.0.1") .unwrap()).send().unwrap_err(); From the source code we can see that this is a test code. Under certain test conditions, it should return an error. Whether or not this error will occur in the application needs to be determined by the parameters passed. Therefore we believe that there is a technical debt that needs to be paid in the production application software systems.
The common feature of such technical debts, which cannot be determined from the context of the code comments, is that they require the completion of specific conditions of the antecedent to execute the code that follows. For example, "spawn should succeed" completes with the precondition "executor has a free slot". This is closely related to the operating environment of blockchain software systems. The blockchain software system is a distributed software system, and the completion of its resource allocation requires the cooperation of multiple nodes.
Another situation is that we cannot determine whether this is a SATD based on the code comments, however we can identify the pending functionality of the source code segment to which the current code comment refers. For example, "by now, we waited a total of 5 seconds. off-by-two for two reasons: * the internal precision is one second * account for network delay" -(from bitcoin 24 ). The related source code is: "with self.nodes [0].assert_debug_log(expected_msgs = expected_timeout_logs): sleep(3)". In the original source code, a hard-coded method was used to wait for five seconds to confirm the closure of other connected nodes. Due to complex network conditions, there may be cases where the connection node fails to close. In this case, application developers need to perform specific software tests. This issue is reported and has been merged in GitHub . 25 In the fixed source code, the node is disconnected more gracefully by using the method of waiting to be closed.
The third case is when the framework developer actively throws a program runtime exception. However, the runtime exceptions are not handled in time due to the lack of sufficient stress testing during use. For example, "Raise an AssertionError with msg modified to identify this node" -(from bitcoin 26 ). The related source code is: "raise AssertionError(self._node_msg(msg))". This issue is reported and has been merged in GitHub. 27 Another similar example is mempool manager in chia project. "all coins spent in all conflicting items must also be spent in the new item" -(from chia 28 ). This very strong condition needed to be removed in subsequent development for exact replacement set. This issue is reported in GitHub. 29

RQ2: is SATD prevalent in blockchain projects?
Motivation. To the best of our knowledge, this is the first empirical study of self-admitted technical debt in blockchain software systems. To show more clearly the existence of technical debt in blockchain software systems, we statistically analyse technical debt from several perspectives. Approach. To present a detailed information of technical debt, we have counted technical debt in different projects according to the result of Section 4. The source code comments have been extracted and cleaned up; next, we count technical debt from another perspective. We counted the technical debt in different projects at the granularity of files and analysed the percentage of technical debt in different files. Data description from the file level is used to demonstrate the approximate distribution of technical debt. This is because these six open-source blockchain software projects use multiple programming languages, including C++, Python, go, Rust, Solidity, etc. Then we sum the code comments with SATD and resource debt as comments as technical debt, which are shown in Table 4 with "#TD".
Result. Table 4 shows the prevalence of technical debt in blockchain software systems. For every project, the total number of comments/files, the number of comments/files with technical debt, and their corresponding proportion are presented. In these six software projects, the solidity project has the highest percentage of technical debt at 28.42%, while the ethereum project has the lowest percentage at 1.71%. Moreover, at the granularity of files we see that the number of files containing SATD varies from project to project. In particular, the highest percentage of files containing SATD is found in the chia project with 21.84%, while the lowest percentage of files containing SATD is found in the solidity project with 5.62%.
In summary, we find that there is a wider range of technical debt in different blockchain software projects, with the percentage ranging from 1.71% to 28.42%; moreover, from the perspective of different files of different projects, the percentage of files with technical debt ranges from 5.62% to 21.84%. The results of this RQ reveal the common existence for SATD in blockchain software systems, which foreshadows a clear technical debt that needs to be repaid in these popular open-source projects. These technical debts were introduced proactively by programmers during the development process, and they are able to consciously deal with these issues in the future. But for application developers using these open-source frameworks, there is no way to keep abreast of these potential technical risks. Especially since blockchain systems are currently still in a phase of rapid development, the existence of these technical debts cannot be ignored.

RQ3: what is the distribution of different types of SATD in blockchain projects?
Motivation. From RQ1 and RQ2, we have conducted a qualitative analysis of technical debt in blockchain software systems. However, we need to continue with a quantitative study of the distribution of different technical debts. This quantitative study will help application developers to properly schedule software assurance resources and deal with the impact of different technical debts. Approach. We count the different technical debts in each of the six open-source blockchain software projects, counting the number of occurrences of technical debt in each project. The statistics are presented in Table 5. Based on RQ1, for every project we sum the code comments with SATD, which are shown in the second column of Table 5 with "#TD". Result.
From the results of the statistical analysis, we can see that design debt and defect debt together have the highest percentage of all software projects. We find that new technical debt, resource debt, occupies a non-negligible proportion for every project. This may be related to the fact that synchronous allocation of computing resources and network latency have a more direct impact on blockchain software systems. This is a significant difference from the study by Liu et al. (2020). Similar to previous studies, defect debt as well as design debt have a more important weight in each software project. Requirement debt, test debt, and compatibility debt are not very high in each software project. In these six software projects, there is almost no documentation debt and algorithm debt.
Therefore, for these six blockchain software projects, design debt and defect debt should receive the most attention in the allocation of resources; although the negative effect of resource debt may be weaker than design debt and defect debt, it also needs attention for technical debt management because of the high proportion in source comments. Finally, we found that there are compatibility debts in every project. There may be greater negative impact on the production system when the software projects evolve continuously. Compatibility debts also need to be paid more attention.

RQ4: can the four representative task tags(i.e. "TODO", "FIXME", "XXX" and "HACK") be used to distinguish between SATD comments and WITHOUT_SATD comments in blockchain software projects?
Motivation. According to Guo et al.'s (2021) research, the four representative task tags (i.e."TODO", "FIXME", "XXX" and "HACK") can be used to distinguish between SATD comments and WITHOUT_SATD comments for the open-source software projects based on java language. This is a very important conclusion based on data observation. This conclusion guided Guo et al. to propose a fuzzy matching-based SATD detection method. Is this method effective for blockchain software projects based on four programming language? If the method is valid, then we can directly adopt that fuzzy matching method for SATD detection. Otherwise we have to design the SATD detection classifier for blockchain software projects.
Approach. All instances are classified as "WITHOUT_SATD" or "SATD". Especially, resource debt is classified as "SATD". Because resource debt indicates a new type of technical debt that needs to be paid when running the corresponding codes. Software failures may occur when the resources needed by the software system cannot be allocated. Then we count the instances in each of the six open-source blockchain software projects for every different category. We would like to show the relationship between tasks tag and SATD categorisation in different software projects through statistical data.
Result. Table 6 reports the detailed number of sampled comments that contain each representative task tag on each project. "All Tags" denotes the number of comments that contain any one of the four task tags. "#Count" denotes the number of comments with specific type, including SATD, resource debt or WITHOUT_SATD. "#Percent" indicates the weight of comments containing tags among all comments.
Overall, code comments with task tags don't completely determine whether the code comment contains SATD. Among the six projects, the highest percentage of code comments with SATD containing the four tags is 50.74% and the corresponding project is fabirc. As shown in Table 1, the programming language is "go" language. Another project mainly using "go" language is Ethereum and the percentage of code comments with SATD containing the four tags is 18.64%. Therefore, from the programming language point of view, we cannot conclude that the programming language is related to the fact that code comments containing specific tags are marked as SATD. the lowest percentage of code comments with SATD containing the four tags is 0.6% and the corresponding project is solidity. Code comments containing SATD are not related to specific tags at all. Manual annotation of SATD can only be done by understanding the semantics of the comments. For this project, we can basically conclude that SATD detection cannot be done by fuzzy matching of task tags. However, we also see that the percentage of code comments containing specific tags ranges from 12.32% to 50.9% in five other projects except solidity. We can consider task tags as important indicators for whether the code comment contains SATD or not. Moreover, we clearly see that code comments containing specific tags have nothing to do with resource debt.

Implications
For blockchain framework developers, we offer the following advice.
(1) Code snippet with design debt should be refactored in time to improve the robustness of the code.
(2) For code snippet with defect debt, framework developers must pay special attention to them in the process of refactoring the code. Defect debt is a critical code snippet that affects whether the code can function properly. For blockchain application developers, based on our research findings, we propose the following implementation recommendations for developers.
(1) To address the widespread design debt in the code, application developers should proactively investigate the relevant source snippet, including method signatures, class descriptions, etc., to confirm whether the design debt has any impact on the functionality to be implemented.
(2) For the source snippet with defect debt, the application developer must manually pay off the technical debt. Although the number of defect debts is small, the developers need to pay more attention on defect debt. (3) For the source snippet with test debt, the application developer has to perform the appropriate software test in time. Due to the rapid evolution of blockchain software projects, timely repayment of testing debts can prevent software systems from having compatibility issues. (4) According to the result of RQ3, SATD detection should be formalised as a machinelearning-based triple classification, rather than a machine-learning-based binary classification. Source comments with different SATD should allocate different priorities for resource. We believe that the source snippet with SATD should be firstly focused, particularly defect debt. Secondly, resource debt should be reviewed in a timely manner depending on the testing resources. The complexity of blockchain software systems and issues such as synchronisation of resource requirements make this type of technical debt somewhat insidious and disruptive. Finally, the source snippet without SATD can be scheduled for software code testing at the end. (5) According to the result of RQ4, the representative task tags can't be used to distinguish between SATD comments and WITHOUT_SATD in blockchain software projects, we believe that deep learning-based techniques such as pre-trained models should be used to learn the deep semantics in code comments. To manage source code more rationally and allocate software quality assurance resources, the evaluation metrics for machine-learning classification models should consider the weights of different categories in order to better reflect the impact of different categories. For example, different weights are given to different categories (the weights are determined according to the proportion of the true distribution of the category), and each category is multiplied by the weights and then summed to calculate Precision, or Recall.

Threats to validity
Internal validity comes from two main areas, including the selection of blockchain software projects and the annotation of code comments. We selected software projects on GitHub with more stars for our research project. We also try to select blockchain system frameworks from different application domains. In addition, the number as well as the quality of code comments in software projects is a key factor affecting our empirical study. Therefore, we extracted code comments for all code files in software projects and obtained a larger dataset. Another key factor that influenced our empirical study was the personal bias of the programmers who performed the labelling. We reduce the difficulty of annotation by labelling code comments in steps, first easy and then difficult. Finally, the annotated dataset is statistically analysed and we adjust the inconsistent data, thus obtaining a credible dataset. Finally, code comments and source code may be out of co-evolution. Although the study by Fluri et al. (2007) confirms that 97% of code comments evolve in parallel with the source code, there is still the co-evolution risk.
External validity stems mainly from the generalisation ability of our empirical study. Our projects mainly come from Github and they are not commercial projects. This may affect the generalisation ability of our projects. Therefore, we choose open-source projects that are more influential and widely used, such as bitcoin. In addition, unlike previous studies, our project covers programming languages such as C++, go, Python, Rust, solidity, Java, etc. These programming languages are basically all programming languages that can be used to develop blockchain application systems. This ensures the generalisation ability of our research at the programming environment level.
Blockchain software systems are developing rapidly and gaining common attention from industry and industry. The existence of SATD in blockchain software systems may greatly damage the stability of software operation and bring catastrophic consequences if it is not given sufficient attention. However, due to the closed-source nature of commercial software systems, we cannot conduct an in-depth analysis of commercial blockchain software systems, and we will conduct further in-depth studies subsequently.

Conclusion
In this paper, we investigate the technical debt in blockchain software projects. Compared to previous empirical studies, we found six types of technical debt in these investigated blockchain software projects, including design debt, etc. Moreover, documentation debt and compatibility debt, which are widely present in other projects, are rarely present in these investigated blockchain software projects. This also indicates that there is more significant variability between different application software projects for different technical debt. We propose a new type of technical debt, resource debt, which is explicitly identified by the framework developers and requires special attention in subsequent production systems. It arises mainly from network latency, resource allocation synchronisation, etc. Our statistical analysis of software projects at two levels of granularity shows that code comments containing technical debt are prevalent. In addition, we found that in blockchain software systems, the code comments containing technical debt are not entirely determined by task tags. However, task tags can be used as an important indicator for marking SATD. This implies that deep semantic discovery models for classification should be used, such as pre-trained models, when detecting whether code comments contain technical debt.
In recent years more and more scholars have started to focus on SATD detection in various areas of software systems (Azuma et al., 2022). There are also newer research perspectives for SATD detection (Tan et al., 2022). In the future, we will conduct research on more blockchain software projects and build a deep semantic-based SATD detection model for blockchain software projects to help framework developers and application developers better manage technical debt.