Measuring the effect of automatically authored video aid on assembly time for procedural knowledge transfer among operators in adaptive assembly stations

Can automatically authored videos of industrial operators help other operators to learn procedural tasks? This question is relevant to the advent of the industrial internet of things (IIoT) and Industry 4.0, where smart machines can help human operators rather than replacing them in order to benefit from the best of humans and machines. This study considers an industrial ecosystem where procedural knowledge (PK) is quickly and effectively transferred from one operator to another. Assembly tasksareproceduralinnatureandpresentacertaincomplexitythatstilldoesnotallowmachinesand their sensors to capture all the details of the operations. Especially if the assembly operation is adaptive and not fixed in terms of assembly sequence plan. In order to help the operators, videos of other operatorsexecutingthecomplexproceduraltaskscanbeautomaticallyrecordedandauthoredfrommachines.Thisstudyshowsbymeansofstatisticaldesignandanalysisofexperimentsthatexpert aidcanreducetheassemblytimeofanuntrainedoperator,whereasautomaticallyauthoredvideoaidscantransferPKbutproducinganoppositeeffectontheassemblytime.Therefore,hybridtrain-ingmethodsarestillnecessaryandtrade-offshavetobeconsidered.Managerialinsightsfromtheresultssuggestanunneglectableimpactofthechoicetodigitiseindustrialoperationstooearly.The experimentalstudiespresentedcanactasguidelinesforthecorrectstatisticaltestingofinnovativesolutionsinindustry.


Introduction
The industrial internet of things (IIoT) is part of the technologies adopted by Industry 4.0 (Wollschlaeger, Sauter, and Jasperneite 2017;Yin, Stecke, and Li 2017;Liao et al. 2017). Large fluctuations in product demand require a new manufacturing system to have rapid reactive abilities (Hwang et al. 2016). One of the opportunities offered by IIoT is that all the production machines are interconnected both among themselves and, through sensors, to the environment (Sisinni et al. 2018). Thus, machines can become context aware and the recognition of the presence of human operators with their goal-oriented behaviours opens up to new kinds of human-machine collaborations (Di Nardo, Forino, and Murino 2020;Gorecky et al. 2014;Zheng et al. 2019;Wang, Törngren, and Onori 2015). Machines, for example, can directly aid unexperienced operators and prevent human errors. This is important because humans remain an integral part of the current manufacturing environments despite an attempt to automatise all that is possible to automatise (Mital 1997). The use of industrial sensors increases day by day, and research fields such as computer vision and image analysis are improving the processing of information that can be gathered through image and depth acquisition devices, reproducing digital twins of operators and their operations (Nikolakis et al. 2019). Some researchers already focus on digitally tracking the workers with videos for offline performance measurements (Elnekave and Gilad 2006). Despite this, recognising the state of a production system from cameras and sensors requires a level of understanding that computers are partially able to handle, especially in real time. Examples of successful applications are found in automatic task segmentation from videos (Petersen and Stricker 2012), but the machine understanding when applied for example to follow assembly tasks even from a depth camera is quite limited and requires tradeoffs (Oyekan et al. 2019). Humans can fill in the gaps, when machines have a limited understanding, especially by means of digital servitisation of the collaborative tasks (Tronvoll et al. 2020), e.g. define an assembly task and its instructions, and machines can fill in the computational gap, i.e. when humans have a limited ability to replicate production tasks without introducing errors ). In such a scenario, the assembly lines of the future could become more adaptive to changes regarding both production requirements and human needs. This article aims at exploring the ability to automatically transfer procedural knowledge (PK) among operators when the PK transfer medium of choice is a video and IIoT machines and sensors mediate the PK transfer. The autonomy of learning transfer is a desired outcome among the digital transformations of the industrial production (Ardolino et al. 2018). The focus is on PK, rather than declarative knowledge (DK) because the former is defined as knowledge how (ten Berge and van Hezewijk 1999) to properly execute an industrial task, and the latter is defined as knowledge that, which mostly pertains to the ability to recognise the contextual information.
Given the premises presented as an introduction, this work aims at answering the following research question (RQ): Can operators learn PK and produce subassemblies in shorter assembly time when aided by automaticallyauthored videos of other operators or aided by expert operators by means of vocal and gesture instructions?
In order to answer the RQ, statistical design of experimental techniques is employed as it allows to plan the experiments so that appropriate data can be collected and analysed by statistical methods, resulting in valid and objective conclusions (Montgomery 2013).
The experiments ran in this study focus on automatically collecting videos from several subassembly tasks which are then used to train new operators immediately before performing the same subassembly tasks.
The experimental study is mediated by two technologies. The first one is an assembly guidance system (AGS) that is in charge of eliciting and transferring the DK components while producing an assembly sequence plan (de Giorgio et al. 2021). The second one is a smart recording/visualisation device, connected to the AGS, which is in charge of transferring the PK components of assembly knowledge through automatically authored videos.
The article structure is as follows. A thorough literature study is summarised in Section 2. The experimental setup and the scientific methodology are presented in Section 3. The results are reported in Section 4. Discussions and conclusions are presented in Section 5, together with a list of main contributions of the article, managerial insights, limitations and suggestions for future studies.

Procedural knowledge transfer with videos
The use of videos for PK transfer is an established practice to include videos in manufacturing courses (Shih et al. 2016). In particular, learning assembly processes through videos is not only possible (BalaSeshan and Janardhan Reddy 2021), but improves the overall outcome of the lessons (Dencker et al. 1999). It has also been experimented on the ability to learn both DK and PK through videos (Hong, Pi, and Yang 2018), though the literature does not present a clear structure to differentiate between them, especially when it comes to their encodings in videos. Chen, Liou, and Chen (2019) showed that videos can be used in the flipped classroom strategy to improve the learning of PK. Yildirim, Yasar Ozden, and Aksu (2001) showed that the retention of knowledge is improved by hypermedia learning. Other studies (Scheurwater 2017) report instead that using a video can be less effective than providing more precise written instructions, without clarifying what information is important to be conveyed through videos or written instructions. In contrast to that, Palmqvist et al. (2021) showed in their survey that most assembly operators do not use the instructions provided (in any form). Thilakumara et al. (2018) focused their study on comparing live demonstrations vs video demonstrations in transferring procedural knowledge. The results were in favour of the videos. However, the experimenters allowed the control group to watch only one live demonstration but the study group could watch the video demonstrations repeated times, which could have highly affected the improved performance of the group watching the videos.
The ability to automatically transfer PK from operator to operator with videos has been implemented and tested in the car manufacturing industry (Dencker et al. 1999), but there is no satisfactory indication, at the best of the authors' knowledge, of the quality of automatically authored videos to indirectly transfer PK among operators. On the other hand, it results not uncommon to use videos directly and purposely authored from operators as a medium for PK transfer to other operators (Molitor et al. 2019). Once the videos are authored, e.g. cut and saved with a title that refers to the recorded task, there are instances in which they have been played in automated ways, for example adapting the video speed to the operator's actions detected by sensors (Georgescu et al. 2019). A fundamental result found in literature is that even the less experienced operators can often teach the novice operators with some advantage: Hinds, Patterson, and Pfeffer (2001) proved that novices are better than experts in training other novices because they can express knowledge in less abstract and more basic terms. When novices try to follow the experts, they result in spending more time on tasks and doing more errors than when instructed by other novices. Thus, it is plausible to elicit knowledge from operators and transfer it to other operators, even if they are not the most expert available. The latter discovery favours the choice of not studying the operators' skills related to the quality of PK transfer but rather average out expert and novice operators' contributions, in favour of a study on PK transferability with videos automatically authored by IIoT machines.

Video recording and authoring on the shop floor
When recording the videos, there are no studies, at the best of the authors' knowledge, indicating the best positioning of a fixed camera pointed at an assembly station for the authoring of tutorial videos. However, there is a study proving an increased quality of knowledge transfer with a choice of a first-person perspective for tutorial videos. This requirement is hardly met when a camera cannot be mounted either on the operator's helmet or close to their first-person perspective. A study from Menn et al. (2017) shows that procedural instructions should make use of an international language, i.e. a visual one, which encourages the exploration of mute videos with respect to language-specific instructions in this research.
Regarding the site of recording, Styhre, Josephson, and Knauseder (2006) noted that in the construction industry, the written instructions from the designers, consisting of layouts and technical specifications, need to be translated into actual practices on the construction sites, because they are highly contextual and designers do not know the constraints of a specific construction site. A similar work by Brandt, Hillgren, and Björgvinsson (2004) is done in a hospital environment for the transfer of knowledge about medical procedures with videos directly recorded by doctors and watched by other doctors. The article observes that the use of the videos recorded in the same environment where they are used facilitates the knowledge transfer. The same implications can be suggested for assembly tasks in which a certain variety introduced by a changing environment and the decisions of the operators might compromise the optimality of a tutorial video; for instance, the video provided by an instructor does not account for the assembly station used, the possible assembly variations due to a more and more personalised production (Lu, Xu, and Wang 2020) or the presence of different operators and levels of expertise. Thus, adaptiveness becomes a required feature for the new assembly systems of Industry 4.0 (Molina et al. 2005) and continual updates of the authored videos might better suit adaptiveness over the changing assembly environment.

Alternative solutions
More innovative digital technologies than videos for providing assembly instructions exist, e.g. those based on augmented reality (Yuan, Ong, and Nee 2008); however, they require specific designs and developments for each product, and cannot be automatically recorded on the shop floor and authored by machines. Although augmented reality is still a technology under rapid development and shows strong potential advantages (Chimienti et al. 2010), it is videos that still maintain the industrial preference as they faithfully reproduce the recorded scene, can be automatically recorded and authored, and humans can directly watch them and notice details that machines might have missed.

Experimental setup and research methodology
An assembly use case is found within a manufacturing course called Tillverkningsteknik (MG1026) at KTH Royal Institute of Technology. Circa 150 students per course, twice a year, are requested to complete a fourhour laboratory exercise in which they have to produce a metal locomotive toy, see Figure 1(a). The students are divided into groups of two to five people and each group produces one locomotive. When the production of each component of the locomotive is completed, the group finishes the laboratory with an assembly task that takes about 15 minutes to complete. This scenario has several advantages. Firstly, the students are not graded on the assembly task, as it is not part of the learning outcomes of the manufacturing course. This guarantees that the researchers can make use of the assembly time and change the scenario for study and control groups that receive a slightly different education. Secondly, laboratory assistants guide the students during the tasks and the quality of the final product is not affected much by wrongly executed assemblies, as the assistants will eventually fix them. This allows the researchers to step in and replace the assistants. Note that all the laboratory exercises are performed in groups, but the researchers ask that only one member of the group performs the assembly in order to simulate a real assembly station scenario. The experimental setup is such that the continuous flow of new students provides a continually renewed source of results to observe during the iteration of several perfectly independent assembly processes, at least with respect to the assembly operators and products. The assembly tools and the experimental equipment at the assembly station are the same throughout the experiment. Such an environment would be very similar to sparse and independent assembly stations with the task of assembling low quantities of new products, ideally only one exemplar for each. In such a scenario, the operators have limited instructions and knowledge about each new assembly process, thus they have to learn it on the spot.
In this article, subassemblies S n , n ∈ N, defined and univocally identified by the set notation S n = {c 1 , . . . , c A }, where c 1 , . . . , c S n ∈ N are the identities (IDs) of the assembly components belonging to the subassembly S n . See Figure 1(b) for an overview of the locomotive assembly components and the corresponding IDs.

Assembly guidance system and recording device
The selection of assembly step is mediated by an AGS. The AGS allows an operator to autonomously decide the next assembly step, without a fixed assembly sequence plan. It does so by constraining the operator choices to the few that do not prevent the assembly from being executed until the end, leaving the optimisation of the sequence to the operator. See Figure 2 for some graphical details.
The operator's selection of the next subassembly triggers the play of the corresponding video and the subsequent recording of the assembly operation. The AGS is connected to an IIoT recording device (RD) that acquires the subassembly actions with a camera pointed at a 45 degrees angle over the assembly station. The positioning of the camera is set to face the operator, because of the inability of any head-mounted cameras to capture a stable image of the assembly station.
All the recorded videos are stored in the cloud and can be displayed on a screen installed on the assembly station, upon selection of the corresponding subassembly from the AGS. The selection criteria for the best video to display with each subassembly is part of the experimental setup and it is discussed in the next subsection.

Preparation of the experiments
The preparation of each experiment consists of resetting the assembly station to the initial conditions and describing the task to the next group of students. The assembly station is a large table; on a side of the table are the tools and all the locomotive components. These are accessible only to the researcher. The AGS is a touch screen tablet placed on the same table and accessible to the students; only one student is in charge of using it and performs the assembly after its selection.
Each group receives an explanation at the beginning of the experiment. The researcher illustrates how to use the AGS, how the tools and components are handled to them after each subassembly choice on the AGS, how and where the videos are shown. The latter operation happens only in the experiments for which the videos are effectively shown. The group is asked to complete the assembly and to decide before starting each subassembly who among the group members performs the manual operation. It is acceptable if the operator changes between one subassembly and the other because they are recorded independently. The researcher does not specifically ask to perform any assembly optimisations or to perform better than the videos in terms of time or quality of assembly. If questions on those aspects are asked, the researcher answers that the only requirement is completing the assembly using the guidance from the AGS and -if the experimental round includes videos -using the help from a video before each subassembly.

Execution of the experiments
In order to control the variability, all the experimental rounds share the same structure at the same assembly station and with the same tools. The first operation consists in planning an assembly step on the AGS. In this phase, the whole group can be involved in the discussion; one person operates on the AGS interface. When this person selects a subassembly, the researcher reads the input from the AGS, checks that the group is ready to watch, and starts the video or provides the expert aid (only if one of these aids is used in the given experimental round). Videos do not contain any sounds. When the operator in the group is ready to assemble, the researcher handles the needed tools and components to them and starts the recording. After each subassembly is completed, the researcher retrieves the tools and the completed subassembly from the operator, in order to be ready for the next step. The operation is repeated until the whole locomotive is successfully assembled. See Figure 3 for further details.

Experimental rounds
Five experiments are performed. Each experiment is called a round (see Figure 4) to differentiate the name from 'group' that is referred to each group of students executing an assembly. For each experimental round, eleven locomotive assemblies are recorded.
The first round (R1) is meant to provide a baseline for the control of the experimental conditions. The focus is  on monitoring quality issues in the manufactured parts, the modality in which instructions on the use of the AGV are given to the operators, and spotting anything that would prevent the assembly to be executed at a regular pace. Each assembly in this round is recorded to test the RD and provide test videos for the next round. Once several groups have stably performed the assembly task, the next round is started.
The second round (R2) is used to create a database of recorded videos to be shown in the fourth round (R4). In R2, all the groups are shown at least a previously recorded test video from round R1. The test videos are replaced as soon as new videos are recorded in R2. The criterion is that shorter videos shall replace the previously recorded ones. During R2, each time that a video is shown, the researcher says 'a previous group has completed this subassembly in n seconds, this is how they did it', where n is the length in seconds of the video that is shown. The video length is also displayed on the screen, under the video. Thus, the videos in this round are recorded with video aid, assuming that one short video could lead to a shorter and qualitatively better one, an assumption that is not necessary to verify. In fact, note that only one of the videos for each subassembly collected in round R2 is used for round R4 and each video is selected based on the average assembly time of all subassemblies in R2 rather than the shortest assembly time. The aim with the shortest video rule is to give motivation to the operators to perform successful subassemblies -and their related videos -in a reasonably short time.
The third round (R3) is the control experiment. This is the only case in which videos are never shown. Since there are no instructions, the researcher refrains from aiding the group, unless a clear first unsuccessful assembly attempt is performed, because the subassembly has to be completed. In the control round, videos of all the subassemblies are still recorded, even if they are not to be shown to the operators.
The fourth round (R4) consists of eleven assemblies aided by videos, in order to assess the validity of using automatically authored videos to transfer PK among operators. In this experiment, the videos that are shown are always the same for each subassembly. The selected videos are those with the closest length to the average length of all the videos recorded in round R2 for each given subassembly.
In all these rounds (R1-R4), the researcher steps in and aids a group only if the group does not properly perform a subassembly at the first attempt. When this happens, the video recording is still stopped upon the successful completion of the subassembly, unless several minutes pass and there is a major impediment. The latter case is eventually marked as an outlier and excluded from the analysis. Given the same subassembly, a video with an unsuccessful first attempt is clearly longer than a video with a successful first attempt. An average-long video might contain several attempts; however, a subsequent analysis does not reveal any double attempts in any of the subassembly videos displayed in round R4.
The fifth and last round (R5) consists of eleven assemblies aided by the researcher who acts as an expert. In this experiment, the videos are not shown. The operator selects the subassembly on the AGS and the researcher provides all the explanations necessary to perform the operation in terms of vocal and gesture instructions. The assembly task starts afterward and the expert does not interfere with it.

Quantitative analysis methodology
A two-factor factorial experimental design is selected to study the effect on the assembly time when operators are aided by automatically authored videos from other operators or aided by expert operators by means of vocal and gesture instructions. Thus, the collection of videos from rounds R3 (no aid), R4 (video aid) and R5 (expert aid) are used to determine the assembly time of each subassembly. In particular, subassemblies S 1 = {7, 12}, S 2 = {7, 9, 18}, S 3 = {8, 9, 17}, S 4 = {9, 10, 11, 19} and S 5 = {9, 11, 20, 21} are used for the quantitative study of the results because by design they are performed eleven times out of the eleven total assemblies in each experimental round. The other subassemblies may vary in number of executions because the AGS allows the operators to select customised assembly plans, thus they are excluded. Notice that five different subassemblies from the entire product assembly are selected because of their different complexity of assembly. As the factors of the experimental design are varied together, instead of one at a time, it allows for studying whether the interaction between aid and subassemblies is significant or not.
Thus, the factors selected are Aid with three levels (control, expert, video) and Subassembly with five levels (S 1 , S 2 , S 3 , S 4 , S 5 ). Since each treatment combination is replicated eleven times, a total of 3 * 5 * 11 = 156 sample data, which refers to the subassembly time expressed in seconds, is collected for the analysis. All the sample data collected is reported in the appendix. The analysis of variance (ANOVA) is employed to analyse the data. For this purpose, the statistical software Minitab, which contains a specific toolbox for the design of experimental methods, is selected for the data analysis.

Results
In order to present the results of the experiments, a statistical analysis of the sampled data is needed. Each analysis performed is presented in a separate subsection with the relative results. The main findings from the results are discussed in Section 5.

Qualitative analysis
The first step in the analysis of the results is checking the sample data distribution in a qualitative way. Since the sample size for each treatment combination is eleven, an individual value plot is selected to identify possible outliers and visualize the distribution of the data. From the individual value plot shown in Figure 5, it can be observed a possible non-constant variance in the samples. Furthermore, it indicates the presence of one possible outlier that is not removed from the analysis as the cause was not reported with the data. The graphical analysis shows comparable data, thus it is interesting to proceed to the statistical analysis.

Analysis of variance
The analysis of variance (ANOVA) follows the preliminary analysis in order to study the effects of the factors Aid and the interaction between the factors Aid and Subassembly on the assembly time. In order to assess the validity of ANOVA, the model hypotheses have to be verified. These hypotheses are on the residuals ε ij that have to be normally and independently distributed with mean zero and constant variance σ 2 (Montgomery 2013): In order to check the normality and the constant variance assumptions, the normality test (see Figure 6) and Levene's test (see Figure 7) are respectively conducted. Both tests give a p-value less than 5%, thus the hypotheses can be rejected. The residuals on the sample data are reported in the appendix.
Since the hypothesis on ANOVA is violated, the ANOVA results cannot be analysed. It is necessary to use further statistical methods to overcome this problem. Data transformations are often a very effective way to deal with the problem of non-normal responses and the associated inequality of variance (Montgomery 2013).

Box-Cox transformation
The Box-Cox method is employed to select a form of transformation to be applied to the sample data. The Box-Cox results (see Figure 8) suggest a lambda value of zero that is equivalent to use the natural log of the sample data. The transformed sample data are reported in the appendix.
On the condition that a Box-Cox transformation satisfies the ANOVA hypotheses, it is possible to run ANOVA on the transformed data and analyze the subsequent results.

Analysis of variance on transformed sample data
The ANOVA is performed again on the transformed sample data. The validity of the ANOVA hypotheses is verified on the residuals of the transformed sample data that have to be normally distributed with constant variance. To check the normality and the constant variance assumptions, the normality test (see Figure 9) and Levene's test (see Figure 10) are respectively conducted. In the normality test, a p-value of 0.391 indicates that the null hypothesis that the sample data on the residuals is normally distributed cannot be rejected with an alpha of 5%. In Levene's test, a p-value of 0.319 indicates that the null hypothesis that the sample data on the residuals have all equal variances cannot be rejected with an alpha of 5%. Thus, both the hypotheses are verified.
After investigating the underlying assumptions, it is possible to proceed with the analysis of variance on the  Table 1, both the factors Aid and Subassembly are statistically significant with a pvalue that is much less than an alpha (i.e. first type error) of 5%. On the other hand, the interaction between Aid and Subassembly is not statistically significant because the p-value is 0.676, which is much greater than an alpha of 5%.
The ANOVA results alone are not sufficient to tell which experimental round has the best mean in terms of assembly time. In order to compare the experimental rounds, it is necessary to perform a post-ANOVA analysis, as shown in the following subsection.

Post-ANOVA analysis
A mean multiple comparison test is performed to verify if the Aid means are statistically different with respect to the assembly time. A Tukey pairwise comparison is chosen and the results are shown in Table 2. All the factors Aid have means that are significantly different because they do not share the same letter. Note that the means refer to the natural logarithm of the assembly time. The expert aid turns out to be minimising the assembly time, whereas the video aid leads to a higher assembly time.
Finally, it is possible to discuss these post-ANOVA analysis results in section 5, as they carry statistical significance.

Discussions and conclusions
In this article, statistical design and analysis of experiments are employed to study whether operators can learn procedural knowledge and produce subassemblies in shorter assembly times when aided by automatically authored videos from other operators or aided by expert operators by means of vocal and gesture instructions. Thus, a two-factor factorial design is conducted with the factor Aid at three levels (control, expert, video) and the  factor Subassembly at five levels (S 1 , S 2 , S 3 , S 4 , S 5 ). Each treatment combination is replicated eleven times, leading to 165 experiments. The ANOVA results on the assembly time indicate that both factors Aid and Subassembly are statistically significant with a p-value < 0.05. On the other hand, the interaction term Aid and Subassembly has a p-value of 0.676, indicating that this interaction is not statistically significant.
Following the ANOVA analyses, a mean multiple comparison test is performed to verify if the Aid means are statistically different with respect to the assembly time. The Tukey pairwise comparison clearly separates the means of the aid levels video, control and expert in three different groups, which means that they are statistically different. The interpretation is that operators can learn procedural knowledge and produce subassemblies in shorter assembly times when aided by expert operators by means of vocal and gesture instructions. On the other hand, it appears that the same operators cannot learn procedural knowledge and produce subassemblies in shorter assembly times when aided by automatically authored videos from other operators.

Contributions to research
The main findings from the experimental results are that • Operators indeed can learn PK already from one repetition of a new subassembly, when this is preceded by expert aid on the field (the expert aid mean is less than the control mean, as reported in Table 2). The learning is confirmed by a statistically significant reduction of the subsequent assembly time with respect to the control group (no aid). • Operators can learn PK from automatically authored videos of other operators, recorded on the shop floor; however, the experimental results show a statistically significant increase in assembly time, with respect to both the control group (no aid) and expert aid assemblies (the video aid mean is greater than the control Figure 9. Normality test on the transformed sample data. mean, as reported in Table 2). Better methods have to be tested to enhance the quality of these videos in transferring PK. • An analysis of variance might not immediately show a statistically significant difference between the study groups (see the ANOVA performed in Section 4.2), but subsequent analysis methods exist (see the Box-Cox transformation in Section 4.3) and can shed light on the nature of the results. The statistical methods applied in this study can act as guidance for the practitioners in testing their innovative solutions in production, especially in light of the IIoT servitisation of Industry 4.0.

Managerial implications
Current managerial efforts to deploy digital technologies as part of the digital servitisation process do not always go without challenges for the businesses (Alghisi and Saccani 2015). Therefore, managers need to adopt an experimental approach when introducing new technologies and act more collaboratively in order to support the digital transformation within the Industrial IoT context (Jocevski, Arvidsson, and Ghezzi 2020;Tronvoll et al. 2020). By studying the transfer of procedural knowledge between operators on the assembly line, the article offers several implications for managers • The finding that an aid provided before each subassembly by an expert is better than automatically authored video aid, shows that introduction of certain digital technologies might not pay off. Therefore, there is an opportunity here for an alternative arrangement where experts instead of providing direct aid to new operators, would manually author the videos automatically recorded on the assembly lines, in order to have a technological basis to train indirectly the operators with videos. • Experiments, such as the one run in this study, are fundamental to assess the state of the industrialtechnological advance and the possible benefits of digitalisation. Managers can use this work as an example to design and run experiments, in order to verify the ability to digitise their processes (e.g. the training of industrial operators) and in turn benefit from the transformation.
Finally, not as a general implication, but as a general call, this article further contributes to an opening discussion arguing for a change in the mindset that managers have -from an all-in approach to a more discovery-driven, step-by-step approach of introducing new digital technologies in business operations (McGrath and McManus 2020).

Limitations and future research directions
Some limitations of this study have to be considered. Firstly, the assembly operators recruited for the experiment are not expert assemblers but students in production engineering courses. At the best of their abilities, they have learned and executed assemblies that are compared with a control run without aid. A baseline run was used to remove any unwanted effects from the experiment, due to the novelty of the task for every operator, including the researchers. Secondly, the product to assemble in the course, which is adopted as a use case for this study, has a certain degree of complexity that may or may not reflect the complexity of an average assembly. In order to minimise the effect of the complexity of assembly on the results, several subassemblies and the relative assembly time have been tested. Future research should aim at finding benchmark products to test. However, the technical specifications of the locomotive can be made available upon request and be reproduced for future research. Another limitation pertains to the use of videos related to subassemblies rather than the entire product, which is dictated by the need of keeping the videos short and the assembly steps variable among groups. Longer videos could trigger attention deficits or hit memory limits. Furthermore, a complete tutorial video of an entire assembly cannot be used for modular and adaptive subassemblies. Future studies should address the differences between providing subassembly videos with variable assembly sequence plans and entire assembly videos with a fixed assembly sequence plan.
The nature of the expert aid is audio-and gesturebased and a non-interactive form of aid (i.e. it does not continue during the assembly task), comparable with some evident differences to ideally top-quality videos. The fact that an expert aid can transfer PK, whereas an automatically authored video cannot, sets the outer boundaries of the PK transferring quality that can be achieved through videos. Future works should explore such a range and provide insights on what kind of setups can improve the PK of the automatically authored videos. Another relevant question is how much artificial intelligence algorithms can help to make automatically authored videos resemble human expert aid.
One last limitation has to address the assumption made in this study that successful PK transfer can be estimated by a shorter assembly time. There is no linear dependence between the two variables but, intuitively, if there is any effect on the assembly time between video aid or no aid, one can assume that PK has been transferred. This means that even if the video-aided assembly time turns out to be longer with respect to the case with no aid, PK was effectively transferred to the operators. In other words, PK can still be transferred through automatically authored videos but not successfully applied to reduce the assembly time during the first execution of the assembly task.
Future studies can address the ability to use automatically authored videos to transfer PK during training time, with the possibility for the operators to perform few repetitions of each subassembly, assimilate the transferred PK and transform it into a better practice.
Future studies should also address the implications of the result found in this study that providing an expert aid at assembly time can improve the performance of untrained operators. Even though the results on the automatically authored videos do not seem encouraging, the fact that an operator can learn PK before their first assembly execution and immediately apply it means that procedural task aids can be effective even without a training phase. This is an indication that more effective automatically authored aids have to be studied in order to replace the expert trainer.
Finally, future studies should be able to replicate these experiments for different production processes, in order to assess the quality of knowledge transfer through the use of automatically authored videos. Possible scenarios include, but are not limited to, maintenance, calibration of machinery, disassembly, and any other procedural tasks that can be recorded by cameras in order to transfer knowledge to other industrial operators.
Malvina Roci is a Ph.D. candidate in Production Engineering at KTH Royal Institute of Technology. Her research focuses on developing analysis methods and tools to support the manufacturing industry in its transition from linear to circular manufacturing systems that are economically viable and environmentally sustainable. In particular, she works with complex systems modelling and simulation for enhanced decision-making in the context of circular manufacturing systems.