A comparative effect of high involvement load versus lack of involvement load on vocabulary learning among Iranian sophomore EFL learners

Abstract This study aimed to compare the impact of high involvement load versus lack of involvement load on vocabulary learning among Iranian sophomore EFL learners. In a cross-sectional control and experimental group design, two sophomore intact BA classes majoring in translational studies were participated in this study. The first experimental group (the one with high involvement load) was required to sit for the first reading comprehension test. Here, the first Vocabulary Knowledge Scale (VKS1) was utilized to see whether any learning had occurred in the target vocabulary chosen from the reading. At the same time, in another place, the second experimental group (the one with lack of involvement load) was given the second reading comprehension task. After two weeks, they were given VKS2 and scores were gathered. Upon analyzing the data through independent samples t-test, it was revealed that exposing learners to high level of involvement load can play a significant role in developing English vocabulary. Furthermore, it was divulged that vocabulary retention acquired with high level of involvement load was not significant though these were better remembered by the participants than the ones with lack of involvement load. Such results can be used by teachers of English as a foreign language (EFL) and vocabulary instructors to design effective reading activities with proper difficulty levels.

Abstract: This study aimed to compare the impact of high involvement load versus lack of involvement load on vocabulary learning among Iranian sophomore EFL learners. In a cross-sectional control and experimental group design, two sophomore intact BA classes majoring in translational studies were participated in this study. The first experimental group (the one with high involvement load) was required to sit for the first reading comprehension test. Here, the first Vocabulary Knowledge Scale (VKS1) was utilized to see whether any learning had occurred in the target vocabulary chosen from the reading. At the same time, in another place, the second experimental group (the one with lack of involvement load) was given the second reading comprehension task. After two weeks, they were given VKS2 and scores were gathered. Upon analyzing the data through independent samples t-test, it was revealed that exposing learners to high level of involvement load can play a significant role in developing English vocabulary. Furthermore, it was divulged that vocabulary retention acquired with high level of involvement load was not significant though these were better remembered by the participants than the ones with lack of involvement load. Such results can be used by teachers of English as a foreign language (EFL) and vocabulary instructors to design effective reading activities with proper difficulty levels.

PUBLIC INTEREST STATEMENT
No one can deny the importance of words in learning any language because without vocabulary nothing can be conveyed. Experts of Second-Language Acquisition (SLA) have suggested a variety of theories to explain the processes involved in vocabulary acquisition. The involvement load hypothesis suggests that retaining unfamiliar second language (L2) words depends on how much the task involves, i.e. the greater the learner is involved in the task, the more effective the vocabulary learning is. Focusing on this issue, this study tried to check the effect of high involvement load versus lack of involvement load on vocabulary learning. After analyzing the data, it was revealed that exposing learners to high level of involvement load can play a significant role in improving English vocabulary. Keywords: Involvement load; vocabulary learning; vocabulary knowledge scale; sophomore EFL learners

Introduction
In a second-language learning, the value of vocabulary is so paramount that it makes it a must to learn and teach. To show the importance, Thorburry (2002, p. 13) cites from the linguist David Wilkins "without grammar very little can be communicated, nothing can be transmitted without vocabulary". All languages were made up of words and were first born as words. Vocabularies rarely avoid being invented in languages, and the process of learning current and other yet to be born vocabulary is therefore a non-stop process. The consistency of coinage, acquisition, and learning is something that language consumers are continually grappling with even in their first language (Anova, Antoni, & Kasyulita, 2015;Shakibaei, Shahamat, & Namaziandost, 2019). Moreover, teachers have accepted the importance of vocabulary because it is clear that vocabulary leads to effective communication. Students strive to express their feelings with precision, coherence, and clarity with a prodigious word strength, although this management is not about mastery.
The importance increases when it comes to learning a second language. The language learner finds him/herself immersed in studying the massive pile of foreign lexemes and their various facets. Most second-language learners and their teachers are well aware that learning a second language involves learning a significant number of words. Therefore, when presented with the undertaking, they are so apprehensive (Begriche, 2014;Namaziandost, Shatalebi, & Nasri, 2019). It therefore seems ironic that work into the learning of vocabulary is not as abundant as other topics in the development of second language. Consequently, vocabulary competence is often seen as a vital resource for second-language learners because a restricted second-language vocabulary impedes successful communication. While emphasizing the importance of learning vocabulary, it can be emphasized that lexical literacy is central to communication skills and second-language acquisition.
Recently, interest in vocabulary-related topics has renewed. Authored and edited books dedicated exclusively to vocabulary reflect a resurgence of interest in L2 vocabulary (Fhonna, 2014;Namaziandost, Hashemifardnia, & Shafiee, 2019). Much attempts have been done to explain how L2 vocabulary can be learned under different learning environments and what factors affect the success and trends of L2 acquisition of vocabulary (Haratmeh, 2012;Nasri, Namaziandost, & Akbari, 2019). With regard to vocabulary, the relationship between vocabulary knowledge and language skills has been one of the recurring themes in the second-language acquisition field, particularly with regard to reading. The problem involved the relationship between vocabulary learning and casual comprehension, to narrow it down. In vocabulary learning study, a general, if not universal, presumption is that words are accidentally acquired in reading and that this learning is fundamental. A description of indirect learning is in order before furthering the debate. Studies on the growth of both the first and second languages support the conclusion that most vocabulary acquisition happens spontaneously as learners try to understand new words that they hear or read in a given context. This experience was deemed "incidental" because it happens when learners concentrate on something other than learning the word themselves (Paribakht & Wesche, 1999).

Research questions
The purpose of this study was to answer the following research questions: (1) Is there any significant difference between the high-involvement load group vs. lack of involvement load group across two vocabulary tasks?
(2) Is there any difference between vocabulary retention of high involvement load group vs. lack of involvement load group?

History of vocabulary in language learning
As Zimmerman (1998) states, despite its importance for language learners, the role of vocabulary has been overlooked throughout its various stages. Issues such as grammar skills, contrastive interpretation, reading, or writing received great attention and interest from scholars and students, while vocabulary teaching and learning in science and methodology were neglected (Jeon & Yamashita, 2014). This apparent confusion could be attributed to the assumption that L2 vocabulary learning should take place immediately or be simply absorbed as the vocabulary in the native language (L1) (Alqahtani, 2015;Linderholm, Kwon, & Therriault, 2014). In addition, linguists attributed greater importance to syntax and morphology "more important to linguistic science and more relevant to language pedagogy" (Zimmerman, 1998, p. 5). Such a misunderstanding of vocabulary learning resulted in a learner's lexical deficit and, thus, failure to develop natural expression and writing (Grasparil & Hernandez, 2015;Kharaghani & Ghonsooly, 2015). To put historical trends in vocabulary instruction in a nutshell, the following paragraphs will provide a comprehensive summary of the teaching methods that were common in the nineteenth and twentieth centuries.
At the outset of the nineteenth century, the Grammar Translation System was the dominant system of language teaching. This stressed clear teaching and precision of grammar as the system became in essence regulated, while little attention was paid to vocabulary. The content focused on the reading and comprehension of literary texts (Solak & Altay, 2014;Yang, 2015;Zimmerman, 1998). Vocabulary items were extracted solely from reading texts and students were provided with the required vocabulary in the form of bilingual lists of words (Alqahtani, 2015).
The Direct Approach originated by the end of the nineteenth century when GTM refused to guarantee language use and then concentrated on language interpretation. This approach approached learning from a naturalistic point of view, stressing learning by listening and access to feedback and communicating in the second place. It was believed that through practice, language could be learned spontaneously. Clear and common daily vocabulary and phrases are taught through example or a mixture of ideas (Ari, 2014;Hazrat, 2015).
In 1972, Hymes brought forward the idea of communication competence that stressed the communicative and functional dimension of language learning. It resulted in shifting the focus from "precision" to "appropriateness" vocabulary. In other words, the Communicative Language Teaching (CLT) approach was introduced by the emphasis on language use for effective and purposeful conversation rather than grammatical precision. Although being a meaning-based approach, vocabulary has been granted a 'secondary status' that has been used as a tool for 'simple language' problems, such as how to apply (Eckerth & Tavakoli, 2012;. Likewise, there were few guidelines on how to manage CLT vocabulary on the basis that L2 vocabulary would be immediately taught, such as L1 vocabulary (Feng, 2015).
The Natural Approach emerged in 1977, similar to the Communicative Language Teaching and other communication methods being developed. This based on interpretation, or clear data, without taking into consideration grammatical analysis, or resorting to translation (Richards & Schmidt, 2002;Sawaki, Quinlan, & Lee, 2013). Because vocabulary is the carrier of meaning, the approach was seen as fundamental to the process of language learning (Sidek & Rahim, 2015;Zimmerman, 1998).
Currently, vocabulary has gained unprecedented importance in the realm of language teaching and all the parties (teachers, learners, material developers). Additionally, language specialists concentrated on the need to build a systematic and methodical approach to terminology for instructional writers, teachers, and learners. This growing interest in vocabulary has inspired an emerging body of experimental studies, pedagogical resources, and computer-aided science, most of which aim to provide teachers and learners with answers to questions, such as, what does it mean to know a word? (Decarrico, 2001;Etemadfar, Namaziandost, & Banari, 2019).

Knowing a word
Words are not separate sub-parts of languages, but sections of complex and interrelated structures. As a consequence, word comprehension has many facets and dimensions that learners need to be able to use words properly and efficiently (Van Polen, 2014). Therefore, one must be aware of the meaning of information.
Categorizing word knowledge into receptive or passive knowledge and productive or active knowledge is a general concept. Through receptive knowledge, we apply to those words that can be recognized when heard or read (listening and reading), whereas active information is the capacity to use and touch words that are used in speech and writing (speaking and writing skills). Since this is to some degree a practical practice, this dimension of word awareness has been modified by a number of institutions and content creators into word lists that can be separated into terms that can be learned either passively or actively (Webb, 2013). Nevertheless, this categorization of words into passive and active words may not explicitly be described in mind "because effective passive skills also allow the reader or listener to consciously predict the words that will arise." The above facets of word knowledge are very important for learning and teaching foreign languages. Unfortunately, in the classroom, a lot of these facets of communication, such as form and context, are given greater importance, but only a few experiments have taken into account the use of terms and the idea of using words to use them in different language skills. Nevertheless, the purpose of the present study was to account for this shortcoming by suggesting the hypothesis of the' task-induced involvement load' of Laufer and Hulstijn (2001). This is explained in the section below.

Involvement load hypothesis and reading skill
After investigating studies on the relationship between reading and vocabulary acquisition, Paribakht and Wesche maintain "these studies all point to the role of reading processes in the development of vocabulary, but an inconsistent one, not generally the most successful" (1997, p. 175). To account for the above drawbacks, the goal should be to regularly improve the development of vocabulary. Although many experiments on incidental learning and vocabulary acquisition have been performed, no attempt has been made to establish a theoretical framework for these studies, i.e. how tasks employed for incidental learning can be used to improve vocabulary acquisition and what are the components of those tasks (Soleimani, Rahmanian, & Sajedi, 2015). Keating (2008) notes that one of the researchers and language instructors' key obsessions is to recognize activities that are most likely to tackle incentives to help learners appreciate and expand on new words. Laufer and Hulstijn (2001) introduced a new concept called Involvement to cause theoretical and analytical research in the field of L2 vocabulary. These are comprised of three aspects of motivation and cognition: desire, quest, and assessment. In what follows, attempts are made through abstract description and exemplification to shed light on the dimensions. They (2001) describe the Need aspect as the "drive to fulfill the requirements of the mission by which task specifications can either be enforced externally or self-imposed." (p. 14). Besides the description, Need is defined as either "moderate" or "solid." It is implied by the former that the need is explicitly forced on the learner, i.e. it is not self-imposed. The need is placed unconsciously by the learner in the case of the latter. If, for instance, a learner is required to make a statement using his/her teacher's language, the need is enforced by the teacher externally, i.e. moderate need.
In comparison, the Need is assumed to be self-imposed, i.e. heavy, when the same learner chooses to use a language that he himself wants to use in his results. There, Laufer and Hulstijn (2001) claim that the need is high because the learner himself is self-imposed on this. What has been addressed so far has been the motivational component of the Involvement framework. The other two to be discussed are the cognitive aspects of the construct. The first one is Search. Search is described as 'trying to find the meaning of an undefined word L2 or attempting to find the word L2 to explain a definition (e.g. trying to find an L2 version of a word L1) by checking a dictionary or other source (e.g. a teacher) ' (2001, p. 14). Evaluation is the third aspect of the Involvement model. Laufer and Hulstijn (2001) describe it as "a similarity of a word with other terms, a particular sense of a word with its other meanings, or a combination of the word with other words to decide whether or not a word (i.e. a pair of forms) matches its context" (2001, p.14).
Once, the elements are characterized by "weak" and "solid" variants in the case of Assessment. "Moderate" evaluation includes identifying discrepancies in a given context between words or variations in multiple meanings of a word. On the other hand, if evaluation calls for judgments on the additional words to be paired with the new word in the original phrase or letter, we will refer to it as a "solid" evaluation. Multiple Choice (MC) assessment method is a good choice to demonstrate the "solid" and "weak" classification. In MCs, test takers are expected from four alternatives to choose the correct answer. In this case, test takers compare the alternatives to the stem sentence and the product of this contrast is the chosen response. What is really being achieved here is to compare the alternatives to each other to see which one is the most correct answer.
Through informing activities with different degrees of need, quest, and assessment, the idea of involvement can be operationalized and can therefore be subjected to empirical investigation. This is what we are pursuing in this report. The present study, however, begins with two research questions and the accompanying hypotheses to test the conclusions. Keyvanfar and Badraghi (2011) examined how, as suggested by Laufer and Hulstijn (2001), word learning and retention in a second language relied on the participation load (i.e. the amount of need, quest, and assessment it imposes). Based on their performance on a Cambridge reading test, 77 male and female pre-intermediate Iranian EFL students were chosen for this reason. Then, they were divided into three classes, each completing one of the three tasks of vocabulary learning which differed in the amount of participation they caused. The assignments were to translate, read, and write sentences, plus fill-in target words. Once the tasks were finished and two weeks later, the actual understanding and memory of the 14 new keywords of the tasks were checked. The findings of their analysis showed that learners gained more from the formation of sentences that involved the association of new words with words already used for development purposes. As final remarks, they concluded that the aspect of the assessment may play a major role in the load of involvement induced by the mission.

Previous studies on involvement load hypothesis
Furthermore, AsadzadehMaleki (2012) tried to figure out if word learning and retention in a second language depend on the Involvement load hypothesis of a task, i.e. the amount of need, quest, and assessment. Based on the results of the immediate and delayed post-test vocabulary retention, she concluded that there was a significant difference in retention effects between the three tasks, which showed the relevance of the Involvement Load Hypothesis, suggesting that tasks with higher involvement load lead to better retention effects.
In another research, Li (2014) tested Laufer & Hulstijn's Involvement Load Hypothesis by a different methodology from that adopted by the previous similar studies. That is, instead of focusing only on the product of learning, they attended to the details of learners' task-induced online learning behavior via a specially designed computer program. Eighty-one participants were randomly assigned to one of the four tasks with different amount of involvement load. Once completing the task, the participants were unexpectedly tested on the retention of the target words that appeared in the texts. Two weeks later they were given two delayed posttests. The data were analyzed both quantitatively and qualitatively. The results suggested that different tasks did elicit different patterns of on-line learning behavior in terms of frequency of look-ups and amount of time spent on target words. It was also found that tasks assumed with higher involvement load hypothesis did not necessarily lead to higher retention scores.
Moreover, Hazrat (2015) has tried to help Iranian English teachers and content creators meet the needs of learners by putting into practice the Involvement Load Hypothesis and the Depth of Processing Hypothesis. The findings indicated that the writing assignment resulted in more longterm efficient vocabulary learning compared to the tasks of reading and communicating, being equal in terms of the participation load. The role of communicating was the least effective in its effects on vocabulary learning.
In the same vein, Pourakbari and Biria (2015) sought to investigate the efficacy of Task-induced Involvement in Incidental Lexical Development of Iranian Senior EFL Students. For this purpose, based on the scores obtained from an Oxford Placement Test (OPT) administered to the population of senior EFL students at Khorasgan University, six samples, twenty-five each, were selected and assigned to work with a list of English words by utilizing six different tasks. Each task was gauged by applying a different involvement load. Subsequently, a receptive and a productive vocabulary tests were administered as post-tests to specify the degree of learners' acquisition of target words, through the role of incidental Task-induced involvement load. The results revealed that the group doing the task with the highest degree of involvement load obtained the best results on the vocabulary tests.
In addition, Amini and Maftoon (2017) investigated whether word learning and retention in a second language are contingent upon a task's involvement load, i.e., the amount of need, search, and evaluation the task imposes. Laufer and Hulstijn (2001) contend that tasks with higher degrees of these three components induce higher involvement load, and are, therefore, more effective for word learning. To test this claim, 64Iranian intermediate EFL learners were selected based on their performance on the Preliminary English Test (PET). The participants were randomly assigned to two equal groups. Each group completed different vocabulary learning tasks that varied in the amount of involvement they induced. The tasks were jigsaw task (Group A) and information gap task (Group B). During the ten treatment sessions, recall and retention of the 100 unfamiliar target words were tested through immediate and delayed posttest. Data were analyzed using repeated measure ANOVA. The results indicated that learners benefited more from the jigsaw task with higher involvement load. Rahmani, Jafari, and Izadpanah (2018) examined the effect of four types of post-reading-based tasks with different index of task-induced involvement load (Laufer & Hulstijn, 2001) on EFL learners' recognition and recall of unfamiliar L2 vocabularies. To this end, 88 intermediate EFL learners were randomly assigned to four groups and were instructed to employ four different tasks after reading two narrative texts: (1) simple sentence writing; (2) text summary writing; (3) creative sentence writing; and (4) imaginary story writing. A day after the output activity session, the participants took two post-tests: the production test and the recognition test. Three weeks later, the delayed post-tests were administered. Mixed ANOVA (Split-plot) was run to compare the performances of the groups on immediate and delayed post-tests. The results revealed that there were overall significant within-group and between-group differences among four groups of the study both in immediate and delayed posttests. The creative sentence writing group outperformed in comparison to the other three groups.
In the same line of inquiry, Un-udom (2018) investigated the effect of initial vocabulary learning and retention by employing the involvement load hypothesis since the hypothesis consists of need, search, and evaluation which concern a motivational-cognitive construct resulted in vocabulary learning and retention. A number of 58 EFL learners were divided into two groups, Group A and Group B, respectively, to perform tasks. Group A did the task by building a sentence with a target word shown in the marginal gloss while group B constructed a sentence by searching the meaning on a bilingual dictionary. The two sentence writing tasks with the target words were designed with different levels of involvement. The results of this study presented no significant difference between two tasks and it partially supported the involvement load hypothesis since it revealed that the task with low involvement gained more vocabulary knowledge than the task with high involvement and it affected both initial vocabulary learning and retention.

Participants
Two sophomore intact BA classes in translational studies were participants of the present study. The average number of students fell from 20 to 30 in each class. Twenty were chosen from each class and allocated to each of the experimental groups. Their ages ranged from 18 to 28, too. They were both male and female, although the focus of this research was not on gender. They were working out of eight in their seventh semester. They had completed approximately 80 units in total until the sixth semester; 70 units included translation-related courses.

Reading tasks
Two reading tasks were used in the study with the target words in bold print to help the participants notice the words. The first task was a reading comprehension passage. The reading passage was an article chosen from Reading Master (Liu et al., 2002). The passage consisted of 395 words. It was about the suppression of emotions and the potential menaces of such behavior to the mental and physical health of human beings. The two groups had just to read the text to answer its multiple-choice items. Since the participants had to know the meanings of the target words to answer the comprehension questions, they were told to bring their own dictionary to class and use it when necessary. It has to be noted that all the students in the reading group had already been trained by their teachers before the study began and knew how to use a dictionary. Since it was necessary in this task to use the dictionary to find and figure out the meaning of the polysemous words, all the three involvement components of need, search, and evaluation were present. As Laufer and Hulstijn (2001) claim, this type of reading task triggers neither Need (being irrelevant to the task) nor Search for the meaning (because of the glossary provided) and no evaluation; in other words, the involvement load for the task is 0 (-+-+-= 0). Here minus symbol means the absence of the three components of the task as defined by Laufer and Hulstijn (2001). Thus, the load of the task was at its lowest level possible and accordingly, as Laufer and Hulstijn (2001) claim, the task triggers the lowest possibility of incidental acquisition of target vocabularies.
In the second task, the participants had to read the same reading passage with the target words omitted. The target words were placed on top of the page in a random order. Having completed the Gap-fill task, they had to answer the same comprehension questions as the first group. In this task, the need component was moderate, because it was externally induced, i.e., by the task itself. There was no search component since students were provided with the glosses and they did not have to look up the words in a dictionary. In order to fill in the blanks with the correct words, the candidate words provided by the researcher had to be evaluated against one another to determine their contextual appropriacy. The task motivated a moderate amount of evaluation. Based on the involvement load hypothesis the involvement index of the task was 2 (+ (1) need,-(0) search, + (1) evaluation).
The two texts of reading have been monitored for their level of difficulty. Attempts have been made to have a degree of low difficulty.

Target words
The total number of nine vocabularies (agitate, malicious, quarrel, unfortunately, appeasement, intention, condition, withstand, preserve) was tested in the study. The criterion based on which the vocabularies was selected was their frequency, namely, the target vocabularies should have a low frequency. That has been decided upon deliberately to make sure that participants have not been exposed to the vocabularies beforehand. So, these vocabularies were selected based on the Nation's classification (2001) of vocabularies. Also, to control for the intrinsic difficulty (the syllables might affect the acquisition of the vocabularies), all the target vocabularies were selected with the same number of syllables. In the study, the researchers planned to select vocabularies with two syllables. Moreover, to be representative of the scope of vocabularies, nouns, verbs, and adjectives were given an equal share of three out of nine. Functional words were not the target of the present study.

Vocabulary knowledge scale (VKS)
The (VKS) was developed in the context of research on the vocabulary development of ESL learners in a university setting (Paribakht & Wesche, 1993a, 1993b. This instrument captures in a relatively efficient way certain stages in the initial development of core knowledge of given words. Paribakht and Wesche (1997) assert that the VKS should be viewed as a practical instrument for use in studies of the initial recognition and use of new words. The VKS instrument uses a scale combining self-report and performance items to elicit both self-perceived and demonstrated knowledge of specific words in written form. The scale ratings range from complete unfamiliarity, through recognition of the word and some idea of its meaning, to the ability to use the word with grammatical and semantic accuracy in a sentence. Our primary goal in developing the VKS has been to capture initial stages or levels in word learning that are subject to accurate self-report or efficient demonstration, and that are precise enough to reflect gains during a relatively brief instructional period.
(1) I don't remember having seen this word before.
(2) I have seen this word before, but I don't know what it means.
(3) I have seen this word before, and I think it means ……………. As represented in the proposed categories of VKS, a given learner may indicate one of the five levels of vocabulary knowledge. Based on the knowledge levels proved by learners' performance, Paribakht and Wesche (1997) proposed the following scoring system for the knowledge levels as follows ( Figure 1): Self-report categories possible scores meaning of scores I.

V.
1 2 3 4 5 *The word is not familiar at all. *The word is familiar but its meaning Is not known. *A correct synonym or translation is given. *The word is used with semantic appropriateness In a sentence. *The word is used with semantic appropriateness And grammatical accuracy in a sentence. The possible scores for a word on this instrument and their relationship to the self-report categories are given in Figure 3.2. As it is illustrated, wrong responses in self-report categories III, IV, or V will lead to a score of 2 (Figure 1). A score of 3 indicates that an appropriate synonym or translation has been given for self-report categories III or IV. A score of 4 is given if the word is used in a sentence demonstrating the learner's knowledge of its meaning in that context but with inaccurate grammar (e.g., a target noun used as a verb: "This famous player announced his retire") or a mistakenly conjugated or derived form is given (e.g., "losed" for "lost"). A score of 5 reflects both semantically and grammatically correct use of the target word, even if other parts of the sentence contain errors.

Data collection procedures
The study's entire procedure revolved around two stages. In the first point, the target words and the VKS were given to participants of the first group to test that they did not know the vocabulary. The research omitted those familiar with vocabularies. Simultaneously, the second experimental group (E2) was given the second reading comprehension task in another place. After finishing the exam, they were given the VKS1 to check the vocabulary learning in the target vocabularies, if any, as the result of the treatment. At this time, we had the preliminary results of the treatment. But, because we aimed to measure the retention of vocabularies after the immediate and incidental, use is made of the VKS2 after two weeks. This time participants of the two experimental groups (E1 & E2) did not sit for reading comprehension tests. They were just given the list of vocabularies they were exposed to two weeks back during the reading comprehension tasks. They had to answer the questions included in the VKS2.

Data analysis procedure
For aims of data analysis, first participants' answers on tests were transformed into scores. Then, scores of the two groups of experimental groups on VKS were analyzed and compared through statistical procedures in SPSS 21. To this end, use was made of samples independent t-test to compare the performances on VKS1. Then, to compare performances of both groups on delayed vocab test (VKS2), paired independent t-test was run.

Results
Before the administration of the reading comprehension and the VKS tests, it was necessary to check that the participants were not familiar with the selected words. For this aim, all the 40 participants were given the words and a VKS. Their scores were recorded and mean was calculated for each participant. Results showed that students were neither familiar nor knew the meaning of the word. Then, they were randomly divided into two groups: two lists of 20 students.
The first research question posed by the present study was: Is there any significant difference between the high-involvement load group vs. lack of involvement load group across two vocabulary tasks? For answering it, prior to the application of the treatment, both groups were provided with the related reading passages and after that, the VKS was administered. This VKS was called VKS 1 because it was immediately given to the students after the reading tests. Results can be seen in the following Tables 1 and 2. It is evident that the mean obtained from students with high level of involvement load is higher than that obtained from those with lack of involvement load. To check any significant difference between the performances of these two groups, t-test was run and results are presented in Table 2.
In Table 2., it can be seen that Sig. value is 0.000 (Sig. < 0.05) and there is a significant difference between the performances of the two groups. In other words, the group with high level of involvement load showed a significantly better performance than that of the group with lack of involvement load.
The second research question in this study was "Is there any difference between vocabulary retention of high involvement load group vs. lack of involvement load group?" To provide an answer to it, the same procedures run for the first research question were carried out.
The mean of the group with high level of involvement load is 2.10 while for the group with no involvement load it is 1.70. What is notable is the sudden drop in the performance of the group with high level of involvement load, from 4.3 in VKS1 to 2.10 in VKS 2. However, for the group with no involvement load, the mean negligibly increased from 1.60 to 1.70 which is an indication of no effect of retention on exposure to involvement load.
To check any significant difference between the performances of the groups on VKS2 (retention), independent samples t-test was run on the scores obtained from it after the interval (Table 3).
As it can be seen in Table 4, the results of independent samples t-test indicated that the difference between mean scores obtained from the two groups was not significant in the VKS2. In other words, after separation from the reading passage, the group with high level of involvement load also showed to have a high mean score on VKS. However, the group with lack of involvement load had no big change between performance on VKS 1 and VKS 2.  All in all, from this figure, it can be said that the mean scores of both groups reached to almost equal values in the retention test, though the one with high level of involvement load was slightly higher.

Discussion and conclusion
The findings of the present study agree with the results of a number of studies in the literature. Some of the previous studies concluded that higher levels of involvement loads did render better levels of acquisition on the part of the learners. The present study concluded that higher involvement load led to better acquisition of vocabularies in reading tasks, so it is in line with the line of inquiry in the literature. Findings obtained by Amini and Maftoon (2017), Hazrat (2015), Un-udom (2018), and Keyvanfar and Badraghi (2011) were in agreement with those obtained by the present study. The same as the findings obtained by the present study, they also came to the conclusion that involvement load can play significant roles in rendering better vocabulary acquisition. Also, all these studies along with the present one imply that accurate manipulation of this involvement load is also important and requires utmost attention to the target students, level of task difficulty, classroom environment, level of proficiency, etc.
More specifically, Keating (2008) indicated that vocabulary learning and retention in a second language are contingent upon a task's involvement load (i.e. the amount of need, search, and evaluation it imposes), as proposed by Laufer and Hulstijn (2001). In a similar vein, Pourakbari and Biria (2015) approved the benefits of using high involvement load tasks by proposing that teachers and language learners can use tasks with higher involvement indexes regardless of their type in order to improve their vocabulary acquisition.
Moreover, Amini and Maftoon (2017) lent support to the present finding by claiming that participants with high cognitive capacity used the verbal information for their credibility attribution. Similarly, Un-udom (2018) supported one type of task by expressing that shared control over task-selection led to higher task involvement. In other words, they supported the application of tasks leading to higher learning outcomes combined with more effort directly invested in learning.
The present results were also in line with those obtained by Keyvanfar and Badraghi (2011). They also rectified the examination of tasks for the incorporation of vocabularies and suggested that the evaluation component might be playing the major role in task-induced involvement load.
Finally, Lu and Huang (2009) who examined three types of tasks believed that the one with higher involvement load produced better vocabulary retention compared to other types. On the other hand, a number of studies came to contrary findings claiming that levels of involvement load did not render different levels of vocabulary acquisition. Such findings were obtained by Rahmani et al. (2018) and Pourakbari and Biria (2015). For them, "involvement load hypothesis" has not yet provided a definite answer to vocabulary acquisition in spite of its inclusiveness. However, these findings are in line with the second phase of the present study in that results were not significant after the retention.
All in all, it can be said that the findings of the present study were two-fold. Firstly, positive and significant effects of implementation of reading tasks were approved for better acquisition of vocabularies. Better results were obtained when the participants were more involved with the vocabularies within the reading tasks. Secondly, it became obvious that the same positive and significant effects of higher involvement loaded tasks were not evident in the delayed posttest. Nonetheless, although these results were not significant, it was found that members of the group with higher involvement load showed a better performance on the delayed posttest than those of the group with lack of involvement load exposed to them.
To sum up, the present study provided strong support for the involvement load construct of Laufer and Hulstijn (2001). Increasing the overall Involvement load Levels was found to be effective in promoting vocabulary acquisition through tasks. Additionally, controlling for learnerrelated deficiencies such as dictionary use habits, writing skills and attention span which were present in Haratmeh (2012) and Van Polen (2014) was found useful for the components of search and evaluation to take effect. In a nutshell, the present study provided strong support for the predictive value of the task-induced involvement load hypothesis, which many studies could not achieve due to learner-related factors. It can be suggested that keeping in mind the specific features of the learners in a specific context, the construct of involvement load can be exploited to design different tasks conducive to incidental vocabulary acquisition.
Regarding the importance of vocabulary in EFL/ESL contexts, incorporation of vocabularies with appropriate levels of difficulty is an important issue. So great attention needs to be paid to the selection of suitable vocabularies for the EFL/ESL learners. In this regard, such importance can be better felt when referring to Wilkin who believed that without grammar very little can be conveyed, without vocabulary nothing can be conveyed. Regarding the present findings, it can be claimed that designing tasks with high involvement loads can serve new strategies and solutions to current problems within the realm of vocabulary acquisition. Findings can be employed by EFL teachers, instructors, and vocabulary trainers in helping them design tasks of reading with appropriate levels of difficulty with vocabularies with suitable involvement load level. In so doing, they can check vocabularies to be incorporated in the reading tasks for their levels of involvement load and then present them to the participants. In this sense, they can train learners with better vocabulary knowledge.
Also, EFL textbook designers can make use of the present findings in order to incorporate tasks with high level involvement loads. These can be included with reading tasks to make reading a more involving and attracting activity. Besides, they might be able to apply high involvement load vocabularies on other language skills than reading. For instance, practitioners of listening, writing, and speaking skills can provide their students with tasks whose vocabulary involvement loads would be higher. In such way, vocabularies which are indispensable parts of every piece of L2 materials can be better and more efficiently acquired.
In sum, using the framework of task-induced involvement load and specifically the findings of the present study, both material designers and teachers can prepare reading materials accompanied by tasks which both serve checking reading comprehension and direct the learners' attention to the meanings of specific words, which are assumed to be important for the learners. Especially, tasks that require learners to search the meanings of the target words and create a text by incorporating these words can be designed for promoting vocabulary gain and retention. Also considering the dynamics of the classes such as class time and learners' competence in writing, the most suitable task among equally loaded tasks can be preferred over the others. Designing tasks that will best suit the proficiency level and capabilities of the learners will be of crucial importance as overwhelming tasks will be no way beneficial as the participants cannot attend to the target words thoroughly. Similarly, employing tasks that are not challenging enough for a specific audience will not bring about the expected vocabulary expansion.
Further studies can investigate different levels of vocabulary involvement loads and test them for their effects on each of the other three language skills than reading. Moreover, the same reading tasks as those introduced by the present study can be examined by future investigations for MA students of English, BA, MA students of English, or more generally BA and MA students of different fields in ESP courses, and EFL/ESL learners of different levels of proficiency. This could result in interesting findings that would be beneficial to better vocabulary teaching and learning.
Since the present study was generally dealing with two levels of involvement loads, namely high level and lack of involvement load, future studies can take account of a higher number of involvement loads. In this sense, a diversity of tasks with different levels of involvement loads can be explored. Results would be beneficial to practitioners of vocabulary and they would have a wide range of tasks to use for a specific group of learners. This could result in peculiar ways to teach vocabulary and could make English learning a more attractive activity. A final recommendation for further research could be administration of different delayed posttests with different time intervals. The present study tested delayed posttest after a single time interval. Future studies can take account of posttest with different time intervals to see which time interval would serve as the best for the retention of vocabularies with high involvement loads.
Like any other study, this study was not out of limitations. First of all, this study chose its participants from a single university and therefore it did not have a large sample size. As it was mentioned in the participants section, the participants of this study were 40 students at the BA level in Shahrekord University. Due to this fact, a word of caution should be taken into account in generalizing the present findings. Secondly, it should be noted that this research employed two reading tasks with two levels of involvement load as its instructing materials. This can be a limitation since it needs to be taken into account that one could use a higher number of involvement loads such as "lack of involvement load", "low level of involvement load", "medium level of involvement load", "high level of involvement load", etc. This diversity might be taken into account if results are expected to be more comprehensive. Moreover, the third limitation refers to the sampling of the participants. In case the participants had been randomly chosen from different universities, more reliable results would have been obtained. In addition, these findings would have been more generalizable if participants had been chosen from among learners of different EFL contexts. If so, results would have been generalizable to a larger group of Iranian EFL learners of a specific level of language proficiency.