Multimedia vs. Analogue Text: Learning Outcome and the Importance of Short-Term Memory Capacity

The purpose of this study was to investigate the importance of individual differences in short term memory capacity (STM) for learning from film (digitized video) and analogue text in a natural learning environment. The results are based on a survey of 396 students on Bachelor's level (military cadets, teachers college and psychology majors). A short-term memory test battery was developed to measure different types and capacities of several individuals simultaneously in a classroom environment Alpha. Respondents were divided into two groups, one receiving a film presentation and one reading an analogue text (the film narrative). The subject matter was the formation of the Norwegian nation in the tenth and eleventh century (history subject at high school/college level). A knowledge test measuring the total learning outcome as well as details and interconnection (understandings/ context) was developed. In total, the results showed that texts gave the best learning outcome. Both film and text had an increased learning outcome for details and understandings in correlation with increased STM capacity, with the largest increase from low to medium capacity. Progressive capacity (successive) matters more than multicapacity (processing a lot concurrently). Non-verbal intelligence (Raven/RAPM) has an underlying general importance, but less important than the total STM capacity. Different types of capacity are more important than others depending on the presentation form and learning content. Visual sensory memory capacity for learning details in text was one of the types most clearly associated with learning outcome. This was explained by code-switching (representation transformation) during processing of information.


Introduction
What gives the best learning outcome -film or text? Research on this is relevant, not just in terms of how to organize multimediaoriented instruction, briefing and training, but also in order to understand which cognitive processes result in differences in learning outcome. This study's main research question was therefore to measure the learning outcome from film and text, and to examine what impact short-term memory capacity has on learning outcomes depending on whether the learning takes place from film or text. The study is part of a series of three consecutive studies. The present one deals with STM and learning in general while the other studies point toward specific sub-processes in STM and the effect of specific tools in multimedia (ICT).
Theoretically, the problem is based in the classic field theories of cognitive information processing theory, and the relationship between multi-presentation-based theories and empirical results related to STM's capacity range. Classic laboratory-based studies show that pictures are remembered better than words and teaching-oriented research emphasizes the fact that media with simultaneous multiple representations, such as film and ICT-based presentations, provide a better learning outcome than text [1][2][3][4][5].
On the one hand, the improved learning effect with several representations simultaneously is justified with a two-coding theory [Dual coding theory], which means coding simultaneously through multiple forms of presentation -for example, visual-verbal and / or visual-iconic and / or auditory-verbal / symbolic. This focus is, among others, represented by Wittich and Shuller [6][7][8]. Another variant of the same starting point is cue-summation theory, represented by, among others, Hartman [9][10][11]. This theory supports the dual-code theory, but adds the assumption that a better effect can occur only if the degree of relevance between stimuli is high.
On the other hand, a number of empirical studies, including the classic Miller [12], and continued studies of a number of representatives of this tradition Baddely, Conway, Kane [13][14][15] conclude that STM has a defined and limited memory span, or capacity, Miller's [12] magical number 7+2. This may mean that multipresentation does not necessarily produce better learning outcome or a stronger memory trace. Too much information at once can overload the memory span, and lead to a loss of capacity. Information is not processed thoroughly enough to create lasting memory traces.
Both of these approaches conclude that there are large individual differences in capacity and memory span in relation to verbal and visual abilities [3,[16][17][18][19][20]. The discussion is linked to visual and verbal short-term memory or channels, based on classic channel theories of information management process [21,22]. The transcoding ability of representation forms during processing or between the memory levels as well as the formation and activation of semantic networks in longterm memory (LTM) is also emphasized [23,24]. Common to these empirical studies and theoretical approaches, is that they have taken place in laboratories, and teaching materials that have been used to measure learning outcomes, have largely been designed for this purpose, often with simple pictures and words, rather than using realistic teaching material.
Furthermore, the tests measuring STM capacity (STMC) have often been outdated, being developed before the media explosion in the 1990s, and designed to be carried out individually, and not in groups or in natural learning environments such as classrooms. These tests have also concentrated on response measurements, i.e., the reaction rate between stimulus and response (e.g. CogLab). The research was characterized by both studying only the effect of learning outcome and possibly connecting the results to existing grades at school, or solely on STM tests without comparing the capacity targets towards performance within different teaching materials. Both classic and recent research on learning outcomes from various multi-media differ greatly, and do not give a clear picture of which presentation form (text or film) is best [25,26].
The argument for different learning outcome from various forms of presentation has largely been limited to explaining the individual differences in ability to process information, particularly in relation to visual and verbal capacities. However, these specific abilities, i.e., the ability or capacity types in short-term memory that are most important for learning with teaching material or presentation forms, such as text versus film, have scarcely been empirically investigated. A major reason for this is the methodological problem [2], such as finding suitable learning materials that the participants have no previous knowledge of. This has been addressed to in this study, by developing a test battery for measuring the STM group capacity, using realistic learning materials, and the experiment is carried out in a natural learning environment. Thus the development of design, test and measurement instruments have been a central part of this project, and this paper is a further adaptation of research.
The purpose of this study is to examine which STM related capacity type or working process contributes most to learning outcome from film and text. If this is achieved, it may give a useful basis for preparation within fields of special education and for briefings and interaction within emergency related environments, such as operative arenas within the defence, energy and transport sector [27]. It can also be applicable to crisis management and narrative attrition amongst a population [28].
The specific goals for this study were to examine: • The differences in learning outcome between the three capacity levels by STM while being exposed to film or text. • The relationship (correlation) between STM-capacity types and learning outcome from film and text.
In this study, the term short-term memory (STM) is utilized, but here the term also covers some of the more process-oriented components within the term working memory [29][30][31]. This term includes recoding processes and phonetic circles or double-codings [13,32]. Amongst these the term "sensory memory" (SM) is used as a concept for the first recoding process, including the identification process in the other memory stores, as long-term memory (LTM) and STM.
The study's test battery for the STMC measurement was the classic storage in 18-30 seconds, while 1-2 seconds was used for the sensory register (SR). Then the capacity ranges from 8 to 14 units (items) were used, i.e., single images or two phoneme words, and 30 items for simultaneous exposure capability. For SR, a capacity range of 4-6 items was used. Similar capacity ranges are found in other STM tests (e.g. Cog Lab) and special education diagnostic tools (Aston Index and the Illinois Test of Psycholinguistics Abilities ITPA). Memory processes or capacity types correspond to ability variables in relation to the teaching material (details or context) from film and text. Information is subjected to various recoding processes in STM and in interaction with LTM. Pattern recognition or non-verbal intelligence is defined in the model as an underlying general ability, Alpha

Methodology Samples and procedures
Sample (n=396) consisted of students at undergraduate level, including officers (n=94, Military Academy), student teachers (n=194) and a mixed group of engineer and psychology students (n=101). The sample contained 193 women and 185 men. In this investigation, the field of study was not used as a variable. The respondents were perceived as a total group, divided according to whether they were exposed to film as a presentation (n=192) or to text (n=192). Gender was not considered, but the distribution was respectively men and women 99/88 for film and 94/97 for the text group. The overall response rate was 95.5%.
The survey was carried out in the respondents' regular classroom or lecture hall, and was conducted in connection with a regular lesson. At first a brief (5 min) introduction was given, and forms for anonymity and informed consent were signed. Then followed the STM test conducted in plenary with the use of power point (about 20 min). The respondents checked off their answers on the distributed form. Finally, the educational film was seen or the text was read, followed by a knowledge test (about 20 min in total). The entire survey was completed in about 60 minutes.

Measurement of learning outcomes
Two main variables were measured in this study. One was learning outcome from digital film presentation or analogue text and the other was the short-term memory capacity (STM). Learning outcomes were measured in two samples. One group was exposed to a digital film presentation and another group received an analogue text as a presentation. The film consisted of a selected sequence of 9 minutes and 15 seconds from an educational presentation which dealt with an era in Norwegian history, the unification-conflict (800-1270 AD).
The 'Øverst i skjemaet' text material was identical with the film's narrative, with a total of 1113 words. The allotted time for text reading was 8 minutes and 25 seconds, which gave the same exposure time for both film and text, based on a normal reading speed of about 140 words per minute. The learning outcomes from both presentation forms were measured with a knowledge test consisting of 13 questions where the answers were divided equally between the film and text. The questions also measured differences in the learning material. The difference between detail and context (understandings) was emphasized [33]. The knowledge related to details required that certain dates and names were remembered, and this was measured with 9 questions. The knowledge that required context and further explanations to answers was measured by 4 questions. The knowledge test gave five response options for each of the questions. All response options were academically relevant, but only one of five was correct according to the presentations. The responses were only oriented toward verbal information, either just given verbally (also reproduced in the text), or both verbally and by text signs (in the film). The term "signs" means that important verbal information also appears as a short text file (i.e., a form of double coding).

Measurement of STM-capacity
The STMC was measured with eight subtests, where each of these measured the capacity of various features or capacity types by STM ( Table 1). The battery test was developed at the Military College, for use both in the military and for this project. The STM test was developed after studying the existing memory tests, which among other things revealed weakness with respect to the examination and testing of many students simultaneously in their natural learning environment. The test is thus developed on the basis of this STM model project and in accordance with associated cognitive theories. Selection and construction of appropriate capacity types, and associated test equipment is also derived from the context of semantic theory construction [34][35][36][37].

Raven (RAPM)
Raven matrices, with increasing degree of difficulty, exposed with next logical pattern to be identified among several options during a certain display time per matrix For the measurement of non-verbal general capacity 12 pictures were picked out from the Ravens Matrix Test (Raven Advanced Progressive Matrices, RAPM) with increasing difficulty. The time interval of exposure was 20 seconds. The correct number on this test was distributed normally within the group (n=396). Raven matrices were chosen to perform well established and documented measures of non-verbal capacity or intelligence, and to see this in the STM-tests. The 12 matrices gave a clear normal distribution and a good spread. Several of the matrices had a p-value of approximately 0.5, this indicates that each chosen Raven matrix collected a lot of variance. All Raven matrices were added to a raw score which was transformed to the derived scores. The correct number on the Raven was divided into three categories: 0-3 correct (n=80), 4-7 right (n=224) and 12.8 (n=80). This is approximately the quartiles.
To measure the capacity depending on how information is presented, certain tests were merged. Three types of capacity categories were defined: Progressive capacity (PM), multi-output capacity (MC), and sensory capacity (SM) ( Table 2).  All subtests for STM merged (not Raven) were used as a measure of the STM capacity level. This was divided into three capacity levels: low, medium and high capacity. The division was statistically accounted for as the low capacity amounted to the first quartile (26.8%) of the distribution. The medium consisted of second and third quartile (48.5%) and the highest amounted to the fourth quartile (24.7%). The total STMC was constructed including all the subtests, but with emphasize put on each individual STM test to give equal importance. This emphasis was maintained by allowing the z-score in each individual STM test to be included in the total measurement for STM capacity.

Statistical Analysis
Pearson's product-moment correlation was used to investigate the relationship between the performance of each STM test and learning outcome from film or text. A stepwise multiple regression analysis was used to clarify the significance of each test, in relation to categories and total STM. Enter procedure was also added to show non-significant β values where they were appropriate. ANOVA and MANOVA analysis were used to express differences in outcomes depending on the STM capacity level and presentation form (video or text). In this analysis the general STM capacities (and Raven) were examined and presented first. Secondly the more specific types of capacity and their impact on learning outcome were studied. Table 3 shows the average values for the learning outcome of details and context from film and text presentation. In total, the results from the text were best by an average of 7.43 correct answers of 13 possible. There was a slight significant difference between film and text in terms of learning outcome (F=3.69, p<0.05), where the text gave the best result. The difference is expressed as Cohen's d=0.20, the low overall power difference between film and text.

Difference in learning outcome from film and text in general
There were minor differences in relation to learning of details and context. However, there was a significant difference in the category learning of details, where text was the best (F=9.02, p<0.01). η 2 -values show that approximately 3% of the differences in outcome of the details can be explained by whether the respondents had been exposed to film or text.

Details
Context Total   Table 4 shows the differences in outcome depending on whether STM capacity was low, medium or high. For both film and text, there was a significant overall difference in outcome between the three capacity levels (Wilks 'λ=0.92, p<0.001, for film and Wilks 'λ=0.98, p<0.05 for text).

Learning outcome and STM capacity levels
The group that was exposed to film, revealed significant differences in learning outcome in terms of detail (F=10.82, p<0.001) and relationships (F=8.81, p<0.001) between the three capacity levels. The high-capacity STM performed significantly better than those with low STM capacity. The text group had a significant difference between capacity levels for learning outcome from details (F=7.33, p<0.05) but no marked variance when it came to relationships.
The differences in learning outcome between STM-capacity levels were larger in relation to detail than to context for both film and text. There was a significant increase in learning outcome related to details in accordance with a higher STM capacity from the group exposed to film. The same tendency applied to learning outcome of details from text, with a sharp increase from low to medium STM. This might indicate that the STM capacity is more important for the learning of details than contexts, and matters more for learning details from film than from text.
For learning of context (understandings) there was no difference in outcome between those who had medium and high-capacity STM. As shown in Table 4, there were significant differences in learning outcome from film in relation to context. The increase was in the low to medium STM capacity. In the group that had text presentation, there were no significant differences in learning outcome in relation to context between any of the groups.

The association between STM tests and learning the details and context of film and text
Overall there was a pronounced association between progressive memory capacity (PM) in general and learning outcome for both film (r=0.26, p<0.001) and text (r=0.22, p<0.01) (Pearson product moment correlation). This related especially to the verbal part of the PMcapacity, as measured by PMvv test. There were similar results for multi-capacity (MC), which consisted of three sub-tests (MviB+MviF +MvvG). Table 5 shows correlations (stepwise) between STM tests and learning details and contexts from film and text, as well as the overall learning outcome. Non-significant β-values that tend to be pronounced are indicated in parentheses. This is informative for the explanation of learning contexts by use of text.
Raven has a clear and dominant explanatory variable. While the other tests reveal some tendencies, they do not provide significant contributions. Table 5 shows that Raven largely explains most of the variance, but there are some notable exceptions. In the detailed learning outcome from film, the PMvv explained the variance by 8%, As expected, the Raven has a general explanatory tendency for learning outcome of both detail and context, and learning in general.
There are nevertheless interesting patterns of findings by the specific explanations that each STM test provides.

Discussion
Most impact studies on learning outcome from media have examined STM capacities at a general level, and the question of which presentation form provides the best learning has varied in results [26]. However, one of the reasons for the disparate results may be that these studies have solely examined the ability of the variable at a general level, and not studied specific capabilities within the total value of the STM capacity. The present study reveals that in the overall result text provides a better learning outcome than film. It also shows that learning is dependent on the STM capacity, and that the question of what is best, film or text, must be viewed in light of certain types of STM capacities and what is to be taught (details or relationships/ context).
When film was used as a presentation, there was an increased learning outcome for both detail and context in accordance with rising STM capacity. The increase was greatest from low to medium STM capacity, both for learning from film and text -and especially for text. This may imply that a special adaptation in teaching and briefing may be most beneficial for students with low capacity, in order to achieve a consistent learning outcome. For text, there was also a sharp increase from low to medium STM capacity for learning of details. The difference in outcome between the three capacity levels was greater for the learning of details than for learning of contexts of both film and text. This may imply that an additional pedagogical effort should be placed upon learning of details regardless of whether the presentation is film or text.
A possible theoretical explanation for the difference in outcome between film and text, as well as the difference between the three capacity levels may lie in the specific types of capacity in STM. Due to the verbal capacity (PMvv), the results in these tests revealed an apparent association between the progressive memory capacity (PM) and learning outcome from both film and text. The term 'progressive' may be understood as revealing a little information at a time, which is the case for both film and text. PM-tests measure the capacity of the processing of this type of information presentation. If this capacity is high, it is possible that the individual grasps more information when it is presented successively. This applies in particular to verbal information. The knowledge test only required verbal information as a basis for answering the questions, that is, the film's speech (verbal-auditory), and similarly for the text (visual-verbal). For the film the images simply functioned as a support or visualization of the verbal information, with a possible cue-summation function. However, the effect did not contribute to a better learning outcome than in text as long PMvv was high. This may be interpreted to imply that a high verbal progressive STM capacity helps to ensure that information which is presented progressively, is processed thoroughly. This leads to a better memory of the processed information, which gave positive results on the knowledge test.
Film provides a lot of information simultaneously, both visually and verbally. It might be assumed that the high multi-oriented STM capacity (MC) would be an advantage. The multi-oriented capacity tests show that there is a correlation between MC and learning outcome for both film and text, but mostly for text -according to the verbal test (MvvG). This may indicate that the multi-oriented STM capacity actually has no special significance for presentation forms which provide a lot of information on multiple forms at once (multi media), compared with the text (a single presentation). In fact, there was a greater correlation between the multi-oriented STM capacity and text, rather than for film. This may indicate that text, read thoroughly, involves the same amount of processing activities, as when a message is delivered via a multimedia.
The Raven also measured a general effect on learning outcome from both film and text. Raven was especially evident for the learning of relations/contexts from text, and gave more relevance here than any of the other STM-capacity types. This may be due to the fact that text requires a more active and independent decoding effort from the individual than film does. On the other hand film may provide pictorial information that visualizes and supports educational purposes. An independent information synthesis of this kind is a feature that may have certain similarities in pattern recognition and logical deduction as required by the Raven tests. This may explain the apparent relationship between Raven and the learning outcome of relationships from text. However, the total STM capacity has more impact than Raven for learning details and relationships from both film and text.
On the other hand, since only verbal tests were utilized, and verbal information was required as an answer, it is uncertain whether a different capacity type would have affected the learning outcome if the visual information had been included in the knowledge test [38]. However, we have seen that visual effects did not contribute significantly to the learning outcome in relation to pure text, but pictures did not reduce the learning outcome either, considering the multi-oriented capacity. Since the respondents did not know the category of the answers in advance, it might imply that they did not consciously focus solely on the verbal information during the presentation. Thus it is reasonable to believe that respondents absorbed a lot of visual information from the film which was not tested. Evidently this visual information did not overload the total capacity in a manner that interfered with the verbal information.
The relationship between the capacity of the sensory register (SM) and the learning outcome of the details from text was particularly evident. In this case the visual test made a difference (SMvi), which does not immediately seem rational. One might assume that the verbal capacity had a greater impact on learning of details from the text, especially when the knowledge test required detailed verbal answers. The result may be due to measurement errors, but it is reasonable to assume that such a mistake would turn out equally for both film and text, as both groups received the same test.
It is generally known from classical studies that simple images are often remembered better than simple words. This is often explained that the identification process between the sensory register and long term memory is activated more quickly and directly with pictures than words [39,40]. Even short two phoneme words, as applied in this study, will require a reading process.
In order for the text to make sense, representation transcoding from word to image and an activation of semantic networks takes place [24,41,42]. This comprehensive process contributes to charge the STMcapacity (when reading words compared to pictures). The more elaborate cognitive processes in the reading process can also charge verbal sensory capacity in relation to the visual sensory capacity. Therefore a better visual effect should have possibly influenced the results, but this should have proved equal for both film and text. This effect should have given the visual tests an advantage with better capacity results in relation to the verbal tests, both for the film and text group. Nevertheless, as the results from the visual sensory capacity were so obvious for learning of details from text, this might be due to parallel cognitive processes between properties measured by the visual sensory test and working processes by reading the details from the text. When a text is read, the words are connected to inner visual images, or images that give the details a meaning [43]. Sensory visual capacity measures a similar process, where recognition and activation of single images happens rapidly.
If alpha this capacity is high, it means that many images can be quickly identified and remembered in a short time. When reading a detail-rich text, the ability of immediate activation of relevant images or ideas that connect to the text, help to give the details meaning. It is reasonable to assume that this process corresponds to the cognitive processes that are measured by the visual sensory test. This may explain the relationship between learning outcome of details from the text and visual sensory capacity. Since the results did not show a similar relationship with the other tests, it must be concluded that visual sensory capacity actually has more impact than verbal sensory capacity in relation to learning of details from the text.

Conclusion
The conclusion of this study must therefore be that learning is dependent on the STM-capacity, and that different types of capacity are more important than others depending on the presentation and the learning material. Progressive capacity means more than the multicapacity, while non-verbal intelligence has a general significance, but less than the overall STM capacity. This may be a useful starting point in planning and facilitation of learning for individuals with low STM capacity. Furthermore, this knowledge may be relevant for facilitation of training which include multi-media, such as simulators, and construction of instruments in multimedia and e-learning. Even for crisis management and narrative perception amongst a population with regards to mass-communication by focusing on design and analysis of communication and news dissemination.