08 HAR Misclassifications


Preliminary research for this project identified concrete instances of misclassification using the British public information films The Spirit of Dark and Lonely Water and A Boy Goes Cycling. In three examples presented here, the HAR system incorrectly labelled certain film frames, highlighting limitations in the model’s classification accuracy. In order to investigate further, the relevant video datasets used to train the HAR system were downloaded from YouTube, some of which are presented as a selection of stills to better understand the algorithm’s “decision-making” process.

Contents                                                                                                    Back Next

                                                                                                                                                                                                                                                                                                                                                                             Back Next

Figure 9. HAR Misclassification – driving tractor, 2020. A composite of video stills from the Kinetics 400 dataset “driving tractor” category and the British public information film The Spirit of Dark and Lonely Water, 1973. 

At the centre of fig. 9 is a frame from the British public information film The Spirit of Dark and Lonely Water, labelled “driving tractor” by the ResNet HAR system. The frame shows a group of children moving through debris in an anonymous wasteland – a scene without an actual tractor. This misclassification reveals a broader limitation of HAR and ML models: how can models trained on limited, specific examples generalise their “knowledge” to new, unfamiliar contexts? The ResNet HAR model, trained on the Kinetics 400 dataset, includes 922 videos of tractor driving, some of which depict children (top right) operating tractors. While it is unclear if this particular detail influenced the classification, it raises questions about the interpretive reliability of HAR models. For an audience, The Spirit of Dark and Lonely Water film-frames blend seamlessly as part of a coherent narrative. However, when analysed frame-by-frame by the HAR model without a guiding conceptual scheme, an alternate interpretation emerges. At this granular level, an outline resembling a tractor becomes apparent, with the children positioned roughly where a tractor driver would sit. The model’s classification appears shaped by its training dataset, where it may be responding to the tractor-like outline and the presence of a human figure as cues for labelling.

Figure 10. HAR Misclassification – swimming backstroke, 2020. A composite of video stills from the Kinetics 400 dataset “swimming backstroke” category and the British public information film The Spirit of Dark and Lonely Water, 1973.

Central to fig. 10, a frame from the British public information film The Spirit of Dark and Lonely Water depicts a child in distress in the water, yet the HAR system mislabels it as “swimming backstroke”. Other misclassifications for the same scene include “catching fish”, “snorkelling”, and “swimming breaststroke”. While these labels share a semantic association with water-related activities, they highlight a significant limitation in the HAR model’s ability to differentiate between routine swimming actions and the urgent, life-threatening condition of drowning. This misclassification underscores the model’s lack of nuance in recognising critical contexts within similar visual settings.

Figure 11. HAR Misclassification – abseiling, 2020. A composite of video stills from the Kinetics 400 dataset “abseiling” category and the British public information film A Boy Goes Cycling, 1963.

A review of the 1,146 training videos labeled as “abseiling” in the Kinetics 400 dataset revealed the common feature of linear ropes under tension. This observation is significant in relation to specific frames from the British public information film A Boy Goes Cycling, where painted lines on the playground’s surface intersect at the child’s position and are coincident with the bicycle frame, which creates a configuration of lines around his body that visually “ensnare” him. As demonstrated in fig. 11, this arrangement parallels the training videos, where the painted playground lines could be interpreted as ropes, and the bicycle frame might resemble the webbing straps of a climbing harness. This visual similarity likely prompts the HAR model to classify the scene under its “abseiling” category, despite the difference in actual context.