07 More Mimicry in Human-Action Datasets


The mimetic actions of machine learning engineers inspired the development of the Synset_Gloss project.

Contents                                                                                                    Back Next

                                                                                                                                                                                                                                                                    Back Next

Another example of mimetic actions used to train a HAR model comes from the University of Texas at Dallas (UTD) video dataset, which contains 27 categories of human actions. This dataset was used in the development of ideas for the Synset_Gloss project and was subjected to analysis by the Kinetics 400 HAR model; the creative merits of using this particular system were discussed in 02 Legacy Systems. By feeding the UTD dataset into this model, I was able to observe a number of different limitations, stemming largely from the coarse taxonomy of the legacy HAR model. In one instance, video clips show UTD researchers mimicking a bowling action. However, since the Kinetics dataset does not include bowling as one of its defined action categories, the model resorts to finding the nearest matches, in effect dissecting continuous motion into the rigid classifications demonstrated in Video 1. Obscuring the fluid and relational nature of human movement, the intended action of bowling was classified as several unrelated categories, including “front raises”, “side kick”, “jumpstyle dancing”, “juggling soccer ball”, “squat”, and “lunge”. This illustrates a critical challenge for the computer model: while it can interpret and label the constituent movements (an impressive feat in itself), the interpretive gap of machine vision highlights the lack of a contextual scheme to combine them into a meaningful, coherent action.

Video 1. 12. bowling, 2020, 02:02. Development video for Synset_Gloss. No audio.