CausalBatch: solving complexity/performance tradeoffs for deep convolutional and LSTM networks for wearable activity recognition

Pellatt, Lloyd and Roggen, Daniel (2020) CausalBatch: solving complexity/performance tradeoffs for deep convolutional and LSTM networks for wearable activity recognition. ISWC/UbiComp 2020, Virtual Cancun, 12 Sep 2020 - 16 Sep 2020. Published in: UbiComp-ISWC '20: Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computersnternational Symposium on Wearable Computers. 272-277. ACM ISBN 9781450380768

[img] PDF - Published Version
Restricted to SRO admin only

Download (776kB)

Abstract

Deep neural networks consisting of a combination of convolutional feature extractor layers and Long Short Term Memory (LSTM) recurrent layers are widely used models for activity recognition from wearable sensors - -referred to as DeepConvLSTM architectures hereafter. However, the subtleties of training these models on sequential time series data is not often discussed in the literature. Continuous sensor data must be segmented into temporal 'windows', and fed through the network to produce a loss which is used to update the parameters of the network. If trained naively using batches of randomly selected data as commonly reported, then the temporal horizon (the maximum delay at which input samples can effect the output of the model) of the network is limited to the length of the window. An alternative approach, which we will call CausalBatch training, is to construct batches deliberately such that each consecutive batch contains windows which are contiguous in time with the windows of the previous batch, with only the first batch in the CausalBatch consisting of randomly selected windows. After a given number of consecutive batches (referred to as the CausalBatch duration t), the LSTM states are reset, new random starting points are chosen from the dataset and a new CausalBatch is started. This approach allows us to increase the temporal horizon of the network without increasing the window size, which enables networks to learn data dependencies on a longer timescale without increasing computational complexity. We evaluate these two approaches on the Opportunity dataset. We find that using the CausalBatch method we can reduce the training time of DeepConvLSTM by up to 90%, while increasing the user-independent accuracy by up to 6.3% and the class weighted F1 score by up to 5.9% compared to the same model trained by random batch training with the best performing choice of window size for the latter. Compared to the same model trained using the same window length, and therefore the same computational complexity and almost identical training time, we observe an 8.4% increase in accuracy and 14.3% increase in weighted F1 score. We provide the source code for all experiments as well as a Pytorch reference implementation of DeepConvLSTM in a public github repository.

Item Type: Conference Proceedings
Keywords: Neural Networks, Deep Learning, LSTM, Activity Recognition, Wearable Computing, Best Practices, Batch Training
Schools and Departments: School of Engineering and Informatics > Engineering and Design
SWORD Depositor: Mx Elements Account
Depositing User: Mx Elements Account
Date Deposited: 11 Dec 2020 15:25
Last Modified: 11 Dec 2020 15:25
URI: http://sro.sussex.ac.uk/id/eprint/95598

View download statistics for this item

📧 Request an update