Computer Vision and Action Recognition

نویسنده :Md. Atiqur Rahman Ahad

Preface vii
Foreword ix
Acknowledgments xi
List of Figures xix
List of Tables xxi
1. Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 What isAction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Action Recognition in Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Application Realms of Action Recognition . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2. Low-level Image Processing for Action Representations 9
2.1 Low-level Image Processing for Action Representations . . . . . . . . . . . . . . . . 9
2.2 Pre-processing Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Segmentation and Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Feature Detection from an Image . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.3 Corner Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.4 BlobDetectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.5 Feature Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.6 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 LocalBinaryPattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 LBP—Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Structure fromMotion (SFM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5.1 ConstraintsofFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5.2 Improvements of FM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Other Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.1 Intensity Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.2 ImageMatching and Correspondence Problem . . . . . . . . . . . . . . . . . 35
2.6.3 Camera Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.7 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3. Action Representation Approaches 39
3.1 Action Representation Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2 Classification of Various Dimensions of Representations . . . . . . . . . . . . . . . . 39
3.2.1 Bag-of-Features (BoF) or Bag-of-Visual-Words (BoVW) . . . . . . . . . . . . 40
3.2.2 ProductManifold Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Action Recognition Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.1 Interest-point-based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 HiddenMarkovModel-based Approaches . . . . . . . . . . . . . . . . . . . . 53
3.3.3 Eigenspace-based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.4 Approaches toManage Occlusion . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.5 OtherApproaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 View-invariantMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 Gesture Recognition and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 Action Segmentation and Other Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7 Affective Computing and Expression Analysis . . . . . . . . . . . . . . . . . . . . . . 63
3.7.1 Games with Emotional Involvement . . . . . . . . . . . . . . . . . . . . . . . 64
3.7.2 InteractiveArts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.7.3 Anatomically-based Talking Head . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.8 Action Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.8.1 Gestures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.8.2 BasicMotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.8.3 Fundamental Gesture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8.4 Motion Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8.5 AtomicMovement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8.6 Direction-based BasicMotion . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8.7 Distinct Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.8.8 Motion Patterns based on Symbols . . . . . . . . . . . . . . . . . . . . . . . . 69
3.8.9 BasicMovement Transition Graph . . . . . . . . . . . . . . . . . . . . . . . . 69
3.9 Gait Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.10 Action Recognition in Low-resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.10.1 Application Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.10.2 RelatedWorks on Low-Resolution Video Processing . . . . . . . . . . . . . . 72
3.11 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Salient Region and Its Associated Salient Region Construction . . . . . . . . 74
3.11.2 Biologically-inspired Visual Representations . . . . . . . . . . . . . . . . . . 75
3.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.13 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4. MHI – A Global-based Generic Approach 77
4.1 Motion History Image (MHI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 WhyMHI? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3 Various Aspects of the MHI — A Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3.1 FormationofanMHI Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3.2 Motion Energy Image (MEI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.3 Parameter—τ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.4 Parameter—δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.3.5 Temporal Duration vs. Decay Parameter . . . . . . . . . . . . . . . . . . . . . 82
4.3.6 Update Function ψ(x, y, t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3.7 FeatureVector for theMHI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4 Constraints of the MHIMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4.1 Self-occlusion Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4.2 Failure in Dynamic Background . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.4.3 Improper Implementation of the Update Function . . . . . . . . . . . . . . . 87
4.4.4 Label-based Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.5 Failure withMotion Irregularities . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.6 Failure to Differentiate SimilarMotions . . . . . . . . . . . . . . . . . . . . . 87
4.4.7 Not View-invariantMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.8 Problem with VariedMotion Duration . . . . . . . . . . . . . . . . . . . . . . 88
4.4.9 Non-trajectory Nature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 Developments on the MHI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5.1 Direct Implementation of the MHI . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5.2 Modified MHI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.6 Solutions to Some Constraints of the Basic MHI . . . . . . . . . . . . . . . . . . . . . 95
4.6.1 Solutions toMotion Self-occlusion Problem . . . . . . . . . . . . . . . . . . . 95
4.6.2 Solving Variable-lengthMovements . . . . . . . . . . . . . . . . . . . . . . . 100
4.6.3 timed-MotionHistoryImage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.6.4 Hierarchical-MHI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.6.5 Pixel Signal Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.6 Pixel Change History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.7 Motion Flow History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.8 Contour-based STV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.9 Solving View-Invariant Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.7 Motion Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.8 Implementations of the MHI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.8.1 The MHI and its Variants in Recognition . . . . . . . . . . . . . . . . . . . . . 106
4.8.2 The MHI and its Variants in Analysis . . . . . . . . . . . . . . . . . . . . . . . 106
4.8.3 The MHI and its Variants in Interactions . . . . . . . . . . . . . . . . . . . . . 107
4.9 MHI and its Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.11 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5. Shape Representation and Feature Vector Analysis 115
5.1 Feature Points Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.1.1 Two-frame-based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.1.2 Long-sequence-based Approaches . . . . . . . . . . . . . . . . . . . . . . . . 116
5.1.3 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.2 Shape Representation Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.2.1 Contour-basedMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2.2 Region-basedMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3 Moment Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3.1 HuMoments for Feature Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3.2 ZernikeMoments for Feature Sets . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.4 Component AnalysisMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.1 Appropriate Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.2 Dimension Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.5 Pattern Classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.5.3 Nearest Neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.4 Cross-validation—Partitioning Scheme . . . . . . . . . . . . . . . . . . . . . 142
5.6 EvaluationMatrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.7 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

6. Action Datasets 147
6.1 Action Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2 Necessity for Standard Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.1 Motion Capture System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.3 Datasets on Single-person in the View . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.3.1 KTH Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.3.2 Weizmann Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.3.3 IXMAS Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.3.4 CASIA Action Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.3.5 UMD Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.3.6 ICS Action Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.3.7 Korea University Gesture Database . . . . . . . . . . . . . . . . . . . . . . . . 152
6.3.8 Wearable Action Recognition Database (WARD) . . . . . . . . . . . . . . . . 153
6.3.9 BiologicalMotion Library (BML) . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3.10 HDM05 (Hochschule derMedien)Motion Capture Database . . . . . . . . . 153
6.4 Gesture Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.4.1 Cambridge Gesture Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.4.2 NATOPS Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.4.3 Keck Gesture Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5 Datasets on Social Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5.1 Youtube Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5.2 Youtube Video Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.3 Hollywood2 Human Action (HOHA) Datasets . . . . . . . . . . . . . . . . . . 155
6.5.4 UCFSportsDataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.5.5 Soccer Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.6 Figure-skating Dataset—Caltech Dataset . . . . . . . . . . . . . . . . . . . . 156
6.5.7 ADL—Assisted Daily Living Dataset . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.8 Kisses/Slaps Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.5.9 UIUC Action Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.6 Datasets on Other Arenas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.6.1 Actions in Still Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.6.2 Nursing-home Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.6.3 Collective Activity Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.6.4 Coffee and Cigarettes Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.6.5 People PlayingMusical Instrument (PPMI) . . . . . . . . . . . . . . . . . . . 161
6.6.6 DARPA’sMind’sEyeProgram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.6.7 VIRAT Video Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.6.8 UMN Dataset: Unusual Crowd Activity . . . . . . . . . . . . . . . . . . . . . . 162
6.6.9 Web Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.6.10 HumanEva Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.6.11 University of Texas Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.6.12 Other Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.7 Challenges Ahead on Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.8 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7. Challenges Ahead 173
7.1 Challenges Ahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.2 Key Challenges Ahead in Action Recognition . . . . . . . . . . . . . . . . . . . . . . . 173
7.3 Few Points for New Researchers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.5 ThinkAhead! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Bibliography 183

1394/07/27 2255 294

رمز عبور : tahlildadeh.com یا www.tahlildadeh.com

سوالات و نظرات

نظرات شما

مشخصات کتاب

Md. Atiqur Rahman Ahad

2011

انگلیسی

2255

294

0

Computer Vision and Action Recognition