副标题:无

作   者:

分类号:

ISBN:9780857294791

微信扫一扫,移动浏览光盘

简介

Many real-world applications of pattern recognition (PR) systems require human post-processing to correct the errors committed by machines. This can create bottlenecks in recognition systems, yielding high operational costs. This important text/reference proposes a radically different approach to this problem, in which users of a system are involved during the recognition process. This can help to avoid later errors and reduce the costs associated with post-processing. The book also examines a range of advanced multimodal interactions between the machine and the users, including handwriting, speech and gestures. Topics and features: Presents a thorough introduction to the fundamental concepts and general PR approaches for multimodal interaction modelling and search (or inference) Provides numerous examples and a helpful Glossary Includes work carried out in the context of the Spanish research program Multimodal Interaction in Pattern Recognition and Computer Vision (MIPRCV), which involves more than 100 highly-qualified researchers from ten research institutions Discusses approaches for computer-assisted transcription of handwritten and spoken documents Examines systems for computer-assisted language translation, interactive text generation and parsing, relevance-based image retrieval, and interactive document layout analysis Reviews several full working prototypes of multimodal interactive PR applications, including live demonstrations that can be publicly accessed through the Internet Addressing the emerging field of interactive and multimodal systems in a fresh, unified and integrated way, this unique book is highly recommended reading for graduate students, academic and industrial researchers, lecturers, and practitioners working in the field of pattern recognition. Dr. Alejandro H茅ctor Toselli is an Associate Professor at the Department of Computer Systems and Computation of the Polytechnic University of Valencia, Spain. Dr. Enrique Vidal and Dr. Francisco Casacuberta both hold the title of Full Professor at the same institution.

目录

Multimodal Interactive Pattern Recognition and Applications 2
Foreword 4
Preface 6
Contents 9
Chapter 1: General Framework 15
1.1 Introduction 16
1.2 Classical Pattern Recognition Paradigm 17
1.2.1 Decision Theory and Pattern Recognition 21
1.3 Interactive Pattern Recognition and Multimodal Interaction 23
1.3.1 Using the Human Feedback Directly 25
1.3.2 Explicitly Taking Interaction History into Account 26
1.3.3 Interaction with Deterministic Feedback 26
Example: Interactive Human Karyotyping 27
1.3.4 Interactive Pattern Recognition and Decision Theory 29
1.3.5 Multimodal Interaction 30
Basic Multimodal Fusion 31
Using Interaction Information to Help Decoding Non-deterministic Feedback Signals 31
Example: Non-deterministic Feedback in Human Karyotyping 33
1.3.6 Feedback Decoding and Adaptive Learning 34
Example: Adapting HTR Feedback Models in Human Karyotyping 35
1.4 Interaction Protocols and Assessment 35
1.4.1 General Types of Interaction Protocols 36
Example: Human Karyotyping Interaction Protocols 37
1.4.2 Left-to-Right Interactive-Predictive Processing 38
1.4.3 Active Interaction 38
1.4.4 Interaction with Weaker Feedback 39
1.4.5 Interaction Without Input Data 39
1.4.6 Assessing IPR Systems 40
1.4.7 User Effort Estimation 40
1.5 IPR Search and Confidence Estimation 41
1.5.1 \ 42
Example: Word Graph of Human Karyotypes 44
1.5.2 Confidence Estimation 47
Example: Estimating Confidence Measures of Human Karyograms 48
1.6 Machine Learning Paradigms for IPR 49
1.6.1 Online Learning 50
Bayesian Statistics 52
1.6.2 Active Learning 54
1.6.3 Semi-Supervised Learning 55
1.6.4 Reinforcement Learning 55
References 57
Chapter 2: Computer Assisted Transcription: General Framework 60
2.1 Introduction 60
2.2 Common Statistical Framework for HTR and ASR 61
2.3 Common Statistical Framework for CATTI and CATS 63
2.4 Adapting the Language Model 65
2.5 Search and Decoding Methods 65
2.5.1 Viterbi-Based Implementation 66
2.5.2 Word-Graph Based Implementation 67
Example: Word Graph of Handwritten Text 68
2.6 Assessment Measures 71
References 71
Chapter 3: Computer Assisted Transcription of Text Images 73
3.1 Computer Assisted Transcription of Text Images: CATTI 74
3.2 CATTI Search Problem 75
3.2.1 Word-Graph-Based Search Approach 76
3.2.2 Word Graph Error-Correcting Parsing 76
3.3 Increasing Interaction Ergonomics in CATTI: PA-CATTI 78
3.3.1 Language Model and Search 80
3.4 Multimodal Computer Assisted Transcription of Text Images: MM-CATTI 82
3.4.1 Language Model and Search for MM-CATTI 85
3.5 Non-interactive HTR Systems 87
3.5.1 Main Off-Line HTR System Overview 87
Off-Line HTR Preprocessing 87
Off-Line HTR Feature Extraction 90
Modeling and Search 91
3.5.2 On-Line HTR Subsystem Overview 91
On-Line HTR Preprocessing 92
On-Line HTR Feature Extraction 92
Character, Word and Language Modeling and Search 93
3.6 Tasks, Experiments and Results 93
3.6.1 HTR Corpora 94
ODEC-M3 94
IAMDB 96
CS MANUSCRIPT 97
UNIPEN Corpus 98
3.6.2 Results 100
Baseline Off-Line HTR Results 100
Baseline On-Line HTR Results 101
CATTI Results 101
PA-CATTI Results 103
MM-CATTI Results 104
3.7 Conclusions 106
References 108
Chapter 4: Computer Assisted Transcription of Speech Signals 111
4.1 Computer Assisted Transcription of Audio Streams 112
4.2 Foundations of CATS 112
4.3 Introduction to Automatic Speech Recognition 113
4.3.1 Speech Acquisition 113
4.3.2 Pre-process and Feature Extraction 114
4.3.3 Statistical Speech Recognition 114
4.4 Search in CATS 115
4.5 Word-Graph-Based CATS 115
4.5.1 Error Correcting Prefix Parsing 116
4.5.2 A General Model for Probabilistic Prefix Parsing 117
4.6 Experimental Results 119
4.6.1 Corpora 120
4.6.2 Error Measures 121
4.6.3 Experiments 121
4.6.4 Results 122
4.7 Multimodality in CATS 125
4.8 Experimental Results 127
4.8.1 Corpora 127
4.8.2 Experiments 128
4.9 Conclusions 128
References 129
Chapter 5: Active Interaction and Learning in Handwritten Text Transcription 130
5.1 Introduction 130
5.2 Confidence Measures 132
5.3 Adaptation from Partially Supervised Transcriptions 133
5.4 Active Interaction and Active Learning 133
5.5 Balancing Error and Supervision Effort 135
5.6 Experiments 137
5.6.1 User Interaction Model 137
5.6.2 Sequential Transcription Tasks 138
5.6.3 Adaptation from Partially Supervised Transcriptions 139
5.6.4 Active Interaction and Learning 140
5.6.5 Balancing User Effort and Recognition Error 141
5.7 Conclusions 143
References 143
Chapter 6: Interactive Machine Translation 145
6.1 Introduction 146
6.1.1 Statistical Machine Translation 146
6.2 Interactive Machine Translation 148
6.2.1 Interactive Machine Translation with Confidence Estimation 150
Confidence Measure for IMT 150
6.3 Search in Interactive Machine Translation 151
6.3.1 Word-Graph Generation 151
6.3.2 Error-Correcting Parsing 152
6.3.3 Search for n-Best Completions 153
6.4 Tasks, Experiments and Results 154
6.4.1 Pre- and Post-processing 155
6.4.2 Tasks 155
6.4.3 Evaluation Measures 155
6.4.4 Results 156
6.4.5 Results Using Confidence Information 158
6.5 Conclusions 159
References 160
Chapter 7: Multi-Modality for Interactive Machine Translation 163
7.1 Introduction 163
7.2 Making Use of Weaker Feedback 164
7.2.1 Non-explicit Positioning Pointer Actions 164
7.2.2 Interaction-Explicit Pointer Actions 166
7.3 Correcting Errors with Speech Recognition 167
7.3.1 Unconstrained Speech Decoding (DEC) 168
7.3.2 Prefix-Conditioned Speech Decoding (DEC-PREF) 169
7.3.3 Prefix-Conditioned Speech Decoding (IMT-PREF) 169
7.3.4 Prefix Selection (IMT-SEL) 170
7.4 Correcting Errors with Handwritten Text Recognition 170
7.5 Tasks, Experiments and Results 172
7.5.1 Results when Incorporating Weaker Feedback 172
7.5.2 Results for Speech as Input Feedback 173
7.5.3 Results for Handwritten Text as Input Feedback 175
7.6 Conclusions 176
References 177
Chapter 8: Incremental and Adaptive Learning for Interactive Machine Translation 179
8.1 Introduction 179
8.2 On-Line Learning 180
8.2.1 Concept of On-Line Learning 180
8.2.2 Basic IMT System 181
8.2.3 Online IMT System 182
8.3 Related Topics 184
8.3.1 Active Learning on IMT via Confidence Measures 184
8.3.2 Bayesian Adaptation 184
8.4 Results 185
8.5 Conclusions 186
References 186
Chapter 9: Interactive Parsing 188
9.1 Introduction 189
9.2 Interactive Parsing Framework 191
9.3 Confidence Measures in IP 193
9.4 IP in Left-to-Right Depth-First Order 195
9.4.1 Efficient Calculation of the Next Best Tree 196
9.5 IP Experimentation 197
9.5.1 User Simulation Subsystem 197
9.5.2 Evaluation Metrics 198
9.5.3 Experimental Results 199
9.6 Conclusions 200
References 201
Chapter 10: Interactive Text Generation 203
10.1 Introduction 203
10.1.1 Interactive Text Generation and Interactive Pattern Recognition 204
10.2 Interactive Text Generation at the Word Level 205
10.2.1 N-Gram Language Modeling 206
10.2.2 Searching for a Suffix 207
10.2.3 Optimal Greedy Prediction of Suffixes 207
10.2.4 Dealing with Sentence Length 211
10.2.5 Word-Level Experiments 212
10.3 Predicting at Character Level 213
10.3.1 Character-Level Experiments 213
10.4 Conclusions 215
References 215
Chapter 11: Interactive Image Retrieval 216
11.1 Introduction 216
11.2 Relevance Feedback for Image Retrieval 217
Related Work 217
11.2.1 Probabilistic Interaction Model 217
11.2.2 Greedy Approximation Relevance Feedback Algorithm 220
11.2.3 A Simplified Version of GARF 221
11.2.4 Experiments 221
WANG Database 221
11.2.5 Image Feature Extraction 222
Color Histograms 222
Tamura Features 223
11.2.6 Baseline Methods 223
Simple Method 223
Relevance Score 223
Rocchio Relevance Feedback 223
11.2.7 Discussion 225
11.3 Multimodal Relevance Feedback 225
11.3.1 Fusion by Refining 226
11.3.2 Early Fusion 226
11.3.3 Late Fusion 227
11.3.4 Proposed Approach: Dynamic Linear Fusion 229
11.3.5 Experiments 230
11.3.6 Discussion 232
References 232
Chapter 12: Prototypes and Demonstrators 234
12.1 Introduction 235
12.1.1 Passive, Left-to-Right Protocol 235
12.1.2 Passive, Desultory Protocol 237
12.1.3 Active Protocol 238
12.1.4 Prototype Evaluation 238
12.2 MM-IHT: Multimodal Interactive Handwritten Transcription 238
12.2.1 Prototype Description 239
User Interaction Protocol 239
12.2.2 Technology 240
IHT Engine 240
Web Interface 241
12.2.3 Evaluation 242
Assessment Measures 243
Corpus 243
Participants 243
Apparatus 243
Procedure 243
Design 244
Discussion of Results 244
Limitations of the Study and Conclusion 246
12.3 IST: Interactive Speech Transcription 246
12.3.1 Prototype Description 247
User Interaction Protocol 247
12.3.2 Technology 248
Prediction Engine 248
Communication Module 249
12.3.3 Evaluation 249
12.4 IMT: Interactive Machine Translation 249
12.4.1 Prototype Description 250
User Interaction Protocol 250
System Interaction Modes 251
12.4.2 Technology 251
Interactive CAT Server 252
Web Interface 252
12.4.3 Evaluation 253
12.5 ITG: Interactive Text Generation 253
12.5.1 Prototype Description 254
System Architecture 255
User Interaction Protocol 255
System Interaction Modes 256
12.5.2 Technology 256
12.5.3 Evaluation 257
12.6 MM-IP: Multimodal Interactive Parsing 258
12.6.1 Prototype Description 258
User Interaction Protocol 259
12.6.2 Technology 261
Parsing System 261
Web Interface 262
12.6.3 Evaluation 262
12.7 GIDOC: GIMP-Based Interactive Document Transcription 262
12.7.1 Prototype Description 262
Block Detection 264
Line Detection 264
Training 265
Transcription 266
User Interaction Protocol 267
12.7.2 Technology 267
12.7.3 Evaluation 267
12.8 RISE: Relevant Image Search Engine 268
12.8.1 Prototype Description 268
User Interaction Protocol 268
Web Interface 269
12.8.2 Technology 269
Starting from Scratch 270
Gathering Images 270
Automatic Image Tagging 270
Retrieval Procedure 271
12.8.3 Evaluation 271
12.9 Conclusions 271
References 272
Glossary 274
Index 277

已确认勘误

次印刷

页码 勘误内容 提交人 修订印次

    • 名称
    • 类型
    • 大小

    光盘服务联系方式: 020-38250260    客服QQ:4006604884

    意见反馈

    14:15

    关闭

    云图客服:

    尊敬的用户,您好!您有任何提议或者建议都可以在此提出来,我们会谦虚地接受任何意见。

    或者您是想咨询:

    用户发送的提问,这种方式就需要有位在线客服来回答用户的问题,这种 就属于对话式的,问题是这种提问是否需要用户登录才能提问

    Video Player
    ×
    Audio Player
    ×
    pdf Player
    ×
    Current View

    看过该图书的还喜欢

    some pictures

    解忧杂货店

    东野圭吾 (作者), 李盈春 (译者)

    loading icon