微信扫一扫,移动浏览光盘
简介
"Contemporary speech synthesis is perceived as inadequate for general adoption for user interaction, largely because it rests on an inadequate model of human speech production and perception. This book reviews the underlying model, brings out areas of inadequacy and suggests how improvements might be made. It is argued that a greater understanding of the fine detail of speech will enable new research and application initiatives. The authors draw on their extensive experience in both theoretical and applied research to bring forward proposals for producing more natural sounding synthetic speech."
"Developments in Speech Synthesis provides the basis for a comprehensive approach that will appeal to speech synthesis and language technology engineers specialising in building dialogue systems. It will also be a resource for computer science and engineering students at both advanced undergraduate and postgraduate levels, as well as researchers in the general field of speech synthesis."--BOOK JACKET.
目录
Contents 7
Acknowledgements 15
Introduction 17
How Good is Synthetic Speech? 17
Improvements Beyond Intelligibility 17
Continuous Adaptation 18
Data Structure Characterisation 19
Shared Input Properties 20
Intelligibility: Some Beliefs and Some Myths 21
Naturalness 23
Variability 24
The Introduction of Style 26
Expressive Content 27
Final Introductory Remarks 29
Part I Current Work 31
1 High-Level and Low-Level Synthesis 33
1.1 Differentiating Between Low-Level and High-Level Synthesis 33
1.2 Two Types of Text 33
1.3 The Context of High-Level Synthesis 34
1.4 Textual Rendering 36
2 Low-Level Synthesisers: Current Status 39
2.1 The Range of Low-Level Synthesisers Available 39
2.1.1 Articulatory Synthesis 39
2.1.2 Formant Synthesis 40
2.1.3 Concatenative Synthesis 44
Units for Concatenative Synthesis 44
Representation of Speech in the Database 47
Unit Selection Systems: the Data-Driven Approach 48
Unit Joining 49
Cost Evaluation in Unit Selection Systems 51
Prosody and Concatenative Systems 51
Prosody Implementation in Unit Concatenation Systems 52
2.1.4 Hybrid System Approaches to Speech Synthesis 53
3 Text-To-Speech 55
3.1 Methods 55
3.2 The Syntactic Parse 55
4 Different Low-Level Synthesisers: What Can Be Expected? 59
4.1 The Competing Types 59
4.2 The Theoretical Limits 61
4.3 Upcoming Approaches 61
5 Low-Level Synthesis Potential 63
5.1 The Input to Low-Level Synthesis 63
5.2 Text Marking 64
5.2.1 Unmarked Text 64
5.2.2 Marked Text: the Basics 64
5.2.3 Waveforms and Segment Boundaries 66
5.2.4 Marking Boundaries on Waveforms: the Alignment Problem 67
5.2.5 Labelling the Database: Segments 70
5.2.6 Labelling the Database: Endpointing and Alignment 71
Part II A New Direction for Speech Synthesis 73
6 A View of Naturalness 75
6.1 The Naturalness Concept 75
6.2 Switchable Databases for Concatenative Synthesis 76
6.3 Prosodic Modifications 77
7 Physical Parameters and Abstract Information Channels 79
7.1 Limitations in the Theory and Scope of Speech Synthesis 79
7.1.1 Distinguishing Between Physical and Cognitive Processes 80
7.1.2 Relationship Between Physical and Cognitive Objects 81
7.1.3 Implications 81
7.2 Intonation Contours from the Original Database 81
7.3 Boundaries in Intonation 83
8 Variability and System Integrity 85
8.1 Accent Variation 85
8.2 Voicing 88
8.3 The Festival System 90
8.4 Syllable Duration 91
8.5 Changes of Approach in Speech Synthesis 92
9 Automatic Speech Recognition 95
9.1 Advantages of the Statistical Approach 96
9.2 Disadvantages of the Statistical Approach 97
9.3 Unit Selection Synthesis Compared with Automatic Speech Recognition 97
Part III High-Level Control 99
10 The Need for High-Level Control 101
10.1 What is High-Level Control? 101
10.2 Generalisation in Linguistics 102
10.3 Units in the Signal 105
10.4 Achievements of a Separate High-Level Control 106
10.5 Advantages of Identifying High-Level Control 106
11 The Input to High-Level Control 109
11.1 Segmental Linguistic Input 109
11.2 The Underlying Linguistics Model 110
11.3 Prosody 112
11.4 Expression 114
12 Problems for Automatic Text Markup 115
12.1 The Markup and the Data 116
12.2 Generality on the Static Plane 117
12.3 Variability in the Database\u2013or Not 118
12.4 Multiple Databases and Perception 121
12.5 Selecting Within a Marked Database 121
Part IV Areas for Improvement 125
13 Filling Gaps 127
13.1 General Prosody 127
13.2 Prosody: Expression 128
13.3 The Segmental Level: Accents and Register 129
13.4 Improvements to be Expected from Filling the Gaps 131
14 Using Different Units 135
14.1 Trade-Offs Between Units 135
14.2 Linguistically Motivated Units 135
14.3 A-Linguistic Units 137
14.4 Concatenation 139
14.5 Improved Naturalness Using Large Units 139
15 Waveform Concatenation Systems: Naturalness and Large Databases 143
15.1 The Beginnings of Useful Automated Markup Systems 145
15.2 How Much Detail in the Markup? 145
15.3 Prosodic Markup and Segmental Consequences 148
15.3.1 Method 1: Prosody Normalisation 148
15.3.2 Method 2: Prosody Extraction 149
15.4 Summary of Database Markup and Content 151
16 Unit Selection Systems 153
16.1 The Supporting Theory for Synthesis 153
16.2 Terms 154
16.3 The Database Paradigm and the Limits of Synthesis 155
16.4 Variability in the Database 155
16.5 Types of Database 156
16.6 Database Size and Searchability at Low-Level 158
16.6.1 Database Size 158
16.6.2 Database Searchability 160
Part V Markup 161
17 VoiceXML 163
17.1 Introduction 163
17.2 VoiceXML and XML 164
17.3 VoiceXML: Functionality 164
17.4 Principal VoiceXML Elements 165
17.5 Tapping the Autonomy of the Attached Synthesis System 167
18 Speech Synthesis Markup Language (SSML) 169
18.1 Introduction 169
18.2 Original W3C Design Criteria for SSML 169
Consistency 169
Interoperability 170
Generality 170
Internationalisation 170
Generation and Readability 171
Implementability 171
18.3 Extensibility 171
18.4 Processing the SSML Document 171
18.4.1 XML Parse 172
18.4.2 Structure Analysis 172
18.4.3 Text Normalisation 173
18.4.4 Text-To-Phoneme Conversion 173
18.4.5 Prosody Analysis 175
18.4.6 Waveform Production 176
18.5 Main SSML Elements and Their Attributes 176
18.5.1 Document Structure, Text Processing and Pronunciation 176
18.5.2 Prosody and Style 177
18.5.3 Other Elements 178
18.5.4 Comment 178
19 SABLE 181
20 The Need for Prosodic Markup 183
20.1 What is Prosody? 183
20.2 Incorporating Prosodic Markup 183
20.3 How Markup Works 184
20.4 Distinguishing Layout from Content 184
20.5 Uses of Markup 185
20.6 Basic Control of Prosody 186
20.7 Intrinsic and Extrinsic Structure and Salience 188
20.8 Automatic Markup to Enhance Orthography: Interoperability with the Synthesiser 190
20.9 Hierarchical Application of Markup 191
20.10 Markup and Perception 192
20.11 Markup: the Way Ahead? 193
20.12 Mark What and How? 195
20.12.1 Automatic Annotation of Databases for Limited Domain Systems 196
20.12.2 Database Markup with the Minimum of Phonology 196
20.13 Abstract Versus Physical Prosody 198
Part VI Strengthening the High-Level Model 199
21 Speech 201
21.1 Introductory Note 201
21.2 Speech Production 202
21.3 Relevance to Acoustics 202
21.4 Summary 203
21.5 Information for Synthesis: Limitations 203
22 Basic Concepts 205
22.1 How does Speaking Occur? 205
22.2 Underlying Basic Disciplines: Contributions from Linguistics 207
22.2.1 Linguistic Information and Speech 207
22.2.2 Specialist Use of the Terms \u2018Phonology\u2019 and \u2018Phonetics\u2019 208
22.2.3 Rendering the Plan 209
22.2.4 Types of Model Underlying Speech Synthesis 210
The Static Model 210
The Dynamic Model 210
23 Underlying Basic Disciplines: Expression Studies 213
23.1 Biology and Cognitive Psychology 213
23.2 Modelling Biological and Cognitive Events 214
23.3 Basic Assumptions in Our Proposed Approach 214
23.4 Biological Events 214
23.5 Cognitive Events 217
23.6 Indexing Expression in XML 219
23.7 Summary 220
24 Labelling Expressive/Emotive Content 223
24.1 Data Collection 224
24.2 Sources of Variability 225
24.3 Summary 226
25 The Proposed Model 229
25.1 Organisation of the Model 229
25.2 The Two Stages of the Model 230
25.3 Conditions and Restrictions on XML 230
25.4 Summary 231
26 Types of Model 233
26.1 Category Models 233
26.2 Process Models 234
Part VII Expanded Static and Dynamic Modelling 235
27 The Underlying Linguistics System 237
27.1 Dynamic Planes 237
27.2 Computational Dynamic Phonology for Synthesis 238
27.3 Computational Dynamic Phonetics for Synthesis 239
27.4 Adding How, What and Notions of Time 240
27.5 Static Planes 240
27.6 Computational Static Phonology for Synthesis 241
27.7 The Term Process in Linguistics 242
27.8 Computational Static Phonetics for Synthesis 244
27.9 Supervision 246
27.10 Time Constraints 246
27.11 Summary of the Phonological and Phonetic Models 247
28 Planes for Synthesis 249
Part VIII The Prosodic Framework, Coding and Intonation 251
29 The Phonological Prosodic Framework 253
29.1 Characterising the Phonological and Phonetic Planes 255
30 Sample Code 261
31 XML Coding 265
31.1 Adding Detail 266
31.2 Timing and Fundamental Frequency Control on the Dynamic Plane 272
31.3 The Underlying Markup 273
31.3.1 Syllables and Stress 274
31.3.2 Durations 276
31.4 Intrinsic Durations 277
31.5 Rendering Intonation as a Fundamental Frequency Contour 278
1: Assign Basic f0 Values to All S and F Syllables in the Sentence: the Assigned Value is for the Entire Syllable 279
2: Assign f0 for all U Syllables; Adjust Basic Values 279
3: Remove Monotony 280
4: For Sentences with RESET, where a RESET Point is a Clause or Phrase Boundary 280
32 Prosody: General 281
32.1 The Analysis of Prosody 282
32.2 The Principles of Some Current Models of Intonation Used in Synthesis 284
32.2.1 The Hirst and Di Cristo Model (Including INTSINT) 284
32.2.2 Taylor\u2019s Tilt Model 285
32.2.3 The ToBI (Tones and Break Indices) Model 285
32.2.4 The Basis of Intonation Modelling 286
32.2.5 Details of the ToBI Model 287
32.2.6 The INTSINT (International Transcription System for Intonation) Model 289
32.2.7 The Tatham and Morton Intonation Model 290
Units in T&M Intonation 290
33 Phonological and Phonetic Models of Intonation 293
33.1 Phonological Models 293
33.2 Phonetic Models 293
33.3 Naturalness 294
33.4 Intonation Modelling: Levels of Representation 297
Part IX Approaches to Natural-Sounding Synthesis 299
34 The General Approach 301
34.1 Parameterisation 301
34.2 Proposal for a Model to Support Synthesis 302
34.3 Segments and Prosodics: Hierarchical Ordering 303
34.4 A Sample Wrapping in XML 304
34.5 A Prosodic Wrapper for XML 305
34.6 The Phonological Prosodic Framework 306
35 The Expression Wrapper in XML 307
35.1 Expression Wrapping the Entire Utterance 308
35.2 Sourcing for Synthesis 309
35.3 Attributes Versus Elements 310
35.4 Variation of Attribute Sources 312
35.5 Sample Cognitive and Biological Components 313
35.5.1 Parameters of Expression 314
35.5.2 Blends 314
35.5.3 Identifying and Characterising Differences in Expression 314
35.5.4 A Grammar of Expressions 315
36 Advantages of XML in Wrapping 317
36.1 Constraints Imposed by the XML Descriptive System 319
36.2 Variability 319
37 Considerations in Characterising Expression/Emotion 321
37.1 Suggested Characterisation of Features of Expressive/Emotive Content 321
37.1.1 Categories 321
37.1.2 Choices in Dialogue Design 323
37.2 Extent of Underlying Expressive Modelling 324
37.3 Pragmatics 325
38 Summary 329
38.1 Speaking 329
38.2 Mutability 331
Part X Concluding Overview 333
Shared Characteristics Between Database and Output: the Integrity of the Synthesised Utterance 335
Concept-To-Speech 337
Text-To-Speech Synthesis: the Basic Overall Concept 338
Prosody in Text-To-Speech Systems 339
Optimising the Acoustic Signal for Perception 341
Conclusion 342
References 345
Author Index 351
Index 353
Acknowledgements 15
Introduction 17
How Good is Synthetic Speech? 17
Improvements Beyond Intelligibility 17
Continuous Adaptation 18
Data Structure Characterisation 19
Shared Input Properties 20
Intelligibility: Some Beliefs and Some Myths 21
Naturalness 23
Variability 24
The Introduction of Style 26
Expressive Content 27
Final Introductory Remarks 29
Part I Current Work 31
1 High-Level and Low-Level Synthesis 33
1.1 Differentiating Between Low-Level and High-Level Synthesis 33
1.2 Two Types of Text 33
1.3 The Context of High-Level Synthesis 34
1.4 Textual Rendering 36
2 Low-Level Synthesisers: Current Status 39
2.1 The Range of Low-Level Synthesisers Available 39
2.1.1 Articulatory Synthesis 39
2.1.2 Formant Synthesis 40
2.1.3 Concatenative Synthesis 44
Units for Concatenative Synthesis 44
Representation of Speech in the Database 47
Unit Selection Systems: the Data-Driven Approach 48
Unit Joining 49
Cost Evaluation in Unit Selection Systems 51
Prosody and Concatenative Systems 51
Prosody Implementation in Unit Concatenation Systems 52
2.1.4 Hybrid System Approaches to Speech Synthesis 53
3 Text-To-Speech 55
3.1 Methods 55
3.2 The Syntactic Parse 55
4 Different Low-Level Synthesisers: What Can Be Expected? 59
4.1 The Competing Types 59
4.2 The Theoretical Limits 61
4.3 Upcoming Approaches 61
5 Low-Level Synthesis Potential 63
5.1 The Input to Low-Level Synthesis 63
5.2 Text Marking 64
5.2.1 Unmarked Text 64
5.2.2 Marked Text: the Basics 64
5.2.3 Waveforms and Segment Boundaries 66
5.2.4 Marking Boundaries on Waveforms: the Alignment Problem 67
5.2.5 Labelling the Database: Segments 70
5.2.6 Labelling the Database: Endpointing and Alignment 71
Part II A New Direction for Speech Synthesis 73
6 A View of Naturalness 75
6.1 The Naturalness Concept 75
6.2 Switchable Databases for Concatenative Synthesis 76
6.3 Prosodic Modifications 77
7 Physical Parameters and Abstract Information Channels 79
7.1 Limitations in the Theory and Scope of Speech Synthesis 79
7.1.1 Distinguishing Between Physical and Cognitive Processes 80
7.1.2 Relationship Between Physical and Cognitive Objects 81
7.1.3 Implications 81
7.2 Intonation Contours from the Original Database 81
7.3 Boundaries in Intonation 83
8 Variability and System Integrity 85
8.1 Accent Variation 85
8.2 Voicing 88
8.3 The Festival System 90
8.4 Syllable Duration 91
8.5 Changes of Approach in Speech Synthesis 92
9 Automatic Speech Recognition 95
9.1 Advantages of the Statistical Approach 96
9.2 Disadvantages of the Statistical Approach 97
9.3 Unit Selection Synthesis Compared with Automatic Speech Recognition 97
Part III High-Level Control 99
10 The Need for High-Level Control 101
10.1 What is High-Level Control? 101
10.2 Generalisation in Linguistics 102
10.3 Units in the Signal 105
10.4 Achievements of a Separate High-Level Control 106
10.5 Advantages of Identifying High-Level Control 106
11 The Input to High-Level Control 109
11.1 Segmental Linguistic Input 109
11.2 The Underlying Linguistics Model 110
11.3 Prosody 112
11.4 Expression 114
12 Problems for Automatic Text Markup 115
12.1 The Markup and the Data 116
12.2 Generality on the Static Plane 117
12.3 Variability in the Database\u2013or Not 118
12.4 Multiple Databases and Perception 121
12.5 Selecting Within a Marked Database 121
Part IV Areas for Improvement 125
13 Filling Gaps 127
13.1 General Prosody 127
13.2 Prosody: Expression 128
13.3 The Segmental Level: Accents and Register 129
13.4 Improvements to be Expected from Filling the Gaps 131
14 Using Different Units 135
14.1 Trade-Offs Between Units 135
14.2 Linguistically Motivated Units 135
14.3 A-Linguistic Units 137
14.4 Concatenation 139
14.5 Improved Naturalness Using Large Units 139
15 Waveform Concatenation Systems: Naturalness and Large Databases 143
15.1 The Beginnings of Useful Automated Markup Systems 145
15.2 How Much Detail in the Markup? 145
15.3 Prosodic Markup and Segmental Consequences 148
15.3.1 Method 1: Prosody Normalisation 148
15.3.2 Method 2: Prosody Extraction 149
15.4 Summary of Database Markup and Content 151
16 Unit Selection Systems 153
16.1 The Supporting Theory for Synthesis 153
16.2 Terms 154
16.3 The Database Paradigm and the Limits of Synthesis 155
16.4 Variability in the Database 155
16.5 Types of Database 156
16.6 Database Size and Searchability at Low-Level 158
16.6.1 Database Size 158
16.6.2 Database Searchability 160
Part V Markup 161
17 VoiceXML 163
17.1 Introduction 163
17.2 VoiceXML and XML 164
17.3 VoiceXML: Functionality 164
17.4 Principal VoiceXML Elements 165
17.5 Tapping the Autonomy of the Attached Synthesis System 167
18 Speech Synthesis Markup Language (SSML) 169
18.1 Introduction 169
18.2 Original W3C Design Criteria for SSML 169
Consistency 169
Interoperability 170
Generality 170
Internationalisation 170
Generation and Readability 171
Implementability 171
18.3 Extensibility 171
18.4 Processing the SSML Document 171
18.4.1 XML Parse 172
18.4.2 Structure Analysis 172
18.4.3 Text Normalisation 173
18.4.4 Text-To-Phoneme Conversion 173
18.4.5 Prosody Analysis 175
18.4.6 Waveform Production 176
18.5 Main SSML Elements and Their Attributes 176
18.5.1 Document Structure, Text Processing and Pronunciation 176
18.5.2 Prosody and Style 177
18.5.3 Other Elements 178
18.5.4 Comment 178
19 SABLE 181
20 The Need for Prosodic Markup 183
20.1 What is Prosody? 183
20.2 Incorporating Prosodic Markup 183
20.3 How Markup Works 184
20.4 Distinguishing Layout from Content 184
20.5 Uses of Markup 185
20.6 Basic Control of Prosody 186
20.7 Intrinsic and Extrinsic Structure and Salience 188
20.8 Automatic Markup to Enhance Orthography: Interoperability with the Synthesiser 190
20.9 Hierarchical Application of Markup 191
20.10 Markup and Perception 192
20.11 Markup: the Way Ahead? 193
20.12 Mark What and How? 195
20.12.1 Automatic Annotation of Databases for Limited Domain Systems 196
20.12.2 Database Markup with the Minimum of Phonology 196
20.13 Abstract Versus Physical Prosody 198
Part VI Strengthening the High-Level Model 199
21 Speech 201
21.1 Introductory Note 201
21.2 Speech Production 202
21.3 Relevance to Acoustics 202
21.4 Summary 203
21.5 Information for Synthesis: Limitations 203
22 Basic Concepts 205
22.1 How does Speaking Occur? 205
22.2 Underlying Basic Disciplines: Contributions from Linguistics 207
22.2.1 Linguistic Information and Speech 207
22.2.2 Specialist Use of the Terms \u2018Phonology\u2019 and \u2018Phonetics\u2019 208
22.2.3 Rendering the Plan 209
22.2.4 Types of Model Underlying Speech Synthesis 210
The Static Model 210
The Dynamic Model 210
23 Underlying Basic Disciplines: Expression Studies 213
23.1 Biology and Cognitive Psychology 213
23.2 Modelling Biological and Cognitive Events 214
23.3 Basic Assumptions in Our Proposed Approach 214
23.4 Biological Events 214
23.5 Cognitive Events 217
23.6 Indexing Expression in XML 219
23.7 Summary 220
24 Labelling Expressive/Emotive Content 223
24.1 Data Collection 224
24.2 Sources of Variability 225
24.3 Summary 226
25 The Proposed Model 229
25.1 Organisation of the Model 229
25.2 The Two Stages of the Model 230
25.3 Conditions and Restrictions on XML 230
25.4 Summary 231
26 Types of Model 233
26.1 Category Models 233
26.2 Process Models 234
Part VII Expanded Static and Dynamic Modelling 235
27 The Underlying Linguistics System 237
27.1 Dynamic Planes 237
27.2 Computational Dynamic Phonology for Synthesis 238
27.3 Computational Dynamic Phonetics for Synthesis 239
27.4 Adding How, What and Notions of Time 240
27.5 Static Planes 240
27.6 Computational Static Phonology for Synthesis 241
27.7 The Term Process in Linguistics 242
27.8 Computational Static Phonetics for Synthesis 244
27.9 Supervision 246
27.10 Time Constraints 246
27.11 Summary of the Phonological and Phonetic Models 247
28 Planes for Synthesis 249
Part VIII The Prosodic Framework, Coding and Intonation 251
29 The Phonological Prosodic Framework 253
29.1 Characterising the Phonological and Phonetic Planes 255
30 Sample Code 261
31 XML Coding 265
31.1 Adding Detail 266
31.2 Timing and Fundamental Frequency Control on the Dynamic Plane 272
31.3 The Underlying Markup 273
31.3.1 Syllables and Stress 274
31.3.2 Durations 276
31.4 Intrinsic Durations 277
31.5 Rendering Intonation as a Fundamental Frequency Contour 278
1: Assign Basic f0 Values to All S and F Syllables in the Sentence: the Assigned Value is for the Entire Syllable 279
2: Assign f0 for all U Syllables; Adjust Basic Values 279
3: Remove Monotony 280
4: For Sentences with RESET, where a RESET Point is a Clause or Phrase Boundary 280
32 Prosody: General 281
32.1 The Analysis of Prosody 282
32.2 The Principles of Some Current Models of Intonation Used in Synthesis 284
32.2.1 The Hirst and Di Cristo Model (Including INTSINT) 284
32.2.2 Taylor\u2019s Tilt Model 285
32.2.3 The ToBI (Tones and Break Indices) Model 285
32.2.4 The Basis of Intonation Modelling 286
32.2.5 Details of the ToBI Model 287
32.2.6 The INTSINT (International Transcription System for Intonation) Model 289
32.2.7 The Tatham and Morton Intonation Model 290
Units in T&M Intonation 290
33 Phonological and Phonetic Models of Intonation 293
33.1 Phonological Models 293
33.2 Phonetic Models 293
33.3 Naturalness 294
33.4 Intonation Modelling: Levels of Representation 297
Part IX Approaches to Natural-Sounding Synthesis 299
34 The General Approach 301
34.1 Parameterisation 301
34.2 Proposal for a Model to Support Synthesis 302
34.3 Segments and Prosodics: Hierarchical Ordering 303
34.4 A Sample Wrapping in XML 304
34.5 A Prosodic Wrapper for XML 305
34.6 The Phonological Prosodic Framework 306
35 The Expression Wrapper in XML 307
35.1 Expression Wrapping the Entire Utterance 308
35.2 Sourcing for Synthesis 309
35.3 Attributes Versus Elements 310
35.4 Variation of Attribute Sources 312
35.5 Sample Cognitive and Biological Components 313
35.5.1 Parameters of Expression 314
35.5.2 Blends 314
35.5.3 Identifying and Characterising Differences in Expression 314
35.5.4 A Grammar of Expressions 315
36 Advantages of XML in Wrapping 317
36.1 Constraints Imposed by the XML Descriptive System 319
36.2 Variability 319
37 Considerations in Characterising Expression/Emotion 321
37.1 Suggested Characterisation of Features of Expressive/Emotive Content 321
37.1.1 Categories 321
37.1.2 Choices in Dialogue Design 323
37.2 Extent of Underlying Expressive Modelling 324
37.3 Pragmatics 325
38 Summary 329
38.1 Speaking 329
38.2 Mutability 331
Part X Concluding Overview 333
Shared Characteristics Between Database and Output: the Integrity of the Synthesised Utterance 335
Concept-To-Speech 337
Text-To-Speech Synthesis: the Basic Overall Concept 338
Prosody in Text-To-Speech Systems 339
Optimising the Acoustic Signal for Perception 341
Conclusion 342
References 345
Author Index 351
Index 353
Developments in speech synthesis /
- 名称
- 类型
- 大小
光盘服务联系方式: 020-38250260 客服QQ:4006604884
云图客服:
用户发送的提问,这种方式就需要有位在线客服来回答用户的问题,这种 就属于对话式的,问题是这种提问是否需要用户登录才能提问
Video Player
×
Audio Player
×
pdf Player
×