Speech recognition with deep recurrent neural networks
Alex Graves,
Abdel-rahman Mohamed,
Geoffrey E. Hinton
|
13 |
2013 |
13
2013
|
Librispeech: An ASR corpus based on public domain audio books
Vassil Panayotov,
Guoguo Chen,
Daniel Povey,
S. Khudanpur
|
12 |
2015 |
12
2015
|
Audio Set: An ontology and human-labeled dataset for audio events
8 auth.
J. Gemmeke,
D. Ellis,
Dylan Freedman,
A. Jansen,
W. Lawrence,
R. C. Moore,
...
Manoj Plakal,
Marvin Ritter
|
11 |
2017 |
11
2017
|
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions
13 auth.
Jonathan Shen,
Ruoming Pang,
Ron J. Weiss,
M. Schuster,
N. Jaitly,
Zongheng Yang,
Z. Chen,
Yu Zhang,
Yuxuan Wang,
R. Skerry-Ryan,
...
R. Saurous,
Yannis Agiomyrgiannakis,
Yonghui Wu
|
11 |
2017 |
11
2017
|
X-Vectors: Robust DNN Embeddings for Speaker Recognition
David Snyder,
D. Garcia-Romero,
Gregory Sell,
Daniel Povey,
S. Khudanpur
|
11 |
2018 |
11
2018
|
CNN architectures for large-scale audio classification
13 auth.
Shawn Hershey,
Sourish Chaudhuri,
D. Ellis,
J. Gemmeke,
A. Jansen,
R. C. Moore,
Manoj Plakal,
D. Platt,
R. Saurous,
Bryan Seybold,
...
M. Slaney,
Ron J. Weiss,
K. Wilson
|
11 |
2016 |
11
2016
|
SWITCHBOARD: telephone speech corpus for research and development
J. Godfrey,
E. Holliman,
J. McDaniel
|
11 |
1992 |
11
1992
|
Building high-level features using large scale unsupervised learning
8 auth.
Quoc V. Le,
Marc'Aurelio Ranzato,
R. Monga,
M. Devin,
G. Corrado,
Kai Chen,
...
J. Dean,
A. Ng
|
11 |
2011 |
11
2011
|
Listen, attend and spell: A neural network for large vocabulary conversational speech recognition
William Chan,
N. Jaitly,
Quoc V. Le,
O. Vinyals
|
11 |
2015 |
11
2015
|
Signal estimation from modified short-time Fourier transform
D. Griffin,
Jae S. Lim
|
10 |
1983 |
10
1983
|
Kernel independent component analysis
F. Bach,
Michael I. Jordan
|
10 |
2003 |
10
2003
|
Improved backing-off for M-gram language modeling
Reinhard Kneser,
H. Ney
|
10 |
1995 |
10
1995
|
A complete ensemble empirical mode decomposition with adaptive noise
M. E. Torres,
M. A. Colominas,
G. Schlotthauer,
P. Flandrin
|
10 |
2011 |
10
2011
|
UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation
9 auth.
Huimin Huang,
Lanfen Lin,
Ruofeng Tong,
Hongjie Hu,
Qiaowei Zhang,
Yutaro Iwamoto,
...
Xianhua Han,
Yenwei Chen,
Jian Wu
|
10 |
2020 |
10
2020
|
Space-time adaptive processing for airborne radar
J. Ward
|
10 |
1994 |
10
1994
|
Extensions of recurrent neural network language model
Tomas Mikolov,
Stefan Kombrink,
L. Burget,
J. Černocký,
S. Khudanpur
|
10 |
2011 |
10
2011
|
Statistical Parametric Speech Synthesis
H. Zen,
K. Tokuda,
A. Black
|
10 |
2007 |
10
2007
|
On robust Capon beamforming and diagonal loading
Jian Li,
P. Stoica,
Zhisong Wang
|
10 |
2003 |
10
2003
|
Enhancement of speech corrupted by acoustic noise
M. Berouti,
R. Schwartz,
J. Makhoul
|
10 |
1979 |
10
1979
|