Speaker-Recognition-Based-on-Deep-Learning-An-Overview

This repo is to list the references papers of Speaker Recognition Based on Deep Learning: An Overview

The analysis of the paper can refer to here:基于深度学习的说话人识别概述

All references are sorted in the order of the original paper, and the pdf document can be downloaded in the Release on the right.

References list

1 An overview of text-independent speaker recognition From features to supervectors .pdf

2 Forensic speaker recognition.pdf

3 The inference of identity in forensic speaker recognition.pdf

4 an overview of speaker identification accuracy and robustness issues.pdf

5 Talker‐Recognition Procedure Based on Analysis of Variance.pdf

6 Speaker Verification Using Adapted Gaussain Mixture Models.pdf

7 Support vector machines using GMM supervectors for speaker verification.pdf

8 Joint factor analysis versus eigenchannels in speaker recognition.pdf

9 Front-end factor analysis for speaker verification.pdf

10 Bayesian speaker verification with heavy tailed priors.pdf

11 Analysis of i-vector length normalization in speaker recognition systems.pdf

12 A novel scheme for speaker recognition using a phonetically-aware deep neural network.pdf

13 Deep neural networks for small footprint text-dependent speaker verification.pdf

14 Xvectors Robust DNN embeddings for speaker recognition.pdf

15 Voxceleb a large-scale speaker identification dataset.pdf

17 Speaker recognition by machines and humans A tutorial review.pdf

18 An overview of automatic speaker recognition technology.pdf

19 An overview of statistical pattern recognition techniques for speaker verification.pdf

20 Spoofing and countermeasures for speaker verification A survey.pdf

21 Speaker diarization A review of recent research.pdf

22 The Attacker's Perspective on Automatic Speaker Verification An Overview.pdf

23 Speaker verification using deep neural networks A review.pdf

24 Learning Discriminative Features for Speaker Identification and Verification.pdf

25 Centroid-based deep metric learning for speaker recognition.pdf

26 An End-to-End Text-Independent Speaker Identification System on Short Utterances.pdf

27 Combining deep embeddings of acoustic and articulatory features for speaker identification.pdf

28 A Memory Augmented Architecture for Continuous Speaker Identification in Meetings.pdf

29 X-vectors Robust neural embeddings for speaker recognition.pdf

30 A study of interspeaker variability in speaker verification.pdf

31 Deep neural network based posteriors for text-dependent speaker verification.pdf

32 Deep Neural Networks and Hidden Markov Models in i-vector-based Text-Dependent Speaker Verification.pdf

33 A deep neural network speaker verification system targeting microphone speech.pdf

34 Exploiting sequence information for text-dependent speaker verification.pdf

35 Text-dependent speaker verification based on i-vectors, neural networks and hidden Markov models.pdf

36 Deep neural network approaches to speaker and language recognition.pdf

37 Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition.pdf

38 Time delay deep neural network-based universal background models for speaker recognition.pdf

39 Advances in deep neural network approaches.pdf

40 Phone-centric local variability vector for text-constrained speaker verification.pdf

41 A unified deep neural network for speaker and language recognition.pdf

42 The IBM 2016 speaker recognition system.pdf

43 Application of convolutional neural networks to speaker recognition in noisy conditions.pdf

44 Insights into deep neural networks for speaker recognition.pdf

45 Exploring robustness of DNN RNN for extracting speaker Baum-Welch statistics in mismatched conditions.pdf

47 Exploring the role of phonetic bottleneck features for speaker and language recognition.pdf

48 Analysis and Optimization of Bottleneck Features for Speaker RecognitionK.pdf

49 Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data.pdf

50 Combination of cepstral and phonetically discriminative features for speaker verification.pdf

51 Deep Neural Network Embeddings for Text-Independent Speaker Verification.pdf

52 Towards directly modeling raw speech signal for speaker verification using CNNs.pdf

53 Speaker recognition from raw waveform with sincnet.pdf

54 Avoiding speaker overfitting in end-to-end dnns using raw waveform for text-independent speaker verification.pdf

55 A complete end-to-end speaker verification system using deep neural networks From raw signals to verification result.pdf

56 Rawnet Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification.pdf

57 Short utterance compensation in speaker verification via cosine-based teacher-student learning of speaker embeddings.pdf

58 Ensemble additive margin softmax for speaker verification.pdf

59 Voxceleb2 Deep speaker recognition.pdf

60 Utterance-level aggregation for speaker recognition in the wild.pdf

61 Frequency and temporal convolutional attention for text-independent speaker recognition.pdf

62 End-to-end text-independent speaker verification with triplet loss on short utterances..pdf

63 Text-independent speaker verification based on triplet convolutional neural network embeddings.pdf

64 Seq2seq attentional siamese neural networks for text-dependent speaker verification.pdf

65 Orthogonal training for text-independent speaker verification.pdf

66 Orthogonality regularizations for end-to-end speaker.pdf

67 JHU-HLTCOE system for the VoxSRC speaker recognition challenge.pdf

68 Multi-resolution multi-head attention in deep speaker embedding.pdf

69 Deep speaker an end-to-end neural speaker embedding system.pdf

70 Deep speaker representation using orthogonal decomposition and recombination for speaker verification.pdf

71 Magneto X-vector magnitude estimation network plus offset for improved speaker recognition.pdf

72 Deep Speaker Embeddings for Short-Duration Speaker Verification.pdf

73 Deep Discriminative Embeddings for Duration Robust Speaker Verification.pdf

74 Boundary discriminative large margin cosine loss for text-independent speaker verification.pdf

75 Text-independent speaker verification using 3d convolutional neural networks.pdf

76 Deep speaker feature learning for text-independent speaker verification.pdf

77 Attention-based models for text-dependent speaker verification.pdf

78 Generalized end-to-end loss for speaker verification.pdf

79 End-to-end text-dependent speaker verification.pdf

80 Improving deep CNN networks with long temporal context for text-independent speaker verification.pdf

81 Deep speaker embedding learning with multi-level pooling for text-independent speaker verification.pdf

82 Speaker recognition for multi-speaker conversations using x-vectors.pdf

83 Speaker embedding extraction with phonetic information.pdf

84 Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification..pdf

85 Attentive statistics pooling for deep speaker embedding.pdf

86 Partial AUC optimization based deep speaker embeddings with class-center learning for text-independent speaker verification.pdf

87 Gaussian-constrained training for speaker verification.pdf

88 Margin matters Towards more discriminative deep neural network embeddings for speaker recognition.pdf

89 State-of-the-Art Speaker Recognition for Telephone and Video Speech The JHU-MIT Submission for NIST SRE18.pdf

90 Statistics pooling time delay neural network based on X-vector for speaker verification.pdf

91 Bayesian x-vector Bayesian Neural Network based x-vector System for Speaker Verification.pdf

92 Deep Speaker Embedding Extraction with Channel-Wise Feature Responses and Additive Supervision Softmax Loss Function..pdf

93 An improved deep embedding learning method for short duration speaker verification.pdf

94 An Effective Deep Embedding Learning Architecture for Speaker Verification..pdf

95 Speaker characterization using tdnn-lstm based speaker embedding.pdf

96 A time delay neural network architecture for efficient modeling of long temporal contexts.pdf

97 Self-supervised speaker embeddings.pdf

98 JHU-HLTCOE system for the VoxSRC speaker recognition challenge.pdf

99 Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks..pdf

100 The JHU Speaker Recognition System for the VOiCES 2019 Challenge.pdf

101 Densely Connected Time Delay Neural Network for Speaker Verification.pdf

102 Compact Speaker Embedding lrx-vector.pdf

103 Deep residual learning for image recognition.pdf

104 Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms.pdf

105 Wav2Spk A Simple DNN Architecture for Learning Speaker Embeddings from Waveforms..pdf

106 BERTPHONE Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition.pdf

107 Self-attention encoding and pooling for speaker recognition.pdf

108 Autospeech Neural architecture search for speaker recognition.pdf

109 Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification.pdf

110 A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings.pdf

111 State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and speakers in the wild evaluations.pdf

112 Am-mobilenet1d A portable model for speaker recognition.pdf

113 Attention is all you need.pdf

114 A structured self-attentive sentence embedding.pdf

115 Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification..pdf

116 An Improved Deep Neural Network for Modeling Speaker Characteristics at Different Temporal Scales.pdf

117 Cnn with phonetic attention for text-independent speaker verification.pdf

118 Self multi-head attention for speaker recognition.pdf

119 Vector-Based Attentive Pooling for Text-Independent Speaker Verification.pdf

120 NetVLAD CNN architecture for weakly supervised place recognition.pdf

121 Ghostvlad for set-based face recognition.pdf

122 Exploring the encoding layer and loss function in end-to-end speaker and language recognition system.pdf

123 Spatial pyramid encoding with convex length normalization for text-independent speaker verification.pdf

124 Total variability layer in deep neural network embeddings for speaker verification.pdf

125 Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System.pdf

126 Shortcut Connections Based Deep Speaker Embeddings for End-to-End Speaker Verification System.pdf

127 A deep neural network for short-segment speaker recognition.pdf

128 Improving multi-scale aggregation using feature pyramid module for robust speaker verification of variable-duration utterances.pdf

129 Sphereface Deep hypersphere embedding for face recognition.pdf

130 Angular Softmax for Short-Duration Text-independent Speaker Verification.pdf

131 On deep speaker embeddings for text-independent speaker recognition.pdf

132 Unified hypersphere embedding for speaker recognition.pdf

133 Dynamic Margin Softmax Loss for Speaker Verification.pdf

134 Large margin softmax loss for speaker verification.pdf

135 On Parameter Adaptation in Softmax-based Cross-Entropy Loss for Improved Convergence Speed and Accuracy in DNN-based Speaker Recognition.pdf

136 Additive margin softmax for face verification.pdf

137 Cosface Large margin cosine loss for deep face recognition.pdf

138 Arcface Additive angular margin loss for deep face recognition.pdf

139 Discriminative neural embedding learning for short-duration text-independent speaker verification.pdf

140 A discriminative feature learning approach for deep face recognition.pdf

141 Multi-Task Discriminative Training of Hybrid DNN-TVM Model for Speaker Verification with Noisy and Far-Field Speech.pdf

142 Multi-task learning for text-dependent speaker verification.pdf

143 Deep feature for text-dependent speaker verification.pdf

144 DNN based speaker embedding using content information for text-dependent speaker verification.pdf

145 On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction..pdf

146 Collaborative joint training with multitask recurrent model for speech and speaker recognition.pdf

147 SNR-invariant multitask deep neural networks for robust speaker verification.pdf

148 Multi-task network for noise-robust keyword spotting and speaker verification using CTC-based soft VAD and global query attention.pdf

149 End-to-end attention based text-dependent speaker verification.pdf

150 Deep neural network-based speaker embeddings for end-to-end speaker verification.pdf

151 End-to-end DNN based speaker recognition inspired by i-vector and PLDA.pdf

153 Optimization of False Acceptance Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems.pdf

154 Joint i-vector with end-to-end system for short duration text-independent speaker verification.pdf

155 Tristounet triplet loss for speaker turn embedding.pdf

156 Speaker verification by partial AUC optimization with mahalanobis distance metric learning.pdf

157 End-to-end Text-dependent Speaker Verification Using Novel Distance Measures..pdf

158 Facenet A unified embedding for face recognition and clustering.pdf

159 Prototypical networks for few-shot learning.pdf

160 Few shot speaker recognition using deep neural networks.pdf

161 In defence of metric learning for speaker recognition.pdf

162 Meta-learning for short utterance speaker recognition with imbalance length pairs.pdf

163 Angular Margin Centroid Loss for Text-independent Speaker Recognition.pdf

164 End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification.pdf

165 DIHARD II is still hard Experimental results and discussions from the DKU-LENOVO team.pdf

166 Speaker diarization using deep neural network embeddings.pdf

167 Diarization is Hard Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge..pdf

168 Speaker diarization using latent space clustering in generative adversarial network.pdf

169 Speaker diarization with lstm.pdf

170 Speaker segmentation using deep speaker vectors for fast speaker change scenarios.pdf

171 Context and Uncertainty Modeling for Online Speaker Change Detection.pdf

172 Pre-training of speaker embeddings for low-latency speaker change detection in broadcast news.pdf

173 Speaker, environment and channel change detection and clustering via the bayesian information criterion.pdf

174 Automatic segmentation, classification and clustering of broadcast news audio.pdf

176 An Unsupervised Neural Prediction Framework for Learning Speaker Embeddings Using Recurrent Neural Networks.pdf

178 Convolutional neural network for speaker change detection in telephone speaker diarization system.pdf

179 Speaker diarization with PLDA i-vector scoring and unsupervised calibration.pdf

180 Speaker diarization with i-vectors from DNN senone posteriors.pdf

181 Bayesian HMM Based x-Vector Clustering for Speaker Diarization.pdf

182 But system for the second dihard speech diarization challenge.pdf

183 Optimizing Bayesian HMM based x-vector clustering for the second DIHARD speech diarization challenge.pdf

184 A comparison of neural network feature transforms for speaker diarization.pdf

185 Speaker diarisation using 2D self-attentive combination of embeddings.pdf

186 Speaker embeddings incorporating acoustic conditions for diarization.pdf

187 Speaker diarization with session-level speaker embedding refinement using graph neural networks.pdf

188 LSTM based similarity measurement with spectral clustering for speaker diarization.pdf

189 Self-attentive similarity measurement strategies in speaker diarization.pdf

190 Multilayer bootstrap networks.pdf

191 Universal Background Sparse Coding and Multilayer Bootstrap Network for Speaker Clustering.pdf

192 An Investigation of Speaker Clustering Algorithms in Adverse Acoustic Environments.pdf

193 Fully supervised speaker diarization.pdf

194 DNN-based speaker clustering for speaker diarisation.pdf

195 Active learning based constrained clustering for speaker diarization.pdf

196 Discriminative neural clustering for speaker diarisation.pdf

197 Supervised online diarization with sample mean loss for multi-domain data.pdf

198 Diarization resegmentation in the factor analysis subspace.pdf

199 Speaker Diarization based on Bayesian HMM with Eigenvoice Priors.pdf

200 Neural speech turn segmentation and affinity propagation for speaker diarization.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speaker-Recognition-Based-on-Deep-Learning-An-Overview

References list

About

Releases 2

Packages

License

zycv/Speaker-Recognition-Based-on-Deep-Learning-An-Overview

Folders and files

Latest commit

History

Repository files navigation

Speaker-Recognition-Based-on-Deep-Learning-An-Overview

References list

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Packages