Abstract
In this study, we evaluated the effectiveness of various deep learning parameters in detecting audio deepfakes using convolutional neural network (CNN) architectures. Through a series of experiments and comparative analyses, we developed four distinct models, each with different activation functions, optimizers, and learning rates. These models were meticulously trained and evaluated using a comprehensive dataset containing both fake and genuine audio samples. The results indicate that Model 1 achieved an exceptional accuracy of 97.8%, primarily due to the effective use of ReLU activation and the Adam optimizer. Additionally, Model 4 showed significant improvement, attaining a validation ac-curacy of 96% by employing advanced activation functions and the Adagrad optimizer. In contrast, Model 2, which used a sigmoid activation function in its fully connected layer and the RMSprop optimizer, and Model 3, which utilized the hyperbolic tangent activation function along with the stochastic gradient descent optimizer, demonstrated moderate accuracies.
Original language | English |
---|---|
Publication status | Accepted/In press - 8 Aug 2024 |
Event | International Conference on Emerging Technologies in Computing 2024 - University of Essex, Southend Campus, UK., London, United Kingdom Duration: 15 Aug 2024 → 16 Aug 2024 Conference number: 7 https://icetic24.theiaer.org/ |
Conference
Conference | International Conference on Emerging Technologies in Computing 2024 |
---|---|
Abbreviated title | iCETiC 24 |
Country/Territory | United Kingdom |
City | London |
Period | 15/08/24 → 16/08/24 |
Internet address |
Keywords
- Deepfakes Audio
- CNN
- Activation functions
- Optimizers
- Mel Spectrograms