Krisp app is based on Deep Neural Networks. We have collected and listened to datasets of 20K distinct noises and 10K clean voices of different ages, gender, and ethnicity. All the audio together compile up to 2.5K hours. The datasets were fed to the neural network which was later trained to remove background noise and leave only clean voice.
Refer to this article published on NVIDIA blog by Krisp team.