NavTarang

Features

Top-Down Attention

Focuses on target speech in mixed audio for effective separation.

Encoder-Decoder Structure

Compresses audio features and reconstructs separated speech efficiently.

Real-Time Processing

Optimized for low-latency speech separation in teleconferencing and mobile devices.

About Us

NavTarang is an innovative audio processing app that combines navigation ("Nav") with audio waves ("Tarang").

Our team specializes in advanced speech separation, offering cutting-edge solutions through our flagship framework, TDANet.

Enhance your audio experience with our key features:

Advanced Attention Mechanisms:
Efficient Encoder-Decoder Architecture:
Real-Time, Low-Latency Processing:
Extensible and Adaptable Solutions:

References

[1] Triantafyllos Afouras, Joon Son Chung, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. Deep audio-visual speech recognition. IEEE transactions on pattern analysis and machine intel- ligence, 2018.

[2] Efthymios Tzinis, Zhepei Wang, and Paris Smaragdis. Sudo rm-rf: efficient networks for universal audio source separation. In IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE, 2020.

[3] Yi Luo, Zhuo Chen, and Takuya Yoshioka. Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 46–50. IEEE, 2020.

[4] Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, and Jianyuan Zhong. Attention is all you need in speech separation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 21–25. IEEE, 2021.

[5] Xiaolin Hu, Kai Li, Weiyi Zhang, Yi Luo, Jean-Marie Lemercier, and Timo Gerkmann. Speech separation using an asynchronous fully recurrent convolutional neural network. volume 34, pp. 22509–22522, 2021.