TTS

글자를 음성으로 생성해주는데 효과음등과 주변음 등도 생성해주는 모델이 나왔습니다.

https://tango-web.github.io/

Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model

Deepanway Ghosal1, Navonil Majumder1, Ambuj Mehrish1, Soujanya Poria1 1DeCLaRe Lab, Singapore University of Technology and Design, Singapore Abstract The immense scale of the recent large language models (LLM) allows many interesting properties, such as, i

tango-web.github.io

잠재 확산 모델(LDM) 기반 접근 방식(TANGO)은 대부분의 메트릭에서 최첨단 AudioLDM을 능가하고 나머지 AudioCaps 테스트 세트에서 비슷한 수준을 유지합니다. 63배 더 작은 데이터 세트에서 LDM을 교육하고 텍스트 인코더를 고정된 상태로 유지함에도 불구하고. 이러한 개선은 트레이닝 세트 확대를 위한 오디오 압력 수준 기반 사운드 믹싱을 채택한 데 기인할 수도 있지만 이전 방법은 무작위 믹스를 사용합니다.

저작자표시 (새창열림)

'AI > Music' 카테고리의 다른 글

deepmind_DreamTrack_Music AI Tools (0)	2023.11.20
Stable Audio (0)	2023.09.17
AudioCraft: Generative AI for audio made simple and available to all (0)	2023.08.10
작곡 (0)	2023.06.05
SoundRaw (0)	2023.04.06

개발의신

TTS

'AI > Music' 카테고리의 다른 글

댓글

티스토리툴바

TTS

'AI > Music' 카테고리의 다른 글

관련글

댓글

티스토리툴바