Audiobox heralds a new era at Meta, serving as the cornerstone research model for audio generation. This pioneering platform synthesizes voices and sound effects through a fusion of voice inputs and natural language text prompts, simplifying the creation of customized audio for diverse applications. Within the Audiobox family lie specialized models like Audiobox Speech and Audiobox Sound, all built upon the foundational Audiobox SSL self-supervised model. This collective approach ensures a spectrum of versatile capabilities, empowering users to craft personalized and impactful audio experiences across various contexts.




