This is the Windows app named VALL-E whose latest release can be downloaded as GreatlyimprovedaccuracyandFixGPUmemoryincreaseduringtraining.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named VALL-E with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start any OS OnWorks online emulator from this website, but better Windows online emulator.
- 5. From the OnWorks Windows OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application and install it.
- 7. Download Wine from your Linux distributions software repositories. Once installed, you can then double-click the app to run them with Wine. You can also try PlayOnLinux, a fancy interface over Wine that will help you install popular Windows programs and games.
Wine is a way to run Windows software on Linux, but with no Windows required. Wine is an open-source Windows compatibility layer that can run Windows programs directly on any Linux desktop. Essentially, Wine is trying to re-implement enough of Windows from scratch so that it can run all those Windows applications without actually needing Windows.
SCREENSHOTS
Ad
VALL-E
DESCRIPTION
We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity. In addition, we find VALL-E could preserve the speaker's emotion and acoustic environment of the acoustic prompt in synthesis.
Features
- The pipeline of VALL-E is phoneme → discrete code → waveform
- VALL-E generates the discrete audio codec codes based on phoneme and acoustic code prompts
- VALL-E directly enables various speech synthesis applications
- Zero-shot TTS, speech editing, and content creation
- Combined with other generative AI models like GPT-3
- VALL-E can synthesize personalized speech while maintaining the acoustic environment of the speaker prompt
Programming Language
Python
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/vall-e.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.