What is VocaTalk?

The Concept

VocaTalk generates personal podcasts that sound like an audio documentary with sound effects and music in the background. You copy and paste any content, select music and effects, generate episode, and download to your iPod, Zune or any mp3 player. If the content is too large, VocaTalk will automatically split it into episodes of 50-60 minutes. VocaTalk uses text-to-speech (TTS) technology to read the text aloud and digital signal processing (DSP) technology to massage the generated speech. Listening to a VocaTalk episode is much more comfortable and fun than raw TTS (text-to-speech) voice.

VocaTalk is designed to make reading and learning experience fun and efficient for people who read and learn a lot, or people who want to do so but have no time. Students, engineers, teachers, doctors or anyone who wants to keep himself/herself up to date with information in a fun way can use VocaTalk.

What does it sound like?

A VocaTalk episode is based on whatever you chose it to be. You just give the content to VocaTalk and chose music and effects. VocaTalk will read the text as it plays the music in the background. VocaTalk will use all available voices installed on your system randomly so you don't get bored by listening to the same voice especially during long hours of listening. It's like a documentary audio that is narrated by multiple speakers. VocaTalk will leave periods of silence between paragraphs to make the listening experience closer to a documentary. The background music makes it also fun and engaging. You can get pretty creative here and try different genres and see what goes with what type of reading. In order to make the listening even more comfortable, keep the focus and attract the attention, VocaTalk uses some sound effects and enhancements to the generated mp3. There are a number of different effects that you can turn on and off. For example positional audio effect moves the sound position of the speakers smoothly and changes depth, echo and pitch of the sound. So you're always immersed into an environment which sounds like a big theatre.

What makes VocaTalk unique?

  • Most TTS applications let you choose a single voice for an entire text. VocaTalk let's you choose multiple voices for the same text so you're not bored.
  • Most TTS applications just generate mp3. VocaTalk not only generates mp3, but also publishes podcasts so it's much easier to track the generated episodes. Also, you can just queue your episodes without generating them, and generate when you want to actually listen.
  • No TTS application can put background music. VocaTalk can put background music which creates a whole new experience.
  • Ordinary TTS output is 16 to 22 Khz, mono. VocaTalk's output is always 44.1 Khz, stereo (CD quality). This makes it possible to move the voice position, add stereo music and other effects that is only achievable using stereo sound.
  • No TTS application supports brainwave enrainment technology. VocaTalk supports binaural beats and crossfeed modulation to improve focus and learning, or relaxation.
  • Most TTS applications save the generated audio directly into an mp3 file. VocaTalk massages the generated output and adds cool effects like echo, reverberation, positional audio, frequency modulation, and more. This makes it even more fun and engaging.
  • Most TTS applications do not save the original text into a file for you future reference. VocaTalk saves content in a rich format with images and font styles, allows you to reopen or regenerate the episode, and embeds the text into mp3, so you can enjoy it on your player.
  • Some TTS applications are server based. VocaTalk runs on your computer and can use its full power. If you have a multicore system, VocaTalk will also make use of multiple cores. You don't have to rely on internet connection speed, everything is local and private to you.

More features are being planned and will be published on this site soon. All for one purpose: Make listening and learning fun and enjoyable!

Compare VocaTalk episodes to regular text-to-speech and vote

Following demonstration shows the difference between regular text-to-speech output and VocaTalk's. This is an extract from a technology article that was originally published at CodeProject by I. Benian.

Mono text-to-speech voice

Ordinary Text-to-speech

This is a regular text-to-speech output that is generated by ordinary text-to-speech applations. A single voice reads the whole text continuously.

Download mp3

(The player may start with a few seconds of latency depending on your internet connection speed.)

Mono 16 Khz, Single speaker


VocaTalk Personal Podcast Stereo text-to-speech with background music and FX Best listened with headphones or ear buds

VocaTalk episode sample 1

And this is a VocaTalk episode that is generated using background music and other enhancements. Multiple voices read the text, the voice position smootly shifts and the echo effect gives more realism.

Download mp3

(The player may start with a few seconds of latency depending on your internet connection speed.)

Stereo 44.1 Khz CD Quality, Multiple speakers, Movie Score and Ambient music, Echoes, Wandering Voices


VocaTalk Personal Podcast Stereo text-to-speech with background music and FX Best listened with headphones or ear buds

VocaTalk episode sample 2

This is another VocaTalk episode that is generated using techno music in the background and additional voice modulator effect.

Download mp3

(The player may start with a few seconds of latency depending on your internet connection speed.)

Stereo 44.1 Khz CD Quality, Multiple speakers, Techno and Electronic music, Echoes, Wandering Voices, Voice Modulator

Have you noticed the periods of silence in VocaTalk episode? Just like in a documentary, these periods make listening much more comfortable and gives you a break to digest the content while enjoying the music.


See the full original article 'A Simple Object Collaboration Framework' at CodeProject.


Learn More

Want to learn more? Check out

Learn more about VocaTalk's features Compare VocaTalk episodes to ordinary text-to-speech See the 10 min screencast and watch in action See the screenshots and listen VocaTalk's introduction OK I'm convinced, download now!

Share |

Share |
Follow on Facebook
Follow on Youtube
Follow VocaTalk on Youtube
VocaTalk News