SAPI 5.1 and 5.3 Support
VocaTalk supports the latest SAPI 5.3 (Microsoft Speech API) as well as the older 5.1 version. This means, you can find
many voice engines that are developed by various vendors. High quality AT&T, Acapella InfoVox Desktop and Ivona voices have been tested and confirmed.
Here's a 'Guide to voices supported by VocaTalk and their vendors'.
Contrary to many text to speech applications on the market, VocaTalk uses all or selected voices randomly in its generated
output. By default the default system Voice Microsoft Anna is used. But if you have other voices installed on your system,
each paragraph is read by a different voice reducing the monotony of speech and enriching the listening experience.
Digital Signal Processing
VocaTalk uses various DSP (Digital Signal Processing) technologies to enhance the speech output. Speech engines usually use
16Khz or 22Khz sampling rate. But VocaTalk's output is always at 44KHz (CD quality). The speech engines normally produce mono
output. But VocaTalk's output is always stereo. VocaTalk applies various effects to the sound after
resampling up to 44Khz, so the new sound has more details than the original. For example, a 16Khz
voice is resampled, turned into stereo, and enhanced by effects that operate on the extra samples
which is 4 times the original. Effects like positional audio, echo, voice pitch modulation thus
produces a richer and lifelike sound. Audio output from different voice engines can be at different
levels caused by the original samples used. VocaTalk automatically adjust volume level to be the same for
all voices.
VocaTalk also suppresses artifacts that may be produced
by signal processing by using digital filtering and floating point precision.
CD Ripping to use as Background Music
VocaTalk includes a CD Ripper tool embedded within the application UI. You can rip your CDs, create a library of albums, and
put them as background music for an episode. By default, VocaTalk will shuffle and use all the music in your library. But you
can chose what albums should be used. VocaTalk automatically adjusts the volume level of the music for a comfortable listening
of the speech.
mp3 Importing to use as Background Music
VocaTalk can import mp3 music and save into your library to use in the background.
Album archival to manage a large collection of albums
VocaTalk allows archival of albums to save hard disk space.
You can archive and deactivate some of the albums, and unarchive and activate them whenever you want to use them again.
Archival of albums does not prevent you from selecting them as background music. When you select an archived album,
VocaTalk will automatically unarchive it.
Effects for Fun
All effects are optional and can be selected per episode.
Sliding Depth and Echo
Voice is modified so that it'll smoothly vary between small-room echo to large-hall reverberation effects.
Sliding Voice Position
Voice moves from left to right smoothly. This creates the illusion of speakers actually walking around you as they read the text.
Sliding Voice Pitch
Voice pitch is smoothly modified from low to high frequencies effectively creating the illusion of more voices than you
have installed on your system.
So the number of voices appear to be virtually infinite.
3D Music
Sliding echo and positional effects for music. This effect enhances music to sound richer and spreads the music around
rather than playing in the middle of your ears. This effect creates a dynamically changing
environment where sounds come from different directions and distance.
Ambient White Noise
For people with Tinnitus, or for anyone who needs better isolation from environment, VocaTalk can put
ambient white noise effect.
Effects for Learning and Relaxation
VocaTalk uses brainwave entrainment technology to embed special signals and effects into the generated mp3.
These effects improve focus, memory and learning abilities or just relax the listener.
Binaural Beats
This is a special stereo signal that produces a slight frequency difference between left and right ears. For example,
left ear hears a sinusoidal signal of 300 Hz and the other 310 Hz. As the brain constantly tries to adapt
to this frequency difference of 10 Hz, a virtual sound is created that is not directly fed. This
is like a 10 Hz wobbling sound which is heard only when you listen with both ears but vanishes
when you listen to only the left or the right channel. The sound does not really exist, but generated
by the brain because of the way the auditory system works.
This virtual sound is an indication that the brain is stimulated to transition
into different states of consciousness like focus, alertness, relaxation or sleep depending on
the frequency.
| Less than 4Hz | Delta Waves | Dreamless sleep |
| 4-7Hz | Theta Waves | REM sleep, Transition to Long Term Memory |
| 7-13Hz | Alpha Waves | Awake relaxation |
| 13-40Hz | Beta Waves | Concentration, cognition |
| Greater than 40Hz | Gamma Waves | High mental activity, consciousness |
VocaTalk currently supports Theate Waves for memory and learning and Alpha Waves for relaxation and
leasure reading.
You can learn more about binaural beats on the internet or by watching this introductory video. There has been scientific a lot of research on this subject and
proven to be an effective way to improve learning.
Cross-feed Modulation
This is another learning improvement technique based on scientific research and supported by VocaTalk.
The spoken audio is smoothly shifted from left to right rapidly at about 1 cycle per second. You hear
the voice coming from all around you rather than hearing it in the middle of your head as is the case with monoaural (single channel) sound.
Some superlearning recordings employ this technique for improving memorization.
You can learn more about Cross-feed Modulation technique on the internet.
More effects will be added in newer versions.
Local Podcasting Technology to integrate with iTunes, Zune and other podcatchers
VocaTalk can directly integrate with your iTunes and Zune and publish the generated episodes just like any other
podcast that you subscribe. The only difference is, VocaTalk podcasts are on your local computer and not shared with others.
iTunes and Zune do a pretty good job of managing podcast episodes and users are already familiar with those.
The only thing that you need to keep in mind is that, you have to keep VocaTalk application open when
you're updating podcast and downloading episodes because VocaTalk has to serve them. Once an episode
is downloaded to iTunes or Zune, you can synchronize it with your iPod, iPhone or Zune just like other
music or podcasts that you subscribe.
Why podcast?
VocaTalk generates mp3 files that you can directly copy and play on any mp3 player device. However podcasts
have some advantages over regular mp3 files.
-
Podcast episodes are always listed in the order they are published. This is especially important when the episodes
are interrelated and must be listened in a specific sequence.
-
Podcatchers like iTunes or Zune manage your files better than files that you move around manually.
-
Podcatchers know what you played and what not, so it's easier to manage when the number of files get large.
-
Podcatchers can synchronize unlistened episodes to your device and delete the ones that you've already listened.
-
Players like iPod, iPhone and Zune treat podcasts specially so that you can easily find unlistened episodes
and stop or resume where you left off last time. You can even switc between episodes, or from music to podcast episode
and the player still remembers the last position you were in any episode. This is an extremely useful
feature especially when episodes are long (40-60 mins).
-
VocaTalk allows you to queue podcast episodes without generating them until you actually want to listen. If you have
a large content you want to listen, VocaTalk will split it into multiple episodes. This makes it easier to
manage such large content.
In short, the podcast experience is much more suitable for listening to speech based recordings like the ones VocaTalk generate.