Forum Coordinators: RedPhantom
Poser - OFFICIAL F.A.Q (Last Updated: 2025 Jan 07 11:07 am)
As you said, speech synthesis simply isn't there yet. For good voice work, have you considered contacting your local community college and checking into the acting or broadcast curriculums? If you found a professor that jazzed on the idea, the audio could be done and recorded on campus as part of a project (and a learning experience, as they would have to keep the creativity within the guidelines you established; just like in the real world).
I believe L_D's done a fair bit of work here. This is exactly the sort of work I do independently. Text to speech tools can be used with some effectiveness, but to attain the level being sought by most folks, you need a voice quality in the 64 to 128 bit level, and those packages typically cost around 50k for a six month lease of the service, as it requires multiple computers to effectively render it. They are also usually ignored in favor of custom voice work as the c-b ratio is pretty poor. Voice work is going to be your best solution given you are describing a specific want in terms of the way it sounds. You'll want a seriously well thought out script for each step, and you'll need an effective sound editor. Assemble the basic voice talent with a decent recording set up, and then basically record the dialogue to get the inflections, spoken at a slower than normal rate. Then, edit the dialogue itself into single word sound bites and reassemble as needed for the final result. Although you can use mp3 as a storage format if you needed to save space, for best results you'll want to save in 24 bit audio and use a lossless format. Typcial work for a single three path training video can take you up to 8 weeks to set up, and possibly 3 to render.
thou and I, my friend, can, in the most flunkey world, make, each of us, one non-flunkey, one hero, if we like: that will be two heroes to begin with. (Carlyle)
Text-to-speech is an interest of mine, and I've been monitoring its technical progress over the last twenty years.
It's true that today even the best TTS comes out a bit stilted and artificial, but they're starting to add inflection and emotion to the algorithms, and it's much improved over the neutral robotic monotones of yesteryear.
AT&T's Natural Voices is up near the top, but you might also consider this remarkable alternative from Rhetorical Systems:
Other options:
Nuance Vocalizer
Elan Speech Sayso
ScanSoft RealSpeak
ScanSoft Speechify
Kodak, er, kedo, clear up exactly what kind of production flowchart you and your employers think you're using. I've worked in client-driven graphic design/production long enough to know signoffs at multiple stages are not optional. I'm praying your production/signoff process is the following: script -> storyboard -> animatics -> looping -> final production. Repeat after me to your boss: "This is how we do it in the real world regardless of how small the production company is. Like it or suck it." Chances are you'll have more dickering over the camera direction and movements of your characters than the dialogue. With both Shrek and Ice Age (different studios), I recall seeing crude test animatics where inert, untextured characters slide around the scenery with correct camera shots and simplified lighting. Point is, at the animatics stage, render without sound or Mimic and use some other video postproduction tool to composite your voice (male and a squeaky, nonconvincing falsetto female) over the characters. Just because the lips aren't moving onscreen doesn't mean it doesn't convey what the client's looking for. When Kod--, er, the client gives you the thumbs up, hire the voice talent to loop the dialogue from script AND MAKE IT CLEAR TO YOUR SUPERVISOR that there is no such thing as a one-off recording session. There will be last minute changes and talent will have to return to rerecord. You don't know this, but Batman 2 had to have nearly everyone's dialogue relooped because Val Kilmer insisted on talking in a tiny little whisper and everyone else's dialogue ended up sounding like crackly shouting. When Jim Carrey was down in Charleston SC shooting "Ace Ventura 2," he had to drive 120 miles north to Wilmington NC to reloop all the Riddler's dialogue at a recording studio. Generate the Mimic stuff, then re-render the final work to production specs. I'll beat the drum one last time: unless you're working for yourself, taking a project in-house may save money, but DOES NOT mean the rules of production change. Project managers who think otherwise are gambling with their company's money if not their own employment (or yours if they are adept at blaming others).
This site uses cookies to deliver the best experience. Our own cookies make user accounts and other features possible. Third-party cookies are used to display relevant ads and to analyze how Renderosity is used. By using our site, you acknowledge that you have read and understood our Terms of Service, including our Cookie Policy and our Privacy Policy.
Ok ladies and germs the poser rubber is meeting the road I work for a division of one of the worlds oldest, largest and best known corporations, (YOU GET THE PICTURE); any who, we (we as in me) are using a poser character to present training on some new manufacturing concepts. Im creating figures with P4 and P5 and rendering the anims in Carrara3; using mimic for lipsync animations. Heres the problem, so far Ive used co workers for the voices, this will not work in the long run, to many schedule conflicts, I need to create voicing with a text to speech software. Ive tried A T&T s software and its the best so far but not there yet, any other ideas? Is there a text to phoneme generator that I could use to put into the mimic text field? Anyway this could be an amazing show case for poser animation is I can over come this problem. Ill try to post some files later on Kedo1981