vizact.speak [Archive] - WorldViz User Forum

View Full Version : vizact.speak

Philippe

08-10-2010, 10:18 AM

Hi,

When using vizact.speak, it is very difficult to have the Avatar speaking in a natural manner. When setting the two following parameters:

Scale =
Threshold =

we managed to set "Scale" in a satisfactory manner, but we had troubles finding something appropriate for the "Threshold". If we set it high, there were parts of the speech where the Avatar did not open its lips. If we set it low, then there were non natural movements of the lips appearing when the lips should hardly move or not move at all.

What is the best way to make the Avatar speaking in a natural manner? Is the use of vizard.speak method the only way, or is there a more sophisticated method? Which is the best morph to use? I guess the quality of the recording of the sound file is important, but to what extent? Are there any tips concerning the sound file recording (speaking loudly in the microphone, use a certain type of microphone, format of the audio file, etc.)? We noticed that some types of voice gave better results than others. What is the current usage – taking some samples, and then trying different morphs to see how it best fits?
Even when the movement was acceptable and gave some approximation of a real environment, it still was quite mechanical – all what we could control was to what extent the mouth opens (Scale) and the sensitivity to sound amplitude (Threshold). Are there other methods giving more control on the movement of the lips - not only open/close and amplitude? Is there some utility which can enable us to match the movement of the lips to the International Phonetic Alphabet? Finally: are there plans to improve vizard.speak method in the foreseeable future?

Thanks.

masaki

08-10-2010, 02:37 PM

Hi Philippe,
Aside from going into a 3d software like 3D Studio Max and manually creating an animation for the mouth movements specifically for your sound file, there isn't a more sophisticated method other than vizact.speak (for now). Internally, we're discussing how to improve many avatar features - and one of the features we want to beef up is the speak method.
The best you can do for now is to have a very clean recording and to tweak the scale and threshold parameters. I have some pointers for getting a clean sound file:
1) if you can avoid it, do not use a pc mic connected to the mic line of your soundcard.
2). Get a real microphone (e.g. Shure SM58 for $100 would be top notch for this) and boost the signal using a preamp, multitracker, etc, and feed the signal to the line in of your sound card (*not* the mic line).
3) use a pop filter OR (this is important especially if you're using a pc mic) speak into the microphone at an angle instead of head on. This is to reduce the "pops" in the sound file you get from plosives (words with p's, b's, etc like the word "pop"). When you speak these words a burst of air hits the microphone if you are head on, and it makes the sound "clip" (distortion caused by maxing out the input level).
4) finally, use an audio editing software (e.g. Audacity is a freeware) to clean up the sound. Use high/low pass filters to get rid of the low hums and rumbles and also the high pitched hisses. You can also use parametric eq's to boost the mid range (where most voices fall). Filters like pop-filters, click-filters, etc should be helpful as well. Basically you want a loud and clean sound file without any noise or clipping. That will yield the best lip movement.

Best,
Masaki

Deltcho

08-10-2010, 06:36 PM

Hi Philippe,

I've been working with making avatars talk as well. Maybe I can share some of the things that I've found to work.

As masaki has already suggested, using audio editing software is key. I've personally had *a lot* of success using a standard noise cancellation PC mic (ZALMAN ZM-MIC -- costs about $5-10). The mic clips onto the collar of your shirt.

Here's the process I came up with:

1) Record the audio using proper audio software (I personally use Acoustica but I'm sure other ones work just as well.) When speaking, speak in a *natural tone* as if you were talking to someone else in the room.

2) Normalize volume in the speech file to 20 db (all recorded files should be normalized to this for best results with my settings.)

3) Run a noise reduction filter on it.

4) Export the file as a mono 16-bit PCM file (very important for proper syncing).

5) Use a scale value of 0.028 and a threshold of 0.0001

I've currently got 4 pages of dialogue lip synced perfectly using this method. I also recommend tweaking vizact.py to sync the lips more often (described in my other post.)

I'll try and post a video demonstrating the end result if I have time.

Philippe

08-12-2010, 01:01 AM

Hi Masaki, Hi Deltcho,

Thanks a lot for your detailed answer! This is extremely valuable information for us, since until now, we were not really happy with the results, although we tried very hard to put the right parameters. I am sure the improved quality of the sound file using the techniques, the tools and the material you recommended will improve drastically the animation.

Thanks again, I really appreciate your tremendous help.