/[mbrola]/trunk/bin/readme.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/bin/readme.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1 - (show annotations)
Sat Aug 5 10:54:32 2006 UTC (17 years, 8 months ago) by dpavlin
File MIME type: text/plain
File size: 31645 byte(s)
import current mbrola, cr1 database and perl pho generator for Croatian

1 Version 3.01h mar jun 8 18:32:35 MEST 1999
2 M M BBBB RRRR OOO L A
3 MM MM B B R R O O L A A
4 M M M M B B R R O O L A A
5 M M M BBBB RRR O O L AAAAAAA
6 M M B B R R O O L A A
7 M M B B R R O O L A A
8 M M BBBBB R R OOO LLLLL A A
9
10 --------------------------------------------------------------
11 Table of Contents
12 --------------------------------------------------------------
13
14 1.0 License
15 2.0 A brief description of the MBROLA software
16 3.0 Distribution
17 4.0 Installation, and Tests
18 5.0 Format of input and output files - Limitations
19 6.0 Joining the MBROLA project as a user
20 7.0 Joining the MBROLA project as database provider
21 8.0 Acknowledgments
22 9.0 Contacting the author
23
24 --------------------------------------------------------------
25 1.0 License
26 --------------------------------------------------------------
27
28 This program and object code is being provided to "you", the licensee,
29 by Thierry Dutoit, the "author", under the following license, which
30 applies to any program, object code or other work which contains a
31 notice placed by the copyright holder saying it may be distributed
32 under the terms of this license. The "program", below, refers to any
33 such program, object code or work.
34
35 By obtaining, using and/or copying this program, you agree that you
36 have read, understood, and will comply with these terms and
37 conditions:
38
39 Terms and conditions for the distribution of the program
40 --------------------------------------------------------
41
42 This program may not be sold or incorporated into any product which is
43 sold without prior permission from the author.
44
45 When no charge is made, this program may be copied and distributed
46 freely, provided that this notice is copied and distributed with
47 it. Each time you redistribute the program (or any work based on the
48 program), the recipient automatically receives a license from the
49 original licensor to copy or distribute the program subject to these
50 terms and conditions. You may not impose any further restrictions on
51 the recipients' exercise of the rights granted herein. You are not
52 responsible for enforcing compliance by third parties to this License.
53
54 If you wish to incorporate the program into other free programs whose
55 distribution conditions are different, write to the author to ask for
56 permission.
57
58 If, as a consequence of a court judgment or allegation of patent
59 infringement or for any other reason (not limited to patent issues),
60 conditions are imposed on you (whether by court order, agreement or
61 otherwise) that contradict the conditions of this license, they do not
62 excuse you from the conditions of this license. If you cannot
63 distribute so as to satisfy simultaneously your obligations under this
64 license and any other pertinent obligations, then as a consequence you
65 may not distribute the program at all. For example, if a patent
66 license would not permit royalty-free redistribution of the program by
67 all those who receive copies directly or indirectly through you, then
68 the only way you could satisfy both it and this license would be to
69 refrain entirely from distribution of the program.
70
71 Terms and conditions on the use of the program
72 ----------------------------------------------
73
74 Permission is granted to use this software for non-commercial,
75 non-military purposes, with and only with the voice and language
76 databases made available by the author from the MBROLA project www
77 homepage:
78
79 http://tcts.fpms.ac.be/synthesis
80
81 In return, the author asks you to mention the MBROLA reference paper:
82
83 T. DUTOIT, V. PAGEL, N. PIERRET, F. BATAILLE, O. VAN DER VRECKEN
84 "The MBROLA Project: Towards a Set of High-Quality Speech
85 Synthesizers Free of Use for Non-Commercial Purposes"
86 Proc. ICSLP'96, Philadelphia, vol. 3, pp. 1393-1396.
87
88 or, for a more general reference to Text-To-Speech synthesis, the
89 book:
90
91 An Introduction to Text-To-Speech Synthesis,
92 T. DUTOIT, Kluwer Academic Publishers, Dordrecht
93 Hardbound, ISBN 0-7923-4498-7
94 April 1997, 312 pp.
95
96 in any scientific publication referring to work for which this program
97 has been used.
98
99 Disclaimer
100 ----------
101
102 THIS SOFTWARE CARRIES NO WARRANTY, EXPRESSED OR IMPLIED. THE USER
103 ASSUMES ALL RISKS, KNOWN OR UNKNOWN, DIRECT OR INDIRECT, WHICH INVOLVE
104 THIS SOFTWARE IN ANY WAY. IN PARTICULAR, THE AUTHOR DOES NOT TAKE ANY
105 COMMITMENT IN VIEW OF ANY POSSIBLE THIRD PARTY RIGHTS.
106
107 --------------------------------------------------------------
108 2.0 A brief description of MBROLA
109 --------------------------------------------------------------
110
111 MBROLA is a speech synthesizer based on the concatenation of
112 diphones. It takes a list of phonemes as input, together with prosodic
113 information (duration of phonemes and a piecewise linear description
114 of pitch), and produces speech samples on 16 bits (linear), at the
115 sampling frequency of the diphone database.
116
117 It is therefore NOT a Text-To-Speech (TTS) synthesizer, since it does
118 not accept raw text as input. In order to obtain a full TTS system,
119 you need to use this synthesizer in combination with a text processing
120 system that produces phonetic and prosodic commands.
121
122 We maintain a web page with pointers to such freely available systems:
123
124 http://tcts.fpms.ac.be/synthesis/mbrtts.html
125
126 This software is the heart of the MBROLA project, the aim of which is
127 to obtain a set a speech synthesizers for as many languages as
128 possible, free of use for non-commercial applications.
129
130 The terms of this project can be summarized as follows :
131
132 After some official agreement between the author of this software and
133 the owner of a diphone database, the database is processed by the
134 author and adapted to the mbrola format, for free. The resulting
135 mbrola diphone database is made available for non-commercial use as
136 part of the MBROLA project. Commercial rights on the mbrola database
137 remain with the database provider, for exclusive use with the mbrola
138 software.
139
140 The ultimate goal of this project is to boost up academic research on
141 speech synthesis, and particularly on prosody generation, known as one
142 of the biggest challenges taken up by Text-To-Speech synthesizers for
143 the years to come. If you want to provide a database to the mbrola
144 project, write first to mbrola@tcts.fpms.ac.be
145
146 More details can be found at the MBROLA project homepage :
147
148 http://tcts.fpms.ac.be/synthesis
149
150 The synthesizer uses a synthesis method known itself as MBROLA.
151
152 --------------------------------------------------------------
153 3.0 Distribution
154 --------------------------------------------------------------
155
156 This distribution of mbrola contains the following files :
157
158 mbrola.exe or mbrola: An executable file of the synthesizer itself
159 (depends on the computer supposed to run it)
160 readme.txt : This file
161
162 As such, it requires an MBROLA language/voice database to run
163 properly. American English, Arabic, Brazilian Portuguese, Breton,
164 British English, Croatian, Dutch, Estonian, French, German, Greek,
165 Mexican Spanish, Romanian, Spanish and Swedish voices are made
166 available. Additional languages and voices will be available in the
167 context of the MBROLA project.
168
169 Please consult the MBROLA project homepage to get the voices:
170
171 http://tcts.fpms.ac.be/synthesis
172
173 --------------------------------------------------------------
174 4.0 Installation and Tests
175 --------------------------------------------------------------
176
177 The following computers/OS are currently supported :
178
179 SUN Sparc 5/S5R4 (Solaris2.4)
180 HPUX9.0 and HPUX10.0
181 VAX/VMS V6.2 (V5.5-2 won't work)
182 DECALPHA(AXP)/VMS 6.2
183 AlphaStation 200 4/233
184 AlphaStation 200 4/166
185 IBM RS6000 Aix 4.12
186 PC486/DOS6 (but other PCs/DOSs should do, too)
187 PC486/Windows 3.1
188 PC486/Windows 95
189 PC-Pentium/Windows 98
190 PC-Pentium/Windows NT
191 PC/LINUX 1.2.11
192 PC/LINUX Redhat6.2
193 PCPentium120/Solaris2.4
194 OS/2
195 BeBox
196 BeOs (PPC,i386)
197 Macintosh
198 Sun Ultra1/SuSE Linux 7.0
199
200 Please send acknowledgement when mbrola works on a machine not listed
201 here. A special DLL version is distributed for PC/Windows to allow
202 direct audio output; check on the Mbrola site the Mbrolatools package.
203
204 See the MBROLA Homepage if your computer or OS is not supported yet.
205
206 Assuming you have copied the right .zip file, create a directory
207 mbrola (although this is not critical), copy the mbrXXX.zip file into
208 it (in which XXX stands for a version number), and unzip the file:
209
210 unzip mbrXXX.zip (or pkunzip on PC/DOS)
211
212 You are now ready to synthesize your first words....
213
214 First try: mbrola
215
216 to see the terms and conditions on the use of this software.
217
218 Then try: mbrola -h
219
220 to get some help on how to use the software:
221
222 > USAGE: ./synth [COMMAND LINE OPTIONS] database pho_file+ output_file
223 >
224 >A - instead of pho_file or output_file means stdin or stdout
225 >Extension of output_file ( raw, au, wav, aiff ) tells the wanted audio format
226 >
227 > Options can be any of the following:
228 > -i = display the database information if any
229 > -e = IGNORE fatal errors on unkown diphone
230 > -c CC = set COMMENT char (escape sequence in pho files)
231 > -F FC = set FLUSH command name
232 > -v VR = VOLUME ratio, float ratio applied to ouput samples
233 > -f FR = FREQ ratio, float ratio applied to pitch points
234 > -t TR = TIME ratio, float ratio applied to phone durations
235 > -l VF = VOICE freq, target freq for voice quality
236 > -R RL = Phoneme RENAME list of the form a A b B ...
237 > -C CL = Phoneme CLONE list of the form a A b B ...
238 >
239 > -I IF = Initialization file containing one command per line
240 > CLONE, RENAME, VOICE, TIME, FREQ, VOLUME, FLUSH, COMMENT,
241 > and IGNORE are available
242
243 Now in order to go further, you need to get a version of an MBROLA
244 language/voice database from the MBROLA project homepage. Let us
245 assume you have copied the FR1 database and referred to
246 the accompanying fr1.txt file for its installation.
247
248 Then try: mbrola fr1/fr1 fr1/TEST/bonjour.pho bonjour.wav
249
250 it uses the format:
251
252 mbrola diphone_database command_file1 command_file2 ... output_file
253
254 and creates a sound file for the word 'bonjour' ( Hello !).
255
256 Basically the output file is composed of signed integer numbers on 16
257 bits, corresponding to samples at the sampling frequency of the MBROLA
258 voice/language database (16 kHz for the diphone database supplied by
259 the author of MBROLA : Fr1). MBROLA can produce different audio file
260 formats: .au, .wav, .aiff, .aif, and .raw files depending on the
261 ouput_file extension. If the extension is not recognized, the format
262 is RAW (no header). We recommand .wav for Windows, and .au for Unix
263 platforms.
264
265 To display information about the phoneme set used by the database,
266 type:
267 mbrola -i fr1/fr1
268
269 It displays the phonetic alphabet as well as copyright information
270 about the database.
271
272 Option -e makes Mbrola ignore wrong or missing diphone sequences
273 (replaced by silence) which can be quite useful when debugging your
274 TTS. Equivallent to "IGNORE" directive in the initialization file (N.B
275 replace the obsolete ;;E=OFF , unsupported in .pho file).
276
277 Optional parameters let you shorten or lengthen synthetic speech and
278 transpose it by providing optional time and frequency ratios:
279
280 mbrola -t 1.2 -f 0.8 fr1/fr1 TEST/bonjour.pho bonjour.wav
281
282 or its equivalent in the initialization file:
283
284 TIME 1.2
285 FREQ 0.8
286
287 for instance, will result in a RIFF Wav file bonjour.wav 1.2 times
288 longer than the previous one (slower rate), and containing speech in
289 which all fundamental frequency values have been multiplied by 0.8
290 (sounds lower).
291
292 You can also set the values of these coefficients directly in a .pho
293 file by adding special escape sequences like :
294
295 ;; F=0.8
296 ;; T=1.2
297
298 You can change the voice characteristics with the -l parameter. If the
299 sampling rate of your database is 16000, indicating -l 18000 allows
300 you to shorten the vocal tract by a ratio of 16/18 (children voice, or
301 women voice depending on the voice you're working on). With -l
302 10000, you can lengthen the vocal tract by a ratio of 16/10 (namely
303 the voice of a Troll). The same command in an initialization file
304 becomes "VOICE 10000".
305
306 Option "-v" specifies a VolumeRatio which multiplies each output
307 sample. In the example below, each sample is multipliead by 0.7 (the
308 loudness goes down). Warning: setting VolumeRatio too high generates
309 saturation.
310
311 mbrola -v 0.7 fr1/fr1 TEST/bonjour.pho bonjour.wav
312
313 or add "VOLUME 0.7" in an initialization file
314
315 The -c option lets you specify which symbol will be used as an escape
316 sequence for comments and commands in .pho files. The default value is
317 the semi-colon ';', but you may want to change this if your phonetic
318 alphabet uses this symbol, like in:
319
320 mbrola -c ! fr1/fr1 TEST/test1.pho test2.pho test.wav
321
322 equivalent to "COMMENT !" in an initialization file
323
324 The -F option lets you specify which symbol will be used to Flush the
325 audio output. The default value is #, you may want to change the
326 symbol like in:
327
328 mbrola -F FLUSH_COMMAND fr1/fr1 test.pho test.wav
329
330 equivalent to "FLUSH FLUSH_COMMAND" in the initialization file.
331
332
333 Using Pipes
334 -----------
335
336 A - instead of command_file or output_file means stdin or stdout. On
337 multitasking machines, it is easy to run the synthesizer in real time
338 to obtain audio output from the audio device, by using pipes.
339
340 Renaming and Cloning phonemes
341 -----------------------------
342
343 It may happen that the language processing module connected to MBROLA
344 doesn't use the same phonemic alphabet as the voice used. The Renaming
345 and Cloning mechanisms help you to quickly solve such problems
346 (without adding extra CPU load). The only limitation about phoneme
347 names is that they can't contain blank characters.
348
349 If, for instance, phoneme "a" in the mbrola voice you use is called
350 "my_a" in your alphabet, and phoneme "b" is called "my_b", then the
351 following command solves the problem:
352
353 mbrola -R "a my_a b my_b" fr1/fr1 test.pho test.wav
354
355 You can give as many renaming pairs as you want. Circular definition
356 are not a problem -> "a b b c" will rename original [a] into [b] and
357 original [b] into [c] independantly ([a] won't be renamed to [c]).
358
359 LIMITATION: you can't rename a phoneme into another that already
360 exists.
361
362 The cloning mechanism does exactly the same thing, though the old
363 phoneme still exists after renaming. This is usefull if you have 2
364 allophones in your alphabet, but the Mbrola voice only provides one.
365
366 Imagine for instance, that you make the disctinction between the
367 voiced [r] and its unvoiced counterpart [r0] and that you are using a
368 syllabic version [r=]. If as a first approximation using [r] for both
369 is OK, then you may use an Mbrola voice that only provides one version
370 of [r] by running:
371
372 mbrola -C "r r0 r r=" fr1/fr1 test.pho test.wav
373
374 which tells the synthesizer that [r0] and [r=] should be both
375 synthesized as [r]. You can write a long cloning list of phoneme
376 pairs to fit your needs.
377
378 Renaming and cloning eats CPU since the complete diphone hash table
379 has to be rebuilt, but once the renaming or cloning has occurred there
380 is absolutely NO RELATED PERFORMANCE DROP. So using this feature
381 is more efficient than a pre-processor, though incompatibilities
382 cannot always be solved by a simple phoneme mapping.
383
384 Before renaming anything as #, check paragraph 5.4
385
386 When you have long cloning and renaming lists, you can conveniently
387 write them into an initialization file according to the following
388 format:
389
390 RENAME a my_a
391 RENAME b my_b
392 CLONE r r0
393 CLONE r r=
394
395 The obsolete ";; RENAME a my_a" can't be used in .pho file anymore,
396 but is correctly parsed in initialization files.
397
398 Note to Festival and EN1 users: the consequence of the change above is
399 that you must change the previous call format "mbrola en1 en1mrpa ..."
400 into "mbrola -I en1mrpa en1 ...".
401
402
403 BELOW ARE A NUMBER OF MACHINE DEPENDANT HINTS FOR BEST USING MBROLA
404
405 On MSDOS/Windows or OS/2
406 ------------------------
407
408 Type: mbrola fr1/fr1 TEST/bonjour.pho bonjour.wav
409
410 Then you can play the RIFF Wav file with windows sound utility On OS/2
411 pipes may be used just like below.
412
413 REMARK: MbrolaTools provide an excellent DLL and graphical pho player
414 called Mbroli. We advise you to use them instead of mbrola.exe for
415 Windows.
416
417 On modern Unix systems such as Solaris or HPUX or Linux
418 -------------------------------------------------------
419
420 mbrola fr1/fr1 TEST/bonjour.pho -.au | audioplay
421
422 where audioplay is your audio file player (* the name vary with the
423 platform, e.g. splayer for HPUX *)
424
425 If your audioplayer has problems with sun .AU files, try with .raw
426 Never use .wav format when you pipe the ouput (mbrola can't rewind the
427 file to write the audio size in the header). Wav format was not
428 developped for Unix (on the contrary Au format let you specify in the
429 header "we're on a pipe, read until end of file").
430
431 NOTE FOR LINUX: you can use the GPL rawplay program provided at
432 ftp://tcts.fpms.ac.be/pub/mbrola/pclinux/
433
434 On Sun4 or with machines with an old audio interface
435 -----------------------------------------------------
436
437 Those machines are now quite old and only provide a mulaw 8Khz
438 output. A hack is:
439
440 mbrola fr1/fr1 input.pho - | sox -t raw -sw -r 16000 - -t raw -Ub -r 8000 - > /dev/audio
441
442 (providing you have the public domain sox utility developed by Ircam).
443 You should hear 'bonjour' without the need to create intermediate
444 files. Note that we strongly recommend that you DON'T use SOX, since
445 its resampling method (linear interpolation) will permanently damage
446 the sound.
447
448 Other solution: The UTILITY.ZIP file available from the MBROLA
449 homepage provides RAW2SUN which does this conversion.
450
451 On VAX or AXP workstations
452 --------------------------
453
454 To make it easier for users to find MBROLA, you should add the
455 following command to your system startup procedure:
456
457 $ DEFINE/SYSTEM/EXEC MBROLA_DIR disk:[dir]
458
459 where "disk:[dir]" is the name of the directory you created for the
460 MBROLA_DIR files. You could also add the following command to your
461 system login command procedure:
462
463 $ MBROLA :== $MBROLA_DIR:MBROLA.EXE
464 $ RAW2SUN :== $MBROLA_DIR:RAW2SUN.EXE
465
466 to use the decsound device:
467
468 $ MCR DECSOUND - volume 40 -play sound.au
469
470 See also the MBR_OLA.COM batch file in the UTILITY.ZIP file available
471 from the MBROLA Homepage if you cannot play 16 bits sound files on
472 your machine.
473
474 --------------------------------------------------------------
475 5.0 Format of input and output files - Limitations
476 --------------------------------------------------------------
477
478 5.1 Phoneme commands
479 --------------------
480
481 The input file bonjour.pho in the above example simply contains :
482
483 ; bonjour
484 _ 51 25 114
485 b 62
486 o~ 127 48 170.42
487 Z 110 53.5 116
488 u 211
489 R 150 50 91
490 _ 91
491
492 This shows the format of the input data required by MBROLA. Each line
493 contains a phoneme name, a duration (in ms), and a series (possibly
494 none) of pitch targets composed of two float numbers each : the
495 position of the pitch target within the phoneme (in % of its
496 total duration), and the pitch value (in Hz) at this position.
497
498 In order to increase readability, it is also possible to enclose pitch
499 target in parentheses. Hence, the first line of bonjour.pho could
500 be written :
501
502 _ 51 (25,114)
503
504 it tells the synthesizer to produce a silence of 51 ms, and to put a
505 pitch target of 114 Hz at 25% of 51 ms. Pitch targets define a
506 piecewise linear pitch curve. Notice that the intonation curve they
507 define is continuous, since the program automatically drops pitch
508 information when synthesizing unvoiced phones.
509
510 The data on each line is separated by blank characters or tabs.
511 Comments can optionally be introduced in command files, starting with
512 a semi-colon ';'. This default can be overrun with the -c option
513 of the command line.
514
515 Another special escape sequence ';;' allows the user to introduce
516 commands in the middle of .pho files as described below. This escape
517 sequence is also affected by the -c option.
518
519 5.2 Changing the Freq Ratio or Time Ratio
520 -----------------------------------------
521
522 A command escape sequence containing a line like "T=xx" modifies the
523 time ratio to xx, the same result is obtained on the fundamental
524 frequency by replacing T with F, like in:
525
526 ;; T = 1.2
527 ;;F=0.8
528
529
530 5.3 Flush the output stream
531 ---------------------------
532
533 Note, finally, that the synthesizer outputs chunks of synthetic speech
534 determined as sections of the piecewise linear pitch curve. Phones
535 inside a section of this curve are synthesized in one go. The last
536 phone of each chunk, however, cannot be properly synthesized while the
537 next phone is not known (since the program uses diphones as base
538 speech units). When using mbrola with pipes, this may be a
539 problem. Imagine, for instance, that mbrola is used to create a
540 pipe-based speaking clock on an HP:
541
542 speaking_clock | mbrola - -.au | splayer
543
544 which tells the time, say, every 30 seconds. The last phone of each
545 time announcement will only be synthesized when the next announcement
546 starts. To bypass this problem, mbrola accepts a special command
547 phone, which flushes the synthesis buffer : "#"
548
549 This default character can be replaced by another symbol thanks to the
550 command:
551
552 ;; FLUSH new_flush_symbol
553
554 Another important issue with piping under UNIX, is the possibility to
555 prematurely end the audio output, if for example the user presses the
556 stop button of your application. Since release 3.01, Mbrola handles
557 signals.
558
559 If in the previous example the user wants to interrupt the speaking
560 clock message, the application just needs to send the USR1 signal. You
561 can send such a signal from the console with:
562
563 kill -SIGUSR1 mbrola_process_number
564
565 Once mbrola catches the signal, it reads its input stream until it
566 gets EOF or a FLUSH command (hence, surrounding sections with flush is
567 a good habit).
568
569 Limitations of the program
570 --------------------------
571
572 Phones can be synthesized with a maximum duration which depends on
573 the fundamental frequency with which they are produced. The higher the
574 frequency, the lower the duration. For a frequency of 133 Hz, the
575 maximum duration is 7.5 sec. For a frequency of 66.5 Hz, it is 15 sec.
576 For a frequency of 266 Hz, it is 3.75 sec.
577
578 --------------------------------------------------------------
579 6.0 Joining the MBROLA project as a user
580 --------------------------------------------------------------
581
582 For convenience, we have defined two mailing lists :
583
584 * mbrola-interest@tcts.fpms.ac.be : a forum for MBROLA questions and
585 issues. It is used by the maintainers of the mbrola project to
586 announce new releases, bug fixes, new voices and languages, and other
587 information of interest to all MBROLA users. Users who want to share
588 .pho files or free applications running on top of mbrola should send
589 mail to mbrola-interest.
590
591 It is your interest, as a user, to subscribe to the mbrola-interest
592 mailing list, by sending an e-mail to :
593
594 mbrola-interest-request@tcts.fpms.ac.be
595
596 with the word 'subscribe' in either the header or the main text. To
597 unsubscribe, just send another mail with 'unsubscribe'.
598
599 BUGS
600 ----
601
602 If you detect a bug, or if you find an input for which the quality of
603 the speech provided by mbrola is not as good as usual, first consult
604 the FAQ file from the MBROLA Project homepage, which will be
605 frequently updated.
606
607 If this is of no help, send a kind mail to mbrola@tcts.fpms.ac.be in
608 which you include the .pho file with which the problem appears and
609 mention your machine architecture.
610
611 NEW DATABASES
612 -------------
613
614 If you want to participate to the mbrola project by providing a
615 diphone database (i.e. a set of sample files with one example of each
616 diphone in your language), refer to the mbrola WWW homepage, or send
617 an email to: mbrola@tcts.fpms.ac.be.
618
619 APPLICATIONS
620 ------------
621
622 If you have used mbrola to build speaking apps on top of it (like
623 talking clocks, talking agendas, talking tools for handicapped
624 persons, etc., and want to make it available to the community (for
625 free, of course, and for non-commercial, non-military applications, as
626 imposed by the mbrola license agreement), just make an announcement to
627 the mbrola mailing list:
628
629 mbrola-interest@tcts.fpms.ac.be.
630
631 COMMERCIAL VERSION
632 ------------------
633
634 If you are interested in the commercial version of mbrola (source code
635 available), send an email to: mbrola@tcts.fpms.ac.be
636
637 FEEDBACK
638 --------
639
640 If you simply find this initiative useful, please drop us a note at
641 mbrola@tcts.fpms.ac.be. We have spent a lot of our time to provide you
642 with this program, and we would like to get some feedback in return.
643
644 Don't forget, either, to mention the MBROLA reference paper :
645
646 T. DUTOIT, V. PAGEL, N. PIERRET, F. BATAILLE, O. VAN DER VRECKEN
647 "The MBROLA Project: Towards a Set of High-Quality Speech
648 Synthesizers Free of Use for Non-Commercial Purposes"
649 Proc. ICSLP 96, Philadelphia, vol. 3, pp. 1393-1396
650
651 or, for a more general reference to Text-To-Speech synthesis, the
652 book:
653
654 An Introduction to Text-To-Speech Synthesis,
655 T. DUTOIT, Kluwer Academic Publishers, Dordrecht
656 Hardbound, ISBN 0-7923-4498-7
657 April 1997, 312 pp.
658
659 in any scientific publication referring to work for which this program
660 has been used.
661
662 --------------------------------------------------------------
663 7.0 Joining the MBROLA project as a database provider
664 --------------------------------------------------------------
665
666 One of the biggest interests of the MBROLA project (and definitely its
667 most original aspect) lies in its ability to provide an ever growing
668 set of languages/voices to users.
669
670 To achieve this goal, the MBROLA project has itself been organized so
671 as to incite other research labs or companies to share their diphone
672 databases.
673
674 The terms of this sharing policy can be summarized as follows :
675
676 1. We shall only use your database to adapt it to the mbrola format,
677 and destroy the copy when this is done.
678
679 2. The resulting mbrola diphone database will be copyright Faculte
680 Polytechnique de Mons. Non-commercial use of the database in the
681 framework of the MBROLA project will be automatically granted to
682 Internet users. In return, we shall send you a license agreement which
683 will transfer all our commercial rights on the database to you,
684 provided the database is used with and only with the MBROLA program.
685
686 3. All these details will be fixed by some official agreement before
687 you send us anything.
688
689 If you want to create a database from scratch
690 ---------------------------------------------
691
692 First, you should be aware that recording a diphone database is not a
693 trivial operation. If it is not performed carefully, the result can be
694 deceiving. FR1, for instance, required about one month of work, yet
695 with the help of some efficient laboratory tools for signal recording
696 and editing. What is more, some phonetic knowledge of the targeted
697 language is necessary to create the initial corpus.
698
699 So if you just think of designing a new diphone database as a game,
700 forget it.
701
702 If, on the contrary, you are willing to spend some time to provide the
703 MBROLA community with a new language or voice, or if you already have
704 a diphone database and wish to share it in mbrola format (and receive
705 in return the rights for any commercial exploitation of the mbrola
706 diphone database we will create for you), welcome here.
707
708 If you want to build a new diphone database, please contact the author
709 first. He will help you as much as he can, by providing phonetic
710 information if available for instance.
711
712 In all cases, make a first dummy trial : create a small corpus for a
713 few diphones, record them, segment them, equalize them if you can, and
714 send the result directly to the author. He will test your data, tell
715 you how good it is, and what should be done to make it better.
716
717 If you want to share an existing database
718 -----------------------------------------
719
720 contact the authors (see below).
721
722 --------------------------------------------------------------
723 8.0 Acknowledgments
724 --------------------------------------------------------------
725
726 I would like to thank Vincent Pagel (Mons / BE) for his intensive
727 programming, testing, and debugging of this program, and for all sorts
728 of fruitful discussions. Vincent also wrote MBRDICO a general purpose
729 trainable phonetizer. Not to forget Nicolas Pierret and Olivier Van
730 der Vreken, for their contribution to the Mbrola coder.
731
732 Then let's greet our pioneer database providers:
733 Alejandro Barbosa (MX1),
734 Aggelos Bletsas (GR1),
735 Marian Boldea (RO1),
736 Gösta Bruce (SW1),
737 Alistair Conkie (EN1 ES1),
738 Denis Costa (BR1),
739 Arthur Dirksen (NL1 NL2),
740 Thierry Dutoit (FR1),
741 Céline Egéa (FR2),
742 Fred Englert (DE1 DE2),
743 Nikolaj Lazic (CR1),
744 Mike Macon (US3 MX1),
745 Einar Meister (EE1),
746 Yann-Ber Messager (BZ1),
747 Vincent Pagel (FR3 FR4),
748 Marcus Philipson (SW1),
749 George Sergiadis (GR1),
750 Nawfal Tounsi (AR1),
751 Raymond Veldhuis (NL3),
752 Gordon Tischer (ES2),
753 Johan Wouters (US3)
754 and the team at University Autonoma of Barcelona (ES1)!
755
756 May they be thanked for their work.
757
758 Sam Przyswa (NEXT Paris/FR), Fred Englert (IBMRS600 Frankfurt/DE),
759 Arnaud Gaudinat (VAX-VMS University of Geneva, CH), Cyrille
760 Mastchenko (BeOS Montreal/CA), Michael C. Thornburgh (SCO-Unix USA),
761 Bruno Langlois (Java port Quebec/CA), Christophe M. Vallat (OS2
762 Domerat/FR), Cristiano Verondini (Mac Bologna/Italy), Gerald Kerma
763 (Mac G'K2 Vaugrigneuse/FR), David Woodman (SUN4 Berkshire/England)
764 Gary Thomas (Linux-PPC Grenoble/France), Thomas Fletcher (QNX-OS CA),
765 Philippe Devallois(Mac DLL),Thomas Agopian (BeOs), Stephen Isard(Linux
766 PC Redhat6.2), Matthias Nutt (Ultra 1)for their help in the compilation
767 of MBROLA on many platforms.
768
769 Arnaud Gaudinat (Lausanne/CH), Thierry Gartiser (Nancy/France), Alec
770 Epting (Summer Institure of Linguistics/USA), Michael M. Cohen
771 (University of California - Santa Cruz), and Patrick Bouffer (France)
772 have arranged mirror sites.
773
774 David Haubensack has written a French TTS in PERL, Stephen Isard and
775 Alistair Conkie have provided the Freespeech British English TTS!!
776 Alan Black and Paul Taylor have supported the Mbrola Project in their
777 great Festival multilingual TTS Project.
778
779 Fabrice Malfrere (Mons/BE) who has developped an efficient speech
780 alignment program for Windows (distributed on the mbrola site).
781
782 Alain Ruelle (Mons/BE) who has developped the MBRPlay dll and the
783 Mbroli interactive pho file player for Windows.
784
785 Nawfal Tounsi(Mons/BE) who has developped the W project aimed at
786 helping disable people talk with the help of Mbrola.
787
788 Last but not least, I am also greatly indebted to Francois Bataille
789 (Mons/BE) for having supported the creation of this internet project.
790
791 --------------------------------------------------------------
792 9.0 Contacting the author
793 --------------------------------------------------------------
794
795 Dr Thierry Dutoit
796
797 Faculte Polytechnique de Mons, TCTS Lab,
798 31, bvd Dolez, B-7000 Mons, Belgium.
799 tel : /32/65/374133
800 fax : /32/65/374129
801 e-mail: mbrola@tcts.fpms.ac.be, for general information,
802 questions on the installation of software and databases.
803

  ViewVC Help
Powered by ViewVC 1.1.26