From fungsin.lui at gmail.com Mon Oct 9 07:21:46 2006 From: fungsin.lui at gmail.com (Lui Fungsin) Date: Mon, 9 Oct 2006 00:21:46 -0700 Subject: [cl-wav-synth-devel] newbie question Message-ID: <3990b5930610090021pdaf149fx63f426dbd6e03499@mail.gmail.com> Hi, I just finished watching cl-wav-synth demo tutorial, it's way cool! I see that this is a new project and not much traffic here, so I hope that you guys wouldn't mind a dumb question. I'm clueless with audio and wav file format, etc. However, there's a simple task that I want to try my hands on with the cl-wav-synth library. Here're two sound files for some chinese words. Some word has more than one pronounciation (like the first file below) while most of the others only have one. http://209.172.124.170/pub/two_tone.wav http://209.172.124.170/pub/single_tone.wav Is it possible to programmically detect if there's a voice uttered at the beginning of a wav file, then some short period of silence, and then another voice uttered. If this is the case, I want to split that into two files (break at the silence). Otherwise I can just leave it alone. If this can be done I'd greatly appreciate if someone can briefly describe the procedure, or can point me to a right direction (url to read, etc). Many thanks. fungsin From hocwp at free.fr Mon Oct 9 19:18:44 2006 From: hocwp at free.fr (Philippe Brochard) Date: Mon, 09 Oct 2006 21:18:44 +0200 Subject: [cl-wav-synth-devel] newbie question In-Reply-To: <3990b5930610090021pdaf149fx63f426dbd6e03499@mail.gmail.com> (Lui Fungsin's message of "Mon, 9 Oct 2006 00:21:46 -0700") References: <3990b5930610090021pdaf149fx63f426dbd6e03499@mail.gmail.com> Message-ID: <871wphjo17.fsf@grigri.elcforest> Lui Fungsin writes: > Hi, > Hi, thanks a lot for your interest in cl-wav-synth! > I just finished watching cl-wav-synth demo tutorial, it's way cool! > thanks :) > I see that this is a new project and not much traffic here, so I hope > that you guys wouldn't mind a dumb question. > > I'm clueless with audio and wav file format, etc. > However, there's a simple task that I want to try my hands on with the > cl-wav-synth library. > > Here're two sound files for some chinese words. Some word has more > than one pronounciation (like the first file below) while most of the > others only have one. > > http://209.172.124.170/pub/two_tone.wav > http://209.172.124.170/pub/single_tone.wav > > Is it possible to programmically detect if there's a voice uttered at > the beginning of a wav file, then some short period of silence, and > then another voice uttered. > > If this is the case, I want to split that into two files (break at the > silence). Otherwise I can just leave it alone. > > If this can be done I'd greatly appreciate if someone can briefly > describe the procedure, or can point me to a right direction (url to > read, etc). > Here is how I write this (load it from slime or the clim repl): -------------------------------------------------- (in-package :wav) (defun find-peak (sample &optional (max-level 5000) (min-level 100) (min-index 1000)) "Find the number of peak in a sample. Return the tone count and there index in a list as two values" (with-slots (data) sample (let ((count 0) (find-max nil) (find-min 0) (acc nil)) (loop for sample across data for index from 0 do (cond ((> (abs sample) max-level) (setf find-max t find-min 0)) ((< (abs sample) min-level) (incf find-min) (when (and find-max (> find-min min-index)) (incf count) (setf find-max nil) (push index acc))) (t (setf find-min 0)))) (values count (nreverse acc))))) -------------------------------------------------- Then in the clim REPL: WAV> Load As Sample (pathname) single_tone.wav WAV> (with-sample (find-peak it)) 0 1 1 (17525) WAV> Load As Sample (pathname) two_tone.wav WAV> (with-sample (find-peak it)) 0 2 1 (23303 60504) WAV> (set-sample (mix it (delay it 4))) WAV> (with-sample (find-peak it)) 0 4 1 (23303 60504 111503 148704) The first value is the number of tone in the file. The second value is a list of each tone index. Then you can do what you want with this value. For example to isolate the first tone: WAV> (set-sample (cut-i it 0 23303)) WAV> (with-sample (write-sample "first-tone.wav" it)) To isolate the second tone: WAV> (set-sample (cut-i it 23303 60504)) Etc... And if you want to automate this and save a file per tone: -------------------------------------------------- (with-sample (multiple-value-bind (total-count index) (find-peak it) (loop for i in index for s = 0 then e for e = i for count from 0 do (write-sample (format nil "tone-~A.wav" count) (cut-i it s e))))) -------------------------------------------------- Note: a sample is just a wav header (bit per sample...) and a big array of data. You can adjust levels: - Max and min level are detection levels. - Min index is the minimal length of the silence in sample index. > Many thanks. > I hope that helps. > fungsin > Philippe -- Philippe Brochard http://hocwp.free.fr -=-= http://www.gnu.org/home.fr.html =-=- From fungsin.lui at gmail.com Mon Oct 23 03:55:57 2006 From: fungsin.lui at gmail.com (Lui Fungsin) Date: Sun, 22 Oct 2006 20:55:57 -0700 Subject: [cl-wav-synth-devel] newbie question In-Reply-To: <871wphjo17.fsf@grigri.elcforest> References: <3990b5930610090021pdaf149fx63f426dbd6e03499@mail.gmail.com> <871wphjo17.fsf@grigri.elcforest> Message-ID: <3990b5930610222055y557b08a8k9c9f22ff4bab4a6b@mail.gmail.com> On 10/9/06, Philippe Brochard wrote: > Here is how I write this (load it from slime or the clim repl): > Hi Philippe, This works well for me. Thanks! BTW, during the course of parsing the pronounciation files I have, I enhance the wav header parsing method a bit to skip other misc header fields. With this patch I'm able to read all of the 10000+ wav samples I have. Attached is the diff. -- fungsin -------------- next part -------------- Index: cl-wav-synth.lisp =================================================================== --- cl-wav-synth.lisp (revision 910) +++ cl-wav-synth.lisp (working copy) @@ -353,9 +353,10 @@ (defgeneric read-header (filename header)) (defmethod read-header (filename (header header)) + "Read wav header info. See http://www.sonicspot.com/guide/wavefiles.html" (labels ((expected (read-str orig-str) (assert (string= read-str orig-str) () - "error reading header: ~S is not a wav file" filename))) + "error reading header: ~S is not a wav file. Expected ~A Got ~A" filename orig-str read-str))) (with-slots (n-samples-per-sec n-channels n-bits-per-sample n-block-align n-avg-bytes-per-sec @@ -365,16 +366,25 @@ (expected (read-id stream 4) "RIFF") (read-32 stream) (expected (read-id stream 4) "WAVE") - (expected (read-id stream 4) "fmt ") - (read-32 stream) - (read-16 stream) - (setf n-channels (read-16 stream)) - (setf n-samples-per-sec (read-32 stream)) - (setf n-avg-bytes-per-sec (read-32 stream)) - (setf n-block-align (read-16 stream)) - (setf n-bits-per-sample (read-16 stream)) - (expected (read-id stream 4) "data") - (setf total-byte (read-32 stream))))) + (loop + (let* ((next-header (read-id stream 4)) + (bytes (read-32 stream))) + (cond ((string= next-header "fmt ") + (read-16 stream) ;; compression code + (setf n-channels (read-16 stream)) + (setf n-samples-per-sec (read-32 stream)) + (setf n-avg-bytes-per-sec (read-32 stream)) + (setf n-block-align (read-16 stream)) + (setf n-bits-per-sample (read-16 stream)) + ;; possible extra format bytes + (dotimes (i (- bytes 16)) (read-byte stream))) + ((string= next-header "data") + (setf total-byte bytes) + (return)) + (t + ;; There're a lot of headers that we don't + ;; care. For instance, bext minf elmo, etc + (dotimes (i bytes) (read-byte stream))))))))) header) (defgeneric print-header (header &optional comment)) From hocwp at free.fr Thu Oct 26 15:24:25 2006 From: hocwp at free.fr (Philippe Brochard) Date: Thu, 26 Oct 2006 17:24:25 +0200 Subject: [cl-wav-synth-devel] newbie question In-Reply-To: <3990b5930610222055y557b08a8k9c9f22ff4bab4a6b@mail.gmail.com> (Lui Fungsin's message of "Sun, 22 Oct 2006 20:55:57 -0700") References: <3990b5930610090021pdaf149fx63f426dbd6e03499@mail.gmail.com> <871wphjo17.fsf@grigri.elcforest> <3990b5930610222055y557b08a8k9c9f22ff4bab4a6b@mail.gmail.com> Message-ID: <87d58fyuae.fsf@grigri.elcforest> Lui Fungsin writes: > On 10/9/06, Philippe Brochard wrote: >> Here is how I write this (load it from slime or the clim repl): >> > > Hi Philippe, > > This works well for me. Thanks! > Ok, cool :) > BTW, during the course of parsing the pronounciation files I have, I > enhance the wav header parsing method a bit to skip other misc header > fields. > > With this patch I'm able to read all of the 10000+ wav samples I have. > > Attached is the diff. > Thanks a lot, this is in the cvs and in the current release. Philippe -- Philippe Brochard http://hocwp.free.fr -=-= http://www.gnu.org/home.fr.html =-=-