Publications

 

  •  * S. Khanal, H.F. Silverman, and R. Shakya, “A Free-Source Method (FrSM) for Calibrating a Large-Aperture Microphone Array,” [pdf]

  •  * D. Ayellon, A. Levi, and H.F.Silverman, “Real-time phase-isolation algorithm for speech separation,” accepted for EUSIPCO 2011, Aug-Sept 2011, Barcelona, Spain. [pdf]

  •  * B. Reggiannini and H.F.Silverman, “Real-Time Labeling of Talker Identity Using Short Speech Segments in a Microphone-Array Environment,” submitted (IEEE Transactions on Audio, Speech, and Language Processing), April 2011 [pdf]

  •  * B. Reggiannini and H.F.Silverman, “A Talker-Similarity Function Based on Fundamental Frequency for Use in Real-Time Talker Labeling of Microphone-Array Data,” submitted (The Journal of the Acoustical Society of America), Sept. 2010 [pdf]

  •  * A. Levi and H.F.Silverman, “A Robust Method to Extract Talker Azimuth Orientation Using a Large Aperture Microphone Array,” in IEEE Transactions of Speech and Audio Processing, vol. 18, no. 2, Feb. 2010, pp. 277 –285[pdf]

  •  * H. Do and H. F. Silverman, “Stochastic Particle Filtering: A fast SRP-PHAT single source localization algorithm,” in Proceedings of the IEEE Workshop on Applications of Acoustics, Speech, and Signal Processing 2009 (WASPAA 2009), New Paltz, New York, Oct. 2009, pp. 213 – 216 [pdf]

    •  * A. Levi and H. F. Silverman, “A new algorithm for the estimation of talker azimuthal orientation using a large aperture microphone array,” in Proceedings of the IEEE International Conference on Multimedia and Expo 2008 (ICME 2008), Germany, June 2008, pp. 565 – 568[pdf]

      • * M. Gillette and H. F. Silverman,“A Linear Closed-Form Algorithm for Source Localization From Time-Differences of Arrival,IEEE Signal Processing Letters, vol.15, pp. 1 - 4, 2008.[pdf]

      • *   H. Do and H. F. Silverman, “A method for locating multiple sources using a frame of a large-aperture microphone array data without tracking,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2008 (ICASSP 2008), Las Vegas, NV, Apr. 2008, pp. 301 - 304 [pdf]

      •  *  H. Do and H. F. Silverman, “A fast microphone array SRP-PHAT source location implementation using coarse-to-fine region contraction (CFRC),” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2007 (WASPAA 2007), , New Paltz, NY, Oct. 2007, pp. 295 – 298.[pdf]

        • *  H. Do, H. F. Silverman, and Y. Yu, “A real-time SRP-PHAT source location implementation using stochastic region contraction (SRC) on a large-aperture microphone array,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2007 (ICASSP 2007), Honolulu, HI, Apr. 2007, vol.1,  pp.121 – 124.[pdf]

        • * H. F. Silverman, Y. Yu, J. M. Sachar, and W. R. Patterson III, “Performance of real-time source-location estimators for a large-aperture microphone array,IEEE Trans. Speech, Audio Process., vol.13, no.4, pp. 593 - 406, July 2005.[pdf]

        • * J. M. Sachar and H. F. Silverman,“The Time-Delay Graph and the Delayogram - New Visualizations for Time-Delay,IEEE Signal Processing Letters, vol.12, no.4, pp. 301 - 304, April 2005.[pdf]

        • * J.M.Sachar, H.F.Silverman and W.R.Patterson III, “Microphone Position and Gain Calibration for a Large-Aperture Microphone Array,IEEE Transactions of Speech and Audio Processing, January, 2005, Volume 13, Number 1,  pp.42 – 52.[pdf]

        • * Y. Yu and H. F. Silverman, “An improved TDOA-based location estimation algorithm for large-aperture microphone arrays,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2004 (ICASSP 2004), Montreal, Quebec, Canada, May 2004, vol.4,  pp.77 – 80.[pdf]

        • * H.F.Silverman, W.R.Patterson III, and J.M.Sachar, “Factors Affecting The Performance of Large-aperture Microphone Arrays,Acoustical Society of America,Mar., 2002.[pdf]

        • * D. Sturim, H.F.Silverman and S. Levinson, “Early Results for a Non-Parametric Hidden Markov Model for Talker Characterization,Computer, Speech and Language,2001.[pdf]

        • * H.F.Silverman, W.R.Patterson III, and J.M.Sachar, “First Measurements of a Large-aperture Microphone Array System for Remote Audio Acquisition,” in Proceedings of IEEE International Conference on Multimedia and Expo 2000,New York, NY, July/Aug., 2000.[pdf]

        • * H.F.Silverman, W.R.Patterson III, and J.M.Sachar, “Early Experimental Results for A Large-Aperture Microphone Array System,” in Proceedings of SAM 2000,Boston, MA, Mar., 1999.[pdf]

        • * M. S. Brandstein and H. F. Silverman, “A Robust Method for Speech Signal Time-Delay Estimation in Reverberant Rooms,” in Proceedings of ICASSP 1997,Munich, Germany, Apr., 1997, vol.1,  pp.375 – 378.[pdf]

        • * D. E. Sturim, M. S. Brandstein, and H. F. Silverman, “Tracking Multiple Talkers Using Microphone-Array Measurements,” in Proceedings of ICASSP 1997,Munich, Germany, Apr., 1997, vol.1,  pp.371 – 374.[pdf]

        • * M. S. Brandstein, J. E Adcock, and H. F. Silverman, “A Closed-Form Location Estimator for use with Room Environment Microphone Arrays,IEEE Transactions on Speech and Audio Processing, Jan., 1997, vol.5,  pp.45 – 50.[pdf]

        • * M. S. Brandstein, J. E Adcock, and H. F. Silverman, “Microphone Array Localization Error Estimation with Application to Sensor Placement,Journal of the Acoustical Society of America, June 1996, vol.99(6),  pp.3807 – 3816. [pdf]

        • * M. S. Brandstein, J. E. Adcock, and H. F. Silverman, “A Localization-Error Based Method for Microphone-Array Design,” in Proceedings of ICASSP 1996, Atlanta, GA, May 1996, vol.2,  pp.901 – 904.[pdf]

        • * J. E. Adcock, Y. Gotoh, D. J. Mashao and H. F. Silverman, “Microphone-Array Speech Recognition via Incremental MAP Training,” in Proceedings of ICASSP 1996, Atlanta, GA, May 1996, vol.2,  pp.897 – 900.[pdf]

        • * M. S. Brandstein, J. E. Adcock and H. F. Silverman, “A Practical Time-Delay Estimator for Localizing Speech Sources with a Microphone Array,Computer, Speech and Language, Sept. 1995, vol.9,  pp.153 – 269.[pdf]

        • * M. S. Brandstein, J. E. Adcock, J. H. DiBiase and H. F. Silverman, “A Closed-Form Method for Finding Source Locations from Microphone-Array Time-Delay Estimates,” in Proceedings of ICASSP 1995, Detroit, Michigan, May 1995, vol.5,  pp.3019 – 3022.[pdf]

        • * P. C. Meuse and H. F. Silverman, “Characterization of Talker Radiation Pattern Using a Microphone-Array,” in Proceedings of ICASSP 1994, Adelaide, Australia, May 1994, vol.ii,  pp.257 – 260.[pdf]

        • This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. Please visit the IEEE copyright page for more information.

       

       

       

      Theses

                                                                                  

        PhD theses

      •  1. Brian Reggiannini. Real-Time Labeling of Talker Identity for Conversational Speech in a Microphone-Array Environment. Brown University, Providence, RI, May 2012[pdf]

        •  1. Ying Yu. Distant Talker Labelling for a Conferencing Situation Using Data from a Large Aperture Microphone Array in a Noisy Reverberant Environment. Brown University, Providence, RI, May 2007[pdf]

          • 2.  John E. Adcock. Optimal Filtering and Speech Recognition With Microphone Arrays. Brown University, Providence, RI, May 2001 [pdf] or [Online file]

          • 3. Joseph H. Dibiase. A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays. Brown University, Providence, RI, May 2001 [pdf]

          • 4. Paul C. Meuse. The Radiation Pattern of Speech Production: Implications for Microphone Array Systems . Brown University, Providence, RI, May 2000 [Postcript]

          • 5. Michael S. Brandstein. A Framework for Speech Source Localization Using Sensor Arrays. Brown University, Providence, RI, May 1995 [pdf]

          •                                                                                       

          •                                                         

             

       

       

       

      Data and tools

                                                                                  

        Data

      New! We just made a 120-second long recording of 10 talkers using 181 microphones (plus 10 close-talking channels) in a real reverberant room. Please visit this page for more information and downloading.

       

      • 1. Locations of 449 microphones of the HMA system [Matlab MAT file]

        • 2.  Microphone list for 8 locators (0-7) of the HMA system. Each locator has [8 x 3] mics [Matlab MAT file]

        • 3. Recording of 5 real talkers facing 24 mics of locator 3.[Wav file] The locations of the 5 talkers and the 24 microphones locations[Text file] Movie of the SRP-PHAT 3D-plot (a slice approximately through the average height of the talkers) with clean-speech recording embedded[Windows Media Player AVI file]

          • More data available [here]

        •                                                                                       

        •                                                                   Tools                                                                 

          1.  Brown Acoustic Simulator by Avram Levi (Before using, please read the readme.txt first).[Zip file]

            2.  Spectrogram plot by H. F. Silverman [Matlab M. file]

            • 3.  Single source locationing algorithm using SRP-PHAT with SRC [Matlab M. file]

            • 4.  HMAVIEW: A fast and powerful program by Ken Silverman to view time signal/spectrogram/energy of a multi-channel signal in real-time
              UPDATES (9/3/2010): A new feature allows users to hand-label the speech segments of the signal. First, hit the "Insert" key to mark the beginning of the speech segment, then hit "Insert" again to mark the end of the speech segment. Right-click on the small rectangular box of the right mark (the end of speech segment) to flip it to the inside of the segment. In other words, the rectangular boxes of the left and right marks have to be inside of the segment bounded by the two marks. If you want to delete a mark, just move your mouse cursor over that mark and hit "Delete". After done labeling, hit "Escape" and the program will produce an output file with an extension "MRK". In this file, there are 3 columns, separated by the delimiter comma. The first column is the sample number, the second one is the channel number, the third one (having binary value of 0 or 1) denotes the "on" (left mark) and "off" (right mark) of the speech segment, respectively. [Application]

             

        Under Construction

                                                                                              

      •