Sep. 26, Noon, BH 751
Bruce Randall Donald, Cornell Univ.
Massively-Parallel Distributed Manipulation Using
Microfabricated Actuator Arrays
Oct. 8, 4pm, BH 161
Jayant Shah, Math Dept. Northeastern Univ.
Synthesis of Curve Evolution and Variational Methods:
Image Segmentation and Extraction of Shape Skeletons
Oct. 15, 4pm BH 161
Sibel Tari, Math Dept. Northeastern Univ.
Shape Representation and Analysis without Segmentation
Oct. 22, Noon BH 751
Prof. Augustus K. Uht, Dept. of Electrical and Computer Engineering. URI
ILP Speedups in the 10's
Nov. 5, Noon, BH 751
Leslie M. Novak, MIT Lincoln Laboratory
Algorithms for Optimal Processing of Polarimetric
Radar Data
Nov. 11, Noon, BH 312
Yvan Leclerc, SRI International
Examples of the Simplest Description Principle in
Computer Vision
Nov. 19,
Jonathan Merril, Medical Consumer Media
Nov. 21, 2:30-4:00pm, Applied Math and Electrical Sciences
Seminar, Applied Math Building
Joseph Mundy, GE Corporate Research and Development
Advances in Object Recognition Using Concepts From Projective Geometry
Nov. 22, 11:00am, BH 312
Ray A. Bittner, Virginia Polytechnic Institute and State University
COLT: An Experiment In Wormhole Run-Time Reconfiguration
Jim Peterson, Virginia Polytechnic Institute and State University
Scheduling And Partitioning Ansi-C Programs Onto Multi-FPGA CCM Architectures
Nov. 25, noon, BH 245
Yoshi Watanabe,
Digital Equipment Corp.
Logic Decomposition During Technology Mapping:
A New Approach For Logic Synthesis
Nov. 26, 4:00pm, BH 161
Ruud Bolle, IBM Watson
Video Query -- The Problem of Content-based Video
Retrieval
Dec. 3, Noon, BH 751
David Jacobs, NEC Research Center
Parts-based Object Recognition with Simple, General Representations
Dec. 10
Lance Williams, NEC tentative
Abstract: Simultaneous multithreading (SMT) is a technique that permits multiple independent threads, or processes, to issue multiple instructions each cycle. While existing forms of hardware multithreading depend on fast context switches to time-share the processor, simultaneous multithreading permits multiple threads, or processes, to issue instructions to a superscalar's functional units in the same cycle. This allows SMT to significantly increase throughput in the face of both long instruction latencies and limited available parallelism per thread.
This presentation describes an architecture for SMT which achieves three goals: (1) it minimizes the architectural impact on the conventional superscalar design, (2) it has minimal performance impact on a single thread executing alone, and (3) it achieves significant throughput gains when running multiple threads. This architecture achieves a throughput of 5.4 instructions per cycle, a 2.5x improvement over an unmodified superscalar with similar hardware resources.
This speedup is enhanced by an advantage of multithreading previously unexploited in other architectures: the ability to favor for fetch and issue those threads most efficiently using the processor each cycle, thereby providing the ``best'' instructions to the processor. This talk is based on a paper of the same title which was presented at ISCA in May 1996.
Bio: Rebecca Stamm is a Principal Hardware Engineer at Digital Semiconductor, a division of Digital Equipment Corporation in Hudson, Massachusetts. She has been involved in the microarchitecture and design of Digital microprocessors since 1983. She holds a B.S.E.E. from MIT and a B.A. in History from Swarthmore College.
Abstract: In distributed manipulation, a distributed system of programmable actuators manipulates its environment to perform a task. Examples of manipulation tasks include sorting or feeding parts, and mechanical assembly.
The manufacturing problems of parts-feeding, -orientation, and -singulation motivate our work on massively-parallel distributed manipulation. I'll discuss our progress on sensorless manipulation using SIMD arrays of microfabricated actuators. We are testing these strategies on the M-CHIP ( Manipulation Chip), an array of programmable micro-motion pixels fabricated at Cornell. I'll show a prototype M-CHIP containing over 15,000 single-crystal silicon "cilia" (actuators) in one square inch. This may be a kind of record for parallelism and density. We have developed and analyzed efficient SIMD manipulation strategies and control algorithms for parts-sorting and -orienting.
The Cornell arrays are based on a modified SCREAM process (Single Crystal Reactive Etching and Metallization). I'll discuss the design, fabrication, control, and programming of such arrays of Micro-actuators, and describe our experiments with other device technologies (both MEMS and macroscopic) for parts-orienting.
We believe such systems could result in flexible, programmable parts-feeders, or ``intelligent'' (i.e., programmable) manipulation surfaces, tiled with microactuators. I'll show a video of our prototype systems.
Biography: Bruce Donald received the SM and Ph.D. degrees in EECS from MIT, working under Tomas Lozano-Perez. He is co-founder of the Robotics and Vision Laboratory at Cornell University, where he is currently an Associate Professor of Computer Science. He received a National Science Foundation Presidential Young Investigator award in 1989. Donald has written three books and numerous scientific papers on Robotics; this year, he is on leave at Stanford University and Interval Research Corporation.
WWW: http://www.cs.cornell.edu/home/brd/
Abstract: In recent years, the method of curve evolution has developed into an important tool in Computer Vision and has been applied to a wide variety of problems, such as smoothing of shapes, determination of shape skeletons, shape recovery and the shape-from-shading problem. The first part of the talk will show that the different versions of curve evolution used for shape recovery, together with the preprocessing step of constructing an edge-strength function can be integrated in the form of a new segmentation functional. The numerical solutions obtained by applying gradient descent to the new functional retain sharp discontinuities or "shocks", thus providing sharp demarcation of object boundaries. The second part of the talk will show that the edge-strength function associated with the new functional may be interpreted as a type of regularization of the Blum's "grassfire" method of determining shape skeletons and thus provides a simple method for determining shape skeletons by linear diffusion. Consequently, shape skeletons may be calculated directly from the grayscale images without first extracting the shape boundary.
Abstract: Today's machine vision systems, which use hierarchical (modular) approach, can not cope with general vision problem. With the hierarchical approach, one can not determine the right amount of information that is relevant to the next level of abstraction, because the context determines the relevance. Designing feedback between existing modules tends to be ad-hoc because principles, by which such feedback links can be designed, are not clear. It is much easier to start with an integrated formulation from which one may derive a system of linked modules. In this talk, a new approach that integrates the steps into a single theoretically justified formulation while satisfying the practical requirements such as computational efficiency, robustness and ease of implementation will be presented. Our basic tool is a function called edge-strength function. The local geometry of the edge-strength function carries the information regarding the shape of the imaged object in terms of its parts and skeletons. The formulation involves two parameters which parameterize a two-dimensional representation space where the shape is represented at the desired level of detail. The edge-strength function is calculated from a raw image through a set of coupled linear diffusion equations; thus the method is hardware implementable. Various previous proposals for shape representation can be obtained as the special cases of the formulation. For binary images, the formulation approximates the non-linear curve evolution equation. I will also discuss how the new method fits to the circle of ideas which have been presented within psychology, psychophysics, and engineering.
Abstract: In order to significantly improve processor performance, the par- allelism among machine instructions must be exploited. Conditional branches are the major restrictors of this parallelism, especially since they are widespread in general-purpose programs. Until recently, the best Instruction Level Parallelism (ILP), or superscalar, methods known realized only about a factor of 2 to 3 speedup over purely sequential computers. The uncertainties of which way branches will execute are the culprits, and are called branch effects. After a brief introduction to instruction level parallelism and current Branch Effect Reduction Tech- niques (BERT's), our research is presented. A new BERT devised by the author, called Disjoint Eager Execution (DEE), is described and evaluated. Simulation results indicating potential order-of-magnitude speedups for one type of DEE are presented. The microarchitecture of the Levo prototype machine, embodying DEE, is also briefly described.
Abstract: M.I.T. Lincoln Laboratory is conducting a broad-based research effort, under DARPA sponsorship, to develop an understanding of the phenomenology of high-resolution polarimetric radar data and to relate this phenomenology to the performance results achieved by stationary-target detection and classification algorithms. This understanding of the underlying phenomenology is necessary for the development of target surveillance, fire control, and missile-seeker systems. the Lincoln program is designed to develop the understanding necessary to develop and quantify the performance of new polarimetric detection, classification, and radar imaging algorithms.
Fully polarimetric radar measurements offer the potential for improved performance in three areas: (1) speckle reduction in polarimetric synthetic aperture radar (POL-SAR) imaging, (2) detection of stationary ground targets, and (3) classification of ground cover and stationary ground targets. New, optimal algorithms for speckle reduction, detection, and classification will be presented.
Abstract: An image is inherently ambiguous because there are an infinite number of three-dimensional scenes that can give rise to it. Given this inherent ambiguity we can ask: What criterion do we use to choose a single scene from this set? How can we exploit this criterion to recover a single scene? One answer to the first question is to choose the simplest description (or model) of a scene that can give rise to a given image, where complexity is taken to be the length of the description of the model, plus the length of the description of the discrepancy between the given image and the image of the model. Unfortunately, minimizing complexity is generally an exponential problem, so we need to make approximations of the coding length and use approximate optimization methods in order to find minima of the coding length in reasonable time. I will present applications of the principle, and the approximations that I made, to four problems in computer vision: image segmentation, shape from shading, interpretation of line drawings, and surface reconstruction from multiple images.
Abstract: Over the last five years there has been considerable progress in object recognition processing based on discoveries enabled by the framework of projective geometry. Even though there was mention from time to time of projective geometry over the last thirty years, it failed to have any significant impact on the field of object recognition by computer until this decade. This talk will review the reasons for this rather late introduction of what has proved to be the central intellectual framework for computer vision. The presentation will review the key contributions of projective geometry, particularly in attacking the difficult problem of figure-ground separation. More recent emphasis on symmetry will be reviewed along with examples of current research results on symmetry from the speaker's laboratory.
Dr. Mundy has been at GE's Corporate R&D Center since 1963. He received his PhD from Rensselaer Polytechnic Institute in 1969. From 1977 until 1982 he was Manager, Visual Information Processing Program. In this capacity he led a number of large projects in the application of machine vision techniques to automatic visual inspection for quality control. A major application was the use of UV dye fluorescence and 3D range data in the inspection of castings for jet engines.
Abstract: The growth of high performance computing power to date can largely be attributed to continuing breakthroughs in materials and manufacturing. In order to increase computing capacity beyond these physical bounds, new computing paradigms must be developed that make more efficient use of existing manufacturing technologies. The concept of Wormhole Run-Time Reconfiguration (RTR) is the core of an attempt to create an improved computing paradigm for high performance computational tasks. The Colt CMOS integrated circuit has been developed as the first device to employ Wormhole RTR. By combining concepts from Field Programmable Gate Array (FPGA) technologies with Data Flow computing architectures the Colt/Stallion architecture achieves superior utilization of hardware resources while still attaining high clock speeds. Targeted mainly at DSP-type operations, the Colt chip compares favorably against similar contemporary DSP products in terms of silicon area consumed per unit computation and raw computing power. Although emphasis has been placed on signal processing applications, general purpose computation has not been neglected. Colt is a small prototype that defines an architecture not only at the chip level but also in terms of an overall system. Colt's future cousin, Stallion, will be a larger more powerful chip that will build on the principles exhibited in the prototype. As this system is realized, the concept of Wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
Abstract: The increasing size and speed of modern FPGAs allow complex computations, on the order of an average sized program, to be performed in a small collection of processing elements. It is well documented that the execution of large sections of a program within the ``virtual hardware'' offered by an attached FPGA processor can provide substantial speedup over the ordinary execution within a sequential, general-purpose processor. Unfortunately, the development tools currently available for FPGAs do not allow for easily configuring multi-FPGA custom computing machines. Configuration of an FPGA architecture requires scheduling: the mapping of computations onto existing functional units. To take advantage of all available logic, computations may span processing elements, calling for partitioning of a subroutine between one or more FPGAs. In this paper, an architecture-independent design tool is presented for translating programs written in C to a dataflow representation and then efficiently scheduling and partitioning the resulting graphs onto multi-FPGA computing platforms.
Abstract: Logic synthesis is a process for generating a circuit which implements a given function. Two key steps in this process are: (1) manipulation of logic expressions, where initially provided logic expressions are simplified, and (2) technology mapping, where expressions are mapped onto a set of gates specified in a technology library to generate the final circuit.
In the most common approach, these two steps are performed separately. Although this approach is popular, it has a drawback that even though quality of the final circuit depends significantly on the expressions used in technology mapping, the expressions are generated with little information about how they are actually implemented.
This drawback is critical especially for logic design with tight and complicated constraints. To resolve this problem, we developed a novel approach which applies technology mapping while modifying logic expressions. This approach is used for our commercial design projects, and its effectiveness has been demonstrated.
In this talk, we present basic ideas of the new approach in contrast to conventional techniques. We show that the new approach can consider exponentially larger solution space than the conventional approach, while the run time is typically within the same order of magnitude.
This talk requires no background on logic synthesis.
About Yoshi Watanabe: Yoshi Watanabe received the Ph.D. degree in electrical engineering and computer sciences from the University of California at Berkeley in 1994, where he worked on logic synthesis of combinational and sequential circuits for synthesis systems SIS and HSIS. In 1994, he joined Digital Equipment Corporation, where he has been engaged in design automation of high-performance microprocessors.
Abstract: Digital video databases are becoming more and more pervasive and finding video of interest in large databases is rapidly becoming a problem. Because of the nature of video (streamed objects), accessing the content of such databases is inherently a time-consuming operation. Intelligent means of quick content-based video retrieval and content-based rapid video viewing is, therefore, an important topic of research.
Video is a rich source of data. It contains visual and audio information, and in many cases, there is text associated with the video. Content-based video retrieval should use all this information in an efficient and effective way. From a human perspective, a video query can be viewed as an iterated sequence of navigating, searching, browsing, and viewing. Video inquiries are iterated in the sense that the user can interactively refine the search based on intermediate results.
I will address all of these search phases, navigating, searching, browsing, and viewing. We discuss where search on the various information sources -- text, audio, video, image search -- is important and how the information sources can be effectively used to retrieve video.
This work is done in the context of a large NIST/ATP award involving a joint venture of a number of companies. This project is on the HDTV studio of the future. I will describe the NIST/ATP program, this particular project, and the role of the different companies.
Abstract: A central problem in recognition is to find representations that are rich enough to capture the shape of complex objects and that have useful analogs that we can compute reliably from images. Existing approaches only partially address these problems. For example, local geometric features (eg. corners, lines) may fail to capture the shape of non-polyhedral objects. Representations based on algebraic descriptions of object parts (eg. generalized cylinders, superquadrics) may be difficult to reliably compute in images, and it may be hard to relate a 3-D model to its 2-D appearance.
Our work uses a simple direct representation of object parts and image regions as sets of points. This representation can clearly be applied to a wide class of objects; what is surprising is that we can use such a simple representation to recognize objects. We focus on the central problem of determining the position in an image of a known object. Our approach assumes that we have matched regions of an image to parts of an object model, without forming any explicit correspondences between local geometric features or portions of contours. We consider planar objects and 3-D objects with three types of occlusions: self-occlusion, occlusions whose locus is identified in the image, and completely arbitrary occlusions. We derive efficient algorithms for pose determination in all cases except 3-D objects with arbitrary occlusion. For that case, we prove that the problem of finding valid poses is computationally hard, but provide an efficient, approximate algorithm.
Joint work with: Ronen Basri