# IRIS: An Integrated, Scalable Focal Plane Architecture William H. Robinson, D. Scott Wills, Martin Brooke, Nan Jokerst Electrical and Computer Engineering Georgia Institute of Technology Atlanta, Georgia 30332-0250 ### Abstract Wearable video processing systems integrate low cost silicon detectors and analog interface circuitry with massively parallel digital processing on a single chip. While this "imaging system on a chip" can enable many significant new systems, many architectural and technological challenges must be addressed. This paper presents the IRIS architecture in which a detector, analog interface circuitry, and massively parallel digital processing is integrated into a cell that can be tiled into a monolithic array. This pixel level integration offers significant performance, efficiency, and cost advantages over multichip and non-interleaved approaches. The IRIS architecture and an example system is described. ### 1. Introduction The increased demand for integrated, portable communication systems is illustrated by the use of alphanumeric pagers for text and cellular phones for voice. The natural extension of this system is the inclusion of portable image processing. Applications such as "wearable" video and teleconferencing require intensive computational capability (>100 Gops/sec). In addition, portability requires a compact, low-power processing architecture. Advances in semiconductor device technology enable the development of these systems. Decreasing feature size will lead to billion transistor chips in the foreseeable future, powering the massively parallel image processing architectures. Development of Active Pixel Sensor (APS) technology offers an alternative to traditional image acquisition using Charge Coupled Devices [1] Several strategies are being pursued to provide image processing to the focal plane. One approach integrates the silicon detectors and analog-to-digital conversion circuitry. A companion digital chip is required for image processing. The Photobit digital camera-on-a-chip implements this strategy <sup>[2]</sup>. A second approach emphasizes the processing architecture, while image data is mapped into system memory. Examples include digital signal processing (DSP) chips, such as the TI 320C80 <sup>[3]</sup>, and massively parallel Single Instruction Multiple Data (SIMD) arrays, such as the Micro-Grain Array Processor (MGAP) <sup>[4]</sup> and the ABACUS <sup>[5]</sup>. A third approach places the image processing on the same chip as the image sensor array. Processing elements capable of data acquisition are integrated into a processing array through an interconnection network. The Near-Sensor Image Processor <sup>[6]</sup> and the Neighborhood Combinatorial Processing (NCP) Retina <sup>[7]</sup> utilize this method. ## 2. IRIS Focal Plane Architecture Processing an image in the focal plane moves the computation closer to the data acquisition. This retains the spatial locality of pixel data. The interconnect wire length of the system is reduced, which decreases the latency within the processing. These ideas form the basis for the development of an Intelligent aRray for an Integrated Scalable (IRIS) focal plane processor at Georgia Tech. The IRIS architecture integrates the functionality of pixel detection, analogto-digital conversion, and digital image processing into a processing element. The processing elements are tiled into an array interconnected with nearestneighbor communication. This system will provide the capability of image processing applications such as convolution, morphological edge detection, and image segmentation. The structure of the IRIS processing element is shown below. The silicon detector samples the light intensity. This analog signal is routed to a Sigma- Delta $(\Sigma - \Delta)$ A/D converter. The analog component of the $\Sigma$ - $\Delta$ converter produces a serial bit stream. This bit stream must them be applied to a digital filter to product the binary value. As each bit is produced, it is used to selectively add a globally broadcast filter weight in a 16 bit accumulator. This accumulator acts as a register in the Digital Image Processing (DIP) unit, a SIMD processing element capable of performing early image processing operations. The DIP unit, show below, incorporates a 16 bit multiply accumulate shift unit, a register file, and systolic communication & I/O registers for local (NEWS) communication and transfer of results off chip. The IRIS processing array incorporates a scalable array of these processing elements for the required system image resolution. The optical input is converted to a digital representation. The pixel values are then processed locally in the (DIP). Efficiency is gained from the retention of pixel locality within the cell array. Also long wires carrying analog signals are eliminated, reducing potential noise and signal cross-coupling problems. The global instruction broadcast also uses employs systolic registers to reduce power and increase clock frequency. The at-pixel processing of image data also permits early compression, reducing the across-chip and off-chip communications requirements. System parameters targeted for a typical system are shown below. | Target VLSI Technology | 0.1 μm | |------------------------|------------------| | Target Clock Rate | 500 MHz | | Cell Size | $2,500 \mu m^2$ | | Cell Transistor Count | 50,000 | | Cell Array Size | 512 x 512 | | Processing Performance | 130 Tops/sec | #### 3. Implementation Design Issues The concept of the IRIS processing architecture requires the resolution of several design issues. The constraints of size, power, and instruction set capability must be resolved. These issues form a triangular trade-off relationship. The fill factor of the image sensor array will constrain the size of the remaining circuitry. Efficient designs of the $\Sigma$ - $\Delta$ converter and the image processing circuitry will be required. Also, each processing element will continuously consume power while performing computations. The instruction set of the IRIS processing array will influence the required circuitry in each processing element. In addition, the bit precision of the $\Sigma$ - $\Delta$ converter will determine the accuracy of each processing element. The SIA roadmap projections will be used to provide information on future system implementation and performance. Work is currently underway to simulate target image processing applications on the IRIS architecture. Results from this work will help refine the final system definition. ### 4. Conclusions Image processing applications have transcended traditional imaging systems and entered the realm of low cost, portable systems. This new class of applications has motivated the need for imaging systems with 100 to 1000 times the computational performance available in current portable systems. The IRIS architecture offers a compact, efficient, monolithic imaging system with on-chip processing required for real-time image applications. The co-location of the detector, analog interface circuitry and digital processing will provide improvements in system power, clock frequency, and I/O bandwidth requirements for emerging new systems. The IRIS processor will provide a valuable area for future research in focal plane processing. ### 5. References [1] E. R. Fossum, "Active pixel sensors: Are CCD's Dinosaurs?" in *Proc. SPIE*, Charge-Coupled Devices and Solid-State Optical Sensors III, Feb. 1993, vol. 1900, pp. 2-14. <sup>[2]</sup> E. R. Fossum, "Digital Camera System on a Chip," in *IEEE Micro*, May-June 1998, pp. 8-15. [3] C. P. Feigel, "TI Introduces Four-Processor DSP Chip." Microprocessor Report, March 28, 1994, pp. 22-25. <sup>[4]</sup> K. Acken and E. Gayles, "The MGAP Family of Processor Arrays," *IEEE Great Lakes Symposium on VLSI*, 1997, pp. 105-110. <sup>[5]</sup> M. Bolotski, R. Amrithrajah, and W. Chen, "ABACUS: A High-Performance Architecture for Vision," *International Conference on Pattern Recognition*, 1994. <sup>[6]</sup> J. Eklund, C. Svensson, and A. Åström, "VLSI Implementation of a Focal Plane Image processor – A Realization of the Near-Sensor Image Processing Concept," in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 4, no. 3, Sept. 1996, pp. 322-335. pp. 322-335. <sup>[7]</sup> T. M. Bernard, B. Y. Zavidovique, and F. J. Devos, "A Programmable Artificial Retina," in IEEE Journal of Solid-State Circuits, vol 28, no. 7, July 1993, pp. 789-798.