# Stacked Silicon CMOS Circuits with a 40-Mb/s **Through-Silicon Optical Interconnect**

Olivier Vendier, Student Member, IEEE, Steven W. Bond, Student Member, IEEE, Myunghee Lee, Sungyung Jung, Student Member, IEEE, Martin Brooke, Member, IEEE, Nan Marie Jokerst, Member, IEEE, and Richard P. Leavitt

Abstract— Optical interconnection through stacked silicon foundry complementary metal-oxide-semiconductor (CMOS) circuitry has been demonstrated at a data rate of over 40 Mb/s with an open eye diagram. The system consists of a 0.8- $\mu$ m transmitter and receiver realized in foundry digital CMOS. The use of digital CMOS enables on-chip integration with more complex digital systems, such as a microprocessor. Two layers of these circuits were integrated with thin-film InP-based light emitting diodes and metal-semiconductor-metal photodetectors operating at 1.3  $\mu$ m (to which the silicon is transparent) to enable vertical optical through-Si communication between the stacked silicon circuits.

Index Terms-Interconnections, integrated optoelectronics, optical, 3-D systems.

## I. INTRODUCTION

NTEGRATED circuit electrical interconnection density is increasing and poses a correction density is increasing, and poses a communications bottleneck, which will worsen with time. One interconnection solution is multichip modules (MCM's), however, the electrical interconnections can suffer from high crosstalk, high latencies, and impedance mismatch [1]. An alternative interconnection solution utilizes vertical optical interconnections between layers of standard foundry silicon circuits to realize three-dimensional (3-D) structures. Optical interconnections have been previously analyzed as an interconnection solution, and offer some solutions to electrical interconnect issues [2], [3].

Some computational challenges, such as real-time high resolution medical imaging, real-time photorealistic virtual reality rendering and automatic target recognition and tracking for automated vehicles will require compact, low cost, massively parallel processing systems [4]. Current and future electronic interconnects fall short of meeting the needs of these parallel systems. For example, a 3-D mesh computational structure with  $32 \times 32 \times 4$  (4096) nodes could compute a highresolution medical scan in seconds but would require a thin film technology MCM substrate approximately  $2 \text{ ft} \times 2 \text{ ft}$ in size. With advanced dielectrics and metallization, this substrate yields a 2-ns (4 ns with ceramic dielectric) maximum latency in processor-processor communication,<sup>1</sup> which is a

Manuscript received June 27, 1997; revised December 22, 1997.

O. Vendier, S. W. Bond, M. Lee, S. Jung, M. Brooke, and N. M. Jokerst are with the School of Electrical and Computer Engineering, Microelectronics Research Center, Georgia Institute of Technology, Atlanta, GA 30332-0250 USA.

R. P. Leavitt is with the Army Research Laboratories, AMSRL-EP-EE, Adelphi, MD 20783-1197 USA.

Publisher Item Identifier S 1041-1135(98)02446-X.

<sup>1</sup> http://www.chips.ibm.com/products/interconnect/documents/sc/mlc/ht, Dec. 12, 1997.

major limitation to computational throughput. Because of the inherently nonscalable nature of two-dimensional (2-D) electrical interconnects, this latency increases to 4 ns (8 ns for a ceramic substrate) for a  $64 \times 64 \times 8$  processor array (32768 nodes), and the electrical system size expands to nearly 6 ft  $\times$  6 ft! Backplanes could be used to reduce the size of the electrical system, however, for the  $64 \times 64$  processor module, this would require a 27.7-Tb/s aggregate data transfer rate (nearly 30000 links operating at 1 Gb/s) for modest 300 MIP's processors [4].

Progress in the area of 3-D optically interconnected systems is expanding and developing. Wafer bonding and flip-chip techniques have been used toward a demonstration of 3-D memory architectures with an external light source [5]. For MCM stacking, InGaAs-GaAs vertical-cavity surfaceemitting laser (VCSEL) arrays have been coupled with In-GaAsP-InP photoreceiver arrays and microlenses into a 3-D module with laser drilled holes for the interconnection path [6]. In these cases, preliminary results on the system were reported without performance data. Vertical optical communication through stacked Si foundry complementary metal-oxide-semiconductor (CMOS) circuits with a sine wave at speeds in the kilohertz range has also been previously demonstrated [7].

In this letter, we report on 3-D stacked foundry silicon CMOS circuits, with the through-Si vertical optical link operating at data rates up to 50 Mb/s. The system described herein includes driver and receiver circuits designed in digital silicon CMOS technology. Both the input (on the driver side) and the output (after the on-chip comparator) are compatible with digital input/output levels. Thin-film light-emitting diodes (LED's) and metal-semiconductor-metal (MSM) photodetectors fabricated in the InP-based materials system are bonded to their respective circuits. The operating wavelength of these optoelectronic components is in the  $1.3-1.6-\mu m$  range, where silicon is transparent. The result is a very compact module (40 mm<sup>3</sup>), consisting of two layers of integrated circuits optically interconnected through their on-chip optoelectronic interface. This optical data link can be used to implement a true 3-D computational mesh, without scalability problems or the need for ultrafast backplanes. The simplicity and scalability of this technology may ultimately yield low cost massively parallel computational systems. For example, a  $32 \times 32 \times 4$ processor system using the simple LED link demonstrated in this paper would yield a 12-in square module. The latency would be less than 1 ns for any size system and would improve substantially with the use of VCSEL's instead of



Fig. 1. (a) Side view of the two-layer through-silicon circuit demonstrator. (b) Board level test circuit schematic.

LED's. With LED's, our theoretical  $32 \times 32 \times 4$  system would use 60 kW of power for interconnect, which is less than the power consumption of an electrically interconnected system (97 kW for an aggressive MCM, assuming 2 pF/cm for interconnect and 3-V signal levels<sup>1</sup>). By using a 20- $\mu$ m diameter, 1-mA threshold VCSEL, our model indicates a nearly 14 times improvement in the coupling efficiency for the system studied herein. By conservatively assuming a factor of ten improvement in the coupling efficiency, the optical interconnect power for the  $32 \times 32 \times 4$  processor system would drop to approximately 250 W—a factor of nearly 400 times improvement over the MCM.

### **II. SYSTEM DESCRIPTION**

Each silicon integrated circuit layer in the two-layer stack consisted of one optical transmitter and one optical receiver. These circuits were fabricated in 0.8- $\mu$ m standard silicon CMOS provided through the MOSIS foundry. The throughsilicon wafer demonstration was performed using two of these chips, stacked to enable optical communication between the transmitter and receiver, as shown in Fig. 1(a).

The digital CMOS transmitter circuits were designed and optimized to provide up to 80 mA of output current to the LED's. Digital level input to this driver circuit was enabled by a two-stage tapered buffer input designed to minimize power consumption at high speeds. The thin-film LED epilayers, as grown, consisted of: InGaAsP (2000 Å, n<sup>+</sup> = 1 · 10<sup>19</sup> cm<sup>-3</sup>,  $\lambda = 1.2 \ \mu$ m)/InP (1.0  $\mu$ m,  $n = 2 \times 10^{17} \text{ cm}^{-3}$ )/InGaAsP (1.0  $\mu$ m,  $n = 2 \times 10^{17} \text{ cm}^{-3}$ )/InGaAsP (1.0  $\mu$ m,  $n = 2 \times 10^{17} \text{ cm}^{-3}$ ,  $\lambda = 1.3 \ \mu$ m)/InP (1.0  $\mu$ m,  $p = 2 \times 10^{17} \text{ cm}^{-3}$ ,  $\lambda = 1.2 \ \mu$ m). The devices were pixellated (100  $\mu$ m on a side squares), the substrate was removed, and the devices were bonded onto the post-metallized circuit. Further post processing isolated and defined the top contact. More details about the fabrication and the characterization of this optical transmitter are given elsewhere [8].

The optical receiver is comprised of a front-end amplifier directly connected to an on-chip comparator. The transimpedance amplifier was designed so that the output can drive



Fig. 2. From top to bottom, Trigger, input signal to the transmitter, output signal from the comparator. The input signal data rate was 10 Mb/s with a digital level.

an on-chip comparator to convert the analog amplifier output to a digital signal. Thus, the peak to peak output of the amplifier must be 20-50 mV to drive the comparator, which converts to a 12 k $\Omega$  total transimpedance gain with a 50- $\Omega$  load, assuming an input power of -23 dBm (SONET OC3 specifications). For wide bandwidth, a multistage low-gain-per-stage configuration was used [9]. This amplifier was optimized for minimum power dissipation at a rate of 155 Mb/s with -23 dBm, leading to an open-loop, current-mode configuration. Thin film InP-based inverted metal-semiconductor-metal photodetectors (I-MSM's) were bonded onto the amplifier input pads. These photodetectors consisted of an InP (substrate)/InGaAs (100 nm, etch layer)/InAlAs (40 nm)/InGaAs (1000 nm)/InAlAs (40 nm), with all layers nominally undoped. These structures have demonstrated up to 0.7-A/W responsivity [10] and, in this particular size, operation up to 1.1 GHz with low-leakage current (less than 150 nA at 10-V bias). The measured capacitance of 250 fF is typical for 250  $\times$  250  $\mu$ m<sup>2</sup> active area I-MSM's with 2- $\mu$ m finger width and 8- $\mu$ m finger spacing. More details about the photoreceiver design and characterization are given elsewhere [11].

# III. FABRICATION

To realize the optoelectronic integrated circuits (OEIC's) for through-Si communication, each of the chips were integrated with a thin film LED and an I-MSM photodetector. Since the emitter is an LED, which needs to have its contact annealed (for lower contact resistance) at a temperature that would damage the MSM Schottky barrier, the LED was fully integrated before the I-MSM was bonded and integrated.

An MJB3 Karl Suss mask aligner with infrared back plane alignment was used to stack the OEIC's. Alignment was aided by the use of fiducial markers on the chip carrier, on the chip



Fig. 3. Eye diagram generated with 40 Mb/s  $2^7 - 1$ , nonretrun-to-zero pseudorandom bit sequence input at the transmitter input. The optical link nominal power consumption was 85 mW.

layout, and on the masks for the OE integration. The bottom OEIC's were actively aligned with IR imaging with respect to the fiducial markers previously defined on a glass substrate. To complete the fabrication of the chip stack, the top chip was aligned with respect to the bottom chip and then placed in a LDCC 44-pin flat pack for testing. To fasten the parts together, a UV/heat curable epoxy was used. The epoxy provided slight index matching between the stacked layers, with a refractive index of 1.62.

## IV. RESULTS AND DISCUSSION

A circuit illustration of the stacked circuit optical through-Si link is shown in Fig. 1(b). A pseudorandom digital signal from a Tektronix pattern generator was applied to the SMA-connectorized bottom transmitter input. A digitizing oscilloscope collected the output signal from the top receiver circuit. The voltage and current in each critical node of the system and the biasing of the integrated circuits were controlled by source measurement units.

The through-silicon wafer optical link was tested at several speeds and power consumption levels. The best results were obtained at 10 Mb/s, with a power supply voltage of less than 3.5 V for both the transmitter and the driver, and a LED nominal drive current of 20 mA. The overall power consumption was measured to be 70 mW (8 mW for the receiver). The output eye diagram shown in Fig. 2 at 10 m/s  $(2^7-1)$  pseudorandom data) shows very low distortion and indicates the potential for low BER operation. Nevertheless, ringing was slightly noticeable even at 10 Mb/s, and is believed to be the major source of noise at bit rates higher than 40 Mb/s, as shown in Fig. 3. To reach higher transmission rates, our off-chip decoupling network needs to be improved.

The LED-photodetector optical coupling efficiency was measured to be 2% for this system, a factor of 2 improvement over the previously demonstrated system [7], primarily due to the use of the large area I-MSM. Both the voltage output (5 V out of the comparator, compatible with digital signal levels) and the link speed (40 Mb/s) contribute to a factor of  $10^3$ improvement compared to previous results (0.3 mV and 10 kb/s, respectively) [7], and are due solely to improvements in the circuit designs. In addition, the use of the relatively large area I-MSM photodetector enabled this system to be alignment tolerant; our models of the system indicate that a lateral LED/I-MSM offset of  $\pm 50 \ \mu$ m will result in a signal degradation of only 10%. The use of VCSEL's would significantly improve the link performance in speed, coupling efficiency, latency, size, and power dissipation, as outlined in the introduction.

## V. CONCLUSION

Vertical optical communication through stacked silicon CMOS circuits has been demonstrated at speeds up to 40 Mb/s with an open eye diagram. This compact module was realized in a digital silicon CMOS foundry process and has both digital input and output, which enables the integration of this interconnection building block into more complex digital systems. Thin-film LED's and I-MSM detectors operating at a wavelength of 1.3  $\mu$ m (to which the Si is transparent) were bonded directly to the silicon integrated circuits to demonstrate a fully internal vertical optical interconnection link to realize the 3-D system.

### REFERENCES

- A. Iwata and I. Hayashi, "Optical Interconnections as a new LSI technology," *IEICE Trans. Electron.*, vol. E76-C, no. 1, pp. 90–99, 1992.
- [2] S. Tang and R. Chen, "1–27 highly parallel three-dimensional intra and inter board optical interconnects," *IEEE Photon. Technol. Lett.*, vol. 6, pp. 299–301, Feb. 1994.
- [3] J. W. Goodman and F. J. Leonberger, "Optical interconnection for VLSI systems," *Proc. IEEE*, vol. 72, pp. 850–866, 1984.
- [4] D. S. Wills, J. M. Baker, H. H. Čat, S. M. Chai, L. Codrescu, J. Cruz-Rivera, J. Eble, A. Gentile, M. Hopper, W. S. Lacy, A. Lopez-Lagunas, P. May, S. Smith, and T. Taha, "Processing architectures for smart pixel systems," *IEEE J. Select. Topics Quantum Electron.*, vol. 2, pp. 24–34, June 1996.
- [5] M. Koyanagi, H. Takata, H. Okano, and S. Yokoyama, "Threedimensional memory LSI with optical interconnections," *Electron. Commun. Jpn., Pt.* 2, vol. 76, no. 2, pp. 1–13, 1993.
- [6] R. F. Carson, M. L. Lovejoy, K. L. Lear, M. E. Warren, P. K. Seigal, G. A. Patrizi, S. P. Kilcoyne, and D. C. Craft, "Low-power modular parallel photonic data links," in *1996 IEEE ECTC Proc.*, pp. 321–326.
- [7] N. M. Jokerst, C. Camperi-Ginestet, B. Buchanan, S. Wilkinson, and M. A. Brooke, "Communication through stacked silicon circuitry using integrated thin film InP-based emitters and detectors," *IEEE Photon. Technol. Lett.*, vol. 7, pp. 1028–1030, Sept. 1995.
- [8] S. W. Bond, M. Lee, J. J. Chang, O. Vendier, Z. Hou, M. Brooke, N. M. Jokerst, and R. P. Leavitt, "An integrated 155 Mbps digital transmitter using a 1.3 μm wavelength thin film LED," *Proc. IEEE LEOS Annu. Meet.* '96, vol. 1, 1996, pp. 342–343.
- [9] R. P. Jindal, "Gigahertz-band high-gain low-noise AGC amplifiers in fine-line NMOS," *IEEE J. Solid-State Circuits*, vol. SSC-22, no. 4, pp. 512–521, 1987.
- [10] O. Vendier, N. M. Jokerst, and R. P. Leavitt, "Thin film inverted MSM photodetectors," *IEEE Photon. Technol. Lett.*, vol. 8, pp. 266–268, Aug. 1996.
- [11] M. Lee, O. Vendier, Martin A. Brooke, and N. M. Jokerst, "A scalable CMOS current mode preamplifier design for an optical receiver," *Analog Integrated Circuits and Signal Processing*, vol. 12, no. 2, pp. 133–144, 1997.