#### UNIVERSITY OF CALIFORNIA

Santa Barbara

#### 1.0 - 2.0 GHz Wideband PLL CMOS Frequency Synthesizer

A thesis submitted in partial satisfaction of the requirements for the degree of Master of Science in Electrical and Computer Engineering

by

Chao W. Huang

Committee in charge:

Professor Forrest D. Brewer, Chair Professor Steve E. Butner Professor P. Michael Melliar-Smith

June 2004

The dissertation of Chao W. Huang is approved.

Steve E. Butner

P. Michael Melliar-Smith

Forrest Brewer, Committee Chair

June 2004

## 1.0 - 2.0 GHz Wideband PLL CMOS Frequency Synthesizer

Copyright © 2004

by

Chao W. Huang

All rights reserved

#### Abstract

#### 1.0 - 2.0 GHz Wideband PLL CMOS Frequency Synthesizer

by

#### Chao W. Huang

CMOS mixed-signal design has become very popular in today's semiconductor industry. This paper is to demonstrate a CMOS frequency synthesizer design, whose primary purpose is to test the designer's high speed, mixed-signal CMOS circuit design skill. The design uses 0.25um deep sub-micron CMOS process technology. Thus issues such as noise rejection, high frequency parasitic effects, leakage current, power dissipation, etc. can be exposed and the designer's problem solving skills can be exercised. A new counter design, the "modified Möbius counter" is also presented in this design. Furthermore, this design is used to verify the usability of the previously set up custom design flow and standard cell design flow, which may be used for future academic teaching purpose.

## **Table of Contents**

| 1 IN | <b>TRODUCTION</b>                           |
|------|---------------------------------------------|
| 2 IN | TRODUCTION TO THE FREQUENCY SYNTHESIZER . 4 |
| 2.1  | Introduction                                |
| 2.2  | Frequency Synthesizer Basics                |
| 2.3  | Phase Noise and Timing Jitter Issues        |
| 2.4  | Conclusions                                 |
| 3 PH | HASE-LOCKED LOOP                            |
| 3.1  | Introduction                                |
| 3.2  | Voltage-Controlled Oscillator               |
| 3.2  | 2.1 Design of VCO Cells                     |
| 3    | .2.1.1 Circuit Analysis                     |
| 3.2  | 2.2 Design of VCO Switching Units           |
| 3.3  | Charge Pump and Loop Filter                 |
| 3.3  | B.1    Design Analysis    25                |
| 3.3  | B.2 Design Implementation                   |
| 3.4  | Phase-Frequency Detector                    |
| 3.5  | PLL Simulation                              |
| 3.6  | Physical Layout                             |
| 3.7  | Conclusions                                 |

| 4 ( | CC  | <b>)U</b> ] | NTERS                                 |
|-----|-----|-------------|---------------------------------------|
| 4.  | 1   | Int         | roduction                             |
| 4.  | 2   | De          | sign and Circuit Analysis             |
| 2   | 4.2 | .1          | Modified Möbius Counter               |
| 2   | 4.2 | .2          | Circuit Implementation                |
| 2   | 4.2 | .3          | Timing Analysis                       |
| 4.  | 3   | Siı         | nulation and Measurement              |
| 4.  | 4   | Ph          | ysical Layout                         |
| 4.  | 5   | Co          | onclusions                            |
| 5 ] | PA  | DS          | S AND PACKAGING                       |
| 5.  | 1   | Int         | roduction                             |
| 5.  | 2   | Hi          | gh Speed Differential Pads 58         |
| 5.  | 3   | Ge          | eneral Purpose Pads61                 |
| 4   | 5.3 | .1          | ESD Protection Circuit                |
| 4   | 5.3 | .2          | Power Pads                            |
| 4   | 5.3 | .3          | Analog Power Pads and Reference Pad62 |
| 4   | 5.3 | .4          | Digital I/O Pads                      |
| 5.  | 4   | Pa          | ckaging                               |
| 6   | OJ  | <b>H</b>    | ER DESIGN ISSUES                      |
| 6.  | 1   | Int         | roduction                             |
| 6.  | 2   | Po          | wer Dissipation                       |

| 6.3  | Noise Reduction                    |
|------|------------------------------------|
| 6.4  | Design for Test                    |
| 6.5  | Dummy Metal/Pad Insertion          |
| 7 W  | HOLE CHIP SIMULATION AND TEST PLAN |
| 7.1  | Introduction                       |
| 7.2  | Whole Chip Simulation              |
| 7.3  | Test and Measurement Plan          |
| 8 C( | <b>DNCLUSIONS</b>                  |
| Refe | <b>rences</b>                      |

# **List of Figures**

| 2-1  | Simplified block diagram and waveforms for a frequency synthesizer5 |
|------|---------------------------------------------------------------------|
| 2-2  | PFD phase error vs. V <sub>out</sub> plot6                          |
| 2-3  | State diagram and schematic of PFD7                                 |
| 2-4  | Sample waveforms for PFD/CP/LPF combination7                        |
| 2-5  | Schematic of a charge pump                                          |
| 2-6  | Linear model of the frequency synthesizer10                         |
| 2-7  | Schematic of modified loop filter10                                 |
| 3-1  | VCO block diagram15                                                 |
| 3-2  | n-stage inverter oscillator VCO ()17                                |
| 3-3  | Schematic of ring oscillator delay stage17                          |
| 3-4  | Step response when $V_{i+}$ : 0->119                                |
| 3-5  | RC model of a 2-stage ring oscillator20                             |
| 3-6  | Example of the 2-stage VCO output as a function of Vctrl22          |
| 3-7  | Schematic of VCOs switch unit S1 (single-ended version)23           |
| 3-8  | Schematic of VCOs switch unit S2 (single-ended version)24           |
| 3-9  | MATLAB Simulink PLL model27                                         |
| 3-10 | Simulink simulation output for N=100027                             |
| 3-11 | MOS capacitor structure and its characteristic plot                 |
| 3-12 | Differential CP/LPF unit                                            |
| 3-13 | Current source for CP/LPF unit                                      |

| 3-14 | Common-mode feedback amplifier.                             | .32 |
|------|-------------------------------------------------------------|-----|
| 3-15 | Schematic for phase-frequency detector unit.                | .33 |
| 3-16 | Schematic of single-ended-to-differential signal converter. | .34 |
| 3-17 | Simulated differential VCO control voltage                  | .35 |
| 3-18 | Simulated reference and feedback signals                    | .35 |
| 3-19 | Simulated control voltage after the PLL is locked.          | .35 |
| 3-20 | Full-custom design flow                                     | .36 |
| 3-21 | Layout plot of PLL block                                    | .37 |
| 4-1  | Pulse swallow frequency divider.                            | .40 |
| 4-2  | Block diagram of counter unit.                              | .41 |
| 4-3  | 4-bit Möbius counter counting pattern.                      | .43 |
| 4-4  | schematic of a 4-bit Möbius counter                         | .43 |
| 4-5  | Pattern edge detection circuit.                             | .45 |
| 4-6  | 64-count 2 stages Möbius counter.                           | .46 |
| 4-7  | Schematic of 256-state programmable counter.                | .47 |
| 4-8  | (249/372)-state fixed counter.                              | .48 |
| 4-9  | 3/2 prescaler                                               | .49 |
| 4-10 | Sample waveforms from 1-256 programmable counter.           | .51 |
| 4-11 | Typical Standard Cell Design Flow.                          | .53 |
| 4-12 | Layout plot of programmable counter design                  | .54 |
| 4-13 | Layout plot of 8-bit shift register                         | .54 |
| 4-14 | Layout plot of 4-16 decoder design                          | .55 |

| 5-1  | Schematic of high speed differential pad drivers             | .59 |
|------|--------------------------------------------------------------|-----|
| 5-2  | Layout plot of differential pads.                            | .60 |
| 5-3  | ggMOS ESD protection scheme                                  | .61 |
| 5-4  | A pad structure with ggMOS ESD protection                    | .62 |
| 5-5  | Schematic of power pads                                      | .62 |
| 5-6  | Schematics for analog power pads and reference pad           | .63 |
| 5-7  | Schematic of the bi-directional digital pad                  | .64 |
| 5-8  | Layout plot of the bi-directional pad                        | .64 |
| 5-9  | Pin model of CLCC-44 package                                 | .65 |
| 5-10 | Layout plot of frequency synthesizer design (including pads) | .66 |
| 6-1  | Metal and assembly stress relief fill.                       | .72 |
| 7-1  | Setup of top-level simulation.                               | .74 |

## **List of Tables**

| 3-1  | The frequency synthesizer's design specification14                      |
|------|-------------------------------------------------------------------------|
| 3-2  | Parameters for manual model of generic 0.25CMOS process (minimum length |
|      | device)16                                                               |
| 3-3  | Device sizes for the VCO delay cells21                                  |
| 3-4  | Corner case Kvco values                                                 |
| 3-5  | Device sizes for the switch unit S123                                   |
| 3-6  | Device sizes of VCOs switch unit S224                                   |
| 3-7  | CP/LPF design values                                                    |
| 3-8  | Device sizes of CP/LPF unit                                             |
| 3-9  | Device sizes of current source                                          |
| 3-10 | Device sizes for common-mode feedback amplifier                         |
| 4-1  | Counting values for the pulse swallow frequency divider40               |
| 5-1  | Device sizes of high speed differential pad drivers60                   |
| 5-2  | Device sizes for the bi-direction pad64                                 |
| 7-1  | Test result of whole chip simulations75                                 |

# **1** INTRODUCTION

PLL-based frequency synthesizers are used in many electronic applications. In the past, frequency synthesizers were mainly implemented by using discrete components. However, due to the package electrical characteristics, building a frequency synthesizer with discrete components to meet the advancing design speed constraint becomes more and more difficult. This is more obvious in the growing wireless communications market.

In a wireless communication system, to achieve higher speed, lower cost, smaller form factor, and lower power dissipation, the frequency synthesizer is usually integrated together with other circuits in low-cost CMOS technology. This in turn requires enough supply-noise isolation for the synthesizer to provide low phase-noise output. Because of its good supply noise and common-mode noise rejection properties, the differential structure is commonly used in on-chip frequency synthesizer design.

Another issue of on-chip synthesizer design is the difficulty of generating output with wide frequency tuning range. This is because to have wider output frequency tuning range, larger capacitance is required. Since current CMOS technologies do not provide good capacitance/area ratio, it is expensive to build a large capacitor and therefore, limits the synthesizer's output tuning range. In order to increase the tuning range without increase the chip area usage, multi-VCO unit can be used. The idea is to partition

output frequency range into sections, one VCO is responding to one section. A multiplexer unit is used to direct the valid VCO output to the synthesizer output. Therefore, with multi-VCO structure, wider tuning range can be achieved with limited capacitance area usage.

One of the core components in the frequency synthesizer is the programmable counter. Traditional counter designs are normally based on either binary or one-hot design structure. Binary counter structure is chip-area efficient but gives slow performance. On the other hand, one-hot design provides good speed performance, but is chip-area expensive. Due to these limitations, in a high-speed and wide-tuning-range synthesizer, neither binary nor one-hot design structure may be usable for programmable counter design. Therefore, a new counter design structure is necessary.

In the following chapters, we will demonstrate the design and implementation of a frequency synthesizer by using the concepts described about. Also, we will present a new counter design, the so-called modified Mobius counter. This counter structure can provide speed performance comparable to one-hot design while using limit amount of chip area. It also gives the benefits of lower power consumption and glitch-free output.

We will also briefly discuss the custom design flow and standard design flow that are used in this design. To complete the demonstration, we will also cover the design and construction of high-speed differential pad and generic pads circuits. Lastly, we will discuss the IC test and measurement plan for the design. The complete synthesizer design implementation complies with TSMC  $0.25 \mu m$  CMOS technology and the prototype is sent out on May 17th, 2004 for fabrication.

## 2 INTRODUCTION TO THE FREQUENCY SYNTHESIZER

## **2.1 Introduction**

The frequency synthesizer's simplified block diagram is shown in Figure 2-1. In the frequency synthesizer, the PLL block is responsible for generating an output signal whose frequency is dependent on the phase relationship between two input signals. The phases of a reference signal,  $f_{ref}$ , and a feedback signal,  $f_{fb}$ , are compared in a phase-frequency-detector (PFD), and the phase difference is then converted by a charge pump and low pass filter (CP/LPF) circuit into a control voltage. This voltage controls the VCO to generate a signal with the desired frequency. A divider is inserted on the feedback path, giving  $f_{fb} = f_{out} + M$ . Since in the locked condition,  $f_{ref}$  and  $f_{fb}$  must be equal,  $f_{out}$  is simply equal to the product of  $f_{ref}$  by M. Shown in Figure 2-1(b) are the simple waveforms with M=4. By changing the multiplication factor, M, signals with desired frequency can be generated.

In order to understand and analyze the functional behavior of the frequency synthesizer, it is necessary to construct a linear model for the system. As we will see, the frequency synthesizer is a non-linear device but it can be modelled as a linear device since under





(a) Block diagram

(b) Typical waveforms with M=4

normal operation, the system behaves fairly linearly. In this chapter, we will start by briefly discussing each building block and its linear model. We will then combine the models and analyze the synthesizer system as a whole. Since we intend to concentrate on circuit design here, an in-depth discussion of each building block and its model is beyond the scope of this thesis. More information can be acquired from [Dai03] and [Rabaey02].

## 2.2 Frequency Synthesizer Basics

As seen in Figure 2-1(a), the frequency synthesizer has four sub blocks: PFD, CP/LPF, VCO, and feedback divider blocks. The PFD block is used to determine the phase difference between the reference and feedback signals. Depending on the input signals' phase relationship, i.e. one leads or lags the other, the PFD produces an appropriate output signal. This is best described with the PFD phase characteristics plot shown in Figure 2-2, in which,  $\Delta \phi = \text{phase}_{\text{ref}} - \text{phase}_{\text{feedback}}$ . The plot shows that the PFD is a nonlinear device and it has a linear phase range within 360 degrees. When the synthesizer is in locked state, the phase error between feedback and reference signals are normally small, which is well within the PFD's linear operating region. Therefore, in locked mode, we can consider the PFD block a linear device.

Figure 2-2. PFD phase error vs. V<sub>out</sub> plot.



The phase frequency detector described above can be implemented by a digital state machine, whose state diagram and schematic are shown in Figure 2-3. The PFD design produces non-complementary outputs UP and DN. The UP signal is used for increasing the value of control voltage and therefore, increasing the output frequency of the VCO. A DN signal does the exactly opposite job, and is used for lowering the output frequency. The duty cycle of each signal is dependent on the phase relationship between the two input signals and their phase difference. Assuming  $f_{ref}$  is leading, the DN signal

remains inactive while the UP signal become active with a duty cycle of  $\frac{\Delta\phi}{2\pi}$ , as shown in Figure 2-4. When f<sub>fb</sub> leads, the UP and DN signals behave exactly the opposite.



Figure 2-3. State diagram and schematic of PFD

Figure 2-4. Sample waveforms for PFD/CP/LPF combination.



Since the UP and DN signals are digital signals, they must be converted into an analog voltage to control the VCO. A charge pump and a low-pass filter unit serves the purpose. Shown in Figure 2-5 is one possible implementation of a CP/LPF unit. It consists of two switched current sources that pump charges into or out of the low-pass filter

according to the PFD outputs. VCO control voltage,  $V_{out}$ , rises when UP is active and the amount of voltage change is dependent on the duty cycle of the UP signal, which can be seen in Figure 2-4. Similarly,  $V_{out}$  decreases when DN is active.





Let us consider the case when the reference clock is leading the feedback clock. The average output current from the charge pump is then given by

$$I_{out} = I_{cp} \cdot \frac{t_{UP-active}}{T}.$$
 (2-1)

In which, T is the period of the reference frequency. Therefore,  $\frac{t_{UP-active}}{T}$  is simply the duty cycle of UP signal. Thus, we have

$$I_{out} = I_{cp} \cdot \frac{\Delta \phi}{2\pi}.$$
 (2-2)

Equation 2-2 can be applied to the cases when the feedback signal is leading the reference clock as well, with both phase error and output current negative. Hence, the transfer function of the PFD and charge pump can be expressed as

$$\frac{I_{out}}{\Delta\phi}(s) = \frac{I_{cp}}{2\pi}.$$
 (2-3)

In simplest case, a low-pass filter can be made as one capacitor connected from limitedimpedance signal line to ground, as shown in Figure 2-5. Its transfer function is then given by

$$\frac{V_{out}}{I_{out}}(s) = \frac{1}{s \cdot C_p}.$$
 (2-4)

The output voltage from the low-pass filter is fed into the voltage-controlled oscillator unit. There are two types of VCO designs that are widely used in the industry, ring oscillator VCO and LC VCO. An LC VCO offers good phase noise performance. However, its requiring special fabrication process, occupies much bigger area, and narrow tuning range properties make it unsuitable to be used in our design. Therefore, we choose to use a ring oscillator for the VCO. The transfer characteristic of a VCO unit can be described as

$$f_{vco}(t) = K_{vco} \times V_{out}(t).$$
(2-5)

Integrating on both sides, we have

$$\phi_{\rm vco}(t) = K_{\rm vco} \int_0^t V_{\rm out}(t) dt, \qquad (2-6)$$

yielding the transfer function

$$\frac{\Phi_{\text{out}}}{V_{\text{out}}}(s) = \frac{K_{\text{vco}}}{s}.$$
(2-7)

The output signal generated by the VCO unit is then fed back to the PFD via a divider unit. The feedback divider carries the function  $f_{in} = M \cdot f_{out}$ , which is in our case,  $f_{fb} = \frac{1}{M} \cdot f_{vco}$ . With each sub system's linear model defined, we can now construct a linear model for the frequency synthesizer system. The combined synthesizer's linear model is shown in Figure 2-6. The model gives a closed-loop transfer function

$$H(s) = \frac{\Phi_{out}(s)}{\Phi_{in}(s)} = \frac{\frac{I_{cp}}{2\pi} \cdot \frac{1}{s \cdot C_p} \cdot \frac{K_{vco}}{s}}{1 + \frac{1}{M} \cdot \frac{I_{cp}}{2\pi} \cdot \frac{1}{s \cdot C_p} \cdot \frac{K_{vco}}{s}}.$$
 (2-8)

Figure 2-6. Linear model of the frequency synthesizer.



Figure 2-7. Schematic of modified loop filter.



(a) Second order filter

(b) simplified filter

The closed-loop system contains two imaginary poles, which suggest that the system is unstable. In order to stabilize the system, a modified loop-filter unit is used to add a zero and pole to the system, as shown in Figure 2-7(a). The zero and the pole are given by

$$\omega_{z} = \frac{1}{RC_{p}}$$
(2-9)

$$\omega_{\rm p} = \frac{{\rm C}_{\rm p} + {\rm C}_{\rm 2}}{{\rm R}{\rm C}_{\rm p}{\rm C}_{\rm 2}} \tag{2-10}$$

Capacitor  $C_2$  is used for suppressing the ripple noise caused by  $R-C_p$  loop and can be neglected as long as it is much smaller than  $C_p$ . Thus, the loop filter can be simplified to the circuit shown in Figure 2-7(b). The pole of the system becomes

$$\omega_{\rm p} \cong \frac{1}{\rm RC_2}, \qquad (2-11)$$

the loop filter's transfer function becomes

$$\frac{V_{out}}{I_{out}}(s) = \frac{1 + sRC_p}{sC_p},$$
(2-12)

yielding the new frequency synthesizer's close loop transfer function

$$H(s) = \frac{\frac{I_{cp}}{2\pi} \cdot K_{vco} \left(sR + \frac{1}{C_p}\right)}{s^2 + sR \cdot \frac{I_{cp}}{2\pi M} \cdot K_{vco} + \frac{I_{cp}}{2\pi M C_p} \cdot K_{vco}}.$$
 (2-13)

Therefore, the natural frequency  $\omega_n$ , damping factor  $\zeta$ , and loop bandwidth  $\omega_{lpf}$  can be written as

$$\omega_{\rm n} = \sqrt{\frac{I_{\rm cp}}{2\pi M C_{\rm p}} \cdot K_{\rm vco}}$$
(2-14)

$$\zeta = \frac{R}{2} \sqrt{\frac{I_{cp}C_p}{2\pi M} \cdot K_{vco}}$$
(2-15)

$$\omega_{\rm lpf} = 2\zeta \omega_{\rm n} = \frac{I_{\rm cp} K_{\rm vco} R}{2\pi \cdot M}.$$
 (2-16)

To keep the system stable yet to have fast settling time,  $\omega_p = 4\omega_{lpf}$  and  $\omega_z = \frac{1}{4}\omega_{lpf}$  are used [Wolaver91].

### **2.3 Phase Noise and Timing Jitter Issues**

Since most applications require clean and precise clock signals, making a frequency synthesizer with low output phase noise and timing jitter is a major concern in this design. Because the PLL as part of the synthesizer system is an analog circuit, it is inherently sensitive to noise and interference. In particular, a ring oscillator, which is used in our VCO design, is the biggest phase noise and timing jitter contributor [Kim90]. Other analog blocks including the charge pump and the loop filter also produce significant effects on the system in terms of phase noise and timing jitter. Therefore, a design with high supply and substrate noise rejection, such as one using differential circuit structure is desirable. In our design, we have adopted the differential VCO circuit topology in [Dai03] and charge pump/loop filter's in [Li00]. Detail discussion on the circuit topologies and their impact to the system phase noise and timing jitter can be obtained from the sources.

### **2.4 Conclusions**

In this chapter, we have reviewed the basic knowledge for a frequency synthesizer design. We have also briefly studied each of the synthesizer's building blocks and its transfer function as well as the linear model and transfer function for the synthesizer itself. In the end, we have shortly reviewed the importance of minimizing phase noise

and timing jitter in a frequency synthesizer and one of the possible solutions, which we used in our design. This chapter serves as a foundation for design analysis in later chapters. Circuit design and analysis for each building block of the frequency synthesizer will be presented in the following chapters.

# **3 PHASE-LOCKED LOOP**

## **3.1 Introduction**

In this chapter, we will demonstrate the phase-locked loop block design. A PLL block is the core of our frequency synthesizer design. Because it is a system containing many analog sub-circuits, it can be used to test the designer's analog expertise. Also, since it has many design constrains that will be shown in the following part of the chapter, we will use a full-custom design flow for the PLL block layout. Before presenting the PLL design, let us first look at the design specification for the frequency synthesizer .

| Input reference frequency | 1.0 MHz                      |
|---------------------------|------------------------------|
| Output frequency          | 1.0 - 2.0 GHz, step 2.0 MHz  |
| lock-in time              | < 1.0 ms                     |
| Power consumption         | < 200 mW                     |
| Process technology        | comply with TSMC 0.25um CMOS |
| Supply voltage            | 2.5±0.2 V                    |

Table 3-1. The frequency synthesizer's design specification.

Since one of this design intentions is to test the standard cell set and pads that were previously created, an output frequency range from 1.0 to 2.0 GHz with 2.0MHz resolution is desirable, as can be seen from the table. Since the maximum output frequency is twice as fast as the minimum output frequency, by inserting a number of divide-by-two divider stages in the output path, the system is able to generate a frequency range from DC up to 2 GHz. Therefore, we can use the output signal to test the maximum frequency the standard cell sets and pads can support. However, such a wide range can create a number of potential issues, including the difficulty of building a VCO unit that covers this output frequency range and an increase of loop filter size. In the following sections, we will show how to conquer these issues.

## **3.2 Voltage-Controlled Oscillator**

As mentioned before, we choose to use a CMOS ring oscillator rather than an LC-tank for our VCO design. This is because it gives a wider tuning range and uses smaller design area. However, it is still difficult to construct an oscillator that can provide 1.0 to 2.0 GHz output frequency. In order to solve this issue, we came up with a VCO unit design whose block diagram is shown in Figure 3-1.

Figure 3-1. VCO block diagram.



Unlike a typical VCO unit, the one in our design contains two VCO cells, each generating a subset of the frequency range, i.e. the low frequency VCO generates frequencies from 1.0GHz to 1.49GHz; and the other one generates the frequencies from 1.49GHz to 2.0GHz. A SEL signal is used to choose which VCO to use. By carefully design these VCO blocks, we can have them cover the full range of frequency output while having similar values of  $\frac{K_{vco}}{M}$ . This is important since it helps to reduce the system area. Equation 2-16 on page 11, suggests that with constant  $\frac{K_{vco}}{M}$ ,  $I_{cp}$  and R can remain unchanged. Therefore, we can use one charge pump and loop filter in the design rather than two. As we shall see later, using one set of CP/LPF unit reduces system area by 40%.

In the following section we will cover the circuit design and component sizing calculations. For demonstration purposes, we use the design parameters given in [Rabaey02], which are re-listed in Table 3-2. These values will be used in later chapters.

| Table 3-2. Param | neters for manual | model of generic | 0.25 µm CMOS j | process (minimum | 1 length device) |
|------------------|-------------------|------------------|----------------|------------------|------------------|
|                  |                   | 0.5              |                | 0                | 4                |

|      | $V_{T0}(V)$ | $\gamma(V^{0.5})$ | $V_{DSAT}(V)$ | $k'(A/V^2)$          | $\lambda(V^{-1})$ |
|------|-------------|-------------------|---------------|----------------------|-------------------|
| NMOS | 0.43        | 0.4               | 0.63          | $115 \times 10^{-6}$ | 0.06              |
| PMOS | -0.4        | -0.4              | -1            | $-30 \times 10^{-6}$ | -0.1              |

#### **3.2.1 Design of VCO Cells**

An n-stage inverter oscillator block diagram is shown in Figure 3-2. The VCO design uses a fully differential structure in order to achieve good supply and substrate noise rejection. Since we partition the output frequency range into two parts, we use a 3-stage VCO to generate the lower range frequency output, and a 2-stage VCO for the upper of range output. For the inverter delay stage design, we use the circuit topology in [Dai03]. The schematic of the inverter delay cell design is shown in Figure 3-3.



**Figure 3-2.** n-stage inverter oscillator VCO ( $n \in I, n \ge 2$ )

Figure 3-3. Schematic of ring oscillator delay stage.



Shown in the figure, transistors M1, M2, M7 and M8 together form differential inverter pairs, M9 and M10 are the current limiting devices, which are used to control the output frequency; and M3-M6 form a pair of latches. The additional of latches into the traditional inverter-only structure changes the delay stage into a design with hysteresis. This

is due to the additional delay that is caused by the latches. With this extra delay, it is possible to obtain a 180 degree signal phase shift at finite frequency with a minimum of two stages. Since this is the requirement for a ring oscillator to oscillate, an oscillator containing only 2 stages is possible. As mentioned before, the VCO is the biggest phase noise and timing jitter contributor in the system. Since phase noise and timing jitter are accumulated by each delay stage in the ring, minimizing the number of stages in a VCO unit can greatly reduce the system phase noise and timing jitter.

It is can be shown that the size ratio between the latch and the inverter pair decides the total delay of a stage. If we can change the effective widths of the inverter pair, we can also change the output frequency of the ring oscillator. M9 and M10 are used exactly for this purpose. Increasing the differential control voltages applied at the inputs of M9 and M10 is equivalent to increasing the effective widths of M1, M2, M7 and M8 by And thus, causes the output frequency to rise. To simplify the circuit analysis that is shown below, we are going to remove M9 and M10 from the circuit.

#### **3.2.1.1 Circuit Analysis**

To understand how the oscillator circuit works, let us first look at the step response of the delay cell. Figure 3-4 shows an example case where step signals arrive at the inputs Vi+ and Vi-. Since transistors M1 and M8 are in cut-off mode in this example, they are removed from the schematic.

Figure 3-4. Step response when  $V_{i+}$ : 0->1



At t=0, both M2 and M7 are on, and creating the two current paths shown in the figure. Since M3 and M6 are still on, the output states remain unchanged until when  $V_0$ + =  $V_{TN}$  or  $V_0$ - =  $V_{dd}$  +  $V_{TP}$  At that moment, either M4 or M5 turns on. Due to the positive feedback formed by the latch pairs, both outputs switch states rapidly. Now, let us define this is the threshold point of the delay stage, in which, Vi+ =  $V_{th}$  and Vi- =  $V_{dd}$  -  $V_{th}$ . At threshold point, M2 and M7 are in saturation and M3 and M6 are in linear mode. On "current path 1", we have

$$I_{d} = \frac{k_{n}' W_{M2}}{L_{M2}} (V_{th} - V_{TN})^{2} = k_{p}' \frac{W_{M3}}{L_{M3}} \left( (V_{TN} - V_{dd} - V_{TP}) V_{TP} - \frac{V_{TP}^{2}}{2} \right).$$
(3-1)

Solving for V<sub>th</sub>, we get

$$V_{th} = V_{TN} + \sqrt{2 \cdot \frac{W_{M3}}{L_{M3}} \cdot \frac{L_{M2}}{W_{M2}}} \cdot \left( (V_{TN} - V_{dd} - V_{TP}) V_{TP} - \frac{V_{TP}^2}{2} \right).$$
(3-2)

When each NMOS and PMOS pair have equal strength, Equation 3-1 and Equation 3-2 also apply to "current path 2". Notice that when the size of the latches and all the gate lengths stay constant,  $V_{th}$  is just a function of the inverters' effective width.

Now, let us study the relationship between the threshold voltage and oscillator output frequency by using the 2-stage oscillator as an example. The basic RC model and example waveforms are shown in Figure 3-5.  $R_d$  is the equivalent drive resistor and  $C_g$  is the equivalent input capacitor of a delay stage. In order for the system to oscillate,  $V_{TPB}$  must cross over the delay cell threshold voltage  $V_{th}$  at  $t = \frac{T}{4}$  after  $V_{TPB}$  starts rising, so that it can trigger the next stage. Therefore, we have the following equation:

Figure 3-5. RC model of a 2-stage ring oscillator



$$V_{th} = V_{TPB} \cdot \frac{T}{4} = V_{dd} \left( 1 - e^{-\frac{T}{4RdCg}} \right) + \frac{V_{dd}}{1 + e^{\frac{T}{2RdCg}}} \cdot e^{-\frac{T}{4RdCg}}.$$
 (3-3)

Solving for T, we get

$$T = 4RdCg \cdot ln\left(\frac{V_{dd} + \sqrt{(3V_{dd} - 2V_{th})(2V_{th} - V_{dd})}}{2(V_{dd} - V_{th})}\right).$$
 (3-4)

According to [Dai03], the circuit noise can be viewed as equivalent to a variation in the threshold voltage,  $V_{th}$ . Thus, to minimize noise, we want  $\frac{dT}{dV_{th}}$  to be as small as possible. Solving  $V_{th}$  from Equation 3-4, we get

$$V_{th} \simeq 0.8 V_{dd} = 2.0 V$$
. (3-5)

Combining it with Equation 3-2, we have

$$\frac{W_{M3}}{L_{M3}} \cdot \frac{L_{M2}}{W_{M2}} \cong 2.1 .$$
 (3-6)

For a 3-stage oscillator, results can be obtained using similar analysis. Optimal  $V_{th}$  for a 3-stage oscillator is  $V_{th}$ =1.8V. Therefore,

$$\frac{W_{M3}}{L_{M3}} \cdot \frac{L_{M2}}{W_{M2}} \cong 1.6 .$$
 (3-7)

Equation 3-6 and Equation 3-7 only give the optimal size relationship between latch and inverter pair. We thus use them as a guideline and use HSpice to find the transistor sizes that give both similar oscillators gain and optimal noise performance. Table 3-3 lists the device sizes for the VCO cells we use in our design.

| Cell       | Device(s) | M1/M7 | M2/M8 | M3/M5 | M4/M6 | M9   | M10  |
|------------|-----------|-------|-------|-------|-------|------|------|
| Delay cell | М         | 3     | 3     | 3     | 3     | 2    | 2    |
| 2 stage    | W(µm)     | 35    | 14    | 15    | 6     | 52   | 21.4 |
| VCO        | L(µm)     | 0.24  | 0.24  | 0.24  | 0.24  | 0.24 | 0.24 |
| Delay cell | М         | 12    | 12    | 2     | 2     | 2    | 2    |
| 3 stage    | W(µm)     | 8.75  | 3.5   | 10    | 4     | 17.5 | 7    |
| VCO        | L(µm)     | 0.24  | 0.24  | 0.24  | 0.24  | 0.24 | 0.24 |

Table 3-3. Device sizes for the VCO delay cells.

Figure 3-6 shows a sample HSpice plot of the 2-stage VCO output period vs. control voltage, confirming that  $\frac{\Phi_{out}}{V_{ctrl}}$  is a monotonic function within -0.1V to 2.5V control voltage range. Acquired corner case  $K_{vco}$  values from simulation are listed on Table 3-4. Notice the  $K_{vco}/M$  values of the two VCO types are very close to each other. This gives us the ability to one CP/LPF for both VCO blocks in our design.

Figure 3-6. Example of the 2-stage VCO output as a function of V<sub>ctrl</sub>.



Table 3-4. Corner case K<sub>vco</sub> values.

| Sampling frequency          | 2.0 GHz     | 1.49 GHz | 1.49 GHz    | 1.0 GHz |
|-----------------------------|-------------|----------|-------------|---------|
| VCO type                    | 2-stage VCO |          | 3-stage VCO |         |
| K <sub>vco</sub> (MHz/V)    | 120         | 540      | 90          | 440     |
| М                           | 2000        | 1490     | 1490        | 1000    |
| K <sub>vco</sub> /M (MHz/V) | 0.06        | 0.36     | 0.06        | 0.44    |

#### **3.2.2 Design of VCO Switching Units**

As mentioned before, it requires two extra switching units, as shown in Figure 3-1, in order to use one CP/LPF unit for multiple VCOs. We will now analyze the design analysis for both units. For simplicity, we use single-ended version designs in the following analysis. Let us first look at the design for Switch S1, whose schematic is shown in Figure 3-7. In the design, M1, M2, M3, and M4 form a passing gate MUX structure. According to the select signal SEL, the input VCO control signal is routed to either the A or B output. In order to reduce cross-coupling noise generated by the unused VCO

unit, we want to shut it off completely. Therefore, transistors M5 and M6 are added to the design so that when the VCO is not chosen, its control voltages are set to reversed maximum and thus, shut off the VCO completely. Because S1 connects the LPF unit and the VCOs, its parasitic value is added to the LPF gain. Thus, it is extremely important to keep the channel resistance down in order not to alter the LPF gain. In the technology we are using, a minimum size, W/L=1, fully on NMOS has an equivalent resistance value of  $13k\Omega$  and  $31k\Omega$  respectively for PMOS. Using the minimum sizing transistors for S1 will definitely break the linear model we use.

Figure 3-7. Schematic of VCOs switch unit S1 (single-ended version).



| Table 3-5. Device sizes | for the | switch | unit S1. |
|-------------------------|---------|--------|----------|
|-------------------------|---------|--------|----------|

| Devices               | M1/M3 | M2/M4 | M5/M6 |
|-----------------------|-------|-------|-------|
| М                     | 12    | 12    | 12    |
| W(µm)                 | 10    | 25    | 2     |
| L(µm)                 | 0.24  | 0.24  | 0.24  |
| $R_{eq}$ ( $\Omega$ ) | 26    | 25    | 130   |

To lower the equivalent resistance,  $R_{eq}$ , we can simply increase the transistors' W/L ratio, since  $R_{eq}$  is inversely proportional to the W/L ratio. The target  $R_{eq}$  is chosen to be less than 30 Ohm, which is compareable to interconnect wire resistance. Increasing the device sizes also increase their paracitic capacitances. In out final design, the overall paracitic capacitance of Switch S1 is approximately 210fF, which can be neglected, since  $C_p$  in the LPF unit which is in the picofarad range and hence dominates. The final device sizes are shown in Table 3-5. Notice that the sizes of M5 and M6 are small. This is because they are not on the LPF path, and therefore, their sizes do not affect the LPF gain and can be kept minimum.

Figure 3-8. Schematic of VCOs switch unit S2 (single-ended version).



Table 3-6. Device sizes of VCOs switch unit S2.

| Devices | M1/M3 | M2/M4 |
|---------|-------|-------|
| М       | 1     | 1     |
| W(µm)   | 2     | 5     |
| L(µm)   | 0.24  | 0.24  |

Switch unit S2 uses a pass gate structure similar to that of S1. Its schematic is shown in Figure 3-8. Since S2 is used for passing high speed signals, we need to keep its input parasitic capacitance as low as possible so as not to create extra loads on the VCOs,

changing their gains. Thus, small size devices are used in S2, whose sizes are listed in Table 3-6

Once we have constructed the VCO unit and acquired the  $K_{vco}$  values, we can move on to the charge pump and loop filter designs.

## **3.3 Charge Pump and Loop Filter**

From Equations 2-9, 2-11, and 2-16 we can see that the charge pump current  $I_{cp}$  is inversely proportional to R, while R itself is directly proportional to  $C_p$  and  $C_2$ . Since, in the process technology we are using, capacitors are expensive devices to make, we want to avoid the use of very large capacitors. To reduce the sizes of  $C_p$  and  $C_2$  while keeping  $\omega_{lpf}$ ,  $\omega_z$ , and  $\omega_p$  unchanged, we can cut down the charge pump current and increase the R value. Since resistors use a decent amount of area as well, it is wise to choose the values that give optimal space usage.

#### **3.3.1 Design Analysis**

Before carrying on with further calculations, we need to choose the appropriate loop bandwidth for the filter. A rule-of-thumb selection is to make the loop bandwidth 1/10 of the PFD update rate [Razavi02], which is equal to the input reference signal in our case. Therefore,  $\omega_{lpf} = \frac{2\pi f_{ref}}{10} = 2\pi \times 10^5 \frac{rad}{sec V}$ . Thus, we have  $\omega_z = \frac{1}{2}\pi \times 10^5 \frac{rad}{sec V}$ , and  $\omega_p = 8\pi \times 10^5 \frac{rad}{sec V}$ . The lock-in time for a PLL with a PFD is approximated by Equation 3-
8, in which  $f_{eo}$  is the initial frequency error, N is the feedback divider ratio, and K is the loop bandwidth [Wolaver91].

$$T_{\rm P} \cong \frac{8f_{\rm eo}}{\rm NK^2}.$$
(3-8)

In our case, the worst case frequency error is 1.0 GHz, and the divider ratio is 1000. Therefore, the lock-in time is approximately  $800\mu s$ , which meets our design requirement.

With all the above data given, we now can calculate the values for the R and C components and for the charge pump currents. Since the maximum overshoot of the control voltage occurs when the loop gain is minimum, we use the minimum  $K_{vco}/M$ , 0.06 MHz/V, for component calculations. Table 3-7shows the values we use in our design.

Table 3-7. CP/LPF design values

| I <sub>cp</sub> | R     | Cp    | C <sub>2</sub> |
|-----------------|-------|-------|----------------|
| 10μΑ            | 157kΩ | 255pF | 16pF           |

Since the CP/LPF design decides the functional correctness of a PLL design, wrong calculations can almost certainly guarantee the failure of a design. Thus, we decided to use MATLAB Simulink to verify the correctness of the loop filter design. Simulink allows us to create a mathematical model for the system and simulate the system's behavior. The simulation speed is incredibly fast compared to Spice simulation. Better yet, it comes with a generic PLL model so that we can modify the model to suit our design needs. Figure 3-9 and Figure 3-10 show the modified PLL model we used in Simulink and one of its output example, in which, we fr=1.0 MHz and fq=100MHz. With all four  $K_{vco}$  corner cases, which are shown in Table 3-4, being simulated, the proper choice of loop filter component values is verified. Now, we can continue with the circuit implementation.





Figure 3-10. Simulink simulation output for N=1000.



## **3.3.2 Design Implementation**

The hardest decision to make for the LPF implementation is what type of capacitor structure to use. In the process technology we are using, we have the options of using special Poly-Insulator-Poly (PiP) or Metal-Insulator-Metal (MiM) structure to build the high precision capacitors, or using MOS structure for area-efficient but substrate-noise-sensitive capacitors. Since the MOS capacitor structure gives 6 times more capacitance per area than the PiP/MiM structure for the same area, we decided to use it in our design. Furthermore, this leads to a more portable design since PiP/MiM structure is not available in some processes. To reduce the noise effects, we use the following techniques in the layout design:

- place the capacitors as far away as possible from all digital units, including VCO and the PFD
- place double guard-rings around the capacitor units
- use dedicated analog power and ground supplies

A MOS capacitor structure is shown in Figure 3-11. The overall capacitance is the gate capacitance when the MOS is turned on plus the parasitic capacitances that exist on the source and drain terminals. When the gate capacitance is substantially larger than parasitic capacitances, the overall capacitance can be approximated by  $C=C_{ox}WL$ . Therefore, we want to construct the device with large gate length and small junction area.

Figure 3-11. MOS capacitor structure and its characteristic plot.



From the characteristic plot we see that the capacitance value remains constant once the device is on. HSpice simulation shows that the turn-on voltage is 1V for a PMOS device and 0.7V for a NMOS device. Due to the body effect, they are bigger than the devices' threshold voltages, as expected. Since the differential VCO control voltage ranges are 1.2-2.5V for  $V_{ctl}$ + and 0-1.3V for  $V_{ctl}$ -, we use PMOS capacitors on the  $V_{ctl}$ + path and NMOS ones on the  $V_{ctl}$ - path to ensure the devices are on during normal operation.

For the charge pump unit, we use the design suggested in [Li00]. The complete schematic of the CP/LPF unit is shown in Figure 3-12 and the device sizes are listed in Table 3-8. In this design, transistors M1 to M12 form a charge pump, which takes differential input signals UP, UP/, DN, and DN/ from PFD units. Controlled by the input signals, transistors M9 to M12 act as current steering switches, deciding which of 4 charge-pump current paths are on. Since they don't draw much current, these four transistors are designed to be weak devices. Keeping them weak also helps reduce load on the input signals, which allows for faster switching times. The amount of current pumped in or drained out from the LPF units is regulated by the current mirror circuit formed by transistors M1 to M8, with the current source shown in Figure 3-13. The current source circuit is a modified  $V_t$ -referenced self-biased cascode design, which has nice features such as power supply rejection and large voltage swing.

Figure 3-12. Differential CP/LPF unit.



Table 3-8. Device sizes of CP/LPF unit.

| Devices | M1/M2 | M3/M4 | M5/M6 | M7/M8 | M9/M10 | M11/M12 | M13  | M14  | M15  | M16  |
|---------|-------|-------|-------|-------|--------|---------|------|------|------|------|
| m       | 1     | 1     | 1     | 1     | 1      | 1       | 5    | 10   | 10   | 5    |
| W(µm)   | 24    | 24    | 9.6   | 9.6   | 0.48   | 0.48    | 24   | 50   | 20   | 9.6  |
| L(µm)   | 0.96  | 1.44  | 1.44  | 0.96  | 0.48   | 0.96    | 0.96 | 0.24 | 0.24 | 0.96 |

Figure 3-13. Current source for CP/LPF unit.



Table 3-9. Device sizes of current source.

| Devices | M1/M2 | M3/M4 | M5/M6/M8 | M7   | M9/M10 | M11/M12 | M13/M14/M16 | M15  |
|---------|-------|-------|----------|------|--------|---------|-------------|------|
| m       | 1     | 1     | 1        | 1    | 1      | 1       | 1           | 1    |
| W(µm)   | 30    | 6     | 24       | 24   | 75     | 2.4     | 9.6         | 9.6  |
| L(µm)   | 0.96  | 1.44  | 0.96     | 1.44 | 0.96   | 1.44    | 0.96        | 1.44 |

Shown in Figure 3-13, the value of current  $I_{cs}$  is given by

$$I_{cs} = I_{css} = \frac{V_{TN}}{R}.$$
 (3-9)

For a current mirror system, we have

$$I_{out} \cdot \left(\frac{W}{L}\right)_{REF} = I_{REF} \cdot \left(\frac{W}{L}\right)_{out}.$$
 (3-10)

Applying it to our design, as an example, we have  $I_{out}=I_{cp}$ ,  $I_{REF}=I_{cs}$ ,  $M_{out}=M1/M3$  in Figure 3-12, and  $M_{REF}=M8/M7$  in Figure 3-13. The relationship applies to each corresponding device pairs. By choosing the correct device size ratios, a  $10\mu A$  charge-pump current can be generated.

Continuing with the CP/LPF design, we see that a differential amplifier, U1, as well as two 281 kOhm resistors are also included in the design. They act as a common-mode feedback circuit to maintain the common mode voltage level of the differential output pair. The design schematic is shown in Figure 3-14. Large value resistors are used so that they won't draw current levels significant enough to alter the loop gain values. The final stages of the CP/LPF are two source-follower units. Their purpose is to shift the LPF output voltages to within the VCO control voltage range requirements, which are from 1.2V to 2.5V for  $V_{ctrl}$ + and from 0V to 1.3V for  $V_{ctrl}$ -. The amount of voltage shift is equal to the  $V_{gs}$  of the device, which can be found by solving the equation

$$I_{d} = \frac{K'W}{2L}(V_{gs} - V_{T})^{2}(1 + \lambda V_{ds}).$$
 (3-11)

Figure 3-14. Common-mode feedback amplifier.



Table 3-10. Device sizes for common-mode feedback amplifier.

| Devices | M1   | M2   | M3/M4 | M5/M6 | M7/M8 |
|---------|------|------|-------|-------|-------|
| m       | 1    | 1    | 24    | 1     | 1     |
| W(µm)   | 40   | 40   | 9.6   | 9.6   | 9.6   |
| L(µm)   | 0.96 | 1.44 | 0.96  | 1.44  | 0.96  |

# **3.4 Phase-Frequency Detector**

Since the PFD update rate is 1.0MHz, which is in the low frequency range, the circuit implementation of the functional block shown in Figure 2-3 is rather straightforward. The design we use is from [Maneatis96] with added inverter delay units to eliminate the hazard conditions and the dead zone issue, as suggested in [Dai03]. The schematic diagram and device sizes (with L= $0.24 \mu m$ ) are shown in Figure 3-15.



Figure 3-15. Schematic for phase-frequency detector unit.

Notice that the outputs of PFD are UP/ and DN/. Since the CP/LPF requires UP, DN, UP/, and DN/ signals, two single-ended-to-differential signal converters are used to generate the UP and DN signals. The schematic and each gate sizes of the converter are shown in Figure 3-16. Again, we use L= $0.24 \mu m$  for the devices.

Figure 3-16. Schematic of single-ended-to-differential signal converter.



In the signal converter design, extra inverters and a pass gate are used to ensure minimum delay between the complementary outputs.

## **3.5 PLL Simulation**

To verify the functionality of the PLL block, we can complete the loop by using some fixed divider counters, then perform real time simulations on the system. It is worthwhile to do this before finishing the complete frequency synthesizer design for the reason of reducing the simulation time. This is because the PLL unit is an analog device, and the VCO units produce high frequency outputs, it requires a small time step in simulation to achieve high accuracy. However, the PLL unit locking time is extremely long compared to the output frequency period, microseconds vs. picoseconds in this case. Full real time simulation would require huge amounts of CPU time. To cut down the simulation time, we uses fixed dividers on the feedback loop rather than the more complicated programmable divider unit. Thus, simulation can be done without the programmable divider being built. The following are some sample waveforms captured from the HSpice simulation results.





Figure 3-19. Simulated control voltage after the PLL is locked.

35

From Figure 3-19 we can see there are two noises showing on the output signals, one with 1  $\mu$ s period and the other with 10  $\mu$ s. The high frequency one is the noise coupled from the phase-frequency detector outputs to eliminate the "dead-zone" and the low frequency one is the noise passed through the low pass filter since it has the bandwidth of  $\omega_{lpf} = \frac{2\pi f_{ref}}{10} = 2\pi \times 10^5 \frac{rad}{secV}$ .

## **3.6 Physical Layout**

As mentioned at the beginning of this chapter, the PLL is an analog system with tight constraints, including substrate and power noise rejection, power dissipation and crosstalk. Hence, the design needs to be done using a full custom design flow. The flow we use is shown in Figure 3-20.





In this custom flow, we have created scripts, including scripts for DRC, LVS, ERC, and simulation, as well as global rule files to minimize human interaction with the flow.

The following is a list of guidelines we used during the PLL design.

- Wide transistors are "folded" to reduce S/D junction area and gate resistance;
- layout differential circuits symmetrically to suppress the effect of common-mode noise and even-order nonlinearity;
- digital signals are distributed in complementary form to reduce the net amount of coupled noise;
- floorplan the noise sensitive analog elements away from digital noise sources;
- use dedicated analog power and ground supplies for the analog elements;
- place double guard ring around all sensitive circuits and noise source circuits;
- fill unused area with de-coupling capacitors to reduce power and ground noise.

The plot of the PLL block layout is shown in Figure 3-21. It also indicates the final placement of each sub-block. Notice that the loop filter occupies more than half the PLL block area. Therefore, using one set of CP/LPF with multiple VCOs greatly reduces the overall chip size in our design.





# **3.7 Conclusions**

In this chapter, we have demonstrated the design and circuit analysis of a PLL block. Since a PLL is a complex analog device, this design provided challenges to the designer in analog component design. These included the consideration of effects from digital elements in a typical deep sub-micron CMOS process technology. The analog components design includes design of differential operational amplifiers, analog filters, a charge pump, a phase-frequency detector, current mirrors, a common-mode feedback loop, and voltage-controlled oscillators.

Because of its complexity, the design of the PLL block uses a full custom design flow. Therefore, it not only helps the designer to understand the flow structure, but also verifies the proper construction of the design flow for future uses.

# 4 COUNTERS

## **4.1 Introduction**

As we have discussed in Chapter 2, the VCO output frequency  $f_{out}$  and the feedback frequency  $f_{fb}$  have the relationship that  $f_{fb} = f_{out} / M$ , where, M is the divisor of the feedback loop. Further, when the frequency synthesizer is in the locked state,  $f_{ref} = f_{fb}$ . Therefore, when the system is locked,  $f_{out} = M \times f_{ref}$ . By changing the value of M, the synthesizer can generate a range of frequencies. A programmable frequency divider is used to generate the divisor, M. It can be as simple as a high speed digital programmable counter, which is what we use in our design.

The counter design we use for the frequency synthesizer is the so-called pulse swallow frequency divider. Its block diagram is shown in Figure 4-1. The circuit is governed by the following equation:

$$M_{count} = 2(P \cdot N + S)$$
(4-1)

In the equation, the S value is the number associated with the programmable counter. By changing the S value, the total number of counts of the system can be programmed. Since the output frequency range is divided between two VCOs, as explained in Chapter 3, two sets of P and S are needed. The calculated P and S values are shown in Table 4-1. Instead of building two separate sets of circuits, we used one general 1-256 programmable counter for S and a 249/372 variable counter for P. Based on the external inputs, a decoder control circuit chooses which counting modes the counters should be in.

Figure 4-1. Pulse swallow frequency divider.



Table 4-1. Counting values for the pulse swallow frequency divider.

| Frequency range | P (counts) | (N+1)/N (counts) | S (counts) | Total counts         |
|-----------------|------------|------------------|------------|----------------------|
| 1GHz - 1.49GHz  | 249        | 3/2              | 1 to 247   | 998 to 1490, step 2  |
| 1.49GHz - 2GHz  | 372        | 3/2              | 1 to 256   | 1490 to 2000, step 2 |

As one of our design goals, we wanted to test the performance of the standard cell set and the custom pads we were using. However, since the project was not intended to characterize the cell set, the test was limited to checking how fast the cells and the pads could be run, reliably. Therefore, it was desirable, for test purposes, to design a system that could generate a set of frequencies ranging from one that the cells and pads could most likely work under to one that would cause them to fail. For this reason, we put another programmable counter on the high frequency prescaler outputs. The complete block diagram for the counter unit with the test structure is shown in Figure 4-2. The signal  $f_{low}$  is used to test the performance of standard cells and pads. Its value is defined as:  $f_{low} = \frac{M1}{M2} \times f_{ref}$ .

Figure 4-2. Block diagram of counter unit.



Currently we have two DFF designs on hand: one has clock-to-Q delay of 180ps and the other's is 300ps. Since timing constrains are generally the biggest issue in a high speed digital design, to better simulate these issues, we choose to use the low speed DFF with a clock-to-Q delay of 300ps in our counter design.

## 4.2 Design and Circuit Analysis

With the given  $f_{vco}(MAX)=2.0$ GHz, Figure 4-2 shows that the counters S and P are required to support up to 500MHz of input clock signal. For the design of these counters, there are two types of counting schemes are commonly used: binary, and one-

hot. An implementation using the binary scheme provides greater area efficiency because it only requires n registers for  $2^n$  counts. However, due to the extra decoding logic overhead, such a design may not be suitable for high speed applications. On the other hand, a one-hot counter easily out-performs any other counter design due to its minimal decoding logic overhead. But since it requires one register for each counting state, one-hot counting scheme becomes extremely area-expensive for a design with a large number of counting states. Due to the high speed and number of counting states, neither scheme provides a reasonable solution for our design. By carefully evaluating the counter's timing constraint and the characteristic of our cell set, we decide on a modified Möbius counter design that meets the performance constraints while providing good area efficiency.

### 4.2.1 Modified Möbius Counter

The bit pattern of a 4-bit Mobius counter is shown in Figure 4-3, in which each letter represents a counting state of the counter and the numbers in a state show the status of each bit. The Möbius counter has two advantages over a one-hot counter design: it occupies half the size and uses half the dynamic power. From the counting pattern we can see that an n-bit Möbius counter can realize 2n counts. Thus, using half the number of registers, a Möbius counter achieve the same number of counts that a one-hot counter does. This cuts down the design area usage by half. Further, since there is only one logic transaction on each count rather than the two on the one-hot case, the Möbius counter consumes only half the switching power of the one-hot counter.





A Möbius counter can be implemented as a chain of DFFs, with the input of each DFF connected to the output of the previous one, except for one DFF, whose input is taken from the complemented output signal of the previous FF, as shown in Figure 4-4. A global reset signal shown in the schematic is required to set the counter to its initial state.

Figure 4-4. schematic of a 4-bit Möbius counter.



## **4.2.2 Circuit Implementation**

From its unique counting pattern, the current state of the counter can be found by detecting the "edge" of the bit pattern. Here we define the falling edge of the pattern as a transition in which a bit completes switching from '1' to '0' state, while the rising edge is the opposite. Since only one transition occurs in each count, we can simply com-

pare the two consecutive bits to detect the edge of the pattern. For instance, if we see Bit  $[2] = 1^{\circ}$  and  $[3] = 0^{\circ}$ , we know the bit pattern's rising edge has arrived at State D, and therefore, the counter is in State D. Another example is when both Bit [3] and [0] are in logic '0' state, we know the falling edge has arrived at Stage A, and so does the counter. The detection circuit for each count state can be implemented with two NAND gates, one for each edge. Two types of detection circuit are shown in Figure 4-5. The special unit is for the DFF which input is connected to the complement output of the previous one and the general one is for other DFFs. A NAND gate compares the input and the complement output of a DFF. If specific pattern edge has arrived at the input, the NAND will output '0', otherwise, its output remains '1'. Since a NAND gate can only detect one type of edge, two gates are required to take care of both rising and falling edges. Indeed, with added detecting units, the Möbius counter's output is turning into one-hot (or one-cold, in our case) style. This gives us the benefit of replacing existing one-hot design directly with our design as long as the timing requirement is fulfilled.

### **4.2.3 Timing Analysis**

Using extra detection circuits increases the counter's clock-to-Q delay. Since the detection circuits are identical to each other, this extra delay is independent of the present state of the counter. Thus, regardless of the counting state, the overall clock-to-Q delay is equal to a DFF's clock-to-Q delay plus a maximum of one two-input NAND gate propagation delay. With approximately 120ps of propagation delay on our NAND gate, it gives around 420ps overall clock-to-Q delay. In contrast, in a binary counter, in which the decoders are normally built using multi-level logic using gates with more than two inputs, the propagation delays are different depending on the current state. So the overall clock-to-Q delay has to accommodate the worst case delay, which is expected to be much larger than the clock-to-Q value in our design.

Figure 4-5. Pattern edge detection circuit.



Both the P and S counters in our design need to count more than 200 states. Using the Möbius counter design, each counter still uses more than 100 registers. To reduce the counter area, a two stage, self-timed Möbius counter circuit design is used. The schematic of a simple circuit is shown in Figure 4-6. We see that the counter contains two Möbius chains, with one chain taking its clock signal from the flip output of the other one. Therefore, this example counter counts up to 8X8=64 counts. This is gain goes with a corresponding sacrifice in the counter's performance. The critical path of the system is highlighted in the schematic, which now goes from the clock input of the

Figure 4-6. 64-count 2 stages Möbius counter.



bottom chain to one of the top chain outputs. The delay is estimated to be about 1ns, or twice the delay of one chain plus the delay from the buffers, which are used to take care of the heavy fan-out on the second clock tree. With the 2-stage structure, only 16 DFFs are required for the 256-state counter and 20 DFFs for the 249/372-state counter. There-fore, the structure gives great reductions in area. The schematics of the S and P counter designs are shown in Figures 4-7 and 4-8 respectively.

To reduce the overall clock-to-Q delays, the outputs of the S and P counters are register buffered, as shown in Figure 4-7 and Figure 4-8. With the added registers at the output, the overall clock-to-Q delay should again be close to a register's clock-to-Q delay. However, the fastest clock rate that the system can support is still restricted by the internal critical path delay. Careful calculation and simulation showed that the overall delay meets the system speed requirement.

#### Figure 4-7. Schematic of 256-state programmable



To make the counter programmable, each output of the counter is compared with the corresponding external input by using a XOR gate. For a 256-state counter, two groups of 16 pins are required. We use two design schemes to reduce the number of external pins needed for counter programming: binary coding and a shift register system. In a binary coding system, a decoder converts input binary digits into one-hot digits. There-

#### Figure 4-8. (249/372)-state fixed



fore, 8 I/O pins are needed for each 256-state counter. To further reduce the total number of pins, a shift register system is inserted in front of the inputs of the decoder. Therefore, with 4 pins, TRST, TCK, TIN, and TOUT, we can load any number of digits into the chip. Since these circuits run in the low frequency digital regime, we have decided to use a standard cell design flow for their designs. Detailed information will be covered in Section 4.4.

#### Figure 4-9. 3/2 prescaler.



Since the clock-to-Q delay of the DFF is 300ps, fast enough to carry a 2GHz clock signal, it is directly used to construct the divide-by-two prescaler whose schematic is therefore not shown here. The schematic of 3/2 prescaler is shown in Figure 4-9. Because any glitch during mode switching is unacceptable, we have designed this counter with the unique counting pattern shown in the figure, which eliminates the glitch issue. The details of the design implementation may be obtained from the figure.

## 4.3 Simulation and Measurement

Circuit simulations for a typical high speed digital circuit should include functional simulation and timing simulation. Functional simulation is intended to verify the digital circuit's functionality under all possible input/output conditions. It is normally done at the structural level, in which circuits under test are described using an HDL language and are simulated using tools such as ModelSim. These simulations are performed in the discrete time domain, detailed information such as interconnect delays and parasitic components being usually neglected. This helps generate simulation results quickly. To further analyze the circuit behavior for issues including but not limited to race conditions and critical path delays, a detailed timing simulation is needed.

Initially, it was taken for granted that the functioning of the counter unit was simple enough that it was admissible to skip the functional simulation. We only performed the special case functional checks under HSpice while doing timing simulations. However, doing so can greatly increase the chance of design failure, as the counter unit has a large number of states, some of which may have input/output combination that can cause the unit to malfunction. These may not surface until a complete check is done. Unfortunately, we did not realize this issue until we were at the final stage of design, when the tape-out date was approaching. Since we did not have the time to go back and fix the issue, we had to take the risk which should not have existed in the first place. We had run more than 20 timing simulations with different input setups on the counter units and all of them pointed for the proper functioning of the unit. Again, this cannot guarantee functional correctness of the unit under all input/output conditions; rather, it only gives us some hope for the proper function of the final design. The following figures show a few waveforms captured from the Spice simulation output. The measured system's clock-to-Q delay is around 340ps, close to a DFF module's clock-to-Q delay, which is around 300ps. This agrees with our previous conclusion.



Figure 4-10. Sample waveforms from 1-256 programmable counter.

## **4.4 Physical Layout**

For speed performance reasons, we use a custom design flow on the prescalers and other counter designs. However, the layout of low performance circuit blocks such as 4-to-16 decoders are done using parts of the standard-cell design flow. Since we have described the details of the custom design flow in Chapter 3, we are not going to repeat it here. Let us take a look at the standard cell design flow we used for this design. The idea behind a standard-cell-based design is to reuse a limited library of cells and replace the labor intensive placement and routing work with CAD automated placement and routing. Figure 4-11 shows a typical ASIC standard cell design flow, in which, the grayed blocks are the ones we used in this design. Other blocks are not used due to either software access limitation or simplicity of the design itself, except for the construction of the cell library, which has already been done prior to this design.

Here is the list of all the tools that are used for the flow. All the scripts and rule files that were created for this project are based on these tools. However, similar tools from other CAD tool vendors may be used as well.

- RTL coding: Verilog HDL;
- RTL simulation: ModelSim;
- logic synthesis and optimization: Design Compiler;
- placement and routing: SEDSM;
- Detailed routing: SEDSM and MAX;
- Post layout static timing analysis: HSpice.

#### Figure 4-11. Typical Standard Cell Design



Notice that in the "Detailed routing" step, a custom layout tool, MAX, is used. The tool is used for manually inserting extra vias for design reliability reasons. Once the ASIC block is constructed, it is used in the custom design flow "as-is" for the rest of the design.

The following figure shows the layout plot of the counter design.



Figure 4-13. Layout plot of 8-bit shift register.



Figure 4-14. Layout plot of 4-16 decoder design.

| a 220 · · · · · · · · · · · · · · · · · · | 00000000                                      | C GGG · · · · · · · · · · · · · · · · · |  |  |  |  |
|-------------------------------------------|-----------------------------------------------|-----------------------------------------|--|--|--|--|
|                                           | <b>0</b> 00000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |                                         |  |  |  |  |
| 0 0 00000                                 | 11 000000                                     |                                         |  |  |  |  |
| 81 <b>D V</b>                             | H U 10                                        |                                         |  |  |  |  |
| SI 0                                      |                                               |                                         |  |  |  |  |
| B 990                                     |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
| C @                                       | 11 000000 · · · · · · · · · · · · · · ·       |                                         |  |  |  |  |
| 61 Q Q                                    | n <u>u</u> o                                  |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
| 8 000 0000 0000                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
| G @                                       |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
| 0 000 000000000000000000000000000000000   |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           |                                               |                                         |  |  |  |  |
|                                           | 5711                                          | m                                       |  |  |  |  |
| J/ull                                     |                                               |                                         |  |  |  |  |
| ◀                                         |                                               | <b>&gt;</b>                             |  |  |  |  |

# **4.5 Conclusions**

Due to the fast timing and large-number counting requirements, traditional counter designs including binary and one-hot designs cannot provide reasonable solutions for our design. Therefore, we have come up a new type of counter, called the modified Möbius counter. In this chapter, we have covered the concepts involved in this new design and its implementation. In a high speed digital design, a designer/designer team often has to modify traditional circuit designs or come up with a new design in order to fulfill the design goals. Our counter design is a good demonstration for it. Also, to further test the designer's problem solving skills, we have tightened the timing constrain by using a low-speed DFF design in our counter implementation. Thus, it can be imag-

ined that by using a faster DFF design, better counter performance may be achieved. At the end of this chapter, we have also briefly shown the use of a standard cell design flow. It has been used on the low speed circuit block designs including the 4-16 decoders.

# **5** PADS AND PACKAGING

## **5.1 Introduction**

The design of the frequency synthesizer has been planned for fabrication so that we can verify the design implementation and its performance. For this purpose, we need to have a set of pads to use. Since most pads complying with TSMC 0.25um process technology are proprietary intellectual properties, there are a lot of regulations and restrictions to use them. Thus, we decided to design our own pads and make them public accessible in the future. After all, the pad design falls into the mix-signal design area. Furthermore, several design constraints that are associated to pad design, such as ESD protection, are good to know from a designer's viewpoint.

To shorten the design cycle, we decide to use the Tanner 0.25um pad package from MOSIS as a reference. All the low-speed pads design that we need are included in the package. However, we also need to build a new high-speed digital pad in order to support the signals generated from the VCO block. In our design, signals with frequency as high as 2.0GHz are required to pass outside the package. Thus, it is impossible to use traditional full swing digital output pads for these signals. This can be seen from the following reason. Assuming the package we use has the pin electrical characteristic of

5nH inductance and 10pF capacitance; for a 2Ghz, 2.5V signal swing, the switching current is:

$$I = C \frac{dV}{dt} = 10 pF \cdot \frac{2.5V}{250 ps} = 100 mA$$
.

Therefore, even if we use dedicated voltage supply pins for these pads, the voltage drop on the supply becomes:

$$V_{drop} = L\frac{dI}{dt} = 5nH \cdot \frac{100mA}{250ps} = 2V$$

With 2V supply variation, the devices will be put out of operation. To solve this problem, we can design the circuit to lower the output signal swing to  $100mV_{pp}$ . Thus, the supply noise becomes 160mV. Further, we use differential structure in our design so that the supply noise will be reduced as common-mode noise. Now, let us first look at the implementation of the high-speed differential pad. Other pad designs will be covered in the latter part of this chapter.

## **5.2 High Speed Differential Pads**

Because of the noise issue addressed before, it is a common practice to use low-swing differential pad for high-speed signals. The pad circuit design we used is shown in Figure 5-1. It is basically an ECL circuit structure. The output voltage swing is controlled by the supply voltages. These voltages will be supplied by a pair of dedicated power pins. Using dedicated supply voltages can 1) allow the user to set the desire

Figure 5-1. Schematic of high speed



output voltage swing; 2) increase the testability of the circuit; and 3) reduce the power noise inference to the rest of the chip. From the schematic, one may argue the output signals do not track each other during mode switching, and therefore, are not differential. Tracking during mode switching is not important in this case. This is because when the digital signal's voltage swing is reduced, the switching time of a signal become relatively short by comparing with the signal period. Therefore, it can be neglected and the signal pair are considered differential.

Shown in the schematic, two resisters are added in the output path. They act as the source terminators to reduce the high-speed signal bouncing issue. Since the output signal will be connected to 50 Ohm cable, the output impedance of the driver should be as small as possible. Recall that  $R_{eq}$  is inversely proportional to W/L ratio. Thus,  $R_{eq} = 20$  Ohm will give us reasonable small output impedance within affordable chip area. To further reduce the size of the driver, device sizes of M2/M4/M5/M7 are reduced. This is possible since current flowing through transistors M4 is mirrored by the combination

of (M1-M2) and compared with the drain current of M3, device sizes of M2 and M4 can be reduced proportionally while maintaining the circuit functionality. Final circuit design has a gain of around 5. High gain is not necessary since the circuit carries digital swing signals. The final device sizes of the design is shown in Table 5-1.

| Devices | M1/M6 | M2/M5 | M3/M8 | M4/M7 |
|---------|-------|-------|-------|-------|
| М       | 28    | 4     | 28    | 4     |
| W(µm)   | 14    | 14    | 5.6   | 5.6   |
| L(µm)   | 0.24  | 0.24  | 0.24  | 0.24  |

**Table 5-1.** Device sizes of high speed differential pad drivers.

To comply with TSMC process technology, additional ESD protection circuit is added. Since it is similar to the protection circuits used in the general purpose pads, they will be covered in the General Purpose Pads section. The plot of the differential pads layout is shown in Figure 5-2.

Figure 5-2. Layout plot of differential pads.



## **5.3 General Purpose Pads**

The general-purpose pads we use can be classified into three category: regular power pads, analog power/reference pads, and digital I/O pads. To comply with TSMC 0.25um process technology, ESD protection circuit is added to each pad.

## **5.3.1 ESD Protection Circuit**

The so-called ground-gate MOS (ggMOS) structure ESD protection circuit is used in this design since it is one of the simplest MOS ESD protection device structures. The circuit schematic is shown in Figure 5-3. The ggMOS ESD protection structure has the advantage of providing "zero" leakage under normal operations and an active discharging path during ESD events. It is preferable also because the ggMOS is a nature option in CMOS technologies. Detailed ggMOS scheme explanation can be found in [Wang02]. To obtain higher ESD protection, high voltage devices and multiple-finger structures are also used in this design. Figure 5-4 shows the layout of a basic pad structure. The structure dimensions are complied with MOSIS' TinyChip rules.

Figure 5-3. ggMOS ESD protection scheme.


Figure 5-4. A pad structure with ggMOS ESD protection.



#### 5.3.2 Power Pads

The structure of regular power pads is shown in Figure 5-5. Corresponding ggMOS devices are removed since they are not used during ESD events.

Figure 5-5. Schematic of power pads.



#### 5.3.3 Analog Power Pads and Reference Pad

Since dedicated power pins for analog circuits are used in this design, the interface devices between the digital and analog circuits such as PFD and VCO become sensitive

to ESD damage. To prevent the problem, we need to add ESD protection between the digital and analog power lines, as well as between analog reference and power lines. The schematics of analog power pads and the reference pad are shown in Figure 5-6.

Figure 5-6. Schematics for analog power pads and reference pad.



#### 5.3.4 Digital I/O Pads

The digital input, output, and bi-directional pads are considered as digital I/O pads. In fact, a bi-directional pad can also be used as a input or output pad, with control pin tie to either logic '0' or '1'. Therefore, we only need the bi-directional pad for our design. The schematic of the bi-directional pad is shown in Figure 5-7. It is adopted from Tanner digital bi-directional pad design. The device sizes we used in our design are shown in Table 5-2. Confirmed with spice simulation, the pad functions properly at 10MHz frequency, which fulfills our design requirement. The maximum frequency this pad can support will be measured with our test plan. The layout plot of the bi-directional pad is shown in Figure 5-8.

Figure 5-7. Schematic of the bi-directional digital pad.



Table 5-2. Device sizes for the bi-direction pad.

| Devices | M1/M4 | M2   | M3/M6 | M5   | M7   | M8   | M9   | M10  | M11/M13 | M12/M14 |
|---------|-------|------|-------|------|------|------|------|------|---------|---------|
| М       | 5     | 4    | 5     | 4    | 1    | 1    | 4    | 1    | 6       | 6       |
| W(µm)   | 6.24  | 6.24 | 3.6   | 3.6  | 6.24 | 3.6  | 30   | 30   | 6.24    | 3.6     |
| L(µm)   | 0.36  | 0.36 | 0.36  | 0.36 | 0.36 | 0.36 | 0.48 | 0.48 | 0.36    | 0.36    |

Figure 5-8. Layout plot of the bi-directional pad.



## 5.4 Packaging

The die will be packaged in ceramic Leadless Chip Career (CLCC) 44 pin package provided by MOSIS. The ceramic package is used rather than plastic one for the similar electrical characteristics over all of its pins. However, the pin characteristic and its model is not provided by the vendor, which becomes an issue when we try to characterize the pads. After evaluating the similar packages' electrical characteristics that are available to us, we have come out an estimated pin model for the CLCC-44 package, which is shown in Figure 5-9.

Figure 5-9. Pin model of CLCC-44 package.



This model is used in all pads characterizations and the top-level chip simulation. The chip is proven functional under this package model.

The CLCC-44 provided by MOSIS offers cavity size of  $7.62 \text{ mm} \times 7.62 \text{ mm}$ . And the maximum die size can put in the package is  $6.985 \text{ mm} \times 6.985 \text{ mm}$ . Since our design measures  $1.5 \text{ mm} \times 1.5 \text{ mm}$ , it can be easily fit into the CLCC-44 package. Figure 5-10 shows the layout plot of our final design.



Figure 5-10. Layout plot of frequency synthesizer design (including pads).

## 6 OTHER DESIGN ISSUES

#### **6.1 Introduction**

In the previous three chapters, we have demonstrated the circuit design and implementation of the frequency synthesizer design. We have also covered most of the design issues which are associated to our design and process technology it is complied with. However, there are still some important design issues that have not been addressed. We will discuss these issues in this chapter.

#### **6.2 Power Dissipation**

It is important to reduce the energy consumption of a chip design. Over budget power dissipation can lead to several serious issues. Firstly, for a battery operated device, more circuit power dissipation means less operation time. Secondly, the circuit may drain out more current that a set of power and ground pins can supply, and therefore, part of the circuit may not getting enough power to maintain its functionality. Thirdly, the circuit performance may be reduced or circuit itself becomes unstable due to exceed heat generated. Lastly, exceed heat may even cause physical damage to the chip.

In a deep-sub-micron CMOS digital circuit design, the primary contributors to power dissipation are circuit switching (dynamic power dissipation) and leakage current (static power dissipation). Amount of power consumed by circuit switching depends on amount of charges being transferred during switching. Hence, shutting down the unused circuit, reducing the clock rate, or using minimum device sizes can reduce the circuit's dynamic power dissipation. However, these techniques cannot be used in reducing the static power dissipation caused by leakage current. This is because in a defined process technology, the magnitude of a device's leakage current mainly depends on the device size. Its value can be estimated as  $I_{ds-leak} = K \cdot \frac{L}{W}$ , where K is a function of the technology. Thus, to reduce the static power dissipation, devices with larger than minimum channel length should be used. However, changing device sizes can jeopardy the circuit's performance and noise immunity, which may not be acceptable in certain situations. Therefore, the designer has to evaluated the trade-offs during the design. Trade-offs between speed, power dissipation, and noise are more obvious in analog circuit design, since analog circuits require constant power consumption to maintain their proper operations.

The following list shows the main techniques we used in our design to reduce the power dissipation:

- shut down the VCO block when it is not being used;
- use less amount of reference current in a current source and choose correct ratio to generate required supply current in an analog circuit;
- self timed counter to reduce the clock rate;
- reduce supply voltage on low voltage swing differential driver.

After applied these techniques, the system is expected to consume 175 mW of power: 50mW from all digital circuits and digital pads; 100uW from current source; 50mW from all analog circuits, including VCOs; and 75 mW from low voltage swing differential pads. Thus, the design meets the power budget, which is less than 200mW.

#### **6.3 Noise Reduction**

As mentioned previously, analog circuits are extremely sensitive to noise. In a mixedsignal environment, noise generated by digital circuit switching is the biggest noise contributor. Such noise can pass through interconnect cross-talk, power lines, and substrate to analog circuits and causes malfunction. Thus, it is very important to reduce the noise interference between circuit blocks. To reduce the noise interference, we use the following guidelines in our design:

- encircle all sensitive analog components with double guard-ring to reduce substrate noise;
- place digital blocks far away from analog blocks encircle them with double guardring to reduce substrate noise from escaping;
- apply separate power/ground pairs to analog and digital blocks, with ESD protection circuit connected between two sets of power;
- use differential circuit structure in a noisy environment to reduce the commonmode noise;
- shut down the un-used VCO to reduce switching noise;
- distribute high speed digital signals in complementary form to reduce the net amount of coupled noise;
- fill empty area with bypass capacitor cells to reduce the power line frustration caused by circuit switching;
- Allocate enough power/ground pins and assign all unused pins for power or ground.

Notice some of them have been mentioned in the previous chapters. We re-list them here for the completeness of the list.

#### 6.4 Design for Test

During the design, we have applied several DFT techniques in order to make the frequency synthesizer IC more easily testable. We will discuss each of them in the following section.

From Chapter 2 we can see the differential VCO control voltages carry a lot of valuable information, including the PLL settling time, proper operation of the LPF block, UP and DN current matching in the CP block, and proper function of the common-mode feedback loop. Therefore, it is useful if we can monitor the VCO control voltages externally. Pulling the control voltages straight out from the signal path is not an option since extra parasitic capacitance can alter the gain of the LPF unit. To minimize the impact to the control signals, two source follower circuits are used to generate the monitoring signals, which will pass through two analog reference pads to outside world. Therefore, VCO control voltage signals can be monitored externally.

Second change we have made associated to DFT is using separate power/ground pins for the high-speed differential pad. As mentioned in Chapter 5, one reason to use separate power/ground pins on the differential pad is for better power noise rejection. More importantly, we want to allow flexible change of the pad's supply voltages. This is because the differential pad we use is a completely new design, thus we do not know exactly under what condition the pad can operate properly. By using separate power and ground supplies, we have the freedom of choosing different setup to test and characterize the pad more thoroughly.

The last change we have made is within the shift register block. We have used the DFF with added asynchronous reset ability to construct the shift registers. Thus, in the event that the register block stop shifting, we can hard reset the block externally to put it into a know state so that the frequency synthesizer may remain partially functional. Without the asynchronous reset ability, the programmable counter can be in any random states, and therefore it is impossible to evaluate the synthesizer's functionality.

The above three changes can help increasing the testability of the design. Hence, they can reduce the time spending on IC test and measurement.

#### 6.5 Dummy Metal/Pad Insertion

To comply with the TSMC assembly stress relief rules, dummy metal and pads are inserted after the final placement and wiring. Dummy metal and pad insertion are important since it can reduce the wire width and space variation [Bernstein02] and prevent surface concavity from happening during chemical mechanical polishing (CMP). TCL scripts are used to generate the dummy metal automatically and manual placement are used for dummy pad insertion. Figure 6-1 shows some examples of dummy metal and pad fill. Figure 6-1. Metal and assembly stress relief fill.



# 7 WHOLE CHIP SIMULATION AND TEST PLAN

### 7.1 Introduction

Since the frequency synthesizer chip design is planned to be fabricated, a complete chip simulation is required according to the vendor's agreement. Also, a chip test and measurement plan is necessary for design evaluation. The first part of this chapter is intended to show the whole chip simulation setup we use and its result. Then we will explain our test plan for the chip test and measurement.

#### 7.2 Whole Chip Simulation

Whole chip simulation is necessary since it is required by the vendor, and more importantly, it is the last defence of a successful design. Besides testing the design functionality under normal condition, such simulation should also test the design's temperature and supply voltage noise tolerances. Before starting the simulation, we need to construct the driver and load models of the circuits that will be connected to the chip under test. Since each output of the design is planned to be connected to 50 Ohm coaxial cable, a 50Ohm resistor with one terminal connected to the ground can be used for simulating an output load. Constructing input drivers are also simple since all input signals are low frequency. Thus, simple spice pulse generators can be used as input drivers. Figure 7-1 shows the actual test environment we use on the whole chip simulation.



Figure 7-1. Setup of top-level simulation.

Shown in the figure, Block "complete" is the spice model of the design with all parasitic parameters extracted from the final layout GDS file. To better monitor the system's behavior, couple test points are inserted in the spice model of the chip design. Such test points include current source outputs, low pass filter outputs, and outputs from each VCO cell.

Due to the fact that the core of the frequency synthesizer is a PLL unit, it becomes one of the most difficult circuits for spice simulation. During the simulation, we have suffered issues such as simulation convergence and time-step resolution that failed the simulation process. Thanks to new computer upgrade in Prof. Brewer's lab, we have managed to run a few cases detailed whole chip simulations to check the design functionality after reducing the simulation accuracy. The simulation results are listed in Table 7-1.

| Temp       |     |        | TEMP   | $= 30^{\circ}C$ |        | $TEMP = 80^{\circ}C$ |         |        |         |
|------------|-----|--------|--------|-----------------|--------|----------------------|---------|--------|---------|
| M counts   |     | 998    | 1490   | 1492            | 2000   | 998                  | 1490    | 1492   | 2000    |
| Vdd<br>(V) | 2.7 | passed | passed | passed          | passed | passed               | passed  | passed | passed  |
|            | 2.5 | passed | passed | passed          | passed | passed               | passed  | passed | passed  |
|            | 2.3 | passed | passed | passed          | passed | passed               | failed* | passed | failed* |

Table 7-1. Test result of whole chip simulations.

Notice that the system failed to lock when  $TEMP = 80^{\circ}C$ , Vdd=2.3V, and M=1490/2000. Extra set of simulations are performed with  $TEMP = 70^{\circ}C$  while keeping everything else the same. The extra simulations have confirmed that the system locks properly under the new condition.

Based on the whole chip simulation results, we are confident to say the frequency synthesizer circuit maintains its functionality in ambient temperature between 30°C and 70°C with 5% supply voltage variation. Unfortunately, due to the spice limitation, simulation results cannot be used to validate the performance of the frequency synthesizer such as phase noise. Validation of performance is to be covered in chip test and measurement.

#### 7.3 Test and Measurement Plan

The frequency synthesizer design has been submitted to the fabrication vendor on May 17th, 2004. Test chips will be sent back within eight weeks. We have prepared a test plan for the design, which will be described in this section.

The plan is designed to validate the test chip's functionality and measure its performance. Performance measurements will include phase noise, timing jitter, operation temperature range, and power dissipation measurements. Due to the output frequency requirement, a printed circuit board (PCB) test platform is required to complete the above test and measurements. The following is the list of the guidelines we will be using on the PCB design.

- use crystal oscillator to generate input reference clock;
- keep short trace between the crystal oscillator and the input of the test chip, and avoid using vias;
- high speed digital outputs will be route to surface-mount SMA connectors, again keep the trace short and avoid using vias;
- test chip will be mounted on PCB by using surface-mount LCC socket with 1pF max pin capacitance;
- digital input will be interfacing with PC's parallel port;
- three sets of power and ground will be provided via external power supplies for flexibility;

In the test environment, a computer with parallel port will be used to program the frequency synthesizer's programmable counter and scan registers. Output signals will be fed into oscilloscope and spectrum analyzer to measure synthesizer's timing jitter, settling time, and phase noise. Three programmable power supplies will coordinate to the computer to provide supply voltages. VCO input voltages will be monitored constantly by a digital multi-meter with PC interface and the results will be logged by the computer.

Since the synthesizer operates in picosecond range, the jitter measurements can be difficult to measure. Courtesy of Teradyne, we will be able to access to high speed oscilloscope to complete the test and measurement. For future consideration, a build-in DFT circuit will be useful in the measurement of PLL jitter and other specified parameters. The complete IC test and measurement will be performed in Fall, 2004 and the final result will be submitted to MOSIS as part of fabrication agreement.

# 8 CONCLUSIONS

In this thesis, we first discussed the design and implementation of frequency synthesizer in a deep sub-micron CMOS process technology. Frequency synthesizers and PLLs are widely used in high-speed digital and mixed-signal IC designs. Thus, it is especially valuable for a designer working in this field to have knowledge for frequency synthesis and PLLs. More importantly, frequency synthesizers is a complex system that involves many general high-speed digital and mixed-signal design issues. By designing and implementing a frequency synthesizer, the designer's problem solving skills in such field can be well exercised. Also the demand of IC designs in deep sub-micron CMOS technology make it desirable to use such kind of technology to implement the design.

We also introduced a new type of counter design, which is based on traditional Möbius counter structure. Such design is proven providing better design area and speed performance combination. Pipeline structure is also used in the design to further improve the counter performance. The standard cell design flow we have developed previously have been used in part of the counter design. Therefore, the success of this design can be used to prove the correctness of the design flow.

Then we reviewed the construction of pads for the technology we are using. Since most of the pads are proprietary IPs, it will be valuable to provide public accessible pad designs to help future researches using complied process technology. The pad set we created includes a high-speed differential pad that is to be characterized. Once its parameters are captured, the set is complete for high speed design usage.

Finally, we provided the whole-chip simulation results. The theoretical work in this thesis is supported by extensive simulation. We also described the test and measurement plan for IC testing after the test chip is sent back from fabrication.

Due to its complexity, this frequency synthesizer design project has been quite overwhelming for a novice designer. However, after completing the circuit design and implementation, I have gained enough experience to feel comfortable with this highspeed mixed-signal design.

#### References

- [Bernstein02] K. Bernstein, et al, *HIGH SPEED CMOS DESIGN STYLES*, Kluwer Academic Publishers, 2002.
- [Burns01] M. Burns and G. Roberts, *An Introduction to Mixed-Signal IC Test and Measurement*, Oxford University Press, 2001.
- [Dai03] L. Dai, DESIGN OF HIGH-PERFORMANCE CMOS VOLTAGE-CON-TROLLED OSCILLATORS, Kluwer Academic Publishers, 2003
- [Gray93] P. Gray and R. Meyer, *Analysis and Design of Analog Integrated Circuits*, 3rd ed., John Wiley and Sons, 1993.
- [Kim90] B. Kim, D. Helman, and P. Gray, "A 30MHz Hybrid Analog/Digital Clock Recovery Circuit in 2um CMOS," *IEEE Journal on Solid State Circuits*, Vol. SC-25, no. 6, pp. 1385-1394, December 1990.
- [Li00] L. Li, L. Tee, and P. Gray "A 1.4GHz differential low-noise CMOS frequency synthesizer using a wide band PLL architecture," *IEEE ISSCC Digest of Technical Papers*, pp. 204-205, 458, 2000.
- [Maneatis96] J. Maneatis, "Low-Jitter process-independent DLL and PLL based on self-biased techniques," *IEEE Journal of Solid-State Circuits*, Vol. 31(11), pp. 1723-1732.
- [Rabaey02] J. Rabaey, DIGITAL INTEGRATED CIRCUITS, A DESIGN PERSPEC-TIVE, 2nd ed., Pearson Education, 2003
- [Razavi02] B. Razavi, *Design of Analog CMOS Integrated Circuits*, Tata McGraw-Hill ed., Tata McGraw-Hill, 2002.
- [Wang02] A. Z. H. Wang, ON-CHIP ESD PROTECTION FOR INTEGRATED CIR-CUITS, An IC Design Perspective, Kluwer Academic publishers, 2002.

[Wolaver91] D. Wolaver, Phase-Locked Loop Circuit Design, Prentice Hall, 1991