# Farrow Structured Variable Fractional Delay Lagrange Filters with Improved Midpoint Response Haolin Li, Guy Torfs, Tarik Kazaz, Johan Bauwelinck, and Piet Demeester Department of Information Technology, IDLab Ghent University-imec Ghent, Belgium Email: haolin.li@ugent.be Abstract—This paper describes improvements in the Farrow structured variable fractional delay (FD) Lagrange interpolation. The main idea is to replace the first sub-filter of the Farrow structure by a sinc-interpolation filter of half a sample period to achieve a superior FD approximation in the vicinity of half a sample period. Its primary advantages over classical Farrow structured FD Lagrange interpolators are the lower level of mean-square-error (MSE) over the whole FD range and the reduced implementation cost. Design examples are included, illustrating an MSE reduction of 50% compared to a classical Farrow structured interpolator while the implementation cost is halved. *Keywords*—canonical signed digit (CSD); Farrow structure (FS); Fractional delay (FD); Lagrange; mean-square-error (MSE) #### I. Introduction In digital communication systems, the propagation delay from the transmitter to the receiver is generally unknown at the receiver. Hence, symbol timing must be derived from the received signal. When designing a digital basedband receiver on field programmable gate arrays (FPGAs), the received signal is typically uniformly sampled at a fixed analog-to-digital converter (ADC) clock. Thus, the timing error is a fraction of the ADC sample period and can vary with time. The timing adjustment must be done by fractional interpolation in the digital domain before decoding the received signal. Variable fractional delay (FD) interpolation filters have been widely investigated for timing synchronization in all-digital receivers since it is desired to realize the fractional interpolation in an efficient way from the perspective of hardware implementation [1], [2]. The Farrow structure (FS) can easily accommodate adjustable fractional delays without the need of changing the filter coefficients, lowering the implementation complexity compared to alternatives such as on-line design or storage of a large number of different impulse responses [3]–[7]. Digital filters are usually divided into two classes: finite-impulse-response (FIR) filters and infinite impulse response (IIR) filters. In contrast to an IIR filter, there is no feedback in a FIR filter, making it inherently stable. The FIR filters implemented on FPGA usually use a series of delays, multipliers, and adders to generate the filter outputs. Thus, a FIR filter can Fig. 1. The general Farrow structure with adjustable fractional delay d and $C_0(z)=1$ . be easily pipelined to increase the maximal clock frequency and the effective throughput, and the maximal allowable clock frequency of a FIR filter is then limited to the speed of the FPGA building blocks. In this work, variable FD Lagrange interpolation filters are realized as a FS [8] with the pipelined structure, which can be very attractive for wideband all-digital receiver implementations on FPGA. However, the approximation of Lagrange FD interpolation filters is heavily degraded when the FD approaches $0.5T_s$ , where $T_s$ is the sample period. Our observations show that if the first sub-filter of a Farrow structure is replaced by a sinc-interpolation filter for $0.5T_s$ , a superior approximation of an ideal FD interpolation $e^{-j\omega dT_s}$ can be obtained with lower implementation cost, where d is the fractional delay (usually $0 \le d \le 1$ ). ## II. FRACTIONAL DELAY LAGRANGE INTERPOLATION The Lagrange interpolator is also known as a maximally flat FIR fractional-sample delay system. Lagrange interpolation is a widely used method in signal processing algorithms and is very accurate at low frequencies. The coefficients of an $N{\rm th}$ -order Lagrange interpolator for fractional delay d can be expressed in the following way $$h_L(n) = \prod_{\substack{k=0\\k\neq n}}^{N} \frac{D-k}{n-k}, \text{ for } n = 0, 1, 2, ..., N$$ (1) where D is the real number that corresponds to the delay d from the beginning (n = 0) of the impulse response (see [8]). Farrow suggested that every filter coefficient of a FIR FD filter could be expressed as an Mth-order polynomial in the variable delay parameter d [4], [9]. The general Farrow structure is presented in Fig. 1, where $C_m(z)$ denotes the This work was supported by the ERC Advanced Grant ATTO project and imec IoT strategic research program. Fig. 2. The pipelined Farrow structure for a polynomial-based Lagrange interpolation filter, this structure works for both even and odd order. Z-transform frequency response of the Farrow structure subfilter. Each sub-filter is a Nth-order FIR filter as depicted in Fig. 2 and its Z-transform frequency response is defined as $$C_m(z) = \sum_{n=0}^{N} C_m(n)z^{-n}, \text{ for } m = 0, 1, 2, ..., M$$ (2) where $C_m(n)$ denotes the n-th coefficient of the m-th sub-filter. The impulse response realized by the Farrow structure can be expressed as follows $$h_d(n) = \sum_{m=0}^{M} C_m(n)d^m$$ , for $n = 0, 1, 2, ..., N$ (3) The coefficient $C_m(n)$ of the Farrow structured Lagrange interpolator, can be obtained from the inverse of the $N \times N$ Vandermonde matrix $V^{-1}$ , where each row represents the subfilter $C_m(z)$ . The main advantage of the Farrow structure is that all the sub-filter coefficients are fixed, the only changeable parameter is the fractional delay d, which leads to a less computation intensive implementation. The whole filter structure is pipelined in Fig. 2 to lower the calculation intensity during a single clock cycle, therefore allowing the increase of the maximal clock frequency. # III. CASCADED FILTER STRUCTURE To evaluate the design accuracy, the mean-square-error (MSE) is introduced. The MSE is mathematically defined as $$MSE = \frac{1}{N_1} \sum_{i=1}^{N_1} (\hat{Y}_i - Y_i)^2 \tag{4}$$ where $N_1$ is the number of samples, $\hat{Y}_i$ and $Y_i$ are the interpolated sample and the ideal sample with normalized power respectively. The MSE of the truncated Lagrange interpolator increases when the FD approaches $0.5T_s$ (see Fig. 4(a), L represents the prototype filter order) [10]. To compensate this degradation and obtain a low level of MSE over the whole range of d, a cascaded sinc-Farrow filter structure is introduced, as depicted in Fig. 3. Once the variable delay d approaches $0.5T_s$ , the branch $H_1(z)$ becomes active and the new FD $(d-0.5)T_s$ is fed to the Farrow structure. The Fig. 3. The cascaded sinc-Farrow filter structure. The switching between $H_0(z)$ and $H_1(z)$ is dependent of the intercepts and the hysteresis. $h_0(n)$ and $h_1(n)$ represent the impulses response of $H_0(z)$ and $H_1(z)$ , respectively. $$h_0(n) = sinc(n - \frac{N_2 - 1}{2}), \text{ for } n = 0, 1, ...N_2$$ (5a) $$h_1(n) = sinc(n - \frac{N_2 - 1}{2} - 0.5), \text{ for } n = 0, 1, ... N_2$$ (5b) where $N_2$ is the odd-order of the sinc-interpolation filter. As shown in Fig. 4(b), at $0.5T_s$ the MSE of the cascaded sinc-Farrow filter exhibits a minimal value that is mainly determined by the order of the sinc-interpolation filter (the cascased sinc-interpolator has the same order as the Farrow in Fig. 4(b)), because, as presented in Fig. 4(a), there is no MSE caused by the Lagrange interpolation at zero delay. As the order of the sinc-interpolation filter rises, the ideal FD interpolation of $0.5T_s$ is better approximated. By properly switching the outputs between these two filter branches, the overall MSE can be reduced. The intercepts of the $H_0(z)$ -FS and $H_1(z)$ -FS correspond to the optimal switching moments but typically some hysteresis is needed to prevent oscillation. Note that the optimal switching moments vary with different choices of Lagrange filter order N, prototype filter order L and the $H_1(z)$ filter order. Consider N=11 and L=11 in Fig. 4(b), for this implementation, the optimal switching points are at $d_1=0.3$ and $d_2=0.7$ . If d is smaller than $d_1$ or if d is lager than $d_2$ , the switch in Fig. 3 is placed in position 1 and the $H_0(z)$ filter is activated. If d is between $d_1$ and $d_2$ , the switch is moved to position 2 and the Farrow filter is preceded by $H_1(z)$ . When implementing the structure in a symbol synchronization loop, a small amount of hysteresis is added around the optimal switching values $d_1$ and $d_2$ to ensure stable operation. It is also worthwhile to mention that $H_0(z)$ can simply be replaced by shift registers since it introduces no additional fractional delay. The main advantage of using the cascased structure lies in the fact that when jointly optimizing the two filtering blocks the computational complexity to generate practically the same filtering performance can be drastically decreased. ## IV. PROPOSED INTERPOLATION FILTER STRUCTURE Note that the first sub filter $C_0(n)$ is equal to 1 for all delay values in FS, as shown in Fig. 1. Thus, the dual form of the cascaded filter structure, i.e. Farrow-sinc, can be used and the first sub-filter $C_0(n)$ can be substituted by $h_1(n)$ without having to change the parameter d. The orders of the Farrow structure and $H_1(z)$ are first kept equal for simplicity. The delay line represented by $H_0(z)$ is inherently included in the pipelined structure (referred to Fig. 2). When the FD approaches $0.5T_s$ , the deviation $\Delta h(n)$ of the Lagrange interpolation from the sinc-interpolation of $0.5T_s$ is added to the column $C_0(n)$ of the FS $(N+1) \times (M+1)$ coefficient matrix in order to correct the approximation error at $0.5T_s$ . The updating of the fractional delay to $(d-0.5)T_s$ is no longer required. $$\Delta h(n) = sinc(n - \frac{N-1}{2} - 0.5) - h_{d=0.5}(n)$$ (6) $$C_0'(n) = C_0(n) + \Delta h(n), \text{ for } n = 0, 1, 2, ..., N$$ (7) It is easily noted that this proposed Farrow-sinc filter bank structure (denoted as proposed FS in Fig. 4) has the same MSE value as the cascaded sinc-Farrow filter structure at $0.5T_s$ . However, the remarkable aspect of this proposed structure is that the MSE value starts decreasing when FD deviates from $0.5T_s$ as illustrated in Fig. 4(c), because the remaining sub-filters of Farrow structure compensate the approximation error. Therefore, the useful delay range between the two intercept points is widened compared to the cascaded sinc-Farrow structure at the same implementation complexity as depicted in Fig. 4(c). It should be pointed out that the Lagrange filter has good FD approximation when d is far from 0.5Ts, even for low filter orders. This allows us to jointly optimize the order of the Farrow structured Lagrange filter and $H_1(z)$ in order to achieve a superior performance. The design procedure is slightly modified. Assuming that the order of $H_1(z)$ is now N+2K. The Farrow structure of order N is first truncated from the prototype Farrow structure of order L [10]. Second, the Farrow $(N+1)\times (M+1)$ coefficient matrix is extended to $(N+1+2K)\times (M+1)$ by adding K zeros above and below the original Farrow coefficient matrix, which is nothing else than pipelining the signal. Hence, (6) and (7) are again applicable. A design example is presented in Fig. 4(d), the ultimate MSE of the proposed structure with N=7 and K=5 is much lower than those in Fig. 4(c), in particular in the area of $0.5T_s$ FD. Moreover, this superior FD approximation is achieved even with merely half the implementation cost. The optimization map for different orders of Farrow structured Lagrange filters and $H_1(z)$ filters with L=N+30 is shown in Fig. 5 where the optimal filter orders can be chosen for a given MSE performance requirement. In addition, this optimization map reveals that Lagrange interpolation performance only increases slightly with increasing filter order, while the order of $H_1(z)$ has significant influence. Filter orders for this design example are indicated in the optimization map. ### V. IMPLEMENTATION COST The computational cost of the Farrow structure of order N is N(N+1)+N multiplications and $N^2+N$ additions per output sample. Note that $C_0(z)$ is equal to 1 for all delay values in the original Farrow structure. Thus, the implementation cost of $C_0(z)$ is discarded. A modified Farrow structure proposed in [5] takes advantage of the symmetry of the polynomial Fig. 4. Mean-square-error (MSE) curves of interpolation filters of order N and different choices of prototype filter order L. (a) Farrow-structure of truncated Lagrange. (b) Cascaded Farrow of order N=11. (c) Cascaded Farrow and proposed Farrow filter structure of order N=11 and K=0. (d) Proposed filter structure of order N=7 and K=5. Fig. 5. Optimization map showing MSE performance with different orders of Lagrange and $H_1(z)$ filters. The values in the dashed lines represent the worst MSEs over the whole FD range when the corresponding Lagrange and $H_1(z)$ filter orders are used in the proposed filter structure. TABLE I. COMPUTATIONAL COMPLEXITY COMPARISON FOR CASCADED SINC-FARROW AND PROPOSED STRUCTURE WITH (MODIFIED) FARROW, ONLY COUNTING THE NUMBER OF MULTIPLIERS AND ADDERS | | Cascaded<br>Structure<br>N=11 | Proposed<br>Structure<br>(N,K)=(11,0) | Proposed<br>Structure<br>(N,K)=(7,5) | Proposed<br>Structure<br>with<br>M. Farrow<br>(N,K)=(7,5) | |----------|-------------------------------|---------------------------------------|--------------------------------------|-----------------------------------------------------------| | | Mul + Add | Mul + Add | Mul + Add | Mul + Add | | $H_1(z)$ | 12+11 | 12+11 | 18+17 | 13 +17 | | Farrow | 143+132 | 143+132 | 63+56 | 35 +56 | | Total | 155+143 | 155+143 | 81+73 | 48 +73 | coefficients, which leads to a reduction in the number of multipliers for the proposed structure. The partial symmetry of $H_1(z)$ in Section IV yields a further reduction in the number of multipliers. Therefore, the proposed structure with modified Farrow requires (N(N+1))/2 + 2N + 1 + K multiplications and $N^2 + 2N + 2K$ additions for odd N. In Table I the computational complexities of different implementations are compared in terms of the number of multiplications (Mul) and additions (Add) for computing one output sample. These results correspond to Fig. 4(b), 4(c) and 4(d) respectively. The implementation cost of Section IV is quadratically reduced by lowering the order of the Farrow structure. In addition, the constant coefficient multiplication in the Farrow structure can be efficiently and multiplierlessly implemented on FPGA with a limited number of shifters and adders by using the canonical signed digit (CSD) format. The utilization of the CSD format can dramatically reduce the number of non-zero bits representing the constant coefficient, therefore reducing the amount of calculations. # VI. CONCLUSION This paper has shown that the proposed structure not only features the advantages of the pipelined Farrow structure in terms of variable fractional delay interpolation and high throughput, but also enhances the fractional delay approximation. It is shown in the example designs that, a overall MSE of approximately 2% is achieved with the proposed structure at half the implementation cost, compared to 4% using a traditional implementation. The considerably lower level of MSE over the whole FD range implies that this proposed structure outperforms both the truncated Lagrange with Farrow structure and the cascaded structure of sinc-Farrow. These remarkable features can be very beneficial to applications such as symbol synchronization in digital receivers. #### REFERENCES - L. Erup, F. M. Gardner, and R. A. Harris, "Interpolation in digital modems-part II: Implementation and performance," *IEEE Trans. Commun.*, vol. 41, pp. 998–1008, June 1993. - [2] M.-T. Shiue and C.-L. Wey, "Efficient implementation of interpolation technique for symbol timing recovery in DVB-T transceiver design," in *Proc. IEEE Int. Electro/Inf. Technol. Conf.*, 2006, pp. 427– 431. - [3] V. Välimäki, "Discrete-time modeling of acoustic tubes using fractional delay filters," Ph.D. dissertation, Lab. Acoust. Audio Signal Process., TKK, Espoo, Finland, Dec. 1995. - [4] C. W. Farrow, "A continuously variable digital delay element," in *Proc. IEEE Int. Symp. Circuits Systems*, Espoo, Finland, Jun. 1988, vol. 3, pp. 2641–2645. - [5] J. Vesma and T. Saramäki, "Optimization and efficient implementation of FIR filters with adjustable fractional delay," in *Proc. IEEE Int. Symp. Circuits and Systems*, Hong Kong, June 1997, pp. 2256–2259. - [6] J. Vesma, "Optimization and applications of polynomial-based interpolationfilters," Ph.D. Dissertation, Tampere Univ. Technol., Tampere, Finland, 1999, Diss. no. 254. - [7] H. Johansson and P. Löwenborg, "On the design of adjustable fractional delay FIR filters," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 50, no. 4, pp. 164–169, Apr. 2003. - [8] T. Laakso, V. Välimäki, M. Karjalainen, and U. Laine, "Splitting the unit delay," *IEEE Signal Process. Mag.*, vol. 13, no. 1, pp. 30–60, Jan. 1996. - [9] J. Vesma and T. Saramäki, "Polynomial-based interpolation filters-part I: filter synthesis," in *Circuits, Systems & Signal Processing*, Apr. 2007, vol. 26, no. 2, pp. 115–146. - [10] V. Välimäki and A. Haghparast, "Fractional Delay Filter Design Based on Truncated Lagrange Interpolation," *IEEE Signal Processing Letters*, vol. 14, no. 11, pp. 816–819, Nov. 2007.