Tests of Additional Conditional Moment Restrictions Paulo M.D.C. Parente Instituto Universitario de Lisboa (ISCTE-IUL) Business Research Unit (BRU-IUL) Richard J. Smith cemmap, U.C.L and I.F.S. Faculty of Economics, University of Cambridge Department of Economics, University of Melbourne This Draft: February 2017 Abstract The primary focus of this article is the provision of tests for the validity of a set of conditional moment constraints additional to those de ning the maintained hypothesis that are relevant for independent cross-sectional data contexts. The point of departure and principal contribution of the paper is the explicit and full incorporation of the conditional moment information de ning the maintained hypothesis in the design of the test statistics. Thus, the approach mirrors that of the classical parametric likelihood setting by de ning restricted tests in contradistinction to unrestricted tests that partially or completely fail to incorporate the maintained information in their formula- tion. The framework is quite general allowing the parameters de ning the additional and maintained conditional moment restrictions to di er and permitting the conditioning variates to di er likewise. GMM and generalized empirical likelihood test statistics are suggested. The asymptotic properties of the statistics are described under both null hypothesis and a suitable sequence of local alterna- tives. An extensive set of simulation experiments explores the practical ecacy of the various test statistics in terms of empirical size and size-adjusted power con rming the superiority of restricted over unrestricted tests. A number of restricted tests possess both suciently satisfactory empirical size and power characteristics to allow their recommendation for econometric practice. JEL Classi cation: C12, C14, C30 Key-words: GMM; Generalized Empirical Likelihood; Series Approximations; Restricted Tests; Unrestricted Tests; Local Power. We would like to thank an Editor, Associate Editor and referees for their helpful and constructive comments on previous versions of the paper. The authors are also grateful for comments on versions of this paper by participants at the CREST-INSEE Seminaire Malinvaud, the CEMFI-UC3M Econometrics Seminar and the Montreal Econometrics Seminar, Microdata Methods and Practice: A Cemmap Celebration, the IMS Empirical Likelihood Workshop, National University of Singapore, the Info-Metrics Institute Conference Fall 2014, the 2015 New Zealand Econometric Study Group Meeting, Econometrics Seminars at Cowles, Yale University, CORE and GREQAM, seminars at Brown and Monash Universities and the Universities of Bristol, Cambridge, Hull, Porto, Surrey and Tasmania. 1 Introduction The primary focus of this article is the provision of tests relevant for independent cross-sectional data for the validity of a set of conditional moment constraints in addition to those de ning the maintained hypothesis when a nite dimensional parameter vector is the object of inferential interest. Examples in- clude moment conditional homoskedasticity and instrument validity.1 The main point of departure and principal contribution of the paper is the explicit incorporation of the maintained conditional moment information in the formulation of the test statistics. Thus, our approach mirrors that of the classi- cal parametric likelihood setting by de ning restricted tests for these additional conditional moments in contradistinction to unrestricted tests that partially or completely fail to incorporate the maintained mo- ment condition information in their design with the advantage that the former dominate the latter tests in terms of asymptotic local power, cf. Aitchison (1962). Newey (1985), pp.242-244, and Eichenbaum et al. (1988), Appendix C, pp.74-76, formulate GMM tests of additional unconditional moment constraints fully utilising maintained moment information gaining a similar local asymptotic power advantage over tests that fail to do so. The framework adopted in this paper is quite general allowing the parameters de ning the additional and maintained conditional moment restrictions to di er and permitting the conditioning variates to di er likewise. The paper also contributes a number of new theoretical results required to address the null and local alternative asymptotic distributions of the test statistics. The approach taken in the paper exploits an equivalence between conditional moment constraints and a countably in nite number of unconditional restrictions noted elsewhere; see Chamberlain (1987). Test statistics are consequently de ned in terms of an appropriate set of additional in nite unconditional moment conditions. These tests adapt and generalise those of Donald et al. (2003) which approximates conditional moments by an appropriate nite set of unconditional moments. Tests for a nite number of unconditional moment restrictions, cf. inter alia Newey (1985), Eichenbaum et al. (1988) and Ruud (2000) for GMM, Hansen (1982), and Smith (1997, 2011) for generalized empirical likelihood (GEL), see also Kitamura and Stutzer (1997), Imbens et al. (1998) and Newey and Smith (2004), are well- known to be inconsistent against all alternatives implied by conditional moment conditions; see, e.g., Bierens (1990). GMM and GEL test statistics de ned in Donald et al. (2003) circumvent this diculty by allowing the number of unconditional moments to grow with sample size at an appropriate rate.2 Likewise here both maintained and null hypothesis conditional moment constraints are approximated by corresponding sets of unconditional moment restrictions with the former a subset of the latter, both of whose dimensions grow with sample size at appropriate rates. Restricted GMM- and GEL-based 1Instrument validity tests are the concern of the application in section 6 to a parametric speci cation of an Engel curve relationship discussed elsewhere in, e.g., Muellbauer (1976), Banks et al. (1997) and, more recently, Blundell and Horowitz (2007). See fn. 15 below. 2Consistent tests of goodness of t in regression models have received substantial attention in the literature. See, e.g., Eubank and Spiegelman (1990) for the nonlinear regression context. See also inter alia De Jong and Bierens (1994), Hong and White (1995) and Jayasuriya (1996). [1] test statistics for additional conditional moment restrictions, after location and scale standardization, are asymptotically equivalent and converge in distribution to a standard normal variate under the null hypothesis. Intuitively this result re ects the implicit in nite number of unconditional moments under test since standardised chi-square distributed statistics are asymptotically standard normally distributed when the statistic degrees of freedom diverges to in nity. A similar result is obtained for unrestricted statistics that partially or completely neglect the maintained conditional moment information although the limit standard normal variate di ers.3 Interestingly, unlike nite dimensional test statistics, ecient parameter estimation is no longer required for test implementation. Under a suitable sequence of local alternatives, restricted and unrestricted test statistics are asymptotically non-central standard normally distributed. The non-centrality parameter of the restricted statistics exceeds those of unrestricted sta- tistics thereby demonstrating the de ciency of these latter tests mirroring the results for restricted tests in the classical parametric likelihood, Aitchison (1962), and unconditional moment condition, Newey (1985) and Eichenbaum et al. (1988), settings. The asymptotic local power results also indicate that one-sided tests of the additional conditional moment restrictions are apposite. The paper is organized as follows. Section 2 provides some initial de nitions, details the test problem and describes moment conditional homoskedasticity and instrument validity examples that are used throughout the paper. GMM and GEL restricted test statistics are then speci ed in section 3; an initial discussion presents the equivalence between conditional moment restrictions and an appropriately de ned in nite set of unconditional moment constraints together with the assumptions that underpin the analysis in the paper. Section 4 provides the limiting distributions of these and unrestricted statistics under the null hypothesis of the additional conditional moment validity; the large sample independence of the restricted test statistics and GMM and GEL test statistics for the maintained hypothesis is shown which thus permits the overall test size of a sequential test of the maintained and then additional conditional moment restrictions to be controlled. Section 5 considers the local asymptotic behaviour of the restricted and unrestricted test statistics demonstrating the one-sided nature of the tests and the relative de ciency of the latter tests. Section 6 presents a set of simulation results on the size and power of the test statistics based on an application to a parametric speci cation of an Engel curve relationship. Section 7 concludes. Proofs of the results in the text and certain subsidiary lemmata are given in Appendix A and the Supplement to the paper. The paper uses the generic subscript notation \m" and \a" to denote quantities associated with the maintained hypothesis and additional moment constraints. Conditional moment indicator vectors are denoted by u(; ) of generic dimension J , with parameter vector of dimension p and associated parameter space B; instrument vectors are denoted as s with dimension d. The abbreviations a.s., 3Alternative unrestricted tests could also be based inter alia on the approaches of Bierens (1982, 1990), Wooldridge (1992), Yatchew (1992), Hardle and Mammen (1993), Fan and Li (1996), Zheng (1996, 1998), Lavergne and Vuong (2000), Ellison and Ellison (2000) and Domnguez and Lobato (2004). The continuum of moment conditions method suggested in Carrasco and Florens (2000) o ers another possible approach; see also Hsu and Kuan (2011). [2] f.r.r., n.s. and p.d. indicate \almost surely", \full row rank", \nonsingular" and \positive de nite" respectively. [] is the integer part of . Statistics are \asymptotically equivalent" if they di er by an op(1) term. 2 Some Preliminaries 2.1 De nitions The maintained hypothesis is de ned in terms of the moment indicator vector um(z; m) which is a Jm-vector of known functions of the dz-vector of data observables z and the pm-vector of parameters m. In many cases um(z; m) may be interpreted as an error vector. It is assumed that there exists an observable dm-vector of instruments sm such that E[um(z; m0)jsm] = 0 a.s. sm (2.1) for some unknown value m0 2 Bm of the parameter vector m where Bm denotes the corresponding parameter space. The central interest of the paper is the provision of tests of the additional conditional moment restrictions E[ua(z; a0)jsa] = 0 a.s. sa (2.2) for some a0 2 Ba. Here the moment indicator vector ua(z; a) denotes a Ja-vector of known functions of z and the unknown pa-vector of parameters a with Ba the corresponding parameter space and sa an observable da-vector of instruments. Together the parameter vectors m0 and a0 constitute the objects of inferential interest. Note that a may or may not be coincident with the maintained hypothesis para- meter vector m. Likewise, the notation sa for the instrument vector de ning the additional conditional moment constraints (2.2) explicitly permits circumstances in which the maintained instruments sm may or may not be strictly included in the additional instruments sa or vice-versa. 4 2.2 Test Problem The maintained hypothesis is given by the conditional moment constraint E[um(z; m0)jsm] = 0 (2.1) and is assumed to hold throughout. The null hypothesis H0 of interest is consequently de ned in terms of the validity of the additional conditional moment constraints (2.2), i.e., H0 : E[ua(z; a0)jsa] = 0 a.s. sa and E[um(z; m0)jsm] = 0 a.s. sm (2.3) with the corresponding alternative hypothesis H1 given by H1 : E[ua(z; a)jsa] 6= 0 all a 2 Ba; sa 2 Sa; and E[um(z; m0)jsm] = 0 a.s. sm (2.4) 4Nonparametric components are excluded from the moment indicator vector de nitions. The theoretical analysis of the paper could in principle be extended to deal with such models; see, e.g., Chen and Pouzo (2009, 2012). [3] for some Sa with non-zero probability content. 2.3 Examples Example 2.1 (Conditional Homoskedasticity). This example concerns the conditional homoskedas- ticity of the maintained conditional moment indicator vector um(z; m); hence the maintained hypothesis and additional instrument vectors are identical, i.e., sm = sa. The additional conditional moment indi- cator is de ned by ua(z; a) = vech(um(z; m)um(z; m) 0 ) where vech() denotes the vectorised upper triangle of . Thus Ja = Jm(Jm+1)=2 and a = ( 0m; vech()0)0 includes the maintained parameter vector m. Let 0(sm) = E[um(z; m0)um(z; m0) 0jsm] and 0 = E[um(z; m0)um(z; m0) 0]. Therefore the null hypothesis may be expressed as H0 : 0(sm) = 0 and E[um(z; m0)jsm] = 0 a.s. sm; with alternative hypothesis H1 : 0(sm) 6=  all p.d. , sm 2 Sm, where Sm has non-zero probability mass, and E[um(z; m0)jsm] = 0 a.s. sm. Remark 2.1. The standard instrumental variable (IV) linear regression model de nes um(z; m) = y mx, with Jm = 1 and thus Ja = 1. With maintained unconditional moment indicator vector smum(z; m) = sm(y mx), continuous updating estimation (CUE) of m, Hansen et al. (1996), uses the inverse of the sample moment matrix Pn i=1 smis 0 mi(yi mxi)2=n as metric whereas, un- der conditional homoskedasticity, the LIML metric, i.e., the inverse of 2n( m) Pn i=1 smis 0 mi=n, where 2n( m) = Pn i=1(yi mxi)2=n, is apposite. Example 2.2 (Instrument Validity). In this example both maintained and additional conditional moment indicators coincide, i.e., um(z; m) = ua(z; a) with m = a and, thus, Ja = Jm. The issue here is the validity of the additional instrument vector sa. The null hypothesis is therefore de ned by H0 : E[um(z; m0)jsa] = 0 all sa; E[um(z; m0)jsm] = 0 all sm; with alternative hypothesis H1 : E[um(z; m)jsa] 6= 0 all m 2 Bm, sa 2 Sa, where Sa has non-zero probability content, and E[um(z; 0)jsm] = 0 all sm. Remark 2.2. Blundell and Horowitz (2007) de ne an exogeneity hypothesis E[umjx] = 0, i.e., E[yjx] = g(x), for the nonparametric regression model y = g (x) + um when the unknown structural function g() is of primary inferential interest and importance, with x a vector of covariates, maintaining the identifying conditional moment restriction E[umjsm] = 0. As a consequence, the structural function [4] g() may be consistently estimated by nonparametric least squares (LS) thus avoiding the diculties associated with nonparametric IV estimation. Given that x but not sm is included in sa, this hypothesis might be considered a marginal form of exogeneity hypothesis (ME). In general, under ME, the structural function g(x) will vary with sm since E[yjx] 6= E[yjx; sm]. Thus, if elements of the maintained instrument vector sm are also of some economic signi cance, the regression function E[yjx; sm] rather than E[yjx] may then be of primary interest. In such circumstances, the alternative form of exogeneity hypothesis E[umjx; sm] = 0, i.e., E[yjx; sm] = g(x) or y conditionally mean independent of sm given x, would be of empirical relevance; in this case, the maintained instruments sm are regarded as potentially omitted variables from the regression function of interest. Cf. Blundell and Horowitz (2007) section 2.3, p.1040. The inclusion of both sm and x in sa constitutes a joint form of exogeneity (JE) and is more stringent than ME.5 For linear regression, see Remark 2.1 above, if E[y m0xjsa] = 0, i.e., E[yjsa] = m0x, LS estimation of m0 is consistent but inecient in the presence of conditional heteroskedasticity, Cragg (1983), with IV estimation incorporating the additional E[y m0xjsa] = 0 and maintained E[y m0xjsm] = 0 conditional moments ecient. 3 GMM and GEL Test Statistics 3.1 Approximating Conditional Moment Restrictions Conditional moment constraints of the form (2.1) and (2.2) are equivalent to a countable number of unconditional moment restrictions under certain regularity conditions; see Chamberlain (1987). The following assumption, Assumption 1, p.58, of Donald et al. (2003), henceforth DIN, provides precise conditions. The discussion is initially framed for a generic vector of instruments s and moment indicator vector u(z; ). For each positive integer K, let qK(s) = (q1K(s); :::; qKK(s)) 0 denote a K-vector of approximating functions. Assumption 3.1. For all K, E[qK(s)0qK(s)] is nite and for any a(s) with E[a(s)2] < 1 there are K-vectors K such that as K !1, E[(a(s) qK(s)0 K)2]! 0: Possible approximating functions which satisfy Assumption 3.1 are splines, power series and Fourier series. See inter alia DIN, Newey (1997) and Powell (1981) for further discussion. The next result, DIN Lemma 2.1, p.58, establishes a formal equivalence between conditional moment restrictions of the type (2.1) and (2.2) and a sequence of unconditional moment restrictions. 5JE rather than ME has been a central concern in the literature on classical likelihood-based tests for (weak) exogeneity; see inter alia Durbin (1954), Wu (1973), Hausman (1978), Engle (1982), Engle et al. (1983) and Smith (1994). [5] Lemma 3.1. Suppose that Assumption 3.1 is satis ed and E[u(z; 0) 0u(z; 0)] is nite. If E[u(z; 0)js] = 0, then E[u(z; 0) qK(s)] = 0 for all K. Furthermore, if E [u(z; 0)js] 6= 0, then E[u(z; 0) qK(s)] 6= 0 for all K large enough. DIN de nes the unconditional moment indicator vector as u(z; ) qK(s). By considering the mo- ment conditions E[u(z; 0) qK(s)] = 0, if K approaches in nity at an appropriate rate, dependent on the sample size n and the estimation method, EL, IV, GMM or GEL, DIN demonstrates that un- der certain conditions these estimators are consistent and achieve the semi-parametric eciency lower bound. To do so, however, requires the imposition of a normalization condition on the approximating functions, DIN Assumption 2, p.59, which now follows. Let S denote the support of the random vector s. Assumption 3.2. For each K there is a constant scalar (K) and matrix BK such that ~q K(s) = BKq K(s) for all s 2 S, sups2S ~qK(s)  (K), E[~qK(s)~qK(s)0] has smallest eigenvalue bounded away from zero uniformly in K and p K   (K). Hence to formulate a test statistic appropriate for the null hypothesis (2.3) requires that its con- stituent conditional moment constraints, E[um(z; m0)jsm] = 0 (2.1) and E[ua(z; a0)jsa] = 0 (2.2), are re-interpreted as suitably de ned sequences of unconditional moment restrictions based on Assumptions 3.1 and 3.2. The maintained conditional moment restrictions (2.1) are re-expressed as the sequence of JmK unconditional moment restrictions E[um(z; m0) qKm(sm)] = 0;K !1; (3.1) for approximating functions qKm(sm) satisfying Assumptions 3.1 and 3.2. Likewise let q MK a (sa) be a MK- vector of approximating functions that depends on sa and that also satis es Assumptions 3.1 and 3.2, where for ease of exposition M is a positive integer. Thus the additional conditional moment restrictions (2.2) are rewritten as the sequence of JaMK unconditional moment restrictions E[ua(z; a0) qMKa (sa)] = 0;K !1: (3.2) The null hypothesis (2.3) is then formally equivalent to the sequence of (Jm + JaM)K unconditional [6] moments6 E[um(z; m0) qKm(sm)] = 0; E[ua(z; a0) qMKa (sa)] = 0;K !1: (3.3) Remark 3.1. Strictly speaking, the succeeding theoretical analysis requires the dimension of qMKa (), the integer dqa(K) say, should satisfy limK!1 dqa (K) =K = M , M a positive constant, e.g., dqa (K) = [MK], i.e., the same order as that of qKm(). The multiplicative choice MK with M a positive integer is adopted for simplicity and for ease of implementation and exposition. Restricted test statistics for (2.3) de ned in section 3.3 below are expressed as (or are asymptotically equivalent to) the di erence of an unrestricted statistic and a statistic apposite for testing the maintained conditional moment restrictions (2.1); see section 4. Their respective large sample behaviours are determined by the relative number of approximating functions used to express the null and maintained hypotheses in unconditional form. If the dimension of qMKa () diverges at a rate di erent from that of qKm(), the limit theory used in sections 4 and 5 to establish the asymptotic behaviour of the unrestricted statistic under null and local alternative hypotheses no longer applies. Example 2.1 (Conditional Homoskedasticity Cont.). Recall that ua(z; a) = vech(um(z; m)um(z; m) 0 ) with a = ( 0 m; vech() 0)0. In this case sa = sm and thus the additional approximating functions are de ned as qMKa (sa) = q K m(sm). Therefore M = 1. Hence, the null hypothesis H0 : 0(sm) = 0, E[um(z; m0)jsm] = 0 is re-expressed in unconditional form as E[ua(z; a0) qKm(sm)] = 0; E[um(z; m0) qKm(sm)] = 0;K !1: Example 2.2 (Instrument Validity Cont.). Recall that ua(z; a) = um(z; m) with Jm = Ja and a = m. The vector of additional approximating functions is q MK a (sa) with dimension MK. Thus, the null hypothesis H0 : E[um(z; m0)jsa] = 0, E[um(z; m0)jsm] = 0 is re-expressed in unconditional form as E[um(z; m0) qMKa (sa)] = 0; E[um(z; m0) qKm(sm)] = 0;K !1: Remark 3.2. For regression the special cases ME sa = x with q MK a (sa) functions of x only and JE sa = (sm; x) with q MK a (sa) additional functions of sm and x are of particular interest. 6To illustrate the construction of qKm(sm) and q MK a (sa) for polynomial approximating functions suppose sm and sa have dam elements in common. Let the approximating functions vector qKm(sm) for the maintained conditional moment restrictions (2.1) be a polynomial of order km 1 which yields K = kdmm . Thus km could be chosen as [K1=dm ] + 1 for given K. Similarly let the components of the vector of approximating functions qMKa (sa) for the additional conditional moment restrictions (2.2) corresponding to the dam elements in common between sm and sa be formed from a polynomial of order ka 1. Also suppose a polynomial of order ka excluding the constant term is used for those components corresponding to the da dam unique elements in sa. Then the dimension of the vector of approximating functions qMKa (sa) is k dam a ((ka + 1) dadam 1). Therefore the order of the dimension of qMKa (sa) is kdaa . Examples: (a) ME: dam = 0; thus MK = (ka + 1)da 1, e.g., da = 1, MK = ka. (b) JE: dam = dm; thus MK = kdma ((ka + 1)dadm 1), e.g., dm = 1, da = 2, MK = k2a. For the general case this suggests choosing ka = [(MK) 1=da ] + 1. [7] 3.2 Basic Assumptions and Notation Let denote the distinct elements of m and a with 0 and the composite parameter space B de ned similarly with p the number of parameters comprising . The vector s collects the distinct elements of the maintained and additional instrument vectors sm and sa. Also let u(z; ) and q K(s) denote the non-redundant elements of um(z; m) and ua(z; a) and q K m(sm) and q MK a (sa) respectively. It will be helpful to de ne a number of f.r.r. selection matrices Sum, S u a and S q m, S q a; viz., S u mu(z; ) = um(z; m), Suau(z; ) = ua(z; a) and S q mq K(s) = qKm(sm), S q aq K(s) = qMKa (sa). 7 Correspondingly Sm = S u m Sqm and Sa = S u a Sqa are both f.r.r. selection matrices. Importantly for the theoretical analysis under- pinning the results in the paper, the unconditional forms of moment indicator vectors corresponding to the maintained and null hypotheses, cf. (3.1) and (3.3), may be expressed as Sm(u(z; ) qK(s)) and S(u(z; ) qK(s)) respectively where S = (S0m; S0a)0. Necessarily S is n.s. otherwise either u(z; ) or qK(s) would contain redundant elements. Example 2.1 (Conditional Homoskedasticity Cont.). Here u(z; ) = (um(z; m) 0; ua(z; a)0)0 and qK(s) = qKm(sm). Hence S q m = S q a = IK and S u m = (IJm ; 0(JmJa)), S u a = (0(JaJm); IJa). The unconditional form of the moment indicator vector corresponding to the null hypothesis H0 : 0(sm) = 0, E[u(z; 0)jsm] = 0 is then S(u(z; ) qK(s)) =  um(z; m) ua(z; a)  qKm(sm);K !1; with that for the maintained hypothesis expressed as Sm(u(z; ) qK(s)) = um(z; m) qKm(sm), K !1. Example 2.2 (Instrument Validity Cont.). Here u(z; ) = ua(z; a) = um(z; m) with Jm = Ja and = a = m. Thus S u m = S u a = IJm and S q m = (IK ; 0(KMK)), S q a = (0(MKK); IMK). The unconditional moment indicator vector um(z; m) (qKm(sm)0; qMKa (sa)0)0 corresponding to the null hypothesis H0 : E[um(z; m0)jsa] = 0, E[um(z; m0)jsm] = 0 may equivalently be re-arranged as S(u(z; ) qK(s)) =  um(z; m) qKm(sm) um(z; m) qMKa (sa)  ;K !1; with that for the maintained hypothesis given by Sm(u(z; ) qK(s)) = um(z; m) qKm(sm), K !1, as above. Standard conditions are imposed to derive the limiting distributions of the test statistics discussed below; viz. 7The row and column dimensions of the selection matrices Sqm and S q a depend on K but to avoid a burdensome notation this dependence is not made explicit. [8] Assumption 3.3. (a) The data are i.i.d.; (b) there exists 0 2 int(B) such that E[um(z; m0)jsm] = 0 and E[ua(z; a0)jsa] = 0; (c) p n( ^ 0) = Op(1); (d) E[sup 2B ku(z; )k2 js] is bounded. Unlike DIN Assumption 6(b), p.67, it is unnecessary to impose E[sup 2B ku (z; )k ] < 1 for some > 2 for GEL; see Guggenberger and Smith (2005).8 Remark 3.3. Assumption 3.3(c) requires only a root-n consistent rather than an ecient estimator ^ of 0. Global identi cation of 0 and thus root-n consistency of GMM and GEL are not necessarily guaranteed if based on an arbitrary nite set of unconditional moments derived from the conditional moment restrictions; see, e.g., Domnguez and Lobato (2004) and Hsu and Kuan (2011). If 0 2 B uniquely satis es E[u(z; )js] = 0 a.s., 2 B, Lemma 3.1 guarantees global identi cation of 0 for suciently large K and root-n consistency of GMM and GEL follows with the imposition of the addi- tional assumptions described in DIN section 5, pp.64-67, if Assumptions 3.1 and 3.2 on the vector of approximating functions qK(s) are satis ed. See also Kitamura et al. (2004). Domnguez and Lobato (2004) and Hsu and Kuan (2011) also propose root-n consistent GMM-type methods based on particular classes of unconditional moment constraints. De ne u (z; ) = @u(z; )=@ 0, D(s) = E[u (z; )js] and u j(z; ) = @2uj(z; )=@ @ 0, j = 1; :::; J , where J denotes the dimension of u(z; ).9 Also let N denote a neighbourhood of 0. Assumption 3.4. (a) u(z; ) is twice continuously di erentiable in N , E[sup 2N ku (z; )k2 js] and E[ku j(z; 0)k2 js], (j = 1; :::; J), are bounded; (b)  (s) = E[u(z; 0)u(z; 0)0js] has smallest eigenvalue bounded away from zero; (c) E[sup 2N ku(z; )k4 js] is bounded; (d) for all 2 N , ku(z; ) u(z; 0)k  (z) k 0k and E[(z)2js] is bounded; (e) E[D(s)0D(s)] is nonsingular. 3.3 Test Statistics Let gmi( m) = Sm(u(zi; ) qK(si)) = um(zi; m) qKm(smi), gai( a) = Sa(u(zi; ) qK(si)) = ua(zi; ) qMKa (sai) and gi( ) = S(u(zi; ) qK(si)), (i = 1; :::; n). Write g^m( m) = Pn i=1 gmi( m)=n and g^( ) = Pn i=1 gi( )=n. GMM statistics appropriate for tests of maintained and null hypotheses expressed unconditionally in (3.1) and (3.3) take the standard forms T gmGMM = ng^m( ^m)0 ^1m g^m( ^m) (3.4) 8Supplement Lemma S.1 may be substituted for DIN Lemma A.10, p.82, rendering = 2 sucient for the succeeding DIN lemmas and theorems concerned with GEL. 9Nonsmooth moment indicators could be accommodated by appropriately modifying the theoretical analysis. See, e.g., Chen and Pouzo (2009, 2012) and Parente and Smith (2011). [9] and T gGMM = ng^( ^)0 ^1g^( ^); (3.5) where ^m denotes the subvector of ^ corresponding to m, ^m = Pn i=1 gmi( ^m)gmi( ^m) 0=n and ^ =Pn i=1 gi( ^)gi( ^) 0=n. Cf., for example, DIN section 4, pp.63-64. In the remainder of the paper tests that fully incorporate the information contained in the maintained hypothesis (2.1), or (3.1), in their formulation are referred to as restricted tests whereas those that partially or completely fail to do so are termed unrestricted tests. A restricted GMM statistic appropriate for testing the null hypothesis (2.3) against the maintained hypothesis (2.4) may be based on the di erence of GMM criterion function statistics (3.5) and (3.4) for the respective revised hypotheses (3.3) and (3.1), cf. Eichenbaum et al. (1988), Appendix C, pp.74-76, in particular, (C.1), p.75; viz. J r = T g GMM T gmGMM (JaMK (p pm))p 2(JaMK (p pm)) ; (3.6) where p pm is the number of additional parameters in a de ning the additional conditional moment conditions (2.2) as compared with the maintained hypothesis (2.1) parameters m. Remark 3.4. For xed and nite K, under suitable conditions, GMM, Newey (1985) and Eichenbaum et al. (1988), and GEL, Smith (2011), test statistics for the validity of additional moment restrictions, e.g., T gGMM T gmGMM , are asymptotically chi-square distributed with JaMK (p pm) degrees of freedom. The mean location JaMK (p pm) and standard deviation scale p 2(JaMK (p pm)) standardis- ations of T gGMM T gmGMM in J r (3.6) mimic those introduced to render chi-square random variates with large degrees of freedom approximately standard normally distributed. A number of alternative test statistics to GMM-based procedures for a nite number of additional moment restrictions using GEL, Newey and Smith (2004) and Smith (1997, 2011), may be adapted for the framework considered here. As in DIN and Newey and Smith (2004) let (v) denote a function of a scalar v that is concave on its domain, an open interval V containing zero. De ne the respective GEL criteria under null and alternative hypotheses as P^ g ( ; ) = Xn i=1 [(0gi( )) 0]=n; P^ gm ( m; m) = Xn i=1 [(0mgmi( m)) 0]=n; (3.7) where  and m = Sm are the corresponding (Jm + JaM)K- and JmK-vectors of Lagrange multipliers associated with the unconditional moment constraints (3.1) and (3.3). Let j(v) = @ j(v)=@vj and j = j(0), (j = 0; 1; 2; :::) where, without loss of generality, the normalisation 1 = 2 = 1 is imposed.10 10EL is GEL with (v) = log(1 v), Imbens (1997), Qin and Lawless (1994) and Smith (2000). ET is also GEL with [10] Let ^gmn ( m) = fm : 0mgmi( m) 2 V, i = 1; :::; ng and ^gn( ) = f : 0gi( ) 2 V, i = 1; :::; ng. Given , the respective Lagrange multiplier estimators for m and  are de ned by ^m( m) = arg max m2^gmn ( m) P^ gm ( m; m); ^( ) = arg max 2^gn( ) P^ g ( ; ): The corresponding respective Lagrange multiplier estimators for m and  are then de ned as ^m = ^m( ^m) and ^ = ^( ^), cf. Assumption 3.3(c), Similarly to the restricted GMM statistic J r (3.6), a restricted form of GEL likelihood ratio (LR) statistic for testing the null hypothesis (2.3) against the maintained hypothesis (2.4) may be based on the di erence of GEL criterion function (3.7) statistics; viz. LRr = 2n(P^ g  ( ^; ^) P^ gm ( ^m; ^m)) (JaMK (p pm))p 2(JaMK (p pm)) : (3.8) Restricted Lagrange multiplier, score and Wald-type statistics are de ned respectively as11 LMr = n(^ S 0 m^m) 0 ^(^ S0m^m) (JaMK (p pm))p 2(JaMK (p pm)) ; (3.9) Sr = Pn i=1 1(^ 0 mgmi( ^m))gai( ^a) 0Sa ^1S0a Pn i=1 1(^ 0 mgmi( ^m))gai( ^a)=n (JaMK (p pm))p 2(JaMK (p pm)) (3.10) and Wr = n^ 0S0a(Sa ^ 1S0a) 1Sa^ (JaMK (p pm))p 2(JaMK (p pm)) : (3.11) An additional assumption on the GEL function () is required for statistics based on GEL as in DIN Assumption 6, p.67. Assumption 3.5. () is a twice continuously di erentiable concave function with Lipschitz second derivative in a neighborhood of 0. (v) = exp(v), Imbens et al. (1998), Kitamura and Stutzer (1997), as is CUE if () is quadratic, Hansen et al. Yaron (1996); see Theorem 2.1, p.223, of Newey and Smith (2004). More generally, members of the Cressie-Read (1984) power divergence family of discrepancies discussed by Imbens et al. (2008) are GEL with (v) = (1 + v)( +1)= =( + 1); see Newey and Smith (2004), Section 2.1, pp.223-224. 11Alternative restricted score and Wald statistics robust to estimation e ects may be de ned; viz. Sr = Pn i=1 1(^0mgmi( ^m))gi( ^)0( ^1 ^1G^(G^0 ^1G^)1G^0 ^1) Pn i=1 1(^0mgmi( ^m))gi( ^)=n (JaMK (p pm))p 2(JaMK (p pm)) Wr = n^ 0 a(Sa( ^ 1 ^1G^(G^0 ^1G^)1G^0 ^1)S0a)1^a (JaMK (p pm))p 2(JaMK (p pm)) : See Smith (1997, section II.2, pp.511-514) and Smith (2011, section 5, pp.1209-1213). [11] 4 Asymptotic Null Distribution The following theorem provides a statement of the limiting distribution of the restricted GMM statistic J r (3.6) under the null hypothesis H0 (2.3). Theorem 4.1. If Assumptions 3.1-3.4 hold and if K !1 and  (K)2K2=n! 0, then J r d! N(0; 1). The next result details the limiting properties of the restricted GEL-based statistics for the null hypothesis (2.3) and their relationship to that of the GMM statistic J r (3.6). Theorem 4.2. Let Assumptions 3.1-3.5 hold and suppose in addition K ! 1 and (K)2K3=n ! 0. Then LRr, LMr, Sr and Wr converge in distribution to a standard normal random variate. Moreover all of these statistics are asymptotically equivalent to J r. Remark 4.1. The large sample analysis in section 5 of the local alternative behaviour of restricted and unrestricted statistics discussed below indicates that one-sided tests of the null hypothesis H0 (2.3) are appropriate. E.g., the critical region fJ r  z g for the standardised GMM statistic J r (3.6) has asymptotic size where PfN(0; 1)  z g = . Alternatively, valid critical regions based on non-standardised statistics may also be de ned. E.g., for T gGMM T gmGMM , the critical region fT gGMM T gmGMM  2JaMK(ppm)( )g where 2k( ) is the -level critical value of the chi-square distri- bution with k degrees of freedom.12 Note that p pm is negligible in the large K, large n asymptotic analysis of Theorems 4.1 and 4.2. Unrestricted statistics fail to take into account some or all of the information contained in the main- tained hypothesis (2.1) in their formulation. The standard forms of unrestricted GEL-based statistic, cf. Aitchison (1962), do not incorporate the component of the restricted statistic corresponding to the 12To see this let the statistic Sn(k) be such that Sn(k) d! 2 d(k) , n!1, for xed k where d(k) is the associated degrees of freedom. De ne Zn(k) = Sn(k) d(k)p 2d(k) and zk( ) = 2 d(k) ( ) d(k)p 2d(k) : Assume that there exists a sequence kn ! 1 such that Zn(kn) d! N(0; 1), n ! 1. Consider the critical region fSn(kn)  2d(kn)( )g. Since limn!1 PnfZn(kn)  z g = , lim n!1 PnfSn(kn)  2d(kn)( )g = limn!1PnfZn(kn)  zkn ( )g = lim n!1 PnfZn(kn)  z g = : The second equality follows from Zn(kn) d! N(0; 1), the absolute continuity of the N(0; 1) distribution function and limn!1 zkn ( ) = z . [12] maintained hypothesis (2.1), cf. LRr (3.8), LMr (3.9) and Sr (3.10); i.e., LRu = 2nP^ g n( ^; ^) ((JaMK + JmK) p)p 2((JaMK + JmK) p) ; (4.1) LMu = n^ 0 ^^ ((JaMK + JmK) p)p 2((JaMK + JmK) p) (4.2) with the score form based on T gGMM (3.5) Su = ng^( ^) 0 ^1g^( ^) ((JaMK + JmK) p)p 2((JaMK + JmK) p) : (4.3) By a similar analysis to that used to establish Theorems 4.1 and 4.2 the statistics LRu, LMu and Su converge in distribution to a standard normal random variate and are mutually asymptotically equiva- lent but not to the restricted statistics above.13 Remark 4.2. Other forms of unrestricted statistics may also be de ned that incorporate the maintained information (2.1) to a lesser extent than restricted statistics, e.g., a GMM statistic solely based on the additional conditional moment restrictions (2.2); viz. J a = T ga GMM (JaMK pa)p 2(JaMK pa) ; (4.4) where T gaGMM = ng^a( ^a)0 ^1a g^a( ^a) with ^a the subvector of ^ corresponding to a, g^a( a) = Pn i=1 gai( a)=n and ^a = Pn i=1 gai( ^a)gai( ^a) 0=n. GEL forms LRa, LMa and Sa follow similarly; cf. (4.1), (4.2) and (4.3) respectively. The proofs of Theorems 4.1 and 4.2 may be adapted to demonstrate that these statistics each converge in distribution to a standard normal random variate and are mutually asymptotically equivalent but not to the restricted statistics or the unrestricted GEL class de ned above. This section concludes with an asymptotic independence result between the restricted GMM statistic J r for testing (2.3) and the corresponding statistic for testing the maintained hypothesis (2.1); viz. Jm = T gm GMM (JmK pm)p 2(JmK pm) : (4.5) Theorem 4.3. If Assumptions 3.1-3.4 hold and if K ! 1 and  (K)2K2=n ! 0, then (a) Jm d! N(0; 1) and (b) J r is asymptotically independent of Jm. A similar result holds for the associated restricted GEL statistics LRr, LMr, Sr and Wr and their counterparts for testing (2.1) if the additional assumption (K)2K3=n! 0 is imposed. 13These unrestricted statistics are apposite for a joint test of the additional (2.2) and maintained (2.1) conditional moment restrictions. The statistics LRu and Su are forms of GMM and GEL statistics suggested in DIN section 6, pp.67-71, adapted for testing the null hypothesis (2.3). [13] Remark 4.3. The practical import of Theorem 4.3 is that the overall asymptotic size of the test se- quence for (2.1) and (2.2) may be controlled, e.g., (a) test (2.1) using Jm; (b) given (2.1), test (2.2) using J r, with overall asymptotic test size 1 (1 m)(1 a), where m and a are the respective asymptotic sizes of the individual tests in (a) and (b). Remark 4.4. The asymptotic independence of J r and Jm mirrors that of classical and unconditional moment GMM and GEL tests for a sequence of parametric restrictions; see Newey (1985) and Smith (2011). Indeed the unrestricted statistic J u is the sum of suitably rescaled restricted J r and maintained hypothesis Jm statistics; cf. the decomposition of standard unrestricted classical or GMM and GEL statistics for parametric restrictions. 5 Asymptotic Local Power This section considers the asymptotic distribution of the statistics of the previous sections under a suitable sequence of local alternatives. Critically, this discussion demonstrates the de ciency in terms of asymptotic local power of unrestricted tests which fail to fully incorporate the maintained conditional information (2.1) and thereby the superiority of restricted tests. The set-up is similar to that in Eubank and Spielgeman (1990) and Hong and White (1995), see also Tripathi and Kitamura (2003), utilising local alternatives to the null hypothesis (2.3) of the form H1n : E[u(z; n;0)js] = 4 p JaMKp n (s); (5.1) where n;0 2 B is a non-stochastic sequence such that n;0 ! 0. It is assumed that E[m(s)jsm] = 0, where m(s) = S u m(s), thus ensuring that the maintained hypothesis E[um(z; m0)jsm] = 0 (2.1) is not violated. Remark 5.1. The sequence of local alternatives (5.1) is particularly apposite for the instrumental validity Example 2.2 in which u(z; ) = um(z; m) = ua(z; a) with = m = a. If the maintained instruments sm are a subvector of sa, i.e., s = sa, E[(s)jsm] = 0. Similarly, when sm is not a subvector of sa, the relevant sequence of local alternatives to E[u(z; 0)jsm] = 0 is the expectation of (5.1) conditional on sa, i.e., E[u(z; n;0)jsa] = 4 p JaMKp n E[(s)jsa]: The asymptotic local alternative distributions of the statistics described above are obtained under the following assumption. Assumption 5.1. (a) n;0 is a non-stochastic sequence such that (5.1) holds and n;0 ! 0; (b) p n( ^ n;0) = Op(1); (c) for all 2 N , (s; ) = E[u(z; )u(z; )0js] and m(sm; m) = E[um(z; m)um(z; m)0jsm] [14] each have smallest eigenvalue bounded away from zero; (d) k(s)k is bounded; (e) (s; ), m(sm; m) and D(s; ) = E[u (z; )js], Dm(sm; m) = E[um (z; m)jsm] are continuous functions on a compact closure of N . The next result summarises the limiting distribution of the restricted statistics J r, LRr, LMr, Sr and Wr under the sequence of local alternatives (5.1). Let (s) = (s; 0). Theorem 5.1. Let Assumptions 3.1-3.4 and 5.1 hold, K !1 and (K)2K2=n! 0. Then J r converges in distribution to a N(r= p 2; 1) random variate, where r = E[(s)0(s)1(s)]: If additionally Assumption 3.5 is satis ed and (K)2K3=n ! 0, then LRr, LMr, Sr and Wr are as- ymptotically equivalent to J r. Remark 5.2. Since r  0 tests of the null hypothesis H0 (2.3) based on these statistics should be one-sided. Although not discussed here, a similar analysis to that underpinning DIN Lemma 6.5, p.71, demonstrates the consistency of tests based on the statistics J r, LRr, LMr, Sr and Wr. The following corollary to Theorem 5.1 details the limiting distribution of the standard forms of unrestricted statistics LRu (4.1), LMu (4.2) and Su (4.3) under the same local alternative sequence (5.1). Corollary 5.1. Let Assumptions 3.1-3.4 and 5.1 hold and (K)2K2=n ! 0. Then Su converges in distribution to a N(u= p 2; 1) random variate, where u = r JaM JaM + Jm r: If additionally Assumption 3.5 is satis ed and (K)2K3=n ! 0, then LRu, LMu are asymptotically equivalent to Su. Remark 5.3. Since r > u Corollary 5.1 demonstrates that for xed M restricted tests dominate the standard unrestricted tests in terms of asymptotic local power. Other unrestricted tests that par- tially or completely fail to incorporate the maintained conditional moment information (2.1) in their formulation are likewise relatively de cient. For example, using a similar analysis to that for Theorem 5.1, the GMM statistic J a (4.4) and associated GEL statistics LRa, LMa and Sa may be shown to converge in distribution under the local alternatives sequence (5.1) to a N(a= p 2; 1) random variable, [15] where a = E[(s)0Su0a (S u a(s)S u0 a ) 1Sua (s)]. Hence  r a  0. Therefore, tests based on these and other unrestricted statistics are asymptotically less powerful relative to restricted tests. Remark 5.4. Corollary 5.1 also shows that the di erence in local asymptotic power between restricted and unrestricted tests declines with increasing M since the noncentrality parameter u would di er little from r with consequential similar discriminatory power for both standard unrestricted and restricted tests for local departures from the null hypothesis H0 (2.3). Remark 5.5. Theorem 5.1 and Corollary 5.1 provide no guidance for the choice of M . The e ect of M on power for given sample size n and K will depend on the speci c alternative hypothesis and correspondingly the relevance of any additional unconditional moment functions included by increasing M . More precisely, the ecacy in terms of power of including extra elements in qMKa (sa), i.e., increasing M , for given n and K, will depend on the correlation between these extra elements and the conditional expectation E[u(z; 0)js]. If this correlation is zero or weak then, although not strictly speaking ap- plicable here, an asymptotic local power analysis for the unconditional moment context would indicate that power should be expected to be diminished since test chi-square degrees of freedom will increase with M but the noncentrality parameter will remain relatively unaltered. Cf. Newey (1985) section 3, pp.238-244, in particular, the discussion following Proposition 6, p.242. If this correlation is strong there will be a trade-o between increases in both degrees of freedom and noncentrality parameter with power potentially enhanced. Simulation evidence reported next in section 6 suggests that for a given sample size n and xed value of K the correspondence between empirical and nominal test size deteriorates with increasing M ; a similar deterioration is also observed for size-corrected empirical power but it should be emphasised against speci c sets of alternatives. 6 Simulation Evidence This section reports the results from a simulation study to assess the performance of some of the tests for ME and JE forms of instrument validity in the linear regression model, see Example 2.2, based on the GMM and GEL statistics developed in previous sections. To provide a realistic setting, the investigation is based on an application to a dataset where the issue of instrument validity is of some interest and importance. Overall these experiments revealed that nominal size is approximated relatively more closely by the empirical size of (a) the non-standardised tests, see Remark 4.1, and (b) tests based on ecient estimators, cf. Tripathi and Kitamura (2003), although Assumption 3.3(c) only requires p n-consistent estimation. Consequently, only results for these forms of statistics are presented. The Wald test statistic Wr (3.11) and score test statistic Sr (3.10) are also excluded for similar reasons. Likewise, only the [16] results for restricted tests are reported as they dominate the unrestricted forms in terms of empirical power re ecting their theoretical superiority; see Corollary 5.1.14 All experiments concern a parametric speci cation for the Engel curve relationship between the expenditure share of leisure services y and the logarithm of total expenditures x and employ the same data as those in Blundell and Horowitz (2007). These data correspond to a subsample of the household- level observations from the British Family Expenditure Survey and consist of a sample of 1518 married couples with one or two children and an employed head of household. Since many parametric Engel curve speci cations are often linear or quadratic in x, see, e.g., Muellbauer (1976) and Banks et al. (1997), the experimental basis here is the linear regression model y = 0 + 1x+ 2x 2 + u: (6.1) The maintained instrument sm is the annual income from wages and salaries of the head of household. Thus = m = a = ( 0; 1; 2) 0, Ja = Jm = 1 and u(z; ) = um(z; m) = ua(z; a) where u(z; ) = y 0 1x 2x2; see Example 2.2. Cf. Blundell and Horowitz (2007) section 5, p.1051. The regression design incorporates both ME and JE forms of additional conditional constraint restric- tions (2.2); see Remark 2.2. Therefore the hypotheses of interest are as follows. First, the maintained hypothesis (2.1) E[ujsm] = 0. Secondly, the additional conditional moment constraints (2.2): ME E[ujx] = 0, i.e., sa = x, and JE E[ujsm; x] = 0, i.e., sa = (sm; x). 6.1 Experimental Design The parameter vector is estimated using the full data set by ecient two step (2S) GMM, with weight matrix computed using two stage least squares with the single instrument sm, see DIN section 4, pp.63- 65, based on the maintained conditional moment restriction E[u(z; )jsm] = 0. The maintained 2SGMM vector of approximating functions is qKm(sm) with K = 25. 15 2SGMM estimates are denoted as e0 , e 1 and e2 with 2SGMM residual u e = y e0 e1x e2x2. The structure of the data generating process underpinning the design is similar to that in Blundell and Horowitz (2007) section 4, pp.1049-1051. To ensure that the maintained hypothesis E[u(z; )jsm] = 0 holds in the sample consider the residual from a nonparametric series regression of ue on sm for the full 14The full set of simulation results is available from the authors upon request. 15Ecient 2SGMM estimates are y^ = 1:29 (0:662) + 0:629 (0:268) x 0:0609 (0:0269) x2: Estimated standard errors are in parentheses. Tests for ME E[ujx] = 0, i.e., sa = x, and JE E[ujsm; x] = 0, i.e., sa = (sm; x), discussed in section 6.1.3 were conducted on the full data set using the value K = 8 indicated by the rule in section 6.2 below. All ME tests rejected the null hypothesis at nominal levels 0:01, 0:05 and 0:10 for M = 1 and at levels 0:05 and 0:10 when M = 2 providing further support for the results reported in section 5, p.1051, of Blundell and Horowitz (2007). At nominal level 0:01 for M = 2 tests based on the GEL LR-type, LM-type and Wald statistics failed to reject the ME null hypothesis whereas those based on the statistics J m, LRMcue and LRMcue(gel) evaluated at EL and ET estimators, gLMm and score statistics did reject at the 0:01 level. These latter tests are precisely those that displayed a close correspondence between empirical and nominal size in the experiments reported below. All tests for the JE null hypothesis E[ujsm; x] = 0 rejected at nominal levels 0:01, 0:05 and 0:1 for both M = 1 and 2. [17] data set, i.e., ue?sm = u eq25m (sm)0 (Q25(sm)0Q25(sm))Q25(sm)0ue, where denotes a generalised inverse and Q25(sm) = (q 25 m (sm1); :::; q 25 m (sm1518)) 0 with the vector q25m (sm) de ned below in section 6.1.1 for n = 1518. Hence E[ue?sm jsm] = 0 approximately; see, e.g., Newey (1994) section 3, pp.6-8. To impose the JE hypothesis E[u(z; )jsa] = 0, where sa = (sm; x), the error term ue?smx is obtained as the residual from the nonparametric series regression of ue on sm and x, i.e., u e? smx = u e q25(s)0 (Q25(s)0Q25(s))Q25(s)0ue, where Q25(s) = (q 25(s1); :::; q 25(s1518)) 0 with q25(s) = (q25m (sm) 0; q25a (sa) 0)0, and then generating the dependent variable as ymc = e0 + e 1x + e 2x 2 + ue?smx. Then E[u e? smxjsm] = 0 and E[ue?smxjsm; x] = 0 approximately. Deviations from the JE null hypothesis are formulated as in ymc = e0+ e 1x+ e 2x 2+uesmx, where uesmx = s smx(ue?smx + (u e? sm ue?smx))=ssmx with ssmx and ssmx the standard deviations of ue?smx and ue?smx + (u e? sm ue?smx) respectively. Experimental data are generated as random samples of size n from (smi; xi; y mc i ) , (i = 1; :::; 1518); simulation random samples are denoted by zi = (smi; xi; y mc i ) , (i = 1; :::; n), below. Empirical test size is examined for sample sizes n = 200, 500, 1000 and 1500 with nominal sizes 0:01, 0:05 and 0:10. Sample sizes of n = 200 and 500 only are considered in those experiments concerned with empirical power. All experiments employ 5000 replications and were programmed using MATLAB. 6.1.1 Approximating Functions Legendre polynomials are used to form the approximating functions in the simulations because of their good collinearity properties, see Belloni et al. (2015) Example 3.1, p.8, and are de ned as P0(v) = 1;P1(v) = v; Pr+1 (v) = (2r + 1)vPr (v) rPr1 (v) r + 1 ; r = 1; 2; 3; ::: where v 2 [1; 1]; see Abramowitz and Stegun (1970) eq. 8.5.3, p.334.16 Since neither sm nor x has support [1; 1] the transformations ~sm = 2  smsm ssm  1 and ~x = 2  xx sx  1 are employed where  () is the N(0; 1) cumulative distribution function; for a given replication of sample size n, sm = Pn i=1 smi=n, ssm = Pn i=1 (smi sm)2 =n and x = Pn i=1 xi=n, sx = Pn i=1 (xi x)2 =n.17 The maintained conditional moment E[u(z; )jsm], cf. (2.1), is approximated using the vector of functions qKm(sm) with elements Pj(~sm), (j = 0; :::;K 1). For ME E[u(z; )jx] is approximated using a polynomial of order MK in x, i.e., qMKa (sa) has elements Pk(~x), (k = 1; :::;MK). The JE case E[u(z; )jsa], sa = (sm; x), uses the [(MK)1=2]2-vector of approximating functions qMKa (sa) with elements Pj(~sm)Pk(~x), (k = 0; :::; [(MK) 1=2] 1; l = 1; :::; [(MK)1=2]) resulting in the null hypothesis vector of approximating functions qK(s) = (qKm(sm) 0; qMKa (sa) 0)0. See fn. 6. 16Lorenz (1986) Theorem 8, p.90, establishes the requisite uniform convergence for polynomial approximating functions; cf. Assumption 3.1. 17We are grateful to V. Chernozhukov for this suggestion. [18] 6.1.2 Estimators Ecient estimation methods examined include 2SGMM (gmm) with weight matrix computed as above, continuous updating (cue), empirical likelihood (el) and exponential tilting (et). The subscripts ma, me and je indicate estimation incorporating maintained, ME and JE restrictions respectively. gmm, cue and et are computed using the Broyden{Fletcher{Goldfarb{Shanno (BFGS) algorithm of MATLAB. EL is more problematic because in some samples for particular BFGS EL estimates ^EL the con- vex hull condition Pni=1 ^ELi g(zi; ^EL) < 104 may not be satis ed where the EL implied probabilities ^ELi = 1=n(1 + ^ 0 ELg(zi; ^EL)), (i = 1; :::; n), and the EL Lagrange multiplier ^EL = ^1 g^( ^EL) with ^ = Pn i=1 ^ EL i g(zi; ^EL)g(zi; ^EL) 0 and g^( ) = Pn i=1 g(zi; )=n; see Newey and Smith (2004) Theorem 2.3, p.224. Hence el is computed using the matElike MATLAB package with the optional Zipsolver pack- age; see Zedlewski (2008).18 In the case of non-convergence, el is computed employing BFGS applied to the EL dual problem with the Lagrange multiplier obtained using MATLAB code based on Owen (2001) eq. (12.3), p.235.19 EL estimates obtained via this procedure are only considered to be valid solutions if the convex hull condition is satis ed, otherwise no solution in the convex hull is reported. Note, however, that in the test size and power results reported in sections 6.3 and 6.4 the EL estimates satis ed the convex hull condition in all replications.20 6.1.3 Test Statistics Restricted tests for ME E[ujx] = 0 and JE E[ujsm; x] = 0 adopt the following notation. The superscripts m and j refer respectively to the ME or JE hypothesis under test with the subscripts cue, el, et referring to which GEL criterion is used to construct the test and, as above, denoting the ecient estimator(s) employed. E.g., the non-standardised restricted GEL LR-type statistic for JE based on EL criteria and estimators is denoted as LRjel = 2n(P^ gel( ^elj ; ^elj) P^ gmel ( ^elma ; ^elma)), cf. (3.8). LR-type CUE statistics evaluated at null and the maintained hypothesis EL and ET estimators are also computed using the subscript cue(gel) to denote the use of the CUE criterion and GEL estimators, e.g., for JE, LRjcue(gel) = 2n(P^ gcue( ^gelj ; ^gelj) P^ gmcue( ^gelma ; ^gelma)). The non-standardised robusti ed score S and Wald W statistics, see fn. 11, evaluated at the corresponding ecient ma estimator are also examined. Restricted ME and JE non-standardised test statistics are calibrated against chi-square distributions with MK and [(MK)1=2]2 degrees of freedom respectively.21 18matElike, rather than solving the dual EL problem, solves the primal EL problem directly and is chosen as the default algorithm because it is faster on average than BFGS. Both BFGS and matElike solutions are identical if each converges to a solution in the convex hull. 19el computation requires some care since the EL criterion involves the logarithm function which is unde ned for negative arguments. This diculty is avoided by replacing logarithms with a function that is logarithmic for arguments larger than a small positive constant and quadratic below that threshold. The code is available at http://www-stat.stanford.edu/~owen/empirical/ 20In a preliminary study the convex hull condition was found to be violated for values of K and M larger than those considered here. The adjusted EL estimator of Chen et al. (2008) o ers an alternative to EL in such circumstances. 21A number of asymptotically equivalent test statistics for the maintained hypothesis (2.1) were also investigated. The Durbin (1954)-Wu (1973)-Hausman (1978) test based on an auxiliary regression as described in Davidson and Mackinnon [19] GEL LM, score and Wald ME and JE test statistics require estimators of the variance matrix = E[g(z; 0g(z; 0) 0] and Jacobian G = E[@g(z; 0)=@ 0]. The estimators considered for and G are ^ = n1 Pn i=1 g(zi; ^gel)g(zi; ^gel) 0 and G^ = n1 Pn i=1 @g(zi; ^gel)=@ 0 where ^gel is the null hypothesis GEL estimator. Additional results are also presented for ME and JE LM tests based on the consistent estimator ~ k = ^k ^ 1 ^k for , see where 22 ^k = Xn i=1 k^ig(zi; ^gel)g(zi; ^gel) 0; k^i = 1(^ 0 gelg(zi; ^gel)) + 1 n^0gelg(zi; ^gel) ; (i = 1; :::; n): LM statistics based on ~ k are denoted gLM. 6.2 Choice of the Number of Instruments Implementation of the above tests requires a choice of K to employ under the maintained hypothesis. Because the Donald et al. (2009) method and selection criteria such as SBC predominantly indicated choices of K that varied relatively little with sample size, following DIN Table 1, p. 71, K was chosen to satisfy K5=n ! 0 according to the rule K = [Cn0:19] with C = 2 resulting in K = 5, 6, 7 and 8 corresponding to sample sizes n = 200, 500, 1000 and 1500 respectively. To explore the e ect of increased M on test size and power the values M = 1 and 2 were examined. 6.3 Empirical Size The results on empirical size reported here correspond to a nominal asymptotic level of 0:05; those results for nominal levels 0:01 and 0:10 are qualitatively similar and are therefore omitted. 6.3.1 ME Table B.1 presents the empirical rejection frequencies for M = 1 and 2 for non-standardised restricted tests of the ME hypothesis E[u(z; 0)jx] = 0 incorporating the maintained hypothesis moment restric- tions E[u(z; 0)jsm] = 0. In general, the empirical size of tests based on the Lagrange multiplier statistics LMmel, LMmet and Wald statistics Wmel, Wmet and to a lesser extent LRmel, LRmet tests su er from size distortions for moderate sample sizes n = 200 and 500 with a serious deterioration in performance as M increases from 1 to 2, i.e., as the number of unconditional moments under test increases. Of those remaining, the LR-type statistics LRmcue, LRmcue(el), LRmcue(et), the LM-type statistics gLMmel, gLMmet and the ET robust score statistic Smet have good size properties. The 2SGMM criterion J m statistic tends to be undersized and the EL robust score Smel statistic somewhat oversized except for the larger sample sizes.23 Generally (1993) section 7.9, p.237, see also Wooldridge (2002) section 6.2.1, p.118, was also considered. Results are available on request from the authors. 22Adapting Newey and Smith (2004) Theorem 2.3, p.224, the LM statistic for overidentifying moment conditions based on ~ k is identical to the score statistic based on ^, i.e., n^ 0 gel ~ k^gel = ng^( ^gel) 0 ^1g^( ^gel). 23Matsushita and Otsu (2013) obtained similar results for EL LR-type tests for overidentifying conditions to those reported here. [20] speaking, for a given sample size n and thus xed K there is a deterioration in performance to a lesser or greater degree for larger M , a nding also mirrored in other experiments by increasing K with xed sample size. In summary, tests for ME based on the statistics LRmcue and LRmcue(el), LRmcue(et) and gLMmel, gLMmet and Smet appear to be the most reliable in terms of empirical size. 6.3.2 JE Table B.2 presents the rejection frequencies for M = 1 and 2 for non-standardised restricted tests of the JE null hypothesis E[u(z; 0)jsm; x] = 0. The general conclusions are quite similar to those for the ME tests. Overall performance worsens substantially for the larger M for moderate sample sizes n = 200 and 500 for all test versions. The CUE LR-type forms LRjcue, LRjcue(el), LRjcue(et) evaluated at CUE, EL and ET estimators, the GEL LM- type statistics gLMjel; gLMjet and the robust score statistic Sjet display the most satisfactory empirical size at the nominal 0:05 level whereas as above the 2SGMM criterion J j and the EL robust score Sjel statistics are respectively undersized and oversized in the smaller sample sizes. 6.4 Empirical Power Tables B.3 and B.4 present size-corrected (sc) and non size-corrected (nsc) empirical rejection frequen- cies at the 0:05 level of tests for the ME and JE hypotheses.24 Given their poor size performance, tests based on the Lagrange multiplier statistics LMel, LMet and Wald statistics Wel, Wet are not considered in this section. Typically both rejection frequencies increase substantially as sample size n increases from 200 to 500 but decline with increased M although there are some exceptions for n = 200 and small . In general the statistics that performed well in terms of empirical size yield similar rejection frequencies under the alternatives considered here. 6.4.1 ME Table B.3 presents empirical rejection frequencies for non-standardised restricted ME tests for values M = 1 and 2 based on 0:05 level size-corrected and nominal non size-corrected critical values for deviations  6= 0 from the ME hypothesis E[u(z; 0)jx] = 0. In general, both rejection frequencies increase with deviation  and sample size n and decline with M with some exceptions at  = 0:2. Size-corrected empirical power di erences between tests are less at higher values for the deviations  and sample sizes n. Overall, tests based on the LR-type statistics LRmel and LRmet using the nominal 0:05 chi-square critical value are most powerful but it is precisely 24Horowitz and Savin (2000) argue that empirical rejection frequencies based on nominal critical values are the most relevant since size-correction is not realistically implementable in practice. [21] these tests that display an unsatisfactory correspondence between empirical and nominal size. Empirical power is relatively low at  = 0:2 for all tests employing size-corrected or non size-corrected critical values. Generally speaking, empirical power for all tests employing size-corrected critical values, not just those with reasonable empirical size characteristics, is rather similar for both the smaller n = 200 and larger n = 500 sample sizes. 6.4.2 JE Table B.4 presents empirical rejection frequencies for non-standardised restricted JE tests for values M = 1 and 2 based on 0:05 level size-corrected and nominal non size-corrected critical values for deviations  6= 0 from the JE hypothesis E[u(z; 0)jsm; x] = 0. Similar general conclusions to those for the ME tests above broadly follow. Interestingly, given M , sample size n and thus K, rejection frequencies are higher than those obtained for the ME hypothesis. 6.5 Summary The empirical size of non-standardised tests more closely approximates nominal size than that of stan- dardised tests. The use of ecient rather than root-n consistent estimators is recommended for test construction. Restricted dominate unrestricted tests in terms of empirical power. Empirical power typically declines for increases in M for both ME and JE tests. For both the ME E[u(z; 0)jx] = 0 and JE hypotheses E[u(z; 0)jsm; x] = 0 empirical sizes of restricted tests based on the restricted CUE LR-type statistics LRcue, LRcue(el), LRcue(et), evaluated at CUE, EL and ET estimates, and the LM-type statistics gLMel, gLMet and the robust ET score versions Set most closely approximate nominal size. The di erences in empirical power with size-corrected critical values between these tests are rather marginal. 7 Conclusions The primary focus of this article has been concerned with the provision of tests for additional con- ditional moment constraints in cross-section or short panel data contexts. The principal contribution is the explicit incorporation of conditional moment restrictions de ning the maintained hypothesis in the formulation of the test statistics mirroring test construction in the classical parametric likelihood setting. The approach reinterprets the respective conditional moment hypotheses as in nite numbers of unconditional moment restrictions with the corresponding tests formulated as tests for additional sets of in nite numbers of unconditional moment restrictions. The limiting distributions of these test statistics are derived under the null hypothesis and suitable sequences of local alternatives. These results suggest that restricted tests that fully incorporate maintained moment constraints in their construction should dominate in terms of power unrestricted tests that fail to do so. [22] The simulation experiments undertaken to explore the ecacy of the various tests proposed in the paper indicate a number of restricted tests possess both suciently satisfactory empirical size and power characteristics to allow their recommendation for econometric practice. The methods proposed in this paper are also relevant for short panel data models with independent cross sections and strictly exogenous instruments. The development of results pertinent for conditional moment constraints involving di erent instruments in di erent time periods is the subject of current research; cf. Holtz-Eakin et al. (1988), Arellano and Bond (1991) and Chamberlain (1992). Appendix A: Proofs of Results Throughout the Appendix, C will denote a generic positive constant that may be di erent in di erent uses with CS, T and cr Cauchy-Schwarz, triangle and Loeve cr, Davidson (1994), p.140, inequalities respectively. Also we write w.p.a.1 for \with probability approaching 1". A.1 Asymptotic Null Distribution Proof of Theorem 4.1. See Supplement Proof of Theorem 4.1. Proof of Theorem 4.2. See Supplement Proof of Theorem 4.2. Proof of Theorem 4.3. The proof uses the Cramer-Wold device. Consider the linear combination J c = rJ r + mJm: where r and m are arbitrary nite scalars such that 2 r + 2 m > 0. The desired result is obtained if J c d! N(0; 2r + 2m). First, by DIN Lemma 6.1, p.69, J r ng^( 0) 0 1g^( 0) ng^m( m0)0 1m g^m( m0) JaMKp 2JaMK p! 0: Likewise Jm ng^m( m0) 0 1m g^m( m0) JmKp 2JmK p! 0: Therefore, J c 1p JaM ng^( 0) 0Qg^( 0) ( rJaM + mJm p JaM=Jm)Kp 2K p! 0; where Q = r 1 ( r m p JaM=Jm)Sm 1 m S 0 m. To prove p JaMJ c d! N(0; v), where v = ( 2r + 2m)JaM , the conditions Supplement Lemma S.3(a)- (f) are veri ed below. Condition (a): immediate. [23] Condition (b): tr(Q ) = rtr(I(Jm+JaM)K) ( r m p JaM=Jm)tr(IJmK) = r(Jm + JaM)K ( r m p JaM=Jm)JmK = r(JaM + mJm p JaM=Jm)K = aK: Condition (c): note that (Q )2 = ( rI(Jm+JaM)K ( r m p JaM=Jm)Sm 1 m S 0 m ) 2 = 2rI(Jm+JaM)K ( 2r 2m(JaM=Jm))Sm 1m S0m : Hence tr[(Q )2] = ( 2r + 2 m)JaMK = vK: Condition (d): (Q )4 = ( 2rI(Jm+JaM)K ( 2r 2m(JaM=Jm))Sm 1m S0m )2 = 4rI(Jm+JaM)K ( 4r 4m(JaM=Jm)2)Sm 1m S0m : Thus tr[(Q )4] = ( 4r + 4 mJaMJm)JaMK = o(K2): Condition (e): from DIN Lemma A.6, p.78, 1=C  min()  max()  C and 1=C  min( )  max( )  C. Therefore, using Assumption 3.2 E[(g(z; 0) 0( r 1 ( r m p JaM=Jm)Sm 1 m S 0 m)g(z; 0)) 2]  C(K)2K = o(nK) since (K)2K2=n! 0. Condition (f): by a similar reasoning to that for Condition (e) E[(g(z; 0) 0 1g(z; 0))2]  C(K)2K: Also Q Q = ( r 1 ( r m p JaM=Jm)Sm 1 m S 0 m) ( r 1 ( r m p JaM=Jm)Sm 1 m S 0 m) = 2r( 1 Sm 1m S0m) + 2m(JaM=Jm)Sm 1m S0m: Thus, cf. Condition (e), E[(g(z; 0) 0Q Qg(z; 0))2]  C(K)2K: [24] A.2 Asymptotic Local Alternative Distribution Let ui( ) = u(zi; ), umi( m) = S u mui( ) = um(zi; m), gi( ) = S(ui( ) qi), gmi( ) = umi( m) qmi, where qi = q K(si) and qmi = q K m(smi), g^i = gi( ^), g^mi = gmi( ^m) and gi;n = gi( 0;n), gmi;n = gmi( m0;n), (i = 1; :::; n). Also let ui;n = ui( 0;n), umi;n = umi( m0;n), i( ) = E[ui( )ui( ) 0jsi], mi( ) = E[umi( m)umi( m) 0jsmi], i;n = i( 0;n) = E[ui;nu0i;njsi], mi;n = mi( m0;n) = E[umi;nu0mi;njsmi], (i = 1; :::; n), together with ^ = X i g^ig^ 0 i=n; ~ n = X i gi;ng 0 i;n=n;  n = S( X i i;n qiq0i)S0=n; n = E[gi;ng0i;n]: and ^m = X i g^mig^ 0 mi=n; ~ mn = X i gmi;ng 0 mi;n=n;  mn = ( X i mi;n qmiq0mi)=n; mn = E[gmi;ng0mi;n]: Proof of Theorem 5.1. The result is established rst for the GMM statistic J r. Let g^mn = g^m( mn;0) and g^n = g^( n;0). Note mn = Sm nS 0 m. Then, by Supplement Lemma S.8, ng^( ^)0 ^1g^( ^) ng^0n 1n g^np 2JaMK p! 0; ng^m( ^m) 0 ^1m g^m( ^m) ng^0mn 1mng^mnp 2JaMK p! 0: Hence J r (ng^0n( 1n S0m 1mnSm)g^n JaMK)= p 2JaMK p! 0: It remains to prove that ng^0n( 1 n S0m 1mnSm)g^n JaMKp 2JaMK d! N(r= p 2; 1): Let gi;n = E[gi;njsi] and ~gi;n = gi;n gi;n, (i = 1; :::; n). Also let gn = Pn i=1 gi;n=n and ~gn =Pn i=1 ~gi;n=n. Write Pn = 1 n S0m 1mnSm. Then, g^0nPng^n = ~g 0 nPn~gn + 2g 0 nPn~gn + g 0 nPngn: The rst step demonstrates g0nPngn = p JaMK n (r + op(1)): Let i = (si) and mi = m(si), (i = 1; :::; n). It follows by Supplement Lemma S.4 that g0n  1 n gn = p JaMK n Xn i;j=1 (i qi)0S0  1n S(j qj)=n2 = p JaMK n (E[(s)0(s)1(s)] + op(1)): [25] Next, note Sm(i qi) = mi qmi, (i = 1; :::; n). Thus, again using Supplement Lemma S.4, g0nS 0 m  1mnSmgn = p JaMK n Xn i;j=1 (mi qmi)0  1mn(mj qmj)=n2 = p JaMK n (E[E[m(s)jsm]0m(sm)1E[m(s)jsm]] + op(1)) = p JaMK n op(1); since E[m(s)jsm] = 0 by hypothesis. It remains to show that np 2JaMK g0n( 1 n  1n )gn p! 0; np 2JaMK g0nS 0 m( 1 mn  1mn)Smgn p! 0: Similarly to DIN Proof of Lemma 6.1, pp.87-88, from Supplement Lemma S.6, ng0n( 1n  1n )gn =p2JaMK  n 1n gn 2 ( n  n + C n  n 2)=p2JaMK = n 1n gn 2Op((K)pK=n)=p2JaMK = op(1) since 1n gn 2 = g0n 2n gn  Cg0n 1n gn = Op(pK=n). Likewise ng0nS0m( 1mn  1mn)Smgn =p2JaMK = op(1). Therefore, g0nPngn = p JaMK n (r + op(1)): Secondly, it is shown that ng0nPn~gn= p 2JaMK = op(1): Noting kik2 bounded and i;n(si)1 bounded for n large enough, by cr E[kui;n E[ui;njsi]k4]  8(E[kui;nk4] + E[kE[ui;njsi]k4]) = 8(E[E[kui;nk4 jsi]) + E[JaMK n2 kik4]]  C for n large enough as E[kui;nk4 jsi]  C and K=n2 ! 0. Hence, by Supplement Lemma S.5, g0n  1 n ~gn = 4 p JaMK n Xn i;j=1 (i qi)0S0  1n ~gj;n=n p n = Op( 4 p JaMK=n): Next, by hypothesis, ng0n( 1n  1n )~gn =p2JaMK  n 1n gn 1n ~gn ( n  n + C n  n 2)=p2JaMK = n 1n gn 1n ~gn Op((K)pK=n)=p2JaMK = op(1) since 1n gn 2 = Op(pK=n) from above and 1n ~gn  1n g^n + 1n gn = Op(pK=n) + Op( 4 p K=n2). A similar analysis yields ng0nS 0 m 1 mnSm~gn= p 2JaMK = op(1). [26] Let gmi;n = E[gmi;njsi], ~gmi;n = gmi;n gmi;n, (i = 1; :::; n). Finally to prove n~g0nPn~gn JaMKp 2JaMK d! N(0; 1) it is rst established that n~g0n(Pn P n)~gnp 2JaMK = op(1) where P n = 1 n S0m( mn)1Sm with n = E[~gi;n~g0i;n] and mn = E[~gmi;n~g0mi;n]. By T jn~g0n(Pn P n)~gnj  n~g0n( 1n ( n)1)~g0n + n~g0n(S0m 1mnSm S0m( mn)1Sm)~g0n The rst term n~g0n( 1n ( n)1)~g0n  n 1n ~gn 2 (k n nk+ C k n nk2): Therefore, noting n = n E[gi;ng0i;n], from eq.(5.1) k n nk = 4 p JaMKp n E[kik2 kqik2]1=2 = Op( 4 p K3p n ): Consequently, since 1n ~gn = Op(pK=n) +Op( 4pK=n2), n~g0n( 1n 1n )~g0n p 2JaMK  Op(K) +Op( p K)p 2JaMK (O( 4 p K3p n ) + p K3 n ) = op(1): Similarly n~g0nS0m( 1mn ( mn)1)Sm~g0np2JaMK = op(1): Therefore n~g0n(Pn P n)~gnp 2JaMK = op(1) Note that 1=C  min ( n)  max ( n)  C because j (A)  (B)j  kABk, jmin ( n) min ( n)j = o (1) and jmax ( n) max ( n)j = o (1). Similarly 1=C  min ( mn)  max ( mn)  C: Supplement Lemma S.2 is now invoked to prove n~g0nP  n~gn JaMKp 2JaMK d! N(0; 1): First, tr( nP  n) = JaMK. Secondly, to establish E[(~g0i;nP  n~gi;n) 2] = op(K p n); by cr E[(~g0i;nP  n~gi;n) 2]  2E[(~g0i;n( n)1~gi;n)2] + 2E[(~g0i;nS0m( mn)1Sm~gi;n)2]: [27] Again using cr E[(~g0i;n(  n) 1~gi;n)2]  3E[(g0i;n( n)1gi;n)2] + 12E[(g0i;n( n)1gi;n)2] + 3E[(g0i;n( n)1gi;n)2]: Now, for n large enough, E[(g0i;n(  n) 1gi;n)2]  CE[kgi;nk4]. Since n;0 2 N for n large enough, by Assumption 3.4(c), similarly to DIN Proof of Theorem 6.3, pp.89-90, E[kgi;nk4]  E[kqik4E[kui;nk4 jsi]]  CE[kqik4]  C(K)2K: Next, E[(g0i;n(  n) 1gi;n)2]  C( p K=n)E[kik2 kqik2]  CK p K=n: Lastly, E[(g0i;n(  n) 1gi;n)2]  C(K=n2)E[kik4 kqik4]  C(K)2K2=n2: Hence, E[(~g0i;n(  n) 1~gi;n)2] = op(K p n) as required. Likewise, E[(~g0i;nS 0 m(  mn) 1Sm~gi;n)2] = op(K p n). Thirdly, P n  nP  n = P  n . Therefore, n~g0nPn~gn JaMKp 2JaMK d! N(0; 1): The conclusion of the theorem for J r then follows. The proof structure for the restricted GEL statistics LRr, LMr, Sr and Wr is similar to that for Theorem 4.2 demonstrating their mutual asymptotic equivalence to the GMM statistic J r under the local alternatives (5.1). The proofs for LMr, Sr and Wr are omitted for brevity. First apply the decomposition for LRr in Supplement eq. (S.3). A similar argument to that in Supple- ment Proof of Theorem 4.2 establishes kg^m g^m0k  Op( p K=n). Thus, from T and Supplement Lemma S.9, kg^mk = Op( p K=n) and, therefore, ^m = Op(pK=n) by Supplement Lemma S.10. Consequently, since ^m 2 ^mn ( ^m) and the rst order conditions for m are satis ed w.p.a.1, an expansion around m = 0 gives g^m( ^m) _ m^m = 0 where _ m = Pn i=1 2( _0mg^mi)g^mig^ 0 mi=n and _m lies between ^m and 0. Thus, w.p.a.1, ^m = _ 1m g^m( ^m) and 2nP^ gm ( ^m; ^m) = ng^m( ^m)0(2 _ 1m _ 1m  m _ 1m )g^m( ^m) where  m = Pn i=1 2( 0mg^mi)g^mig^ 0 mi=n and m lies between ^m and 0. It remains to prove that 2 _ 1 m _ 1m  m _ 1 m ^1m = op(1= p K). Now, by Supplement Lemmata S.1 and S.6, ^m mn = op(1=pK), _ m mn = op(1=pK) and  m mn = op(1=pK). Consequently, 2 _ m  m mn p! 0 and max((2 _ m  m)1)  C w.p.a.1. Thus, by T, as (2 _ 1m _ 1m  m _ 1m )1 = _ m(2 _ m  m) 1 _ m, _ m(2 _ m  mn)1 _ m mn(2 _ m  m)1 mn  op(1=pK). Also, as max ( mn)  C, mn(2 _ m  m)1 mn mn  op(1=pK) yielding _ 1m (2 _ m  m) _ 1m 1mn = op(1=pK). Therefore, as ^1m 1mn = op(1=pK), the third term in the decomposition for LRr in Supplement eq. (S.3) is op(1). Likewise, the second term in Supplement eq. (S.3) is op(1). Therefore, from the rst term in Supplement eq. (S.3), LRr d! N(r=p2; 1). [28] References Abramowitz, M. and I.A. Stegun, 1972, Handbook of mathematical functions with formulas, graphs, and mathematical tables, Applied Mathematics Series 55, 10th. Edition. Dover Publications, New York. Aitchison, J., 1962, Large-sample restricted parametric tests. Journal of the Royal Statistical Society (Series B) 24, 234-250. Arellano, M. and S. Bond, 1991, Some tests of speci cation for panel data: Monte Carlo edidence and an application to employment equations. Review of Economic Studies 58, 277-297. Banks, J., Blundell, R.W. and A. Lewbel, 1997, Quadratic Engel curves, indirect tax reform and welfare measurement. Review of Economics and Statistics 79, 527{539. Belloni, A., Chernozhukov, V., Chetverikov, D. and K. Kato, 2015, Some new asymptotic theory for least squares series: Pointwise and uniform results. Forthcoming in Journal of Econometrics. Bierens, H.J., 1982, Consistent model speci cation tests. Journal of Econometrics 20, 105-134. Bierens, H.J., 1990, A consistent conditional moment test of functional form. Econometrica 58, 1443- 1458. Blundell, R.W. and J.L. Horowitz, 2007, A non-parametric test of exogeneity. Review of Economic Studies 74, 1035-1058. Carrasco, M. and J.-P. Florens, 2000. Generalization of GMM to a continuum of moment conditions. Econometric Theory 16, 797{834. Chamberlain, G., 1987, Asymptotic eciency in estimation with conditional moment restrictions. Jour- nal of Econometrics 34, 305-334. Chamberlain, G., 1992, Comment: Sequential moment restrictions in panel data. Journal of Business and Economic Statistics 10, 20-26. Chen, J., Variyath, A.M. and B. Abraham, 2008, Adjusted empirical likelihood and its properties. Journal of Computational and Graphical Statistics 17, 426-443. Chen, X. and D. Pouzo, 2009, Ecient estimation of semiparametric conditional moment models with possibly nonsmooth residuals. Journal of Econometrics 152, 46-60. Chen, X. and D. Pouzo, 2012, Estimation of nonparametric cnditional moment models with possibly nonsmooth generalized residuals. Econometrica 80, 277-321. [29] Cragg, J.G., 1983, More ecient estimation in the presence of heteroscedasticity of unknown form. Econometrica 51, 751-763. Cressie, N. and T. Read, 1984, Multinomial goodness-of- t tests. Journal of the Royal Statistical Society (Series B) 46, 440-464. Davidson, J.E.H., 1994, Stochastic limit theory. Oxford University Press, Oxford. Davidson, R. and J.G. MacKinnon, 1993, Estimation and inference in econometrics. Oxford University Press, Oxford. De Jong, R.M. and H.J. Bierens, 1994, On the limit behavior of a chi-square type test if the number of conditional moments tested approaches in nity. Econometric Theory 10, 70-90. Domnguez, M.A. and I.N. Lobato, 2004, Consistent estimation of models de ned by conditional mo- ment restrictions. Econometrica 72, 1601-1615. Donald, S.G., Imbens, G.W. and W.K. Newey, 2003, Empirical likelihood estimation and consistent tests with conditional moment restrictions. Journal of Econometrics 117, 55-93. Donald, S.G., Imbens, G.W. and W.K. Newey, 2009, Choosing instrumental variables in conditional moment restriction models. Journal of Econometrics 152, 28-36. Durbin, J., 1954, Errors in variables. Review of the International Statistical Institute 22, 23-32. Eichenbaum, M.S., Hansen, L.P. and K.J. Singleton, 1988, A time series analysis of representative agent models of consumption and leisure choice under uncertainty. Quarterly Journal of Economics 103, 51-78. Ellison, G. and S.F. Ellison, 2000, A simple framework for nonparametric testing, Journal of Econo- metrics 96, 1-23. Engle, R.F., 1982, A general approach to Lagrange multiplier model diagnostics. Journal of Economet- rics 20, 83-104. Engle, R.F., Hendry, D.F. and J.-F. Richard, 1983, Exogeneity. Econometrica 51, 277-304. Eubank, R. and C. Spiegelman, 1990, Testing the goodness of t of a linear model via nonparametric regression techniques. Journal of the American Statistical Association 85, 387-392. Fan, Y. and Q. Li, 1996, Consistent model speci cation tests: Omitted variables and semi-parametric functional forms. Econometrica 64, 865-890. [30] Guggenberger, P. and R.J. Smith, 2005, Generalized empirical likelihood estimators and tests under partial, weak and strong identi cation. Econometric Theory 21, 667-709. Hansen, L. P., 1982, Large sample properties of generalized method of moments estimators. Economet- rica 50, 1029-1054. Hansen, L.P., Heaton, J. and A. Yaron, 1996, Finite-sample properties of some alternative GMM estimators. Journal of Business and Economic Statistics 14, 262-280. Hardle, W. and E. Mammen, 1993, Comparing non-parametric versus parametric regression ts. Annals of Statistics 21, 1926-1947. Hausman, J.A., 1978, Speci cation tests in econometrics. Econometrica 46, 1251-1271. Holtz-Eakin, D., Rosen, H. and W.K. Newey, 1988, Estimating vector autoregressions with panel data. Econometrica 56, 1371-1396. Hong, Y. and H. White, 1995, Consistent speci cation testing via nonparametric series regression. Econometrica 63, 1133-1159. Horowitz, J.L. and N.E. Savin, 2000, Empirically relevant critical values for hypothesis tests: A boot- strap approach. Journal of Econometrics 95, 375-389. Hsu, S.-H. and C.-M. Kuan, 2011, Estimation of conditional moment restrictions without assuming parameter identi ability in the implied unconditional moments. Journal of Econometrics 165, 87{ 99. Imbens, G.W., 1997, One-step estimators for over-identi ed generalized method of moments models. Review of Economic Studies 64, 359-383. Imbens, G.W., Apady, R.H. and P. Johnson, 1998, Information theoretic approaches to inference in moment condition models. Econometrica 66, 333-357. Jayasuriya, B.R., 1996, Testing for polynomial regression using nonparametric regression techniques. Journal of the American Statistical Association 91, 1626-1631. Kitamura, Y. and M. Stutzer, 1997, An information-theoretic alternative to generalized method of moments estimation. Econometrica 65, 861-874. Kitamura, Y., Tripathi, G. and H. Ahn, 2004, Empirical likelihood-based inference in conditional moment restriction models. Econometrica 72, 1667-1714. Lavergne, P. and Q. Vuong, 2000, Nonparametric signi cance testing. Econometric Theory 16, 576-601. [31] Lorentz, G.G., 1986, Approximation of Functions. Chelsea Publishing Company, New York. Matsushita, Y. and T. Otsu, 2013, Second-order re nement of empirical likelihood for testing overiden- tifying restrictions. Econometric Theory 29, 324-353 Muellbauer, J., 1976, Community preferences and the representative consumer. Econometrica 44, 525{ 543. Newey, W.K., 1985, Generalized method of moments speci cation testing. Journal of Econometrics 29, 229-256. Newey, W. K., 1994, Series estimation of regression functionals. Econometric Theory 10, 1-28. Newey, W.K., 1997, Convergence rates and asymptotic normality for series estimators. Journal of Econometrics 79, 147-168. Newey, W.K. and R.J. Smith, 2004, Higher order properties of GMM and generalized empirical likeli- hood estimators. Econometrica 72, 219-255. Owen, A., 1990, Empirical likelihood con dence regions. Annals of Statistics 18, 90-120. Owen, A., 2001, Empirical likelihood. Chapman and Hall, New York. Parente, P.M.D.C. and R.J. Smith, 2011, GEL methods for nonsmooth moment indicators. Econometric Theory 27, 74-113. Powell M. J. D., 1981, Approximation theory and methods. Cambridge University Press, Cambridge. Qin, J. and J. Lawless, 1994, Empirical likelihood and general estimating equations. Annals of Statistics 22, 300-325. Ruud, P., 2000, An introduction to classical econometric theory. Oxford University Press, Oxford. Smith, R.J., 1994, Asympotical optimal tests using limited information and testing for exogeneity. Econometric Theory 10, 53-69. Smith, R.J., 1997, Alternative semi-parametric likelihood approaches to generalized method of moments estimation. Economic Journal 107, 503-519. Smith, R.J., 2000, Empirical likelihood estimation and inference, in: M. Salmon and P. Marriott, (Eds.), Applications of di erential geometry to econometrics. Cambridge University Press, Cambridge, pp. 119-150. Smith, R. J., 2011, GEL methods for moment condition models. Econometric Theory 27, 1192-1235. [32] Tripathi, G. and Y. Kitamura, 2003, Testing conditional moment restrictions. Annals of Statistics 31, 2059-2095. Wooldridge, J.M., 1992, A test of functional form against nonparametric alternatives. Econometric Theory 8, 452-475. Wooldridge, J.M., 2002, Econometric analysis of cross section and panel data. MIT Press, Cambridge. Wu, D.-M., 1973, Alternative tests of independence between stochastic regressors and disturbances. Econometrica 41, 733-750. Yatchew, A.J., 1992, Nonparametric regression tests based on least squares. Econometric Theory 8, 435-451. Zedlewski, J., 2008, Practical empirical likelihood estimation using matElike. Manuscript, Harvard University. Zheng, J.X., 1996, A consistent test of functional form via nonparametric estimation techniques. Journal of Econometrics 75, 263-289. Zheng, J.X., 1998, A consistent nonparametric test of parametric regression model under conditional quantile restrictions. Econometric Theory 14, 123-138. [33] Table B.1. ME Tests Null Hypothesis Empirical Rejection Frequencies M n 200 500 1000 1500 J m 2:74 4:04 3:92 5:02 LRmcue 4:32 4:92 4:62 5:46 LRmel 9:48 7:10 5:78 6:42 LRmcue(el) 4:94 5:06 4:64 5:50 LMmel 15:68 8:66 6:18 6:52 1 gLMmel 6:06 5:64 4:86 5:56 Smel 6:88 6:10 5:44 6:08 Wmel 13:62 7:80 5:86 6:00 LRmet 8:54 7:08 5:90 6:64 LRmcue(et) 4:56 5:02 4:62 5:50 LMmet 16:94 10:90 7:86 8:46gLMmet 5:76 5:70 5:14 5:94 Smet 5:66 5:46 5:06 5:88 Wmet 15:24 10:56 8:32 8:88 J m 2:64 3:22 4:06 4:14 LRmcue 3:40 4:30 4:86 5:26 LRmel 17:06 10:8 8:22 7:50 LRmcue(el) 4:20 4:74 5:06 5:3 LMmel 38:34 19:00 10:64 9:22 2 gLMmel 5:74 5:54 5:56 5:76 Smel 6:64 6:14 5:92 6:16 Wmel 35:54 18:12 10:06 9:16 LRmet 13:32 9:82 8:08 7:68 LRmcue(et) 3:68 4:52 4:92 5:24 LMmet 37:84 22:3 14:16 12:64gLMmet 4:82 5:52 5:64 5:9 Smet 5:08 5:2 5:38 5:82 Wmet 35:78 21:42 13:92 13:12 Table B.2. JE Tests Null Hypothesis Empirical Rejection Frequencies M n 200 500 1000 1500 J j 2:42 3:98 4:66 4:62 LRjcue 4:22 5:00 4:94 4:86 LRjel 8:64 6:50 5:86 5:66 LRjcue(el) 4:56 5:24 5:02 4:90 LMjel 14:3 7:86 5:98 5:66 1 gLMjel 5:80 5:68 5:40 5:24 Sjel 6:56 6:06 5:58 5:54 Wjel 12:36 7:26 5:66 5:28 LRjet 7:82 6:62 5:9 5:72 LRjcue(et) 4:40 5:10 4:94 4:88 LMjet 15:52 9:32 7:00 6:66gLMjet 5:38 5:84 5:44 5:28 Sjet 5:38 5:46 5:20 5:28 Wjet 15:08 9:70 7:06 6:56 J j 1:54 3:06 3:84 3:20 LRjcue 2:88 4:48 4:86 4:44 LRjel 14:22 9:12 6:82 7:66 LRjcue(el) 3:94 5:02 4:96 4:66 LMjel 34:58 15:40 8:76 11:10 2 gLMjel 4:90 5:54 5:24 4:98 Sjel 5:52 5:78 5:66 5:34 Wjel 33:38 14:82 8:28 10:96 LRjet 10:82 8:70 6:98 7:82 LRjcue(et) 3:26 4:82 4:90 4:60 LMjet 33:38 17:22 11:1 14:66gLMjet 4:26 5:66 5:42 5:16 Sjet 3:98 5:20 5:24 4:90 Wjet 33:46 18:14 11:62 19:18 Table B.3. ME Tests Alternative Hypothesis Empirical Rejection Frequencies n M  J m LRmcue LRmel LRmcue(el) gLMmel Smel LRmet LRmcue(et) gLMmet Smet 0:2 7:26 6:58 6:14 6:60 6:74 6:44 6:30 6:60 6:46 6:30 0:4 11:92 11:38 11:00 11:38 11:26 10:64 11:02 11:60 11:06 10:60 1 0:6 15:76 15:88 16:10 15:90 16:14 15:18 16:22 15:96 15:80 15:06 0:8 18:66 19:48 20:26 19:54 19:68 18:62 20:20 19:68 19:26 18:68 1:0 21:16 21:50 24:42 21:78 22:30 21:64 23:26 21:80 21:58 21:30 sc 0:2 6:30 6:44 5:78 6:82 6:60 6:90 6:06 6:64 6:60 6:46 0:4 9:26 9:44 8:98 10:00 9:86 9:68 8:86 9:74 9:50 9:00 2 0:6 12:44 12:00 12:30 12:96 12:52 12:78 12:38 12:34 12:26 11:76 0:8 15:36 15:38 15:60 16:24 15:72 15:70 15:24 15:96 15:40 14:66 1:0 17:38 17:94 19:24 18:88 18:00 18:02 18:42 18:38 17:90 17:06 200 0:2 3:96 5:80 11:90 6:56 8:02 9:00 10:76 6:12 7:28 7:16 0:4 6:98 10:20 18:50 11:16 13:54 14:32 17:54 10:68 12:54 12:12 1 0:6 10:14 14:54 25:02 15:62 18:40 19:72 23:02 14:96 17:56 16:66 0:8 12:48 17:56 30:72 19:18 22:32 23:82 28:24 18:24 20:84 20:56 1:0 14:24 20:04 33:98 21:34 25:24 26:98 31:74 20:76 23:84 23:24 nsc 0:2 3:34 4:46 19:58 5:82 7:58 8:86 15:80 5:08 6:36 6:72 0:4 5:44 6:96 24:56 8:70 10:70 12:16 20:30 7:74 9:20 9:14 2 0:6 7:34 9:16 31:30 11:70 13:92 15:32 25:28 10:34 11:90 11:98 0:8 9:34 11:42 37:12 14:34 17:24 18:64 30:24 12:52 15:00 14:88 1:0 10:88 13:26 41:78 16:84 19:62 21:70 34:40 14:94 17:46 17:26 0:2 10:50 10:36 10:30 10:48 10:38 10:00 10:36 10:36 10:44 10:22 0:4 23:12 23:70 23:56 23:68 23:60 23:56 23:62 23:60 23:64 23:50 1 0:6 35:66 35:78 36:50 36:00 36:10 36:08 36:48 35:70 36:06 35:66 0:8 45:72 45:94 47:00 46:40 45:96 45:98 46:78 46:02 45:88 45:82 1:0 52:86 52:80 54:42 53:10 52:74 53:36 54:30 52:78 53:00 52:80 sc 0:2 9:16 8:62 7:96 8:50 8:44 8:84 7:96 8:40 8:60 8:64 0:4 19:54 18:42 17:82 18:54 18:30 18:86 17:26 18:34 18:40 18:32 2 0:6 30:22 30:02 29:62 30:36 30:08 30:68 28:74 29:72 29:64 29:64 0:8 39:32 38:88 39:34 39:28 39:18 39:66 38:72 38:62 38:92 38:70 1:0 45:74 45:06 46:50 45:34 45:80 46:20 45:90 44:74 45:10 44:98 500 0:2 8:64 10:24 13:70 10:50 11:38 12:00 13:60 10:36 11:44 10:92 0:4 20:40 23:46 28:56 23:82 25:30 26:12 28:34 23:64 25:28 24:94 1 0:6 31:46 35:62 43:42 36:04 37:94 39:64 43:10 35:70 37:72 37:48 0:8 40:98 45:62 54:20 46:48 48:36 50:24 53:76 46:02 48:44 48:06 1 48:10 52:50 61:28 53:14 55:32 57:58 60:80 52:80 55:32 55:02 nsc 0:2 6:40 7:66 14:98 8:24 9:24 10:04 14:06 7:84 9:02 8:80 0:4 14:14 16:78 29:08 17:86 19:84 20:84 27:32 17:38 19:50 18:76 2 0:6 23:38 27:44 43:12 29:34 31:84 33:10 41:52 28:32 30:84 30:26 0:8 31:62 36:30 53:96 38:52 40:92 42:50 51:84 37:08 40:20 39:52 1:0 37:76 42:58 60:82 44:46 47:30 49:06 59:28 43:26 46:24 45:54 Table B.4. JE Tests Alternative Hypothesis Empirical Rejection Frequencies n M  J j LRjcue LRjel LRjcue(el) gLMjel Sjel LRjet LRjcue(et) gLMjet Sjet 0:2 8:60 9:06 9:20 8:94 9:00 8:54 8:88 8:94 9:00 8:74 0:4 16:36 17:20 18:34 16:90 16:40 15:78 17:38 16:80 16:50 16:06 1 0:6 24:40 24:98 28:42 24:86 23:94 23:34 26:34 24:70 23:94 23:80 0:8 30:40 31:08 37:34 30:94 30:18 29:70 34:20 30:74 30:28 29:84 1 35:16 35:42 43:54 35:36 35:46 34:04 40:28 35:06 34:94 33:96 sc 0:2 8:68 8:20 9:64 8:68 8:66 8:98 8:94 8:14 8:10 8; 50 0:4 13:34 12:22 17:58 13:18 13:14 12:84 15:44 12:26 12:20 12; 78 2 0:6 18:56 16:48 26:76 17:74 17:36 17:42 22:34 16:38 16:34 16; 78 0:8 23:42 20:82 35:26 21:92 21:90 21:98 29:14 20:56 20:70 21; 36 1 27:38 24:32 41:54 25:54 25:54 25:42 34:46 23:72 23:98 24:70 200 0:2 4:32 8:00 13:68 8:50 9:86 10:76 12:70 8:14 9:32 9:26 0:4 10:40 15:08 24:94 16:14 18:32 19:16 23:46 15:18 17:22 16:92 1 0:6 16:12 21:98 36:88 23:62 26:32 27:96 34:04 22:54 24:80 24:84 0:8 21:12 27:86 45:84 29:82 32:76 34:24 42:30 28:44 31:20 31:06 1 25:00 31:92 51:52 33:88 37:98 39:74 48:54 32:58 35:94 35:36 nsc 0:2 3:08 5:02 22:46 6:74 8:48 9:58 17:96 5:76 6:96 7:08 0:4 5:32 8:32 35:22 10:62 12:82 14:04 27:16 9:14 10:82 10:82 2 0:6 7:88 10:92 46:66 14:26 17:00 18:74 36:50 12:12 14:36 14:52 0:8 10:70 13:84 55:60 18:24 21:66 23:40 43:94 15:38 18:42 18:22 1 13:34 16:68 62:12 21:06 25:18 27:22 49:84 18:18 21:56 21:28 0:2 15:74 15:56 14:94 15:52 15:58 15:42 15:04 15:38 14:92 15:50 0:4 40:18 39:26 39:70 39:20 38:94 38:84 39:46 39:08 38:06 38:84 1 0:6 62:08 61:26 62:38 61:08 61:20 60:50 62:02 60:96 59:96 60:54 0:8 74:22 72:82 73:98 72:74 72:82 72:52 74:06 72:60 71:90 72:58 1 80:80 79:80 80:74 79:72 79:96 79:46 81:40 79:50 78:70 79:44 sc 0:2 14:84 15:86 17:04 15:88 16:00 16:14 16:36 15:58 15:74 16:28 0:4 32:36 32:74 39:52 33:14 33:56 32:94 36:96 32:72 32:64 32:88 2 0:6 48:58 47:52 57:34 48:02 48:60 48:28 53:70 47:40 47:64 47:94 0:8 60:54 58:66 68:58 58:94 59:48 59:60 65:30 58:42 58:34 59:40 1 68:58 66:54 76:58 66:54 67:40 67:84 73:42 66:18 66:70 67:24 500 0:2 13:42 15:58 18:78 15:82 16:94 17:74 18:84 15:66 17:08 16:76 0:4 36:04 39:32 45:56 39:64 40:92 42:42 45:28 39:48 41:14 40:92 1 0:6 57:72 61:32 68:10 61:82 63:28 64:34 67:56 61:52 63:38 62:86 0:8 70:28 72:84 78:60 73:16 74:56 75:42 78:84 72:94 74:42 74:20 1 77:32 79:80 85:00 80:10 81:52 82:36 85:46 79:96 81:44 81:04 nsc 0:2 9:88 14:94 26:66 15:90 16:90 18:24 24:62 15:36 16:84 16:60 0:4 24:78 31:40 50:94 33:20 35:16 35:70 47:34 32:30 34:16 33:58 2 0:6 39:44 45:98 67:46 48:02 50:30 51:34 64:16 46:90 49:24 48:78 0:8 51:86 57:02 77:86 58:94 61:08 62:52 74:54 57:94 60:30 60:20 1 60:18 65:24 83:80 66:54 68:90 69:96 81:24 65:80 68:28 67:86