The Open Mathematics, Statistics and Probability Journal


The Open Statistics & Probability Journal

(Discontinued)

ISSN: 2666-1489 ― Volume 10, 2020
RESEARCH ARTICLE

Asymptotic Relative Efficiencies of the Score and Robust Tests in Genetic Association Studies



Ao Yuan1, *, Ruzong Fan1, Jinfeng Xu2, Yuan Xue3, Qizhai Li4
1 Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington DC, 20057, USA
2 Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong
3 School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100190, China
4 LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China

Abstract

Introduction:

The score statistic Z(θ) and the maximin efficient robust test statistic ZMERT are commonly used in genetic association study, but according to our knowledge there is no formal comparison of them.

Methods:

In this report, we compare the asymptotic behavior of Z(θ) and ZMERT, by computing their Asymptotic Relative Efficiencies (AREs) relative to each other. Four commonly used ARE measures, the Pitman ARE, Chernoff ARE, Hodges-Lehmann ARE and the Bahadur ARE are considered. Some modifications of these methods are made to simplify the computations. We found that the Chernoff, Hodges-Lehmann and Bahadur AREs are suitable for our setting.

Results and Conclusion:

Based on our study, the efficiencies of the two test statistic varies for different criterion used, and for different parameter values under the same criterion, so each test has its advantages and dis-advantages according to the criterion used and the parameters involved, which are described in the context. Numerical examples are given to illustrate the use of the two statistics in genetic association study.

Keywords: Asymptotic relative efficiency, Genetic association study, Maximin efficiency robust test, Score test Z(θ), Test statistic Zmert, Pitman ARE, Chernoff ARE.


Article Information


Identifiers and Pagination:

Year: 2018
Volume: 9
First Page: 26
Last Page: 41
Publisher Id: TOSPJ-9-26
DOI: 10.2174/1876527001809010026

Article History:

Received Date: 7/3/2018
Revision Received Date: 23/7/2018
Acceptance Date: 2/10/2018
Electronic publication date: 28/12/2018
Collection year: 2018

© 2018 Yuan et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


* Address correspondence to this author at the Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington DC, 20057, USA, Tel: +91 22 33611111, E-mail: ay312@georgetown.edu





1. INTRODUCTION

In genetic association studies, several test statistics are often used, including the score test Z(θ) and the maximin efficient robust test statistic ZMERT. Although numerical behavior of the two tests are reported in various genetic association studies based on simulations, to our knowledge, a formal theoretical comparison of the two tests hasn’t been seen in the literature. It is of meaning to compare their asymptotic performances. Although for likelihood ratio based test statistic for testing hypothesis of simple null versus simple alternative, there is a uniformly most powerful test under some regularity conditions. However, most test statistics are not constructed directly from likelihood ratio, the hypothesis are composite, and there is generally no such optimal test. Therefore, the classical method to compare any two test statistics is to evaluate the Asymptotic Relative Efficiency (ARE) between them.

The ARE is a well studied area, with vast literatures and numerous different definitions. But often the computation of ARE is very difficult in the general case, some of the classical methods for ARE require that the test statistics have some standard forms, such as they have the same asymptotic distribution, or have the forms of i.i.d. summations. However, in practice, such as in genetic association studies, some test statistics do not have these forms. Sitlani and McKnight [1C.M. Sitlani, and B. McKnight, "Relative efficiency of trend tests with misspecified genetic models in stratified analyses of case-control or cohort data", Hum. Hered., vol. 71, no. 4, pp. 246-255.
[http://dx.doi.org/10.1159/000328858] [PMID: 21811075]
] studied AREs for the trend test under different models and stratifications. In this communication, wecompare the asymptotic behavior of two commonly used test statistics the score statistic Z(θ) and the maximin efficient robust test statistic ZMERT, arise in case-control genetic association study, as given in Zheng, Li and Yuan [2G. Zheng, "Some statistical properties of efficiency robust tests with applications to genetic association studies", Scand. J. Stat., pp. 762-774.
[http://dx.doi.org/10.1111/sjos.12060]
], hereafter ZLY, by evaluate their AREs relative to each other. Four commonly used ARE measures, the Pitman ARE, Chernoff ARE, Hodges-Lehmann ARE and the Bahadur ARE are considered. Pitman’s ARE does not apply directly. We found the Chernoff, Hodges-Lehmann and the Bahadur AREs are suitable for our setting. Some modifications of these methods are made to simplify the computations.

Existing studies on ARE are mainly focused on two categories. One is to compare efficiencies of estimators of the same parameter; the other is to compare test statistics of the same hypothesis, in which the test statistics may not estimate the same parameter. The latter study can be under the assumption that the test statistics in comparison are asymptotic normality. In this case, the ARE’s can often be easily computed. There are also methods for compare ARE of different test statistics in general, in which different test statistics of the same hypothesis may have different asymptotic distributions. In this general case, Pitman, Bahadure and Hodges-Lehmann proposed different ways to compute the ARE, and it is often difficult. Although, when the test statistics have the same asymptotic distribution, the ARE can be computed easily. We also give a simple definition of ARE, so that it can be computed in the case of different asymptotic distributions, as long as the asymptotic distributions of the test statistics are known.

In Section 2, we describe the background of the genetic association study problem and a brief review of the classical definitions of ARE. In Section 3 we compare the ARE of the test statistics arose from our genetic association study. We found that he performances, or the efficiencies of the two test statistic varies for different criterion used, and for different parameter values under the same criterion, which described in the context. Section 4 gives brief numerical examples in simulation and application of the two tests in genetic association study, from our previous study, to illustration their usage.

2. BACKGROUND

Denote the log-likelihood function as, where Yi is the outcome, R2 are the parameters of interest, is a vector of parameters (m ≥ 0) for the covariate Xi = (x1,....,xim)T, and n is the sample size. The goal is to test the null hypothesis against the alternative H11, λ2) \ {(1, 1)}, where has two edges with known slopes θ 0 and θ1, and the null point (1, 1) is on the boundary of . We assume - ∞ < θ 0 < θ1 <∞ and the endpoints θ 0 and θ1 satisfy some constraints as specified in ZLY. If θ1 = ∞ which corresponds to a vertical edge, we can switch λ1 and λ2 and define new (θ1, θ2) so - ∞ < θ 0 < θ1 <∞ is satisfied by the new (θ1, θ2). For example, we can write λ1 = 1 + (λ2 - 1)/ λ1*2-1) and λ1 = 1 + (λ2 - 1)/ θ 0 = 1 + θ 0*2 - 1) where - ∞ < θ 0 < θ1 <∞.

Assume θ 0 and θ1 are known from the problem of interest and/or scientific knowledge. Given λ1 = λ ≥ 1, λ2 can be written as . We treat η as a nuisance parameter not estimable under H 0 λ = 1, but it is estimable under 0. Then the log-likelihood becomes. ln (λ, η, θ) The score test statistic H 0 λ = 1 for is given by;

(1)

where is the MLE of η under H 0. It would be difficult to deal with ln (λ, η, θ) because θ in Z (θ) is implicitly expressed.

So we work with ln (λ,1 - θ + θλ, η), where θ is explicitly expressed. It is convenient to view ln (λ, η, θ) as a tri-variate function with variables x1 = λ, x2 = 1 - θ + θλ and x3 = η. Denote ln,u = ∂ln/ ∂xu for, u = 1,2,3, ln, uv = ∂2ln/∂xu∂xv for u = 1,2 and, v = 1.2.3, and ln.33 = ∂2ln/∂x3∂xT3. Assume and v = 3. Denote Lvu (η) = EHnllvu (1.1, η).

Suppose we have a family of asymptotically normally distributed tests , where under H1 λ = 1 for a given , which determines the data-generating model under H 0: λ = 1. When is the true value Z()), is asymptotically most powerful (optimal). In this case, θ(1) ≠ θ(0) when is used, the Pitman ARE of Z(1)) relative to Z(1)) is given by (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185]
, 4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127]
])

(2)

where is the asymptotic null correlation coefficient between and. Let be a set of all convex linear combinations of. A simple robust test derived under efficiency robust theory (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185]
, 4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127]
]; Birnbaum and Laska [5A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251.
[http://dx.doi.org/10.1080/01621459.1967.10500929]
],) is the maximin efficient robust test (MERT), denoted as. When, is given by;

(3)

When T 0 has more than two members, generally exists and is unique (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185]
]), but its computation needs quadratic programming methods (Rosen [6J. Rosen, The gradient projection method for non-linear programming Part I: Linear constraints., SIAM J., pp. 181-217.]). However, when there is an extreme pair (Z(θi), Z(θ i)) in T 0i.e. pθi, θi = is MERT for if and only if (Gastwirth [7J. Gastwirth, "On robust rank test", Nonparametric Techniques in Statistical Inference, Cambridge University Press: London, .]).

and thus

(4)

That is, the MERT reaches the maximin ARE due to model uncertainty. The MERT was first derived for linear rank tests for the two-sample problem (Gastwirth [3J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185]
]; Birnbaum and Laska [5A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251.
[http://dx.doi.org/10.1080/01621459.1967.10500929]
],) and later extended to a family of asymptotically normally distributed tests (Gastwirth [4J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127]
]).

The Z (θ) statistic has the following property (ZLY): Let. Then where and.

Let be the MLE of η under H 0, and be that of (η, λ) under H1. For given θ, the X2 likelihood ratio test statistic is . For fixed θ, the number of parameters under H1 is just 1 more than that under H 0, so by Wilk’s theorem, under H 0,

the chi-squared distribution with one degree of freedom. The likelihood ratio test is also widely used in genetic association studies, its properties, including its ARE is well studied in the literature, so we will not investigate it here.

Let the MLE here 0 presents a vector of 0’s. Let η 0 be the true value (unknown) of η under either H 0 or H1, we define the score function as;

and the test statistic for H 0 as;

(5)

where “~” means asymptotically equivalent, in the above is replaced by it is approximated by n-1ln, vu (1.1, η).

Denote . For a vector v (v1, v2, v3)T, denote . be the true density of the data y. The null model f (1, 1, η) is and the alternative model is . The following notation is also used under H1. For fixed, (λ, θ) let;

(6)

Under H1, the empirical version of η 0 is just . We denote the Fisher information and its inverse in the blocked forms as;

Let


by is replaced by Note that with defined in the above,

Below we give a brief review of the notions of ARE for test statistics in the general case, more detailed account can be found in Serfling (1980) [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481]
] and Nikitin (2011) [9Y. Nikitin, "Asymptotic relative efficiency in testing (version 5)", Stat Prob: The Encyclopedia Sponsored by Statistics and Probability Societies, .].

The calculation of the existing of versions of ARE is generally not easy, as in the examples (Serfling, 1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481]
]; Nikitin, 1995 [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081]
]; van der Varrt, 1998 [11A. van der Varrt, Asymptotic Statistics., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511802256]
]). We only point out that the Pitman ARE is based on the central limit theorem for test statistics, that the Bahadur ARE requires the large deviation asymptotics of test statistics under the null-hypothesis, while the Hodges-Lehmann ARE is connected with large deviation asymptotics under the alternative. Each type of ARE has its own advantage and dis-advantage, and the different notions of ARE are not always give consistent conclusion.

If the condition of asymptotic normality (or common asymptotic distribution) fails, considerable difficulties will arise in calculating the Pitman ARE as it may not at all exist or may depend on α and β. Usually one considers limiting Pitman ARE as α → 0 Wieand (1976) [12H. Wieand, "A condition under which the Pitman and Bahadur", Ann. Stat., pp. 1003-1011.
[http://dx.doi.org/10.1214/aos/1176343600]
] established the correspondence between this kind of ARE and the limiting approximate Bahadur efficiency which is easy to compute.

The Bahadur (1960) [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894]
] ARE is to fix the power of tests and compare the exponential rate of decrease of their sizes for the increasing number of observations and fixed alternative. Its computation is always non-trivial, and heavily depends on advancements in large deviation theory, as in Dembo and Zeitouni (1998) [14A. Dembo, Large deviation techniques and applications 2nd., Springer: New York, .
[http://dx.doi.org/10.1007/978-1-4612-5320-4]
] and Deuschel and Strook (1989) [15J. Deuschel, Large deviations, Academic Press: Boston, .].

It is proved that under some regularity conditions the likelihood ratio statistic is asymptotically optimal in Bahadur sense (Bahadur, 1967 [16R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324.
[http://dx.doi.org/10.1214/aoms/1177698949]
]; Arcones, 2005 [17M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.]). Often the Bahadur ARE is difficult to compute for any alternative but it is possible to calculate the limit of Bahadur ARE as θ approaches the null-hypothesis, to obtain the local Bahadur efficiency.

The Hodges-Lehmann ARE is, in contrast to Bahadur efficiency, it fixes the level of tests and compares the exponential rate of decrease of their type-II errors for the increasing number of observations and fixed alternative. The computation of Hodges-Lehmann ARE is also difficult as it requires large deviation asymptotics of test statistics under the alternative.

The drawback of Hodges-Lehmann efficiency is that most two-sided tests like Kolmogorov and Cramer-von Mises tests are all asymptotically optimal, and hence one cannot discriminate among them. On the other hand, under some regularity conditions the one-sided tests, such as linear rank tests can be compared, and their Hodges-Lehmann efficiency coincides locally with Bahadur efficiency (Nikitin, 1995 [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081]
]).

The Chernoff ARE is to minimize, asymptotically, a linear combination of type I and type II errors, it does not depend on the nominal level nor the power. But it basically only applies to test statistics of the form of i.i.d. summation.

The local ARE is much easier to compute than the previous ones, but it only applies to test statistics which are asymptotical normal with rate . We will see that some test statistics used in genetic association studies do not satisfy this condition.

Besides the four commonly used AREs for hypothesis tests described above, there are some other interesting methods. Hoeffding’s (1965) ARE [18W. Hoeffding, "Asymptotically optimal tests for multinomial distributions (with discussion)", Ann. Math. Stat., pp. 369-408.
[http://dx.doi.org/10.1214/aoms/1177700150]
], based on the work of Sanov (1957) [19I. Sanov, On the probability of large deviations of random variables.Sel., Transl. Math. Statist. Prob, pp. 213-244.], is theoretically appealing, but ony applies to multinomial data; Rubin and Sethurman ARE (1965) [20H. Rubin, "Bayes risk efficiency", Sankhya, A., pp. 325-346.] is based on Bayes risk; others including Kallenberg ARE (1983) [21W. Kallenberg, "Intermediate efficiency, theory and examples", Ann. Stat., pp. 170-182.
[http://dx.doi.org/10.1214/aos/1176346067]
], and the Borovkov-Mogulskii ARE (1993) [22A. Borovkov, "Large deviations and testing of statistical hypotheses", Sib. Adv. Math., .], etc.

3. ARE OF TWO TESTS IN GENETIC ASSOCIATION STUDIES

In this section, we investigate the uses of Pitman ARE, Chernoff ARE, Hodges-Lehmman ARE, and Bahadur ARE to the commonly used statistics in genetic association analysis. We focus on the statistics used in ZLY, Z(θ) and, ZMERTand refer the notations there. Although some other commonly used test statistics in genetic association studies, such as the likelihood ratio statistic (chi-squared statistic), we will not discuss them here, as most of them are well studied in the literatures.

Pitman ARE. Consider testing Let Sn be a test statistic based on data of size n, with mean µn (λ) and standard deviation µn (λ). To use this method the following conditions are needed.

(P1). For some continuous strictly increasing distribution function F independent of λ, and some, δ > 0 as n → ∞,

(P2). For , is k times differentiable, with µn(1) 0) = ... =

(P3). For d(n) → ∞ some and some constant

(P4). For

Pitman appears as the first to introduce the notion of ARE for tests in his unpublished lectures, and the following result was stated in Noether’s works.

(Pitman, 1949 [23E. Pitman, Lecture Notes on Nonparametric Statistical Inference, Columbia University: Mimeographed., 1949.]; Noether, 1950 [24G. Noether, "Asmptotic properties of the wald-wolfowitz test of randomness", Ann. Math. Stat., pp. 231-246.
[http://dx.doi.org/10.1214/aoms/1177729841]
]). Assume (P1)-(P4), that αn = Pλ 0 (Sn > then , if and only if

(7)

(ii) Let S1,n and S2,n each satisfy (P1)-(P4) with the common F, K, n1 and n2 be the sample size required for S1,n and S2,n to have the same asymptotic power 1 - β, then

Thus, if d(n) = nq (q > 0), then the Pitman ARE is given by; .

and Pitman ARE is then;

(8)

Let l 0) be the Fisher information at λ 0. Under some additional conditions, Rao (1963) [25C. Rao, "Criteria of estimation in large samples", Sankhya Ser. A, pp. 189-206.] proved that

Any test statistic Sn achieves the equality in the above is called Pitman efficient.

Under suitable conditions, Pitman ARE can be expressed in terms of correlation coefficient between the two test statistics in their standardized form, as given below.

(P5) are asymptotic joint normal uniformly in a neighborhood of λ0.

Denote p(λ)the asymptotic correlation coefficient between them under, and and be the distribution and density function of. The following result is true.

(van Eden, 1963 [26C. van Eden, "The relationship between Pitman’s asymptotic relative efficiency of two tests and correlation coefficient between their test statistics", Ann. Math. Stat., pp. 1442-1451.
[http://dx.doi.org/10.1214/aoms/1177703876]
]). Assume that S1,n and S2,n satisfy (P1)-(P5) in their standardized form with , and that p(λn) → p(λ λn): = p as λn → λ 0 Then;

(i) For 0 ≤ λ ≤ 1, tests of the form satisfy (P1)-(P5), and the “best” Syn which maximizes is the one with;

and

(9)

(ii) If S1n is the best test satisfying (P1)-(P5), then;

(10)

In the typical case, Sn is an i.i.d. summation (upto scale), then µn(λ) = nµ(λ)

Note does not (α, β) depend on , thus if or, C1 > C2 then {S1n} is better than {S2n} for all (α, β).

Pitman ARE given by (3) or (4) are easy to use. However, they require the two comparing test statistics have the same asymptotic distribution (after standardization), (4) require further that they are jointly asymptotic normal. In practice, these conditions some times cannot be satisfied. For example the chi-squared test Z 0) and have different asymptotic distributions. Below we give a generalized version of (3) to the case the two comparing test statistics not necessarily have the same asymptotic distribution (after standardization). Similar generalizations may have already exist in the literature, we still state our version to see what form it has in this case. Let Fi be the asymptotic distribution of We have;

Assume (P1)-(P4) for Sin with µin, σin and Fi separately, but with the same K and nominal level α, n1 and n2 be the sample sizes required for S1n and N2n to have the same asymptotic power 1 - β(0 < β < 1 - α), then

Thus for d(n) = nq (q > 0), we define the generalized Pitman ARE as;

(11)

In the typical case or 1/q = 2, and;

Note, unlike the case of F1 = F2, in this case, Pitman’s ARE depends on the values of level α and power β , and comparison of two tests may not have consistent result.

Can we have the corresponding form of (10) in the case S1n and S2n have different asymptotic distribution? For this we checked the proof for (4), and find in this case, although in principle there is a relationship among the asymptotic correlation coefficient p between S1n and S2n , the asymptotic distributions’s, Fi's, and the level α and power β , but its mathematically intractable. Below we give its actual value.

Proposition 1.

Remark: When some of the conditions (P1)-(P5) are not satisfied, ARE may not be characterized by correlation coefficient. For example, T1 = Z is an estimate of θ = 0 under H 0, and Z is symmetrically distributed around 0, so EHo (Z) = 0 and suppose VARHo (Z) = 1 . Let, is an estimate of can also be used to test H 0. However , but we cannot say that T2 is a ‘bad’ test statistic, and .

Chernoff ARE. This notion only considers test statistic of the form with the s i.i.d. with be the moment generating function of Y, and;

Let and (assume µ 0 ≤ µ1), (i = 0,1), and is called the Chernoff index . be a linear combination of type I and type II errors evaluated at the critical value t, and Qn = infµ0 ≤ t ≤ µ,Qn (t) be the minimum of these errors for test statistic Sn. Chernoff (1952) [27H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on sums of observations", Ann. Math. Stat., pp. 493-507.
[http://dx.doi.org/10.1214/aoms/1177729330]
] showed that Qn tends to 0 at exponential rate, (so the faster the rate, or the larger absolute value of logQn, the better the test statistic), and established.

the result is independent of γ.

Let {S1,n} and {S2,n} both of the form of i.i.d. summation and have Chernoff indices p1 and p2 respectively, n1 and n2 be the corresponding sample sizes for which Q1,n, ~ Q2,n, the Chernoff ARE of {S1,n} relative to {S2,n} is defined and given by;

(12)

For test statistic not in the form of i.i.d summation, its Chernoff index is difficult to compute. The following result sometimes is very helpful in this case, and give an upper bound of Chernoff index.

(Kallenberg, 1982 [28W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594.
[http://dx.doi.org/10.1214/aos/1176345799]
]) Let for some

Then

In the case of simple null vs simple alternative, Kallenberg (1982) [28W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594.
[http://dx.doi.org/10.1214/aos/1176345799]
] also gives an upper bound of the Chernoff index, and any test statistic achieves this bound is said to be Chernoff efficient. As this bound itself is not easy to compute, we won’t pursue it here, interested readers can check the mentioned paper or the book by Nikitin (1995) [10Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081]
].

As another way to simplify the computation, we consider a modified version of this Chernoff index. Let S be the weak limit of Sn, be the distribution function of S, and Hn: λn + λn = n-1/2be a sequence of local alternatives. As the sample size increases, the test statistic Sn is expected to be able to distinguish the local alternatives from the null. Let (assume µ1 ≥µ 0), and be the asymptotic linear combination of type I and local type II errors evaluated at t, and . The smaller is , the better Sn as a test statistic for H 0vs.H1 For two test statistics S1n and S2n with we define the modified Chernoff ARE as;

(13)

Let, ;

Below we give values pz(θ(0)) and pZMERT and so that their Chernoff ARE can be obtained. We also give and, so their modified Chernoff ARE can be obtained. For the chi-squared test T, under T1 its asymptotic distribution is a non-central chi-squared distribution, with a non-closed form, its modified Chernoff index is not directly computable. Let , where g1 is the observed genotype of the i-th individual, x1 is the corresponding covariates, and let;

Let, and

Proposition 2. (i) Assume is normal with mean and variance . Then, for E to denote expectation with respect to (xi, gi), we have;

Hodges-Lehmann ARE. Consider testing the null hypothesis be given a level α test statistic Sn with critical value the type II error at λ is βn (λ) = Typically, βn (λ) tends to zero at exponential rate, the faster the better Sn is. Hodges and Lehmann (1956) [29J. Hodges, "The efficiency of some nonparametric competitors of the t-test", Ann. Math. Stat., pp. 324-335.
[http://dx.doi.org/10.1214/aoms/1177728261]
] proposed;

as a measure of the performance of Sn and it called the Hodges-Lehmann index of the statistic Sn. For two test statistics S1n and S2n for the same H 0vs,H1 with d1 (λ) and d2 (λ), the Hodges-Lehmann ARE of {S1n} relative to {S2n} at is defined as;

(14)

For probability density functions f and g, let g(x]dx) be the Kullback-Leibler divergence between f) and g). For any test statistic Sn (X1,.....,Xn) based on (X1,.....,Xn) i.i.d. density , the Hodges-Lehmann index has the following property;

and any test statistic achieve the equality in the above is said to be Hodges-Lehmann efficient.

Compared to the Pitman and Chernoff ARE, the Hodges-Lehmman ARE does not require the comparing test statistic have the same asymptotic distribution, nor they have the form of i.i.d. summations, so it has wilder application scope.

Proposition 3. Under conditions of Theorem 4 in Zheng et al. (2010) [30Y. Zang, "Simple algorithms to calculate asymptotic Simple algorithms to calculate asymptotic", J. Stat. Softw., pp. 1-24.], with , given in (2), for λ > 1, we have;

For the chi-squared test T, under H1 its asymptotic distribution is a non-central chi-squared distribution, with no-closed form. So its Hodges-Lehmann ARE is not directly available.

Bahadur ARE. Consider testing the null hypothesis be Let Fn,λ(.) be the distribution function of a test statistic Sn under pλ, and for , let;

the p-value of the observed Sn under the distribution pλ, and;

if the limit exists. Typically, Ln tends to one and Ln tends to zero exponentially fast, and the faster, or the bigger c(.), the better Sn is. For two test statistics Si,n (l = 1,2) for the same hypothesis with Ln, Ci (λ), and sample size ni, to perform “equivalently” in the sense lim n1-1 log L2,n2 = lim n1-1 Log L1,n1, the Bahadur ARE of S1,n log L1n, the Bahadur ARE of relative to S2,n, at , is defined as, and has the property

(15)

The limit C can be computed under the following conditions.

(B1). For

(B2). For the interval , there is a function g on l, such that;

(Bahadur, 1960 [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894]
]). If Sn satisfies (B1)-(B2), then for ,

For any test statistic Sn (X1,....,X2) based on X1,....,Xn i.i.d. density Bahadur (1967) [16R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324.
[http://dx.doi.org/10.1214/aoms/1177698949]
] obtained the following;

Note although the above relationship is regarded as a dual to that of the Hodges-Lehmann index, the two are not equivalent as A test statistic is said to be Bahadur efficient if for each limn, log

Bahadur efficiency of likelihood ratio test has been studied by a number of researchers for some special distribution families. Arcones (2005 [17M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.], Theorem 3.3) proved that, under some regularity conditions, the likelihood ratio statistic is Bahadur efficient. Let be the density function of the data, under his conditions of Theorem 3.3, for each fixed λ > 1 and θ, we have;

Like the Hodges-Lehmman ARE, Bahadur ARE does not require the comparing test statistic have the same asymptotic distribution, nor they have the form of i.i.d. summations, so it has wide application scope.

For computation easiness, we consider a local version of Bahadur ARE. Consider testing H 0: λ = λ 0vs the local alternative H 0: λ = λ 0 + n-1/2. Let F 0 be the asymptotic distribution function of Sn under H 0, we define;

Typically, 0 < <1. The smaller , the better Sn is. For two test statistics Si,n(i = 1,2) for the same hypothesis with Gi,n and , we define the local Bahadur ARE of S1,n relative to S2,nas;

(16)

Proposition 4. (i) with µMERT (λ) given in Proposition 3, we have;

(ii) Under conditions of Theorem 4 in ZLY, µMERT (λ) with be the derivative of µMERT (λ), θ 0 be the value of θ H 0 under, we have;

4. SIMULATION AND APPLICATION TO GENETIC ASSOCIATION STUDIES

4.1. Simulation Study

Let P be the Minor Allele Frequency (MAF) of a marker of interest. We consider case-control data with r = 500 cases and s = 500 controls, and the disease prevalence K = 0.05. We generate 1000 datasets, and compute the means and standard deviations of For ZMERT, we choose θi = 0 and θj = 1.

Table T1 shows the result, the means of AREs and the standard deviations of AREs are in brackets. First we can see the mean of all three AREs are less than 1, which show that Zθo is consistent better than ZMERT. Corresponding tothis fact when θ = θ(o) is the true value Zθ(o), is asymptotically most powerful. Then the three AREs are increased with the P or λ increased. Third, the ep has the lowest variance among the three AREs, next is , last is

Table 1
The AREs of ZMERT and Z θ(0).


4.2. Application

We use 6 reported SNPs associated with breast cancer 2 (Hunter et al. 2007 [31D.J. Hunter, P. Kraft, K.B. Jacobs, D.G. Cox, M. Yeager, S.E. Hankinson, S. Wacholder, Z. Wang, R. Welch, A. Hutchinson, J. Wang, K. Yu, N. Chatterjee, N. Orr, W.C. Willett, G.A. Colditz, R.G. Ziegler, C.D. Berg, S.S. Buys, C.A. McCarty, H.S. Feigelson, E.E. Calle, M.J. Thun, R.B. Hayes, M. Tucker, D.S. Gerhard, J.F. Fraumeni Jr, R.N. Hoover, G. Thomas, and S.J. Chanock, "A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer", Nat. Genet., vol. 39, no. 7, pp. 870-874.
[http://dx.doi.org/10.1038/ng2075] [PMID: 17529973]
]; Li et al., 2008 [32Q. Li, G. Zheng, Z. Li, and K. Yu, "Efficient approximation of P-value of the maximum of correlated tests, with applications to genome-wide association studies", Ann. Hum. Genet., vol. 72, no. Pt 3, pp. 397-406.
[http://dx.doi.org/10.1111/j.1469-1809.2008.00437.x] [PMID: 18318785]
]) to illustrate the ARE of ZMERT. These 6 SNPs are rs10510126, rs12505080, rs17157903, rs1219648, rs7696175, and rs2420946. The counts of subjects with three types of genotypes in cases and controls are shown in Table 2, where (r0, r1, r2) is the number of three genotypes in cases and (s0, s1, s2) is the number of genotypes in controls. From the table, we find three AREs of Ep, Ec and Eb are higher than 75%, sometimes it can reach 97%. For example, for SNP rs17157903, the AREs of, and are 0.8255, 0.8453 and 0.7642, respectively. It shows that ZMERT is a robust test.

Table 2
Three AREs of for 6 reported SNPs associated with breast cancer 2.


APPENDIX

Derivation of : From (P3), we have . Also, as in the proof in Serfling (1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481]
], p. 317-318), if and only if

Thus, for βi,nn) → β, we must have;

Proof of Proposition 1: We use (4) to compute ep (ZMERT,Z(θ(0))). By definition of Z(θ(0))) and CLT we have , and by Theorem 3 in ZLY, Also Z(θ(0))), and ZMERT)are jointly asymptotic normal with correlation . Thus the condition of (4) are satisfied, and it gives;

Proof of Proposition 2. (i) By assumption As in the proof of Theorem 4 in ZLY, we have that where the Vi = Vi (θ) ’s are i.i.d. with;

Under with and Under with and So we have

By example A in Serfling (1980 [8R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481]
], p. 330), we have;

similar to that for (Z(θ(0))).

(ii). We first compute . In this case, let be the weak limit of (Z(θ(0))). Then,

Proof of Proposition 3. Since under , we have tn(α)→ Ф-1 (1- α); and under is continuous on (- ∞, ∞), the distribution function of converges to uniformly Ф (.). Note µ(λ, θ) > 0, so for λ > 1 we have;

Let , using L’hopital’s rule twice, we get;

Similarly, under where The same way we get;

Proof of Proposition 4. i). In our case and when uniformly in Sn. From proof of Theorem 4 in ZLY, we have that for (a.s.). Now we compute, for

Let , and use L’Hopital’s rule,

Since , and by L’hopital’s rule, , so use L’Hopital’s rule on the above again,

Thus by Bahadur’s (1960) [13R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894]
] Theorem,

is similarly computed;

Similarly, under , (a.s.), so

CONSENT FOR PUBLICATION

Not applicable.

CONFLICT OF INTEREST

The authors declare no conflict of interest, financial or otherwise.

ACKNOWLEDGEMENTS

Declared none.

REFERENCES

[1] C.M. Sitlani, and B. McKnight, "Relative efficiency of trend tests with misspecified genetic models in stratified analyses of case-control or cohort data", Hum. Hered., vol. 71, no. 4, pp. 246-255.
[http://dx.doi.org/10.1159/000328858] [PMID: 21811075]
[2] G. Zheng, "Some statistical properties of efficiency robust tests with applications to genetic association studies", Scand. J. Stat., pp. 762-774.
[http://dx.doi.org/10.1111/sjos.12060]
[3] J. Gastwirth, "On robust procedures", J. Am. Stat. Assoc., pp. 929-948.
[http://dx.doi.org/10.1080/01621459.1966.10482185]
[4] J. Gastwirth, "The use of maximin efficiency robust tests in combining contingency tables and survival analysis", J. Am. Stat. Assoc., pp. 380-384.
[http://dx.doi.org/10.1080/01621459.1985.10478127]
[5] A. Birnbaum, "Efficiency robust 2-sample rank tests", J. Am. Stat. Assoc., pp. 1241-1251.
[http://dx.doi.org/10.1080/01621459.1967.10500929]
[6] J. Rosen, The gradient projection method for non-linear programming Part I: Linear constraints., SIAM J., pp. 181-217.
[7] J. Gastwirth, "On robust rank test", Nonparametric Techniques in Statistical Inference, Cambridge University Press: London, .
[8] R. Serfling, Approximation Theorems of mathematical Statistics Wiley., Wiley, .
[http://dx.doi.org/10.1002/9780470316481]
[9] Y. Nikitin, "Asymptotic relative efficiency in testing (version 5)", Stat Prob: The Encyclopedia Sponsored by Statistics and Probability Societies, .
[10] Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511530081]
[11] A. van der Varrt, Asymptotic Statistics., Cambridge University Press, .
[http://dx.doi.org/10.1017/CBO9780511802256]
[12] H. Wieand, "A condition under which the Pitman and Bahadur", Ann. Stat., pp. 1003-1011.
[http://dx.doi.org/10.1214/aos/1176343600]
[13] R. Bahadur, "Stochastic comparison of tests", Ann. Math. Stat., pp. 276-295.
[http://dx.doi.org/10.1214/aoms/1177705894]
[14] A. Dembo, Large deviation techniques and applications 2nd., Springer: New York, .
[http://dx.doi.org/10.1007/978-1-4612-5320-4]
[15] J. Deuschel, Large deviations, Academic Press: Boston, .
[16] R. Bahadur, "Rates of convergence of estimates and test statistics", Ann. Math. Stat., pp. 303-324.
[http://dx.doi.org/10.1214/aoms/1177698949]
[17] M. Arcones, "Bahadur efficiency of the likelihood ratio test", Math. Methods Stat., pp. 163-179.
[18] W. Hoeffding, "Asymptotically optimal tests for multinomial distributions (with discussion)", Ann. Math. Stat., pp. 369-408.
[http://dx.doi.org/10.1214/aoms/1177700150]
[19] I. Sanov, On the probability of large deviations of random variables.Sel., Transl. Math. Statist. Prob, pp. 213-244.
[20] H. Rubin, "Bayes risk efficiency", Sankhya, A., pp. 325-346.
[21] W. Kallenberg, "Intermediate efficiency, theory and examples", Ann. Stat., pp. 170-182.
[http://dx.doi.org/10.1214/aos/1176346067]
[22] A. Borovkov, "Large deviations and testing of statistical hypotheses", Sib. Adv. Math., .
[23] E. Pitman, Lecture Notes on Nonparametric Statistical Inference, Columbia University: Mimeographed., 1949.
[24] G. Noether, "Asmptotic properties of the wald-wolfowitz test of randomness", Ann. Math. Stat., pp. 231-246.
[http://dx.doi.org/10.1214/aoms/1177729841]
[25] C. Rao, "Criteria of estimation in large samples", Sankhya Ser. A, pp. 189-206.
[26] C. van Eden, "The relationship between Pitman’s asymptotic relative efficiency of two tests and correlation coefficient between their test statistics", Ann. Math. Stat., pp. 1442-1451.
[http://dx.doi.org/10.1214/aoms/1177703876]
[27] H. Chernoff, "A measure of asymptotic efficiency for tests of a hypothesis based on sums of observations", Ann. Math. Stat., pp. 493-507.
[http://dx.doi.org/10.1214/aoms/1177729330]
[28] W. Kallenberg, "Chernoff efficiency and deficiency", Ann. Stat., pp. 583-594.
[http://dx.doi.org/10.1214/aos/1176345799]
[29] J. Hodges, "The efficiency of some nonparametric competitors of the t-test", Ann. Math. Stat., pp. 324-335.
[http://dx.doi.org/10.1214/aoms/1177728261]
[30] Y. Zang, "Simple algorithms to calculate asymptotic Simple algorithms to calculate asymptotic", J. Stat. Softw., pp. 1-24.
[31] D.J. Hunter, P. Kraft, K.B. Jacobs, D.G. Cox, M. Yeager, S.E. Hankinson, S. Wacholder, Z. Wang, R. Welch, A. Hutchinson, J. Wang, K. Yu, N. Chatterjee, N. Orr, W.C. Willett, G.A. Colditz, R.G. Ziegler, C.D. Berg, S.S. Buys, C.A. McCarty, H.S. Feigelson, E.E. Calle, M.J. Thun, R.B. Hayes, M. Tucker, D.S. Gerhard, J.F. Fraumeni Jr, R.N. Hoover, G. Thomas, and S.J. Chanock, "A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer", Nat. Genet., vol. 39, no. 7, pp. 870-874.
[http://dx.doi.org/10.1038/ng2075] [PMID: 17529973]
[32] Q. Li, G. Zheng, Z. Li, and K. Yu, "Efficient approximation of P-value of the maximum of correlated tests, with applications to genome-wide association studies", Ann. Hum. Genet., vol. 72, no. Pt 3, pp. 397-406.
[http://dx.doi.org/10.1111/j.1469-1809.2008.00437.x] [PMID: 18318785]
Track Your Manuscript:


Endorsements



"Open access will revolutionize 21st century knowledge work and accelerate the diffusion of ideas and evidence that support just in time learning and the evolution of thinking in a number of disciplines."


Daniel Pesut
(Indiana University School of Nursing, USA)

"It is important that students and researchers from all over the world can have easy access to relevant, high-standard and timely scientific information. This is exactly what Open Access Journals provide and this is the reason why I support this endeavor."


Jacques Descotes
(Centre Antipoison-Centre de Pharmacovigilance, France)

"Publishing research articles is the key for future scientific progress. Open Access publishing is therefore of utmost importance for wider dissemination of information, and will help serving the best interest of the scientific community."


Patrice Talaga
(UCB S.A., Belgium)

"Open access journals are a novel concept in the medical literature. They offer accessible information to a wide variety of individuals, including physicians, medical students, clinical investigators, and the general public. They are an outstanding source of medical and scientific information."


Jeffrey M. Weinberg
(St. Luke's-Roosevelt Hospital Center, USA)

"Open access journals are extremely useful for graduate students, investigators and all other interested persons to read important scientific articles and subscribe scientific journals. Indeed, the research articles span a wide range of area and of high quality. This is specially a must for researchers belonging to institutions with limited library facility and funding to subscribe scientific journals."


Debomoy K. Lahiri
(Indiana University School of Medicine, USA)

"Open access journals represent a major break-through in publishing. They provide easy access to the latest research on a wide variety of issues. Relevant and timely articles are made available in a fraction of the time taken by more conventional publishers. Articles are of uniformly high quality and written by the world's leading authorities."


Robert Looney
(Naval Postgraduate School, USA)

"Open access journals have transformed the way scientific data is published and disseminated: particularly, whilst ensuring a high quality standard and transparency in the editorial process, they have increased the access to the scientific literature by those researchers that have limited library support or that are working on small budgets."


Richard Reithinger
(Westat, USA)

"Not only do open access journals greatly improve the access to high quality information for scientists in the developing world, it also provides extra exposure for our papers."


J. Ferwerda
(University of Oxford, UK)

"Open Access 'Chemistry' Journals allow the dissemination of knowledge at your finger tips without paying for the scientific content."


Sean L. Kitson
(Almac Sciences, Northern Ireland)

"In principle, all scientific journals should have open access, as should be science itself. Open access journals are very helpful for students, researchers and the general public including people from institutions which do not have library or cannot afford to subscribe scientific journals. The articles are high standard and cover a wide area."


Hubert Wolterbeek
(Delft University of Technology, The Netherlands)

"The widest possible diffusion of information is critical for the advancement of science. In this perspective, open access journals are instrumental in fostering researches and achievements."


Alessandro Laviano
(Sapienza - University of Rome, Italy)

"Open access journals are very useful for all scientists as they can have quick information in the different fields of science."


Philippe Hernigou
(Paris University, France)

"There are many scientists who can not afford the rather expensive subscriptions to scientific journals. Open access journals offer a good alternative for free access to good quality scientific information."


Fidel Toldrá
(Instituto de Agroquimica y Tecnologia de Alimentos, Spain)

"Open access journals have become a fundamental tool for students, researchers, patients and the general public. Many people from institutions which do not have library or cannot afford to subscribe scientific journals benefit of them on a daily basis. The articles are among the best and cover most scientific areas."


M. Bendandi
(University Clinic of Navarre, Spain)

"These journals provide researchers with a platform for rapid, open access scientific communication. The articles are of high quality and broad scope."


Peter Chiba
(University of Vienna, Austria)

"Open access journals are probably one of the most important contributions to promote and diffuse science worldwide."


Jaime Sampaio
(University of Trás-os-Montes e Alto Douro, Portugal)

"Open access journals make up a new and rather revolutionary way to scientific publication. This option opens several quite interesting possibilities to disseminate openly and freely new knowledge and even to facilitate interpersonal communication among scientists."


Eduardo A. Castro
(INIFTA, Argentina)

"Open access journals are freely available online throughout the world, for you to read, download, copy, distribute, and use. The articles published in the open access journals are high quality and cover a wide range of fields."


Kenji Hashimoto
(Chiba University, Japan)

"Open Access journals offer an innovative and efficient way of publication for academics and professionals in a wide range of disciplines. The papers published are of high quality after rigorous peer review and they are Indexed in: major international databases. I read Open Access journals to keep abreast of the recent development in my field of study."


Daniel Shek
(Chinese University of Hong Kong, Hong Kong)

"It is a modern trend for publishers to establish open access journals. Researchers, faculty members, and students will be greatly benefited by the new journals of Bentham Science Publishers Ltd. in this category."


Jih Ru Hwu
(National Central University, Taiwan)


Browse Contents




Webmaster Contact: info@benthamopen.net
Copyright © 2023 Bentham Open