INTRODUCTION
The problem of estimating the mode of a probability density function (pdf)
is a matter of both theoretical and practical interest. Parzen
(1962) considered the problem of estimating the mode of a univariate pdf.
Parzen (1962) and Nadaraya (1965)
have shown that under regularity conditions the estimator of the population
mode obtained by maximizing a kernel estimator of the pdf is strongly consistent
and asymptotically normally distributed. Samanta (1973)
has given multivariate versions of Parzen’s results. Samanta
and Thavaneswaran (1990) considered the problem of estimating the mode of
a conditional pdf and they have shown under regularity conditions that the estimator
of the population conditional mode is strongly consistent and asymptotically
normally distributed. Salha and Ioannides (2004) generalized
these results by considering the conditional mode evaluated at distinct conditional
points. Vieu (1996) presented and compared four mode
estimation procedures. Recently, for random design models, Ziegler
(2002) proposed a kernel estimator of the mode and its asymptotic normality
has been shown by Ziegler (2003). In addition, Ziegler
(2004) presented an adaptive kernel estimator for the mode.
Assume that (X_{1}, Y_{1},…,X_{n}, Y_{n}) are i.i.d. random variables with joint pdf f(x, y) and a conditional pdf f(yx) of Y_{1} given X_{1} = x. We assume that for each x, f(yx) is uniformly continuous in y and it follows that f(yx) possesses a uniquely conditional mode M(x) defined by:
Samanta and Thavaneswaran (1990) considered the problem
of estimating the conditional mode and they use the Nadaraya Watson (NW) estimator
of the conditional density function, but this estimator has disadvantages of
producing rather large bias and boundary effects. To overcome these difficulties.
Hall et al. (1999) proposed the Rewighted Nadaraya
Watson (RNW) estimator as a weighted version of the NW estimator, which combines
the better sides of the Local Linear (LL) estimators such as bias reduction
and no boundary effects to preserve the property of the NW estimator is always
a distribution function.
Let τ _{I}(x) denote the probability like weights with properties
that τ_{i}(x)≥0,
and ,
where K(. ) is a kernel function,
and h = h_{n}>0 is the bandwidth. The roll of τ_{i}(x)
is to adjust the NW weights such that the resulting conditional density estimator
resembles that from the LL estimators. The RNW conditional density estimator
is defined as follows:
If K(u) is chosen such that K(u) tends to zero as u tends to ± ∞, then for every sample sequence and for each x, f_{n}(yx) is a continuous function of y and tends to zero as y tends to ± ∞ . Consequentially, there is a random variable M_{n}(x), which is called the sample conditional mode, such that:
In this study, the conditional mode will be estimated using the RNW estimator of the conditional pdf and the asymptotic normality of this estimator will be proved and its performance will be examined by two applications.
CONDITIONS
Consider the following conditions:
Condition 1
The kernel function K(u) is a symmetric and bounded probability density
function such that
• 
The first two derivatives of K(u), (K^{(I)} (u), I = 1,2) are
functions of bounded variations. 
• 

• 

Condition 2
The marginal density g(x) is uniformly continuous and is bounded from below
by a positive constant.
Condition 3
The partial derivatives
exist and are bounded for 1≤I+j≤3.
Condition 4
The bandwidth satisfying the following:
• 

• 

MAIN RESULTS
Here, the main two theorems of this study theorem 1 and 2 will be presented and proved. For proving these theorems, the following lemmas are required.
Lemma 1
Under the conditions 1, 2 and 4 (I) the following is true:
Where:
Poof
The proof of this lemma is a part of the proof of theorem 1 by De
Gooijer and Zerom (2003).
Let
Where:
Lemma 2
Under the conditions 1, 3 and 4, the following holds:
• 

• 

Proof
Using Taylor expansion and integration by parts,
Then,
Then,
From Eq. 1 and 2, we get:
• 

• 

This completes the proof of the lemma.
Now,
This implies that:
Lemma 3
Under the conditions 1, 3 and 4, the following holds:
Proof
Let .
.
Since, E(ε_{i}x) = 0, then E(Δ) = 0 which implies that E(J_{1})
= 0.
This implies that,
To show that,
we will use Liapunov’s Theorem, (Pranab and Julio Singer,
1993). It is sufficient to show that:
Since, .
Therefore, the following holds:
Since,
Therefore, ρ_{n}→ ∞.
This implies that:
which leads to:
Since,
we get that:
Theorem 1
Under the conditions 1, 3 and 4 the following is true:
where;
Proof
A combination of lemma 3 and Eq. 3 completes the proof
of theorem 1.
Now, using Taylor expansion:
This implies that:
where;
Therefore,
Lemma 4
Under the conditions 14, the following holds:
Proof
Using the same techniques of lemma 4 by Samanta and Thavaneswaran
(1990).
Theorem 2
Under the conditions 14, the following is true:
Proof
The proof of the theorem follows directly by using Eq. 4,
lemma 4 and theorem 1.
Note that Bias (f_{n} (M (x)x)→0 if we assume that the second
moment of the kernel function K vanishes, that is
as by Samanta and Thavaneswaran (1990).
Applications
The proposed method of RNW estimator is applied to find the conditional
mode of different data sets. Standardized normal kernel function is used and
the weights τ_{i} (x) are calculated as described by De
Gooijer and Zerom (2003) and Cai (2002).
Example 1
This application will depend on some simulation data. Simulate a sample
of size 200 from the model y = sin2π (1x)^{2}+xe, where x∼N(0,1)
and e∼uniform[0,1]. A perfect smooth would recapture the original single
y = sin2π(1x)^{2}, exactly. For a direct comparison of the perfect
smooth and the conditional mode estimation, a scatter plot of the original data,
the perfect smooth and the estimated conditional mode curve is shown in Fig.
1. The performance of the estimator can be tested using R^{2}_{y,Ŷ}
(the correlation coefficient between Ŷ, the predicted values and y, the
actual values).
where, SSE =
denotes the error sum of squares, SSTO =
denotes the total sum of squares and denotes
the mean of the actual values . For the current data, SSE = 1.3958, which is
small relative to SSTO = 15.3209 and R^{2}_{y,Ŷ} = 0.9089,
which is closed to l and indicates that the correlation between the actual and
predicated values is very strong. This comparison indicates that the proposed
estimator of the conditional mode is reasonably good.

Fig. 1: 
Comparison between the mode estimation and the perfect curve 

Fig. 2: 
Three different estimations for the ethanol data 
Example 2
Consider the ethanol data, which describes the relationship between the
predictor E (ethanol) and the response NOx (Nitric Oxide). Clearly the relationship
is not linear. The regression relation estimated by using three different estimators,
the proposed estimator (mode estimator) and another two estimators from SPlus
program, the locally weighted regression (loess) estimator and the kernel estimator.
A scatter plot of the data together with the graphs of the three estimators
is shown in Fig. 2. It is clear that the proposed estimator
is reasonably good.