GDP and the more similar economically, the higher the probability of the formation of an
RTA between two countries. The probability of the formation of an RTA is also higher when
the difference between two countries with respect to factor endowments is larger. These factors
are regarded as source of the potential for trade creation and promoting the governments
to RTAs, but they are not included in the estimated equation. Baier and Bergstrand (2007)
pointed out that IV methods applied to cross-section data so as to address the endogenous bias
are not reliable because of difficulties in selection of appropriate IVs. Instead of using IVs,
they suggested an application of the fixed effects model using panel data, because the source
of the endogeneity bias in the gravity equation could be unobserved time-invariant heterogeneity.
As in Baier and Bergstrand (2007), we apply the fixed effects model using panel data
to control for endogeneity bias.
The zero trade flow problem is also a matter of serious concern. In particular, zero trade
flows may appear frequently in sectoral and product-level trade data.5 For instance, the percentage
of zero trade flows in the total number of our sample is about 50 per cent. Most
studies on the impact of RTAs on foreign trade, by estimating the gravity equation, omitted
zero trade flows since the log of a zero value is not defined. However, omitting zero trade
flows results in biased results if zero trade flows do not occur randomly. It seems appropriate
to assume that factors such as distance, lack of political and cultural links, and large differences
in production structures lead to zero trade between countries.
Several studies have dealt with this using the Tobit model, the Heckman sample selection
model and the PPML method. Trade flows could not be negative, and import values from the
UN COMTRADE database, which we use for the estimation, are reported when values are
greater than 1 US$. Consequently, the gravity model with zero trade flows is categorised as a
type 2 Tobit model, namely the sample selection model.6 Linders and de Groot (2006) employ
the Heckman sample selection model, which incorporates the decision on trade. Given that
zero trade flows usually do not appear randomly, the sample selection model is appropriate.
However, theoretically founded models of the firm’s decision on trade have not been well
developed, and thus, it is difficult to find specific explanatory variables for the selection
model. In the case where the same variables are used in both the selection and the outcome
equations, multicollinearity could be caused and error variances increase. Santos Silva and
Tenreyro (2006) on the other hand propose the PPML estimation technique which provides a
way to deal with zero trade values. They also show that the PPML estimation can provide
robust estimates in the presence of heteroscedasticity. Although Martin and Pham (2008) demonstrated
the PPML estimator yields severely biased estimates when zero trade flows are frequent,
Santos Silva and Tenreyro (2011) extended their simulation to the case of a large
proportion of zeros, and confirmed that the PPML estimator is generally well behaved even
when the dependent variable, that is trade flow in the gravity model, has a large proportion of
zeros in overall bilateral trade.