April 30, 2026

Adventure Awaits Journeyers

Discovering the World Anew

Measurement of tourism eco-efficiency, spatial distribution, and influencing factors in China

Measurement of tourism eco-efficiency, spatial distribution, and influencing factors in China

Case study introduction

The study area is selected to include 30 provincial-level administrative units in China (excluding Hong Kong, Macau, Taiwan, and Tibet). China boasts abundant resource endowments and diverse regional characteristics, with significant regional imbalances in the development of tourism across provinces. This imbalance is closely related to improvements in eco-efficiency and practical pathways toward sustainable development. As a strategic pillar industry in China, tourism not only drives economic growth but also exerts considerable pressure on the environmental carrying capacity. Since the early 21st century, China has gradually intensified its focus on the construction of an ecological civilization, transitioning from an extensive growth model to a high-quality development model. The goal of improving TEE has been explicitly stated. During the 12th and 13th Five-Year Plans (2011–2020), China’s economy experienced rapid growth, and the tourism industry entered a phase of comprehensive development. This period not only marks an important transitional stage for the tourism industry, but its related data also hold significant typical and policy reference value. Therefore, this study selects 30 provincial-level administrative units in China as the research subjects, with the research period spanning from 2011 to 2020. This period is both highly typical and representative. The specific locations of the case study areas are shown in Fig. 1.

Fig. 1
figure 1

Overview of the Case Study Area.

Methodology

Technical roadmap

First, TEE was measured using the super-SBM model, which considers undesirable outputs. To describe its time series and spatial distribution characteristics, various methods, including the Theil index, nonparametric kernel density estimation, standard deviation ellipse analysis, and hotspot analysis, were subsequently applied. Finally, an impact framework for TEE, namely, the TOE framework, was constructed. OPGD was introduced to explore the driving effects of both single and multiple factors within the TOE framework. The detailed process is shown in Fig. 2.

Fig. 2
figure 2

Super-SBM model for unexpected outputs

Owing to the limitations and biases caused by radial and angular selection in traditional DEA models, Tone proposed an improved DEA model, namely, the SBM model (Tone 2001). However, the aforementioned model cannot handle the comparison and ranking of provinces (cities) with relative efficiency values of 1. To address this issue and further optimize the model, the super-efficiency SBM model was proposed. We selected 30 provinces (municipalities directly under the central government) as decision-making units. Provinces are economically and statistically independent, making it easier to obtain comprehensive and reliable input and output data (Charnes et al. 1997). One of the advantages of the super-efficiency SBM model is its capability for ‘multiple inputs and outputs’. According to the production function’s input‒output indicators, labor input, fixed asset investment, energy consumption in tourism, and tourism revenue all satisfy the requirements for normal production activities, and each of these indicators can represent the conditions of the province. The model is as follows:

$$\begin{array}{cc}\min \rho =\displaystyle\frac{\frac{1}{m}\mathop{\sum }\nolimits_{i=1}^{m}(\bar{x}/{x}_{ik})}{\frac{1}{{r}_{1}+{r}_{2}}\left(\mathop{\sum }\nolimits_{s=1}^{{r}_{1}}\overline{{y}^{d}}/{y}_{sk}^{d}+\mathop{\sum }\nolimits_{q=1}^{{r}_{2}}\overline{{y}^{u}}/{y}_{qk}^{u}\right)}\\ \left\{\begin{array}{l}\bar{x}\cdots \mathop{\sum}\limits_{j=1,\ne k}^{n}{x}_{ij}{\lambda}_{j};\overline{{y}^{d}}_{\mbox{”}}\mathop{\sum }\limits_{j=1,\ne k}^{n}{y}_{sj}^{d}{\lambda }_{j};\overline{{y}^{d}}\cdots \mathop{\sum }\limits_{j=1,\ne k}^{n}{y}_{qj}^{d}{\lambda }_{j};\bar{x}.. {x}_{k};\overline{{y}^{d}}_{\mbox{”}}{y}_{k}^{d};\overline{{y}^{u}}\cdots {y}_{k}^{u};\\ {\lambda }_{j}\cdots 0,i=1,2,\cdots ,m;j=1,2,\cdots ,n,j\,\ne\, 0;s=1,2,\cdots ,{r}_{1};q=1,2,\cdots ,{r}_{2};\end{array}\right.\end{array}$$

(1)

The model assumes that there are N decision-making units (DMUs) and that each DMU is composed of inputs, m, desired outputs, r1, and undesired outputs, r2, x, yd, and yu represent the elements in the corresponding input matrix, desired output matrix, and undesired output matrix, respectively. ρ represents the TEE value. λ is a column vector.

Standard deviation ellipse

The standard deviational ellipse reveals the spatial distribution characteristics of various elements(Wang et al. 2022c). It quantitatively describes the spatial distribution features of observed variables by calculating parameters such as the centroid, orientation, major axis, and minor axis of the ellipse, which represent the spatial distributions of geographic elements.

$${\overline{X}}_{w}=\mathop{\sum}\limits _{i=1}{\scriptstyle{n}\atop}{w}_{i}{x}_{i}/\mathop{\sum}\limits _{i=1}{\scriptstyle{n}\atop}{w}_{i};{\overline{Y}}_{w}=\mathop{\sum}\limits _{i=1}{\scriptstyle{n}\atop}{w}_{i}{y}_{i}/\mathop{\sum}\limits _{i=1}{\scriptstyle{n}\atop}{w}_{i}$$

(2)

$$\tan \theta =\frac{\left(\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}{\tilde{x}}_{i}^{2}-\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}{\tilde{y}}_{i}^{2}\right)+\sqrt{{\left(\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}{\tilde{x}}_{i}^{2}-\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}{\tilde{y}}_{i}^{2}\right)}^{2}+4\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}{\tilde{x}}_{i}^{2}{\tilde{y}}_{i}^{2}}}{2\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}\tilde{{x}_{i}}\tilde{{y}_{i}}}$$

(3)

$${\sigma }_{x}=\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{({w}_{i}{\tilde{x}}_{i}\cos \theta -{w}_{i}{\tilde{y}}_{i}\sin \theta )}^{2}/\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}}$$

(4)

$${\sigma }_{y}=\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{({w}_{i}{\tilde{x}}_{i}\sin \theta -{w}_{i}{\tilde{y}}_{i}\cos \theta )}^{2}/\mathop{\sum }\nolimits_{i=1}^{n}{w}_{i}^{2}}$$

(5)

where (xi, yi) are the spatial coordinates of the observed variables; wi is the spatial weight; (w, w) is the weighted mean center; Ѳ is the orientation of the standard deviational ellipse; i and i are the coordinate deviations from the weighted mean center; and σx and σy are the standard deviations along the x- and y-axes, respectively.

Hotspot analysis

The Getis‒Ord Gi* statistic is used to identify clusters of high and low values within a spatial region and reveal the spatial distribution patterns of hot spots and cold spots (Van der Zee et al. 2020). It calculates a standardized Z score, where higher values indicate hot spot areas and lower values indicate cold spot areas. A Z value approaching zero indicates that the spatial clustering characteristics of the region are not significant. The formula is as follows:

$$G{i}^{\ast }=\frac{\mathop{\sum }\nolimits_{j=1}^{n}{W}_{i,j}-\bar{X}\mathop{\sum }\nolimits_{j=1}^{n}{W}_{i,j}}{S\sqrt{\frac{n\mathop{\sum }\nolimits_{j=1}^{n}{W}_{i,j}-{\left(\mathop{\sum }\nolimits_{j=1}^{n}{W}_{i,j}\right)}^{2}}{n-1}}},\bar{X}=\frac{\mathop{\sum }\nolimits_{j=1}^{n}{X}_{j}}{n},S=\sqrt{\frac{\mathop{\sum }\nolimits_{j=1}^{n}{X}_{j}^{2}}{n}}-{(\bar{X})}^{2}$$

(6)

The equation represents the Getis‒Ord Gi* statistic, where X represents the attribute value of spatial feature j, Wij represents the spatial weight between feature i and feature j, n represents the total number of spatial features, represents the mean of the attribute values of the spatial features, and S represents the standard deviation of the attribute values of the spatial features. The Gi* statistic corresponds to the z score. A higher z score indicates tighter clustering of high attribute values among the spatial features, whereas a lower z score indicates tighter clustering of low attribute values.

Geodetector analysis for optimal parameters

The OPGD model is derived from the geographic detector model by incorporating additional parameters. Parameter optimization can detect the scale and zoning effects of spatial data, which helps improve the quality and accuracy of research results. Determining the optimal scale of spatial stratification heterogeneity through spatial data discretization is a key step in applying a geographic detector. The effectiveness of discretization classification can be evaluated on the basis of the magnitude of the q statistic from the geographic detector. Using the GD package in R, methods such as Equal Breaks, Natural Breaks, Quantile Breaks, Geometric Breaks, and Standard Deviation Breaks are applied, with the number of classification levels set between 3 and 7 classes. The optimal parameters are the best combination of spatial data discretization methods, the number of breakpoints, and spatial scale. By calculating the q values for different combinations, the combination with the highest Q value is ultimately selected as the optimal parameter. In this study, since the research scale is fixed, the optimal parameters were determined based on the discretization method and the number of breakpoints calculated using the R language package (Song et al. 2020). Building upon the selection of optimal parameters, the geographic detector method is employed to reveal the driving forces behind the spatial distribution characteristics of TEE in China (Wang et al. 2010). A larger value indicates a stronger explanatory power of the driving factors on the changes in TEE. The formula is as follows:

$$q=1-\frac{\mathop{\sum }\nolimits_{h=1}^{L}{N}_{h}{\sigma }_{h}^{2}}{{\rm{N}}{\sigma }^{2}}=1-\frac{{\rm{SSW}}}{{\rm{SST}}}$$

(7)

$${\rm{SSW}}=\mathop{\sum }\nolimits_{{\rm{h}}=1}^{{\rm{L}}}{N}_{h}{\sigma }_{h}^{2},SST=N{\sigma }^{2}$$

(8)

where q denotes the explanatory power of the factor with a value range of [0, 1], with values closer to 1 indicating greater explanatory power; h represents the stratification (strata) of the explanatory or interpreted variables; Nn and N represent the number of cells in stratum h and the whole region; σn and σ2 represent the variance of the Y values in stratum h and the whole region, respectively; and SSW and SST represent the sum of the intrastratum variance and the total region-wide variance, respectively.

The interaction detector determines the characteristics of the interaction between two variables by comparing the q values of a single factor with those of the interaction of two factors. The interaction of driving factors is identified by examining the q values of the detection results to determine whether the combined effect between driving factors increases or decreases the explanatory power of the analyzed variable (see Table 1).

Table 1 Interaction Detection.

Measurement

TEE measurement

The measurement indicators of TEE vary depending on the research context and data availability, but they always include input and output indicators. Considering the new requirements of ecological civilization construction for high-quality tourism development, this study refers to existing research indicator systems (Guo et al. 2022; Wang et al. 2022a) and proposes the following indicator system (see Table 2).

Table 2 Measurement Indicators for TEE.

First, in terms of input indicators, labor, capital, and land are regarded as the basic production factors of tourism economic activities (Zha et al. 2020). During the study period, there were significant differences in the statistical standards for tourism labor, resulting in large data fluctuations. This study uses the number of people employed in the tertiary industry as a proxy indicator. Capital refers to the investment resources associated with tourism economic activities. In this study, it specifically refers to fixed asset investment. Owing to limitations in tourism fixed asset investment data and drawing from the existing research, this study uses the sum of the number of star-rated hotels, travel agencies, and weighted scenic spots as a representation of fixed asset investment. Another input indicator selected in this study is tourism energy consumption (Perch-Nielsen et al. 2010). This study estimates tourism energy consumption and carbon emissions indirectly (Becken and Patterson 2006; Perch-Nielsen et al. 2010). Land refers to the land resources used for tourism development, such as land for building scenic spots, hotels, transportation facilities, etc. However, because tourism satellite accounts do not include tourism land, this factor is not considered in this study.

Second, in terms of expected output indicators, the tourism industry aims to meet tourist demand and create social and economic output. Economic output is the ideal output indicator. This study uses total tourism revenue to represent economic output. To eliminate the impact of price fluctuations, the study uses the consumer price index (CPI), with 2011 as the base year, to convert total tourism revenue into constant prices. The total number of tourist arrivals is considered a proxy for social output. Additionally, tourist satisfaction is also an ideal indicator of output for a tourism destination, but due to data limitations, this indicator is not considered (Peng et al. 2017). In terms of unintended output indicators, during tourism economic activities, the emission of pollutants is inevitable when using production factors to produce goods and services. Tourism carbon emissions reflect the impact of tourism development on the ecological environment, so tourism carbon emissions are considered a negative output indicator. Following the approach of previous studies (Wang et al. 2022a), this study adopts a bottom-up measurement method (see the supplemental material).

Factor measurement

We constructed a framework of influencing factors on the basis of “technology—organization—environment” (Baker 2012; Zeng et al. 2023) (see Table 3). This framework was proposed by Tornatzky and Fleischer and later applied to macro-level research (Zeng et al. 2023). Its advantage lies in the ability to simultaneously consider the comprehensive static impacts of three dimensions—technology, organization, and environment—within the TOE framework, providing a more comprehensive reflection of the complexity and diversity of TEE improvement. The technological dimension includes regional innovation capability and the digitalization level; the organizational dimension encompasses government attention and human resources; and the environmental dimension covers resource endowment, transportation convenience, and the economic development level.

  1. (1)

    Regional Innovation Capability: The supply side of tourism can utilize technologies such as big data and blockchain to innovate cultural and tourism industry products, formats, and scenarios (Yunita et al. 2019). Technological innovation also provides strong support for innovative regulation of integrated market phenomena, favoring tourism development (Gan et al. 2023; Pascual-Fernández et al. 2021). Referring to existing research (Wang et al. 2022b), we use the knowledge creation index disclosed in the regional innovation report for measurement. Specific indicators can be found in the China Regional Innovation Capability Evaluation Report (see the supplemental material).

  2. (2)

    Digitalization Level: The degree of digitalization promotes cross-industry exchanges and collaborations among tourism market entities on the basis of organizational innovation (Aghaei et al. 2021). The data analysis capabilities provided by digital technology help tourism market entities respond quickly to the precise production and market processes between supply and demand (Hadjielias et al. 2022). The level of digitalization typically involves digital infrastructure (Okafor et al. 2023) and digital service capabilities (Zhang et al. 2024). Therefore, we constructed a digitalization measurement system with seven indicators based on these two aspects: digital infrastructure and digital services.

  3. (3)

    Government Support Level: The government can increase efforts to enhance research and financial investment to ensure regional technological upgrades (Li et al. 2023; Sun et al. 2024). A sound institutional mechanism also mitigates the rent-seeking behaviors caused by market failure (Greenwood et al. 2011).Drawing on existing research (Zhang 2025), this study measures tourism-related factors based on the frequency of the word “tourism” in government reports and the number of officially issued policies.

  4. (4)

    Human Resources: High-quality labor is a comprehensive entity that integrates explicit skills and tacit knowledge (Tandon et al. 2023). It can apply uncoded tacit knowledge and experiential skills to the tourism development process, promoting organizational innovation, technological innovation, and tourism product innovation in the cultural and tourism sectors.

  5. (5)

    Transportation Convenience: The transportation system facilitates the sharing of knowledge, technology, and talent across administrative regions and industries in the cultural and tourism sectors (Tang 2021). Its externality feature may accelerate the diffusion speed of production factors and provide innovative conditions for interregional and interindustry innovation(Li and Liu 2022).

  6. (6)

    Economic Development Level: The economic development level promotes effective resource allocation through price mechanisms and increases the aggregation of production factors such as technology and high-quality labor (Zhang 2023).

Table 3 Measurement Indicators for the Independent Variables.

link

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © All rights reserved. | Newsphere by AF themes.