Vol. 6, Iss. 2, 2020
Generating reliable tourist accommodation statistics: Bootstrapping regression model for overdispersed long-tailed data
Nguyen Van Truong (ORCiD), University of Transport and Communication, Vietnam & & Japan Transport and Tourism Research Institute, Japan, Tetsuo Shimizu (ORCiD), Tokyo Metropolitan University, Japan, Takeshi Kurihara (ORCiD), Toyo University, Japan, Sunkyung Choi (ORCiD), Tokyo Institute of Technology & Japan Transport and Tourism Research Institute, Japan
Published online: 30 May 2020, JTHSM, 6(2), pp.30-37.
URN: urn:nbn:de:0168-ssoar-66291-9, DOI: 10.5281/zenodo.3837608
DataCite XML Export
<?xml version='1.0' encoding='utf-8'?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd"> <identifier identifierType="DOI">10.5281/zenodo.3837608</identifier> <creators> <creator> <creatorName>Van Truong, Nguyen</creatorName> <givenName>Nguyen</givenName> <familyName>Van Truong</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-2095-7462</nameIdentifier> <affiliation>University of Transport and Communication, Vietnam & Japan Transport and Tourism Research Institute</affiliation> </creator> <creator> <creatorName>Shimizu, Tetsuo</creatorName> <givenName>Tetsuo</givenName> <familyName>Shimizu</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-3592-1139</nameIdentifier> <affiliation>Tokyo Metropolitan University</affiliation> </creator> <creator> <creatorName>Kurihara, Takeshi</creatorName> <givenName>Takeshi</givenName> <familyName>Kurihara</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-2513-9338</nameIdentifier> <affiliation>Toyo University</affiliation> </creator> <creator> <creatorName>Choi, Sunkyung</creatorName> <givenName>Sunkyung</givenName> <familyName>Choi</familyName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-6727-0668</nameIdentifier> <affiliation>Tokyo Institute of Technology & Japan Transport and Tourism Research Institute</affiliation> </creator> </creators> <titles> <title>Generating reliable tourist accommodation statistics: Bootstrapping regression model for overdispersed long-tailed data</title> </titles> <publisher>Zenodo</publisher> <publicationYear>2020</publicationYear> <subjects> <subject>tourism statistics</subject> <subject>bootstrap</subject> <subject>count regression</subject> <subject>heterogeneity</subject> <subject>over dispersed data</subject> <subject>zero-inflated data</subject> </subjects> <dates> <date dateType="Issued">2020-05-30</date> </dates> <language>en</language> <resourceType resourceTypeGeneral="Text">Journal article</resourceType> <alternateIdentifiers> <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3837608</alternateIdentifier> </alternateIdentifiers> <relatedIdentifiers> <relatedIdentifier relatedIdentifierType="ISSN" relationType="IsPartOf" resourceTypeGeneral="Text">2529-1947</relatedIdentifier> <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3835846</relatedIdentifier> </relatedIdentifiers> <rightsList> <rights rightsURI="http://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights> <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights> </rightsList> <descriptions> <description descriptionType="Abstract"><p><strong><em>Purpose</em></strong><em>: Few studies have applied count data analysis to tourist accommodation data. This study was undertaken to investigate the characteristics and to seek for the most fitting models for population total estimation in relation to tourist accommodation data.</em></p> <p><strong><em>Methods</em></strong><em>: Based on the data of 10,503 hotels, obtained from by a nationwide Japanese survey, the bootstrap resampling method was applied for re-randomisation of the data. Training and test sets were derived by randomly splitting each of the bootstrap samples. Six count models were fitted to the training set and validated with the test set. Bootstrap distributions for parameters of significance were used for model evaluation.</em></p> <p><strong><em>Results</em></strong><em>: The outcome variable (number of guests), was found to be heterogenous, over dispersed and long-tailed, with excessive zero counts. The hurdle negative binomial and zero-inflated negative binomial models outperformed the other models. The accuracy (se) of the estimation of total guests with training sets that ranged from 5% to 85%, was from 3.7 to 0.4 respectively. Results appear little&nbsp;overestimated.</em></p> <p><strong><em>Implications</em></strong><em>: Findings indicated that the integration of the bootstrap resampling method and count regression provide a statistical tool for generating reliable tourist accommodation statistics. The use of bootstrap would help to detect and correct the bias of the estimation.</em></p></description> <description descriptionType="Other">SUBMITTED: MAR. 2019, REVISION SUBMITTED: OCT. 2019, 2nd REVISION SUBMITTED: JAN. 2020, ACCEPTED: MAR. 2020, REFEREED ANONYMOUSLY, PUBLISHED ONLINE: 30 MAY 2020</description> </descriptions> </resource>