Preview

Modeling and Analysis of Information Systems

Advanced search

Unified Classification Model for Geotagging Websites

https://doi.org/10.18255/1818-1015-2013-2-80-91

Abstract

The paper presents a novel approach to finding regional scopes (geotagging) of websites. Unlike the traditional approaches, which generally involve training a separate classification model for each class (region), the proposed method is based on training a single model which is used for all regions of the same type (e.g. cities). This approach is made possible by the usage of ”relative” features which indicate how a selected region matches up to other regions for a given website. The classification system uses a variety of features of different nature that have not been yet used together for machine-learning based regional classification of websites. The evaluation demonstrates the advantage of our ”one model per region type” method versus the traditional ”one model per region” approach. A separate experiment demonstrates the ability of the proposed classifier to successfully detect regions which were not present in the training set (which is impossible for traditional approaches).

About the Author

A. N. Volkov
Yandex LLC
Russian Federation

разработчик программного обеспечения,

Leo Tolstoy St., 16, Moscow, 119021, Russia



References

1. Amitay E., Har’El N., Sivan R., and A. Soffer. Web-a-where: geotagging web content. SIGIR. ACM, 2004. P. 273–280.

2. Cheng Z., Caverlee J., and Lee K. You are where you tweet: a content-based approach to geo-locating twitter users. CIKM, 2010. P. 759–768.

3. Crandall D. J., Backstrom L., Huttenlocher D., and Kleinberg J. Mapping the world’s photos. WWW. ACM, 2009. P. 761–770.

4. Ding J., Gravano L., and Shivakumar N. Computing geographical scopes of web resources. VLDB, 2000.

5. Gulin A. and Karpovich P. Greedy function optimization in learning to rank., 2009.

6. Liu T.-Y. Learning to rank for information retrieval // Foundations and Trends in Information Retrieval. 2009. 3.

7. Pyalling A., Maslov M., and Braslavski P. Automatic geotagging of russian web sites. WWW, 2006. P. 965–966.

8. Pyalling A., Maslov M., and Trifonov S. Automatic classification of websites. RCDL, 2008.

9. Qi X. and Davison B. D. Web page classification: Features and algorithms // ACM Comput. Surv. 2009. 41.

10. Serdyukov P., Murdock V., and van Zwol R. Placing flickr photos on a map. SIGIR, 2009. P. 484–491.

11. Zong W., Wu D., Sun A., Lim E.-P., and Goh D. H.-L. On assigning place names to geography related web pages. JCDL. ACM, 2005. P. 354–362.


Review

For citations:


Volkov A.N. Unified Classification Model for Geotagging Websites. Modeling and Analysis of Information Systems. 2013;20(2):80-91. (In Russ.) https://doi.org/10.18255/1818-1015-2013-2-80-91

Views: 761


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-1015 (Print)
ISSN 2313-5417 (Online)