Analysis of Interaction Patterns and Users' Query Reformulation Strategies in Persian Search Engine

Document Type : .

Authors

Abstract

Web search is integrated into the modern online life, which is partly represented in users’ queries[p1] [F.2] . Search engines receive users' queries and then retrieve a limited number of relevant documents from a pool of billion web pages. Therefore, they can reveal users’ behavior patterns using their search logs in long periods of time. Behavior patterns can be identified in query expansion, query suggestion and spelling correction. In the present study, query logs of Parsijoo Persian search engine were examined to reveal users’ interaction pattern(s), and specifically, lexical and temporal properties of the query logs, as well as reformulation patterns. Query term distribution, temporal query length variation in different hours of the day and patterns of users’ query reformulation(s) were analyzed. The results of analyzes were employed to investigate users' web search behaviors in contrasts to behaviors reported for human users in studies which used search logs of the international search engines. The results of the present study demonstrated that the Persian native users show similar behavior patterns


 

Highlights

Anick, P. (2003). “Using terminological feedback for web search refinement”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM Press, 88–95. Available in: http://doi.org/10.1145/860435.860453

Beitzel, S. M., et al. (2004). “Hourly Analysis of a Very Large Topically Categorized Web Query Log”. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 321–328. Available in: http://doi.org/10.1145/1008992.1009048

Clauset, A., C. R. Shalizi, & M. E. J. Newman (2009). “Power-Law Distributions in Empirical Data”. SIAM Review, 51(4), 661–703. Available in: http://doi.org/10.1137/070710111

Eiron, N., & K. S. McCurley (2003). “Analysis of Anchor Text for Web Search”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR ’03, New York: ACM Press, 459-460. http://doi.org/10.1145/860435.860550

Fallah, M., & S. Zarifzadeh (2016). “Click Spam Detection Based on User Session Classification”. Proceedings of the 21th Annual Conference of Computer Society of Iran (CSICC2016). Tehran: CSI Conference Publications, 646–651.

Guo, J., et al. (2008). “A Unified and Discriminative Model for Query Refinement”. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’08. New York: ACM Press, 379-386. Available in: http://doi.org/10.1145/1390334.1390400

Hassan, A., et al. (2013). “Beyond Clicks: Query Reformulation as a Predictor of Search Satisfaction”. Proceedings of the 22nd ACM international conference on Conference on information and knowledge management. New York: ACM Press, 2019–2028. Available in: http://doi.org/10.1145/2505515.2505682

Hassan A. A. (2013). “Identifying Web Search Query Reformulation Using Concept based Matching”. Empirical Methods in Natural Language Processing (EMNLP). Seattle, Washington: Association for Computational Linguistics, 1000–1010. Retrieved from https://www.microsoft.com/en-us/research/publication/identifying-web-search-query-reformulation-using-concept-based-matching/

Huang, J., & E. N. Efthimiadis (2009). “Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs”. Proceedings of the 18th ACM Conference on Information and Knowledge Management. New York: ACM Press, 77–86. Available in: http://doi.org/10.1145/1645953.1645966

Jansen, B. J., et al. (1998). “Real Life Information Retrieval: A Study of User Queries on the Web”. ACM SIGIR Forum. 32(1), 5–17. Available in: http://doi.org/10.1145/281250.281253

Jansen, B. J., et al. (2007). “Defining a Session on Web Search Engines: Research Articles”. Journal of the American Society for Information Science and Technology. 58(6), 862–871. Available in: http://doi.org/10.1002/ASI.V58:6

Jiang, J., et al. (2015). “Understanding and Predicting Graded Search Satisfaction”. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 57–66. Available in: http://doi.org/10.1145/2684822.2685319

Jiang, J., D. He, & J. Allan (2014). “Searching, Browsing, and Clicking in a Search Session: Changes in User Behavior by Task and over Time”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 607–616. Available in: http://doi.org/10.1145/2600428.2609633

Jiang, J., & C. Ni (2016). “What Affects Word Changes in Query Reformulation During a Task-based Search Session?”. Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval. New York: ACM Press, 111–120. Available in: http://doi.org/10.1145/2854946.2854978

Jiang, J.-Y., et al. (2014). “Learning User Reformulation Behavior for Query Auto-completion”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 445–454. Available in: http://doi.org/10.1145/2600428.2609614

Kaveh-Yazdy, F., A. M. Zareh-Bidoki, & M. R. Zare-Mirakabad (2014). “Social Event Detection Via Search Engine User Queries”. The 3rd Conference on Computational Linguistics (CLC ’14). Tehran: Sharif University of Technology.

Kay, J., & E. Horvitz (1999). “UM99: User Modeling”. Proceedings of the seventh international conference on User modeling. Springer, 392–397.

Kiseleva, J., et al. (2014). “Modelling and Detecting Changes in User Satisfaction”. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM ’14. New York: ACM Press, 1449–1458. Available in: http://doi.org/10.1145/2661829.2661960

Lempel, R., & S. Moran (2003). “Predictive Caching and Prefetching of Query Results in Search Engines”. Proceedings of the twelfth international conference on World Wide Web- WWW ’03. New York: ACM Press, 19-28. available in: http://doi.org/10.1145/775152.775156

Meloni, J. (2012). SEO Keyword Alert: Long-Tail Search Most Common on Ask.com. Retrieved from http://www.brafton.com/news/seo-keyword-alert-long-tail-search-most-common-on-ask-com/

Park, J. Y., et al. (2015). “A Large-Scale Study of User Image Search Behavior on the Web”. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. New York: ACM Press, 985–994. Available in:

           http://doi.org/10.1145/2702123.2702527

Saraiva, P. C., et al. (2001). “Rank-Preserving Two-Level Caching for Scalable Search Engines”. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval- SIGIR ’01. New York: ACM Press, 51–58. Available in: http://doi.org/10.1145/383952.383959

Sloan, M., H. Yang, & J. Wang (2015). “A Term-Based Methodology for Query Reformulation Understanding”. Information Retrieval Journal. 18(2), 145–165. Available in: http://doi.org/10.1007/s10791-015-9251-5

Spink, A., et al. (2001). “Searching the Web: The Public and Their Queries”. Journal of the American Society for Information Science and Technology. 52(3), 226–234. Available in: http://doi.org/10.1002/1097-4571(2000)9999:99993.3.CO;2-I

Taghavi, M., et al. (2012). “An Analysis of Web Proxy Logs with Query Distribution Pattern Approach for Search Engines”. Computer Standards & Interfaces. 34(1), 162–170. Available in: http://doi.org/10.1016/j.csi.2011.07.001

Teevan, J., et al. (2007). “Information Re-retrieval”. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’07. New York: ACM Press, 151-158. Available in: http://doi.org/10.1145/1277741.1277770

Wang, P., M. W. Berry, & Y. Yang (2003). "Mining Longitudinal Web Queries: Trends and Patterns". Journal of the American Society for Information Science and Technology. 54(8), 743–758. Available in: http://doi.org/10.1002/asi.10262

Weber, I., & A. Jaimes (2011). “Who Uses Web Search for What: And How”. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 15–24. Available in: http://doi.org/10.1145/1935826.1935839

Whittle, M., et al. (2007). “Data Mining of Search Engine Logs”. Journal of the American Society for Information Science and Technology. 58(14), 2382–2400. Available in: http://doi.org/10.1002/ASI.V58:14

Zhang, Y., B. J. Jansen, & A. Spink (2009). „Time Series Analysis of a Web Search Engine Transaction Log”. Information Processing & Management. 45(2), 230–245. Available in:

           http://doi.org/10.1016/j.ipm.2008.07.003

Keywords


فایل پی دی اف را دریافت نمایید

Anick, P. (2003). “Using terminological feedback for web search refinement”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM Press, 88–95. Available in: http://doi.org/10.1145/860435.860453
Beitzel, S. M., et al. (2004). “Hourly Analysis of a Very Large Topically Categorized Web Query Log”. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 321–328. Available in: http://doi.org/10.1145/1008992.1009048
Clauset, A., C. R. Shalizi, & M. E. J. Newman (2009). “Power-Law Distributions in Empirical Data”. SIAM Review, 51(4), 661–703. Available in: http://doi.org/10.1137/070710111
Eiron, N., & K. S. McCurley (2003). “Analysis of Anchor Text for Web Search”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR ’03, New York: ACM Press, 459-460. http://doi.org/10.1145/860435.860550
Fallah, M., & S. Zarifzadeh (2016). “Click Spam Detection Based on User Session Classification”. Proceedings of the 21th Annual Conference of Computer Society of Iran (CSICC2016). Tehran: CSI Conference Publications, 646–651.
Guo, J., et al. (2008). “A Unified and Discriminative Model for Query Refinement”. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’08. New York: ACM Press, 379-386. Available in: http://doi.org/10.1145/1390334.1390400
Hassan, A., et al. (2013). “Beyond Clicks: Query Reformulation as a Predictor of Search Satisfaction”. Proceedings of the 22nd ACM international conference on Conference on information and knowledge management. New York: ACM Press, 2019–2028. Available in: http://doi.org/10.1145/2505515.2505682
Hassan A. A. (2013). “Identifying Web Search Query Reformulation Using Concept based Matching”. Empirical Methods in Natural Language Processing (EMNLP). Seattle, Washington: Association for Computational Linguistics, 1000–1010. Retrieved from https://www.microsoft.com/en-us/research/publication/identifying-web-search-query-reformulation-using-concept-based-matching/
Huang, J., & E. N. Efthimiadis (2009). “Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs”. Proceedings of the 18th ACM Conference on Information and Knowledge Management. New York: ACM Press, 77–86. Available in: http://doi.org/10.1145/1645953.1645966
Jansen, B. J., et al. (1998). “Real Life Information Retrieval: A Study of User Queries on the Web”. ACM SIGIR Forum. 32(1), 5–17. Available in: http://doi.org/10.1145/281250.281253
Jansen, B. J., et al. (2007). “Defining a Session on Web Search Engines: Research Articles”. Journal of the American Society for Information Science and Technology. 58(6), 862–871. Available in: http://doi.org/10.1002/ASI.V58:6
Jiang, J., et al. (2015). “Understanding and Predicting Graded Search Satisfaction”. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 57–66. Available in: http://doi.org/10.1145/2684822.2685319
Jiang, J., D. He, & J. Allan (2014). “Searching, Browsing, and Clicking in a Search Session: Changes in User Behavior by Task and over Time”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 607–616. Available in: http://doi.org/10.1145/2600428.2609633
Jiang, J., & C. Ni (2016). “What Affects Word Changes in Query Reformulation During a Task-based Search Session?”. Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval. New York: ACM Press, 111–120. Available in: http://doi.org/10.1145/2854946.2854978
Jiang, J.-Y., et al. (2014). “Learning User Reformulation Behavior for Query Auto-completion”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 445–454. Available in: http://doi.org/10.1145/2600428.2609614
Kaveh-Yazdy, F., A. M. Zareh-Bidoki, & M. R. Zare-Mirakabad (2014). “Social Event Detection Via Search Engine User Queries”. The 3rd Conference on Computational Linguistics (CLC ’14). Tehran: Sharif University of Technology.
Kay, J., & E. Horvitz (1999). “UM99: User Modeling”. Proceedings of the seventh international conference on User modeling. Springer, 392–397.
Kiseleva, J., et al. (2014). “Modelling and Detecting Changes in User Satisfaction”. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM ’14. New York: ACM Press, 1449–1458. Available in: http://doi.org/10.1145/2661829.2661960
Lempel, R., & S. Moran (2003). “Predictive Caching and Prefetching of Query Results in Search Engines”. Proceedings of the twelfth international conference on World Wide Web- WWW ’03. New York: ACM Press, 19-28. available in: http://doi.org/10.1145/775152.775156
Meloni, J. (2012). SEO Keyword Alert: Long-Tail Search Most Common on Ask.com. Retrieved from http://www.brafton.com/news/seo-keyword-alert-long-tail-search-most-common-on-ask-com/
Park, J. Y., et al. (2015). “A Large-Scale Study of User Image Search Behavior on the Web”. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. New York: ACM Press, 985–994. Available in:
           http://doi.org/10.1145/2702123.2702527
Saraiva, P. C., et al. (2001). “Rank-Preserving Two-Level Caching for Scalable Search Engines”. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval- SIGIR ’01. New York: ACM Press, 51–58. Available in: http://doi.org/10.1145/383952.383959
Sloan, M., H. Yang, & J. Wang (2015). “A Term-Based Methodology for Query Reformulation Understanding”. Information Retrieval Journal. 18(2), 145–165. Available in: http://doi.org/10.1007/s10791-015-9251-5
Spink, A., et al. (2001). “Searching the Web: The Public and Their Queries”. Journal of the American Society for Information Science and Technology. 52(3), 226–234. Available in: http://doi.org/10.1002/1097-4571(2000)9999:99993.3.CO;2-I
Taghavi, M., et al. (2012). “An Analysis of Web Proxy Logs with Query Distribution Pattern Approach for Search Engines”. Computer Standards & Interfaces. 34(1), 162–170. Available in: http://doi.org/10.1016/j.csi.2011.07.001
Teevan, J., et al. (2007). “Information Re-retrieval”. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’07. New York: ACM Press, 151-158. Available in: http://doi.org/10.1145/1277741.1277770
Wang, P., M. W. Berry, & Y. Yang (2003). "Mining Longitudinal Web Queries: Trends and Patterns". Journal of the American Society for Information Science and Technology. 54(8), 743–758. Available in: http://doi.org/10.1002/asi.10262
Weber, I., & A. Jaimes (2011). “Who Uses Web Search for What: And How”. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 15–24. Available in: http://doi.org/10.1145/1935826.1935839
Whittle, M., et al. (2007). “Data Mining of Search Engine Logs”. Journal of the American Society for Information Science and Technology. 58(14), 2382–2400. Available in: http://doi.org/10.1002/ASI.V58:14
Zhang, Y., B. J. Jansen, & A. Spink (2009). „Time Series Analysis of a Web Search Engine Transaction Log”. Information Processing & Management. 45(2), 230–245. Available in:
           http://doi.org/10.1016/j.ipm.2008.07.003