الگوهای تعامل و راهبردهای بازآرایی پرس‌و‌جو توسط کاربران در یک موتور جستجوی فارسی

نوع مقاله: علمی-پژوهشی

نویسندگان

1 دانشجوی دکتری مهندسی کامپیوتر دانشگاه یزد

2 دانشگاه یزد

چکیده

فرایند جستجو در وب با زندگیِ برخط امروزی عجین شده است. موتورهای جستجو، با دریافت پرس‌وجوهای کاربران، تعداد محدودی از اسناد مرتبط را از میان چندین میلیارد صفحۀ وب بازیابی می‌کنند. بنابراین موتورهای جستجو با ثبت مجموعۀ پرس‌وجوهای کاربران در درازمدت می‌توانند به مجموعه‌ای از اطلاعات دربارۀ الگوهای رفتاری کاربران دست یابند. این الگوها می‌توانند در فرایندهایی مانند گسترش پرس‌وجو، پیشنهاد پرس‌وجو و تصحیح املایی مورد استفاده قرار گیرند. این مقاله به بررسی الگوهای تعامل کاربران و بازآرایی پرس‌وجوهای ارسالی آنها به موتور جستجوی فارسی پارسی‌جو می‌پردازد و به‌طور اخص، پرس‌وجوهای کاربران از نظر واژگانی و زمانی و همچنین الگوهای بازآرایی را مورد بررسی قرار می­دهد. توزیع آماری کلمات پرس‌جو، تغییرات طول پرس‌وجو در زمان‌های مختلف از شبانه‌روز و رفتارهای بازآرایی پرس‌وجو توسط کاربران از مهم‌ترین بررسی‌های آماری در این مقاله بوده‌اند. در ادامه تحلیل‌هایی دربارۀ الگوهای تعامل کاربران مبتنی بر نتایج بررسی‌های مذکور ارائه شده و نتایج آن با بررسی‌های بین‌المللی مقایسه شده است. این مطالعه نشان‌دهنده همخوانی رفتاری کاربران فارسی زبان با کاربران موتورهای جستجو در سراسر جهان است

کلیدواژه‌ها


عنوان مقاله [English]

Analysis of Interaction Patterns and Users' Query Reformulation Strategies in Persian Search Engine

نویسندگان [English]

  • Fatemeh Kaveh-Yazdy 1
  • Ali-Mohammad Zareh-Bidoki 2
  • Mohammadreza Pajoohan 2
چکیده [English]

Web search is integrated into the modern online life, which is partly represented in users’ queries[p1] [F.2] . Search engines receive users' queries and then retrieve a limited number of relevant documents from a pool of billion web pages. Therefore, they can reveal users’ behavior patterns using their search logs in long periods of time. Behavior patterns can be identified in query expansion, query suggestion and spelling correction. In the present study, query logs of Parsijoo Persian search engine were examined to reveal users’ interaction pattern(s), and specifically, lexical and temporal properties of the query logs, as well as reformulation patterns. Query term distribution, temporal query length variation in different hours of the day and patterns of users’ query reformulation(s) were analyzed. The results of analyzes were employed to investigate users' web search behaviors in contrasts to behaviors reported for human users in studies which used search logs of the international search engines. The results of the present study demonstrated that the Persian native users show similar behavior patterns


 

کلیدواژه‌ها [English]

  • Persian Search Engine
  • Query
  • Reformulation Patterns
  • Click
  • Power-law Distribution

فایل پی دی اف را دریافت نمایید

Anick, P. (2003). “Using terminological feedback for web search refinement”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. New York: ACM Press, 88–95. Available in: http://doi.org/10.1145/860435.860453
Beitzel, S. M., et al. (2004). “Hourly Analysis of a Very Large Topically Categorized Web Query Log”. Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 321–328. Available in: http://doi.org/10.1145/1008992.1009048
Clauset, A., C. R. Shalizi, & M. E. J. Newman (2009). “Power-Law Distributions in Empirical Data”. SIAM Review, 51(4), 661–703. Available in: http://doi.org/10.1137/070710111
Eiron, N., & K. S. McCurley (2003). “Analysis of Anchor Text for Web Search”. Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR ’03, New York: ACM Press, 459-460. http://doi.org/10.1145/860435.860550
Fallah, M., & S. Zarifzadeh (2016). “Click Spam Detection Based on User Session Classification”. Proceedings of the 21th Annual Conference of Computer Society of Iran (CSICC2016). Tehran: CSI Conference Publications, 646–651.
Guo, J., et al. (2008). “A Unified and Discriminative Model for Query Refinement”. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’08. New York: ACM Press, 379-386. Available in: http://doi.org/10.1145/1390334.1390400
Hassan, A., et al. (2013). “Beyond Clicks: Query Reformulation as a Predictor of Search Satisfaction”. Proceedings of the 22nd ACM international conference on Conference on information and knowledge management. New York: ACM Press, 2019–2028. Available in: http://doi.org/10.1145/2505515.2505682
Hassan A. A. (2013). “Identifying Web Search Query Reformulation Using Concept based Matching”. Empirical Methods in Natural Language Processing (EMNLP). Seattle, Washington: Association for Computational Linguistics, 1000–1010. Retrieved from https://www.microsoft.com/en-us/research/publication/identifying-web-search-query-reformulation-using-concept-based-matching/
Huang, J., & E. N. Efthimiadis (2009). “Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs”. Proceedings of the 18th ACM Conference on Information and Knowledge Management. New York: ACM Press, 77–86. Available in: http://doi.org/10.1145/1645953.1645966
Jansen, B. J., et al. (1998). “Real Life Information Retrieval: A Study of User Queries on the Web”. ACM SIGIR Forum. 32(1), 5–17. Available in: http://doi.org/10.1145/281250.281253
Jansen, B. J., et al. (2007). “Defining a Session on Web Search Engines: Research Articles”. Journal of the American Society for Information Science and Technology. 58(6), 862–871. Available in: http://doi.org/10.1002/ASI.V58:6
Jiang, J., et al. (2015). “Understanding and Predicting Graded Search Satisfaction”. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 57–66. Available in: http://doi.org/10.1145/2684822.2685319
Jiang, J., D. He, & J. Allan (2014). “Searching, Browsing, and Clicking in a Search Session: Changes in User Behavior by Task and over Time”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 607–616. Available in: http://doi.org/10.1145/2600428.2609633
Jiang, J., & C. Ni (2016). “What Affects Word Changes in Query Reformulation During a Task-based Search Session?”. Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval. New York: ACM Press, 111–120. Available in: http://doi.org/10.1145/2854946.2854978
Jiang, J.-Y., et al. (2014). “Learning User Reformulation Behavior for Query Auto-completion”. Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM Press, 445–454. Available in: http://doi.org/10.1145/2600428.2609614
Kaveh-Yazdy, F., A. M. Zareh-Bidoki, & M. R. Zare-Mirakabad (2014). “Social Event Detection Via Search Engine User Queries”. The 3rd Conference on Computational Linguistics (CLC ’14). Tehran: Sharif University of Technology.
Kay, J., & E. Horvitz (1999). “UM99: User Modeling”. Proceedings of the seventh international conference on User modeling. Springer, 392–397.
Kiseleva, J., et al. (2014). “Modelling and Detecting Changes in User Satisfaction”. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM ’14. New York: ACM Press, 1449–1458. Available in: http://doi.org/10.1145/2661829.2661960
Lempel, R., & S. Moran (2003). “Predictive Caching and Prefetching of Query Results in Search Engines”. Proceedings of the twelfth international conference on World Wide Web- WWW ’03. New York: ACM Press, 19-28. available in: http://doi.org/10.1145/775152.775156
Meloni, J. (2012). SEO Keyword Alert: Long-Tail Search Most Common on Ask.com. Retrieved from http://www.brafton.com/news/seo-keyword-alert-long-tail-search-most-common-on-ask-com/
Park, J. Y., et al. (2015). “A Large-Scale Study of User Image Search Behavior on the Web”. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. New York: ACM Press, 985–994. Available in:
           http://doi.org/10.1145/2702123.2702527
Saraiva, P. C., et al. (2001). “Rank-Preserving Two-Level Caching for Scalable Search Engines”. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval- SIGIR ’01. New York: ACM Press, 51–58. Available in: http://doi.org/10.1145/383952.383959
Sloan, M., H. Yang, & J. Wang (2015). “A Term-Based Methodology for Query Reformulation Understanding”. Information Retrieval Journal. 18(2), 145–165. Available in: http://doi.org/10.1007/s10791-015-9251-5
Spink, A., et al. (2001). “Searching the Web: The Public and Their Queries”. Journal of the American Society for Information Science and Technology. 52(3), 226–234. Available in: http://doi.org/10.1002/1097-4571(2000)9999:99993.3.CO;2-I
Taghavi, M., et al. (2012). “An Analysis of Web Proxy Logs with Query Distribution Pattern Approach for Search Engines”. Computer Standards & Interfaces. 34(1), 162–170. Available in: http://doi.org/10.1016/j.csi.2011.07.001
Teevan, J., et al. (2007). “Information Re-retrieval”. Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’07. New York: ACM Press, 151-158. Available in: http://doi.org/10.1145/1277741.1277770
Wang, P., M. W. Berry, & Y. Yang (2003). "Mining Longitudinal Web Queries: Trends and Patterns". Journal of the American Society for Information Science and Technology. 54(8), 743–758. Available in: http://doi.org/10.1002/asi.10262
Weber, I., & A. Jaimes (2011). “Who Uses Web Search for What: And How”. Proceedings of the Fourth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 15–24. Available in: http://doi.org/10.1145/1935826.1935839
Whittle, M., et al. (2007). “Data Mining of Search Engine Logs”. Journal of the American Society for Information Science and Technology. 58(14), 2382–2400. Available in: http://doi.org/10.1002/ASI.V58:14
Zhang, Y., B. J. Jansen, & A. Spink (2009). „Time Series Analysis of a Web Search Engine Transaction Log”. Information Processing & Management. 45(2), 230–245. Available in:
           http://doi.org/10.1016/j.ipm.2008.07.003