<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title type="main" level="a">Challenges in archiving the personalized web</title>
        <author>
          <persName n="1" ref="https://orcid.org/0000-0001-8344-2135" type="ORCID">
            <forename>Erwan</forename>
            <surname>Le Merrer</surname>
            <placeName type="affiliation">CNRS, France</placeName>
          </persName>
          <persName n="2">
            <forename>Camilla</forename>
            <surname>Penzo</surname>
            <placeName type="affiliation">PEReN, France</placeName>
          </persName>
          <persName n="3" ref="https://orcid.org/0000-0003-4473-4332" type="ORCID">
            <forename>Gilles</forename>
            <surname>Tredan</surname>
            <placeName type="affiliation">CNRS, France</placeName>
          </persName>
          <persName n="4" ref="https://orcid.org/0000-0002-1361-1703" type="ORCID">
            <forename>Lucas</forename>
            <surname>Verney</surname>
            <placeName type="affiliation">PEReN, France</placeName>
          </persName>
        </author>
        <respStmt>
          <resp>This is a section of <title>Exploring the Archived Web during a Highly Transformative Age</title>(DOI: <idno type="DOI">10.36253/979-12-215-0413-2</idno>) by </resp>
          <name>Sophie Gebeil, Jean-Christophe Peyssard</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <publisher>Firenze University Press</publisher>
        <pubPlace>Florence</pubPlace>
        <date when="2024">2024</date>
        <idno type="DOI">https://doi.org/10.36253/979-12-215-0413-2.10</idno>
        <availability>
          <p>Available for academic research purposes</p>
          <p>Open Access</p>
          <p>Copyright Author(s)</p>
          <licence source="text" target="https://creativecommons.org/licenses/by/4.0/legalcode">
            <p>Content licence CC BY 4.0</p>
          </licence>
          <licence source="metadata" target="https://creativecommons.org/publicdomain/zero/1.0/legalcode">
            <p>Metadata licence CC0 1.0</p>
          </licence>
        </availability>
      </publicationStmt>
      <sourceDesc>
        <p>This is original content, published for academic research purposes</p>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <appInfo>
        <application version="2.2" ident="Booksflow">
          <desc>Digital edition XML powered by Booksflow</desc>
        </application>
      </appInfo>
    </encodingDesc>
    <profileDesc>
      <abstract xml:lang="en">
        <p>The decision-making algorithms embedded within online platforms are determining content shown to users. This personalization steers the dissemination of information, in contrast with the idea of a universal World Wide Web. Personalization thus generates a combinatorial explosion of different versions of the web, rendering each user’s experience distinct. This raises critical questions: what elements of a personalized web should be archived? How can the collected user journeys capture a representative picture of our times? Navigating personalization is essential to capture the contemporary web experience, yet it presents methodological and technical challenges. In this chapter, we identify key challenges in performing a representative sampling of personalization within online platforms.</p>
      </abstract>
      <textClass>
        <keywords>
          <list>
            <item>personalization</item>
            <item>archival</item>
            <item>YouTube</item>
            <item>2022 French presidential election</item>
          </list>
        </keywords>
      </textClass>
    </profileDesc>
  </teiHeader>
  <text>
    <body>
      <p>It is available online at https://doi.org/10.36253/979-12-215-0413-2.10<ref target="https://doi.org/10.36253/979-12-215-0413-2.10" /></p>
      <div>
        <listBibl>
          <head>References</head>
          <bibl n="154717">
            <bibl>Azcoitia, Santiago Andr&amp;#233;s, and Nikolaos Laoutaris. 2022. “A Survey of Data Marketplaces and Their Business Models.”</bibl>
            <idno type="DOI">10.48550/arXiv.2201.04561</idno>
          </bibl>
          <bibl n="154580">
            <bibl>Bandy, Jack, and Nicholas Diakopoulos. 2021. “Curating Quality? How Twitter’s Timeline Algorithm Treats Different Types of News.” Social Media + Society 7 (3).</bibl>
            <idno type="DOI">10.1177/2056305121104164</idno>
          </bibl>
          <bibl n="154521">Cloudfare. 2023. “What is rate limiting? | Rate limiting and bots.” &amp;lt;https://web.archive.org/web/20240424000000*/https://www.cloudflare.com/learning/bots/what-is-rate-limiting/&amp;gt;</bibl>
          <bibl n="154524">
            <bibl>Covington, Paul, Jay Adams, and Emre Sargin. 2016. “Deep Neural Networks for Youtube Recommendations.” In Proceedings of the 10th Acm Conference on Recommender Systems, 191–98.</bibl>
            <idno type="DOI">10.1145/2959100.2959190</idno>
          </bibl>
          <bibl n="154762">
            <bibl>Cresci, Stefano. 2020. “A Decade of Social Bot Detection.” Commun. ACM 63 (10): 72–83.</bibl>
            <idno type="DOI">10.1145/3409116</idno>
          </bibl>
          <bibl n="154358">
            <bibl>Eg, Ragnhild, &amp;#214;zlem Demirkol T&amp;#248;nnesen, and Merete Kolberg Tennfjord. 2023. “A Scoping Review of Personalized User Experiences on Social Media: The Interplay Between Algorithms and Human Factors.” Computers in Human Behavior Reports 9: 100253.</bibl>
            <idno type="DOI">10.1016/j.chbr.2022.100253</idno>
          </bibl>
          <bibl n="154238">
            <bibl>Eslami, Motahhare, Aimee Rickman, Kristen Vaccaro, Amirhossein Aleyasen, Andy Vuong, Karrie Karahalios, Kevin Hamilton, and Christian Sandvig. 2015. ““I Always Assumed That I Wasn’t Really That Close to [Her]“: Reasoning About Invisible Algorithms in News Feeds.” In Proceedings of the 33rd Annual Acm Conference on Human Factors in Computing Systems, 153–62. CHI ’15. New York, NY, USA: Association for Computing Machinery.</bibl>
            <idno type="DOI">10.1145/2702123.2702556</idno>
          </bibl>
          <bibl n="154416">NOYB European Center for Digital Rights. 2023. “How Mobile Apps Illigally Share Your Personal Data.” &amp;lt;https://web.archive.org/web/20240424000000*/https://noyb.eu/en/how-mobile-apps-illegally-share-your-personal-data&amp;gt;</bibl>
          <bibl n="154488">
            <bibl>Fang, Minghong, Neil Zhenqiang Gong, and Jia Liu. 2020. “Influence Function Based Data Poisoning Attacks to Top-N Recommender Systems.” In Proceedings of the Web Conference 2020, 3019–25.</bibl>
            <idno type="DOI">10.1145/3366423.3380072</idno>
          </bibl>
          <bibl n="154411">
            <bibl>Farseev, Aleksandr, Qi Yang, Andrey Filchenkov, Kirill Lepikhin, Yu-Yi Chu-Farseeva, and Daron-Benjamin Loo. 2020. “SoMin.ai: Personality-Driven Content Generation Platform.” arXiv E-Prints, November, arXiv:2011.14615.</bibl>
            <idno type="DOI">10.48550/arXiv.2011.14615</idno>
          </bibl>
          <bibl n="154276">
            <bibl>Gupta, Udit, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, et al. 2020. “The Architectural Implications of Facebook’s Dnn-Based Personalized Recommendation.” In 2020 Ieee International Symposium on High Performance Computer Architecture (Hpca), 488–501. IEEE.</bibl>
            <idno type="DOI">10.1109/HPCA47549.2020.00047</idno>
          </bibl>
          <bibl n="154393">
            <bibl>Gustarini, Mattia, Marcello Paolo Scipioni, Marios Fanourakis, and Katarzyna Wac. 2016. “Differences in Smartphone Usage: Validating, Evaluating, and Predicting Mobile User Intimacy.” Pervasive and Mobile Computing 33: 50–72.</bibl>
            <idno type="DOI">10.1016/j.pmcj.2016.06.003</idno>
          </bibl>
          <bibl n="154361">
            <bibl>Hosseinmardi, Homa, Amir Ghasemian, Aaron Clauset, Markus Mobius, David M Rothschild, and Duncan J Watts. 2021. “Examining the Consumption of Radical Content on Youtube.” Proceedings of the National Academy of Sciences 118 (32): e2101967118.</bibl>
            <idno type="DOI">10.1073/pnas.2101967118</idno>
          </bibl>
          <bibl n="154292">Insider, Business. 2019. “The Cambridge Analytica Whistleblower Explains How the Firm Used Facebook Data to Sway Elections.” &amp;lt;https://web.archive.org/web/20240424000000*/https://www.businessinsider.com/cambridge-analytica-whistleblower-christopher-wylie-facebook-data-2019-10?r=US&amp;amp;IR=T&amp;gt;</bibl>
          <bibl n="154525">
            <bibl>Kelly, Mat, Justin F Brunelle, Michele C Weigle, and Michael L Nelson. 2013. “A Method for Identifying Personalized Representations in Web Archives.” D-Lib Magazine 19 (11-12).</bibl>
            <idno type="DOI">10.1045/november2013-kelly</idno>
          </bibl>
          <bibl n="154514">Kiesel, Johannes, Arjen P de Vries, Matthias Hagen, Benno Stein, and Martin Potthast. 2018. “WASP: Web Archiving and Search Personalized.” &amp;lt;https://ceur-ws.org/Vol-2167/paper6.pdf&amp;gt;</bibl>
          <bibl n="154676">
            <bibl>Ledwich, Mark, and Anna Zaitsev. 2020. “Algorithmic Extremism: Examining Youtube’s Rabbit Hole of Radicalization.” First Monday.</bibl>
            <idno type="DOI">10.5210/fm.v25i3.10419</idno>
          </bibl>
          <bibl n="154691">
            <bibl>Le Merrer, Erwan, Ronan Pons, and Gilles Tredan. 2023. “Algorithmic Audits of Algorithms, and the Law.” AI and Ethics, 1–11.</bibl>
            <idno type="DOI">10.1007/s43681-023-00343-z</idno>
          </bibl>
          <bibl n="154319">Le Merrer, Erwan, and Gilles Tredan. 2018. “The Topological Face of Recommendation.” In Complex Networks &amp;amp; Their Applications Vi: Proceedings of Complex Networks 2017 (the Sixth International Conference on Complex Networks and Their Applications), 897–908. Springer.</bibl>
          <bibl n="154643">Le Merrer, Erwan, Gilles Tredan, and Ali Yesilkanat. 2023. “Modeling Rabbit-Holes on Youtube.” Social Network Analysis and Mining 13 (1): 100.</bibl>
          <bibl n="154408">
            <bibl>Milligan, Ian, Nick Ruest, and Jimmy Lin. 2016. “Content Selection and Curation for Web Archiving: The Gatekeepers Vs. The Masses.” In Proceedings of the 16th Acm/Ieee-Cs on Joint Conference on Digital Libraries, 107–10.</bibl>
            <idno type="DOI">10.1145/2910896.2910913</idno>
          </bibl>
          <bibl n="154505">Mozilla. 2020. “Political Advertisements from Facebook.” &amp;lt;https://web.archive.org/web/20240424000000*/https://foundation.mozilla.org/en/blog/step-inside-someone-elses-youtube-bubble&amp;gt;</bibl>
          <bibl n="154732">
            <bibl>Ohme, Jakob, and Theo Araujo. 2022. “Digital Data Donations: A Quest for Best Practices.” Patterns 3 (4).</bibl>
            <idno type="DOI">10.1016/j.patter.2022.100467</idno>
          </bibl>
          <bibl n="154688">Pariser, Eli. 2012. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think. Penguin Books.</bibl>
          <bibl n="154765">
            <bibl>Powers, Elia. 2017. “My News Feed Is Filtered?” Digital Journalism 5 (10): 1315–35.</bibl>
            <idno type="DOI">10.1080/21670811.2017.1286943</idno>
          </bibl>
          <bibl n="154558">Exodus Privacy. “Exodus Privacy Analyzes Privacy Concerns in Android Applications.” &amp;lt;https://web.archive.org/web/20240424000000*/http://https://exodus-privacy.eu.org/&amp;gt;</bibl>
          <bibl n="154522">ProPublica. 2017. “Political Advertisements from Facebook.” &amp;lt;https://web.archive.org/web/20240424000000*/https://www.propublica.org/article/help-us-monitor-political-ads-online&amp;gt;</bibl>
          <bibl n="154277">Rastegarpanah, Bashir, Krishna Gummadi, and Mark Crovella. 2021. “Auditing Black-Box Prediction Models for Data Minimization Compliance.” Advances in Neural Information Processing Systems 34: 20621–32. &amp;lt;https://proceedings.neurips.cc/paper_files/paper/2021/file/ac6b3cce8c74b2e23688c3e45532e2a7-Paper.pdf&amp;gt;</bibl>
          <bibl n="154267">Digital Services Act . 2022. Regulation (EU) 2022/2065 of the European Parliament and of the Council of 19 October 2022 on a Single Market for Digital Services and Amending Directive 2000/31/EC (Text with EEA Relevance). OJ L. &amp;lt;https://web.archive.org/web/20240424000000*/http://data.europa.eu/eli/reg/2022/2065/oj/eng&amp;gt;</bibl>
          <bibl n="154439">
            <bibl>Salganik, Matthew J., and Duncan J. Watts. 2008. “Leading the Herd Astray: An Experimental Study of Self-Fulfilling Prophecies in an Artificial Cultural Market.” Social Psychology Quarterly 71 (4): 338–55.</bibl>
            <idno type="DOI">10.1177/0190272508071004</idno>
          </bibl>
          <bibl n="154452">
            <bibl>Schafer, Val&amp;#233;rie, G&amp;#233;r&amp;#244;me Truc, Romain Badouard, Lucien Castex, and Francesca Musiani. 2019. “Paris and Nice Terrorist Attacks: Exploring Twitter and Web Archives.” Media, War &amp;amp; Conflict 12 (2): 153–70.</bibl>
            <idno type="DOI">10.1177/1750635219839382</idno>
          </bibl>
          <bibl n="154253">Schmidt, Jan-Hinrik, Lisa Merten, Uwe Hasebrink, Isabelle Petrich, and Amelie Rolfs. 2019. “How Do Intermediaries Shape News-Related Media Repertoires and Practices? Findings from a Qualitative Study.” International Journal of Communication 13 (0). &amp;lt;https://web.archive.org/web/20240424000000*/https://ijoc.org/index.php/ijoc/article/view/9080&amp;gt;</bibl>
          <bibl n="154442">
            <bibl>Siano, Alfonso, Agostino Vollero, Francesca Conte, and Sara Amabile. 2017. “‘More Than Words’: Expanding the Taxonomy of Greenwashing After the Volkswagen Scandal.” Journal of Business Research 71: 27–37.</bibl>
            <idno type="DOI">10.1016/j.jbusres.2016.11.002</idno>
          </bibl>
          <bibl n="154460">“Teens, Social Media and Technology”. 2023, Pew Research Center. &amp;lt;https://web.archive.org/web/20240424000000*/https://www.pewresearch.org/internet/2023/12/11/teens-social-media-and-technology-2023/&amp;gt;</bibl>
          <bibl n="154367">“The Christchurch Call to Action to Eliminate Terrorist and Violent Extremist Content Online.” n.d. &amp;lt;https://web.archive.org/web/20240424000000*/https://www.christchurchcall.com/assets/Documents/Christchurch-Call-full-text-English.pdf&amp;gt;</bibl>
          <bibl n="154394">
            <bibl>Xu, Runhua, Remo Manuel Frey, Elgar Fleisch, and Alexander Ilic. 2016. “Understanding the Impact of Personality Traits on Mobile App Adoption – Insights from a Large-Scale Field Study.” Computers in Human Behavior 62: 244–56.</bibl>
            <idno type="DOI">10.1016/j.chb.2016.04.011</idno>
          </bibl>
          <bibl n="154434">
            <bibl>Zhao, Sha, Shijian Li, Julian Ramos, Zhiling Luo, Ziwen Jiang, Anind K. Dey, and Gang Pan. 2019. “User Profiling from Their Use of Smartphone Applications: A Survey.” Pervasive and Mobile Computing 59: 101052.</bibl>
            <idno type="DOI">10.1016/j.pmcj.2019.101052</idno>
          </bibl>
        </listBibl>
      </div>
    </body>
  </text>
</TEI>