FAQ

Artificial Intelligence and Machine Learning at the Intersection of Privacy and Archives

__T_CHECK_FOR_UPDATES

Data publikacji: 03.12.2024

Archeion, 2024, 125, s. 55 - 78

https://doi.org/10.4467/26581264ARC.24.006.20201

Autorzy

,
Iori Khuhro
University of British Columbia
, Kanada
https://orcid.org/0009-0002-6403-4149 Orcid
Wszystkie publikacje autora →
,
Erin Gilmore
San José State University
, Stany Zjednoczone Ameryki
https://orcid.org/0009-0008-0249-1954 Orcid
Wszystkie publikacje autora →
,
Jim Suderman
InterPARES Trust AI Project
, Kanada
Wszystkie publikacje autora →
Darra L. Hofman
San José State University
, Stany Zjednoczone Ameryki
https://orcid.org/0000-0002-1772-6268 Orcid
Kontakt z autorem
Wszystkie publikacje autora →

Tytuły

Artificial Intelligence and Machine Learning at the Intersection of Privacy and Archives

Abstrakt

As records are increasingly born digital – and thus, at least ostensibly, potentially much more accessible – archivists find themselves struggling to enable general access while providing appropriate privacy protections for the torrent of records being transferred to their care. In this article, the authors report the results of an integrative literature review study, examining the intersection of AI, archives, and privacy in terms of how archives are currently coping with these challenges and what role(s) AI might play in addressing privacy in archival records. The study revealed three major themes: 1) the challenges of – and possibilities beyond – defining “privacy” and “AI”; 2) the need for context-sensitive ways to manage privacy and access decisions; and 3) the lack of adequate “success measures” for ensuring the actual fitness for purpose of privacy AI solutions in the archival context.

Podziękowania

The authors are grateful that this work was supported by International Research on Permanent Authentic Records in Electronic Systems (InterPARES) Trust AI, an international research partnership led by Drs. Luciana Duranti and Muhammad Abdul-Mageed, University of British Columbia. InterPARES Trust AI is supported in part by funding from the Social Sciences and Humanities Research Council of Canada (SSHRC). The authors would like to thank Kisun Kim (Okanagan College) and Carlos Quevedo, previous InterPARES Trust AI Graduate Research Assistants, for their contribution to this work.

Bibliografia

Pobierz bibliografię

Ardia D., Klinefelter A., Privacy and Court Records: An Empirical Study, “Berkeley Technology Law Journal” 2015, vol. 30, no. 3, pp. 1807–1898.

Baron J.R., Payne N., Dark archives and E-democracy: strategies for overcoming access barriers to the public record archives of the future [in:] Conference for E-Democracy and Open Government (CeDEM), eds. P. Parycek, N. Edelmann, Krems 2017, pp. 3–11.

Bingo S., Of Provenance and Privacy: Using Contextual Integrity to Define Third-Party Privacy, “The American Archivist” 2011, 74(2), pp. 506–521, https://doi.org/10.17723/aarc.74.2.55132839256116n4 [access: 5.11.2024].

Booms H., Überlieferungsbildung: keeping archives as a social and political activity, “Archivaria” 1991, vol. 33, pp. 25–33.

Desai M.A., Pasquetto I.V., Jacobs A.Z., Card D., An Archival Perspective on Pretraining Data, “Patterns” 2024, vol. 5, no. 4, pp. 1–11.

Fairfield J.A., “You Keep Using That Word”: Why Privacy Doesn’t Mean What Lawyers Think, “Osgoode Hall Law Journal” 2002, vol. 59, pp. 249–290.

Garat D., Wonsever D., Automatic Curation of Court Documents: Anonymizing Personal Data, “Information” 2022, vol. 13, no. 27, pp. 1–16, https://doi.org/10.3390/info13010027 [access: 5.11.2024].

Gichoya J.W., Kaesha T., Celi L.A., Safad N., Banerjee I., Banja J.D., Seyyed-Kalantari L., Trivedi H., Purkayastha S., AI pitfalls and what not to do: mitigating bias in AI, “The British Journal of Radiology” 2023, vol. 96, no. 1150, pp. 1–8, https://doi.org/10.1259/bjr.20230023 [access: 5.11.2024].

Glaser I., Schamberger T., Matthes F., Anonymization of German legal court rulings [in:] Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, New York 2021, pp. 205–209, https://doi.org/10.1145/3462757.3466087 [access: 5.11.2024].

Goldman B., Pyatt T.D., Security without obscurity: Managing personally identifiable information in born-digital archives, “Library & Archival Security” 2013, vol. 26, no. 1–2, pp. 37–55, https://doi.org/10.1080/01960075.2014.913966 [access: 5.11.2024].

Harris V., The archival sliver: power, memory, and archives in South Africa, “Archival Science” 2002, vol. 2, pp. 63–86.

Hertzog W., Privacy’s Blueprint: The Battle to Control the Design of New Technologies, Cambridge, Massachusetts 2018.

Heurix J., Zimmermann P., Neubauer T., Fenz S., A taxonomy for privacy enhancing technologies, “Computers & Security” 2015, vol. 53, pp. 1–17.

Hutchinson T., Protecting Privacy in the Archives: Preliminary Explorations of Topic Modeling for Born-Digital Collections [in:] Proceedings of the 2017 IEEE International Conference on Big Data25–30 June 2017, Honolulu, Hawaii, eds. G. Karypis, J. Zhang, Los Alamitos 2017, pp.  2251–2255,  https://harvest.usask.ca/items/e237ebe9-5627-44ac-8b2f-a61fc2e4acc3 [access: 5.11.2024].

Hutchinson T., Protecting Privacy in the Archives: Supervised Machine Learning and Born-Digital Records [in:] Proceedings 2018 IEEE International Conference on Big Data10–13 December 2018, Seattle, ed. N. Abe, H. Liu, C. Pu, X. Hu, N. Ahmed, M. Qiao, Y. Song, D. Kossmann, B. Liu, K. Lee, J. Tang, J. He, J. Saltz, Piscataway 2018, pp. 2696–2701, https://doi.org/10.1109/BigData.2018.8621929 [access: 5.11.2024].

Jenkinson H., A Manual of Archive Administration: Including the Problems of War Archives and Archive Making, London 1922.

Koops B.J., Newell B.C., Timan T., Chokrevski T., A Typology of Privacy, “University of Pennsylvania Journal of International Law” 2017, vol. 38, no. 2, pp. 483–578.

LeClere E., Breaking Rules for Good? How Archivists Manage Privacy in Large-Scale Digitisation Projects, “Archives and Manuscripts” 2018, vol. 46, no. 3, pp. 289–308, https://doi.org/10.1080/01576895.2018.1547653 [access: 5.11.2024].

Lee C.A., Woods K., Automated redaction of private and personal data in collections [in:] Proceedings of Memory of the World in the Digital Age: Digitization and Preservation International Conference, eds. L. Duranti, E. Shaffer, Vancouver 2012, pp. 298–313, https://ils.unc.edu/callee/p298-lee.pdf [access: 5.11.2024].

Lemieux V.L., Werner J., Protecting Privacy in Digital Records: The Potential of Privacy-Enhancing Technologies, “Journal on Computing and Cultural Heritage” 2024, vol. 16, no. 4, article 83, pp. 1–18, https://doi.org/10.1145/3633477 [access: 5.11.2024].

Liu B., Ding M., Shaham S., Rahayu W., Farokhi F., Lin Z., When Machine Learning Meets Privacy: A Survey and Outlook, “ACM Computing Survey” 2021, vol. 54, no. 2, article 31, pp. 1–36, https://doi.org/10.1145/3436755 [access: 5.11.2024].

Mordell D., Critical Questions for Archives as (Big) Data, “Archivaria” 2019, vol. 87, pp. 140–161.

Nissenbaum H.F., Privacy in context: Technology, policy, and the integrity of social life, Stanford 2009.

Ohm P., Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization, “UCLA Law Review” 2010, vol. 57, pp. 1701–1777.

Oksanen A., Tamper M., Tuominen J., Hietanen A., Hyvöonen E., ANOPPI: A pseudonymization service  for  Finnish  court  documents  [in:]  Legal  Knowledge  and  Information  Systems, eds. M. Araszkiewicz, V. Rodríguez-Doncel, Amsterdam 2019, pp. 251–254, https://helda.helsinki.fi/server/api/core/bitstreams/622773b4-8c6e-4558-8571-da432fe7ea8f/content [access: 5.11.2024].

Pasquale  F.,  New laws of robotics: defending human expertise in the age of AI, Cambridge, Massachusetts 2020.

Rolan G., Humphries G., Jeffrey L., Samaras E., Antsoupova T., Stuart K., More human than human? Artificial intelligence in the archive, “Archives and Manuscripts” 2019, vol. 47, no. 2, pp. 179–203, https://doi.org/10.1080/01576895.2018.1502088 [access: 5.11.2024].

Sillitoe P., Privacy in a public place: Managing public access to personal information controlled by archives services, “Journal of the Society of Archivists” 1998, vol. 19, no. 1, pp. 5–15.

Silva P., Goncalves C., Godinho C., Antunes N., Curado M., Using NLP and Machine Learning to Detect Data Privacy Violations [in:] IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto 2020, pp. 972–977, https://doi.org/10.1109/INFOCOMWKSHPS50562.2020.9162683 [access: 5.11.2024].

Snyder H., Literature review as a research methodology: An overview and guidelines, “Journal of Business Research” 2019, vol. 104, pp. 333– 339, https://doi.org/10.1016/j.jbusres.2019.07.039 [access: 5.11.2024].

Solove D.J., A Taxonomy of Privacy, “University of Pennsylvania Law Review” 2006, vol. 145, no. 3, pp. 477–564, https://doi.org/10.2307/40041279 [access: 5.11.2024].

Solove D.J., Access and Aggregation: Public Records, Privacy, and the Constitution, “Minnesota Law Review” 2002, vol. 86, no. 6, pp. 1137–1209.

Tamper M., Oksanen A., Tuominen J.A., Hyvönen E.A., Hietanen A., Anonymization Service for Finnish Case Law: Opening Data without Sacrificing Data Protection and Privacy of Citizens, 2018,  https://research.aalto.fi/en/publications/anonymization-service-for-finnish-case-law-opening-data-without-s [access: 5.11.2024].

Todd M., Power, Identity, Integrity, Authenticity, and the Archives: A Comparative Study of the Application of Archival Methodologies to Contemporary Privacy, “Archivaria” 2006, vol. 61, pp. 181–214.

Tzouganatou A., Openness and privacy in born-digital archives: reflecting the role of AI development, “AI & Society” 2022, vol. 37, pp. 991–999, https://doi.org/10.1007/s00146-021-01361-3 [access: 5.11.2024].

Westin A.F., Privacy and Freedom, New York 1967.

Yun H., Lee G., Kim D.J., A Chronological Review of Empirical Research on Personal Information Privacy Concerns: An Analysis of Contexts and Research Constructs, “Information & Management” 2019, vol. 56, no. 4, pp. 570–601, https://doi.org/10.1016/j.im.2018.10.001 [access: 5.11.2024].

Netography

Heaven W.D., What is AI?, “MIT Technology Review”, 10 July 2024, https://www.technologyreview.com/2024/07/10/1094475/what-is-artificial-intelligence-ai-definitive-guide/ [access: 5.11.2024].

Kundu R., F1 Score in Machine Learning: Intro & Calculation, 16 December 2022, https://www.v7labs.com/blog/f1-score-guide [access: 5.11.2024].

Ovide S., Why Google’s AI might recommend you mix glue into your pizza, “The Washington Post”, 24 May 2024, https://www.washingtonpost.com/technology/2024/05/24/google-ai-overviews-wrong/ [access: 5.11.2024].

UNCTAD. Data Protection and Privacy Legislation Worldwide, https://unctad.org/page/data-protection-and-privacy-legislation-worldwide [access: 5.11.2024].

Ware W.H., Records, computers and the rights of citizens, [“Report of the Secretary’s Advisory Committee on Automated Personal Data Systems”, Washington 1973], https://aspe.hhs.gov/reports/records-computers-rights-citizens [access: 5.11.2024].

WIPO. Genetic Resources, Traditional Knowledge and Traditional Cultural Expressions, https://www.wipo.int/tk/en/ [access: 5.11.2024].

Informacje

Informacje: Archeion, 2024, 125, s. 55 - 78

Typ artykułu: Oryginalny artykuł naukowy

Tytuły:

Angielski: Artificial Intelligence and Machine Learning at the Intersection of Privacy and Archives
Polski: Sztuczna inteligencja i uczenie maszynowe na styku prywatności i archiwów

Autorzy

https://orcid.org/0009-0002-6403-4149

Iori Khuhro
University of British Columbia
, Kanada
https://orcid.org/0009-0002-6403-4149 Orcid
Wszystkie publikacje autora →

University of British Columbia
Kanada

https://orcid.org/0009-0008-0249-1954

Erin Gilmore
San José State University
, Stany Zjednoczone Ameryki
https://orcid.org/0009-0008-0249-1954 Orcid
Wszystkie publikacje autora →

San José State University
Stany Zjednoczone Ameryki

InterPARES Trust AI Project
Kanada

https://orcid.org/0000-0002-1772-6268

Darra L. Hofman
San José State University
, Stany Zjednoczone Ameryki
https://orcid.org/0000-0002-1772-6268 Orcid
Kontakt z autorem
Wszystkie publikacje autora →

San José State University
Stany Zjednoczone Ameryki

Publikacja: 03.12.2024

Status artykułu: Otwarte __T_UNLOCK

Licencja: CC BY-NC-ND  ikona licencji

__T_CHECK_FOR_UPDATES

Finansowanie artykułu:

The authors are grateful that this work was supported by International Research on Permanent Authentic Records in Electronic Systems (InterPARES) Trust AI, an international research partnership led by Drs. Luciana Duranti and Muhammad Abdul-Mageed, University of British Columbia. InterPARES Trust AI is supported in part by funding from the Social Sciences and Humanities Research Council of Canada (SSHRC).

Udział procentowy autorów:

Iori Khuhro (Autor) - 25%
Erin Gilmore (Autor) - 25%
Jim Suderman (Autor) - 25%
Darra L. Hofman (Autor) - 25%

Korekty artykułu:

-

Języki publikacji:

Angielski