Pangeanic develops Multimodal and Multilingual Anonymization
Key players insights
In today’s society, insights from personal information gathered from a customer or user of a platform are key for guiding the decision-making processes of companies, governments, utilities and legal and financial institutions. These institutions often have a wealth of valuable information at their fingertips. This data can be used to make better business decisions, inform about the behavior of specific customer segments, or how customers use a particular platform, service or utility.
Data can be structured data: customer databases of text or numeric data. It can also be unstructured data, such as letters to and from users, CVs, medical records, simple text, images, speech or video.
This valuable data is coming under increasing protection, evidenced by the adoption of privacy regulations around the world: GDPR in Europe, APPI in Japan, LGPD in Brazil, and CCPA, which seeks to protect consumers in California. In all legislations, the goal is to ensure consumers’, users’ and citizens’ right to privacy.
In order to monetize and benefit from the large pools of data at their disposal, while complying with these privacy regulations, organizations are turning to Anonymization software. This software is able to identify and mask personal data, such as names, addresses and bank details. Static images, such as photos, or even moving images, for example vehicle license plates and faces appearing in videos, can also be anonymized. Some applications also require data masking in speech. When data is anonymized, the identifiers that can link it to data subjects are removed or coded. This sanitized data greatly reduces the risk of personal information leaks as personal data identifiers are removed from the text or image. Post-anonymization data analytics are made possible by maintaining the structure of the original data.
Pangeanic has built its Masker software with these privacy needs in mind. Drawing on decades of translation experience, Pangeanic now offers Multilingual Anonymization. In addition to Spanish and English, Pangeanic has developed specific models for anonymization of Japanese and Brazilian Portuguese. Anonymization in German, French, Russian, Polish, Lithuanian, Italian and Catalan is also possible.
Pangeanic has also developed multimodal capabilities for anonymizing images contained in .pdf or .docx documents, such as barcodes and QR codes, signatures and company logos.
In addition to Cloud services, organizations with strict privacy policies can benefit from On-Premises capabilities, so that data for anonymization never leaves the client’s environment.
Another key feature is the integration with Pangeanic’s PECAT tool, which uses AI and human expertise to refine and rate the quality of results from current models, and feed back the results in order to continuously improve the quality of the output.
With these new developments, Pangeanic is well-positioned to offer the right solutions to the privacy requirements of organizations in 2023.