By Mark W. Datysgeld, Director at Governance Primer, UASG Member
In partnership with the Governance Primer and the Brazilian Association of Software Companies (ABES), the Universal Acceptance Steering Group (UASG) completed a pilot evaluation (UASG033) to investigate the usage of libraries and frameworks by applications in GitHub, the largest repository of open-source code, with the objective of evaluating strategies to maximize Universal Acceptance (UA)-readiness in software. GitHub hosts a great number of libraries of code that developers can use to add features to their own projects. If such code does not support UA, it can potentially cause the entire application using the code to fail to handle UA properly.
This pilot focused on evaluating libraries in GitHub written in the top coding languages used by the open-source community – Java and Python – to study UA-relevant actions, such as the validation of email addresses and domain names. The UASG has plans to promote the use of the UA-ready libraries and reach out to the ones that are not yet ready to inform them of the potential UA issues. Additionally, this pilot’s full findings and datasets have been made publicly available, so any interested group may advance the pilot initiative and subsequent testing.
Universal Acceptance is a competitive differentiator every software developer should have in their skill set. It ensures online systems accept all valid domain names and email addresses equally, regardless of language, script, or character length. Not only is UA the cornerstone of a more inclusive and multilingual Internet, but it also offers a $9.8+ billion businesses opportunity for developers who want to be at the forefront of their industry and keep pace with the global Internet. To develop UA-compliant systems using GitHub, developers must choose libraries that properly process domain names and email addresses. As the UASG continues to evaluate the best path forward to work with GitHub’s most relevant libraries, developers are welcome to evaluate and report UA-readiness in their own coding environments, as this effort is meant to start conversations and increase awareness for UA.
Pilot evaluation overview:
A custom crawler application was used to identify the relevant projects in GitHub. Then, an evaluation of usage was performed by means of metadata analysis, a feature not offered by GitHub. This generated a baseline of the most high-demand components for future remediation and engagement work. The complete dataset containing the output of the Java and Python crawl procedures are available to the public, with more details in the UASG033 report.
GitHub libraries relevant to UA:
Overall, the evaluation found the following to be the most common Java and Python libraries which facilitate UA-associated actions.
|Java UA-associated libraries and their status|
|Short name||Long name||Status (Source)|
|icu4j||International Components for Unicode||IDNA2008 (UASG018A)|
|libidn||GNU IDN||IDNA2003, deprecated and ported to the Java language as “java.net.IDN”. (Documentation)|
|commons-validator||Apache Commons Validator||Relies on a static list of TLDs from 2017. (UASG018A)|
|validation-api||Jakarta Bean Validation||IDNA2003 implied, RegEx via annotations. (Documentation)|
|springfox-bean-validators||SpringFox Bean Validators for Swagger||IDNA2003 implied, RegEx via annotations; SpringFox implementation of validation-api. (Documentation)|
|hibernate-validator||Hibernate Validator||IDNA2003 implied, RegEx via annotations; Hibernate implementation of validation-api. (Documentation)|
|Python UA-associated libraries and their status|
|Short name||Long name||Status (Source)|
|idna||Internationalized Domain Names in Applications (IDNA)||IDNA2008 (UASG018A)|
|pyicu||International Components for Unicode||IDNA2008 (Documentation)|
|idna_ssl||IDNA SSL||IDNA2008 (Documentation)|
|email_validator||Email Validator||IDNA2008 (UASG018A)|
|validators||Python Data Validation for Humans||Email validation based on the Django validator, Not compliant; URL validation based on regex-weburl.js, which is a RegEx.|
Now that the pilot evaluation has been completed, the following next steps are recommended:
- The IDNA2003 UA-associated libraries identified should be prioritized for remediation. Once adaptations are made, developers should incorporate these libraries into their projects as a step towards UA-readiness.
- Individual GitHub projects that are built using the identified libraries can also be considered a priority for evaluation to ensure proper implementation.
- Finally, developers building GitHub projects that are found using libraries which are not UA-compliant will be encouraged to update their project with a more suitable option.
Complete and detailed testing information for this pilot evaluation, and recommendations from the Governance Primer and the Brazilian Association of Software Companies, can be found here.