Match organization names to ROR IDs
If you have organization names or full affiliation strings stored in your system as text, there are several approaches to matching those text strings to ROR IDs.
The approaches given below are for cases when you have a list, a spreadsheet, or a database with organization information as text strings, often with "extra" text such as a department or address as well as the organization's name, but no corresponding organization IDs.
Examples of organization names and affiliation information as text strings and the corresponding ROR ID:
- University of Haifa > https://ror.org/02f009v59
- University of ManchesterGreater Manchester Mental Health NHS Foundation TrustNational Institute for Health Research (NIHR) Greater Manchester Patient Safety Translational Research Centre > https://ror.org/05sb89p83
- Atmospheric Sciences and Global Change Division, Pacific Northwest National Laboratory, Richland, WA, USA > https://ror.org/05h992307
Mapping names and IDs to ROR
If you have organization names along with another global organization ID (Crossref Funder ID, GRID, ISNI, or Wikidata), you may find quicker, more accurate results by mapping those other IDs to ROR. See Map other organization ID types to ROR.
If you need to match ROR IDs to user input in a web application, see Create ROR-powered typeaheads in forms.
Match organization names to ROR IDs using the ROR API
Query vs affiliation parameter
The ROR API offers 2 ways to search ROR records, which work slightly differently and return different results:
-
Affiliation parameter
/organizations?affiliation=
: Searchesname
,aliases
, andlabels
fields using different search algorithms and returns the best match(es) based on matching score. Includes matching score and true/false indicator of whether the score is high enough to be considered a reliable match. Produces about 85% correct matches in our tests; individual implementations may see better or worse results. -
Query parameter
/organizations?query=
: Searchesname
,aliases
,labels
,acronyms
, andexternal_ids
fields in ROR records and returns all matching records. Does NOT include a matching score or true/false indicator of whether the score is high enough to be considered a reliable match.
For cases where you have relatively unique organization names or full affiliation strings, use the affiliation parameter approach. Works for both English and non-English name variations. Examples:
- Incorporated Research Institutions for Seismology (IRIS)
- Universitätsbibliothek der Ludwig-Maximilians-Universität München
- Department of Civil and Industrial Engineering, University of Pisa, Largo Lucio Lazzarino 2, Pisa 56126, Italy
For cases where you have very common organization names, use the query parameter approach to look for keywords from the organization's name or the exact name of the organization surrounded by double quotation marks, and consider using filters for organization type and country. Examples:
- Ministry of Health
- National Research Council
- York University
Retrieving active and inactive organizations
By default, both the affiliation parameter and the query parameter will return only records with an active status:
status: "active"
. Consider whether you also want to retrieve records with an inactive status; inactive records generally represent organizations that no longer operate. See API filtering for details.Be aware too that inactive organizations may be succeeded by a new organization under a different name with a different ROR ID. If you do retrieve inactive organizations, check the
relationships
field of an inactive record to see if it has a Successor organization.
Affiliation parameter approach
Searches name
, aliases
and labels
fields using different search algorithms and returns the best match(es) based on matching score using the format https://api.ror.org/organizations?affiliation=<URL-encoded string>
. For the full list of fields returned, see Fields and sub-fields.
curl 'https://api.ror.org/v1/organizations?affiliation=university+of+wisconsin+madison' | json_pp
If there's a high-confidence match, it will appear as the first result, with "chosen": true
. Other results and their matching score will also be returned, but if you're automating this process using a script, we recommend using results with "chosen": true
rather than relying on the score. Results are paginated: see API paging for details.
{
"items" : [
{
"chosen" : true,
"matching_type" : "EXACT",
"organization" : {
"acronyms" : [
"UW"
],
"addresses" : [
{
"city" : "Madison",
"country_geonames_id" : null,
"geonames_city" : {
"city" : "Madison",
"geonames_admin1" : {
"ascii_name" : null,
"code" : "US.WI",
"id" : null,
"name" : "Wisconsin"
},
"geonames_admin2" : {
"ascii_name" : null,
"code" : null,
"id" : null,
"name" : null
},
"id" : 5261457,
"license" : {
"attribution" : "Data from geonames.org under a CC-BY 3.0 license",
"license" : "http://creativecommons.org/licenses/by/3.0/"
},
"nuts_level1" : {
"code" : null,
"name" : null
},
"nuts_level2" : {
"code" : null,
"name" : null
},
"nuts_level3" : {
"code" : null,
"name" : null
}
},
"lat" : 43.07305,
"line" : null,
"lng" : -89.40123,
"postcode" : null,
"primary" : false,
"state" : null,
"state_code" : null
}
],
"aliases" : [
"UW–Madison"
],
"country" : {
"country_code" : "US",
"country_name" : "United States"
},
"email_address" : null,
"established" : 1848,
"external_ids" : {
"FundRef" : {
"all" : [
"100007015",
"100008959",
"100005996",
"100007870",
"100008301",
"100008028",
"100008237",
"100008161",
"100010495",
"100009627",
"100010284",
"100005911",
"100007925",
"100005902",
"100012787"
],
"preferred" : "100007015"
},
"GRID" : {
"all" : "grid.14003.36",
"preferred" : "grid.14003.36"
},
"ISNI" : {
"all" : [
"0000 0001 2167 3675"
],
"preferred" : null
},
"Wikidata" : {
"all" : [
"Q838330",
"Q33122195",
"Q7662222"
],
"preferred" : "Q838330"
}
},
"id" : "https://ror.org/01y2jtd41",
"ip_addresses" : [],
"labels" : [
{
"iso639" : "es",
"label" : "Universidad de Wisconsin-Madison"
},
{
"iso639" : "fr",
"label" : "Université du Wisconsin à Madison"
}
],
"links" : [
"http://www.wisc.edu/"
],
"name" : "University of Wisconsin–Madison",
"relationships" : [
{
"id" : "https://ror.org/05cb4rb43",
"label" : "Morgridge Institute for Research",
"type" : "Child"
},
{
"id" : "https://ror.org/04gq8q482",
"label" : "North Temperate Lakes Long Term Ecological Research",
"type" : "Child"
},
{
"id" : "https://ror.org/03ydkyb10",
"label" : "University of Wisconsin System",
"type" : "Parent"
},
{
"id" : "https://ror.org/03b8vas82",
"label" : "National Atmospheric Deposition Program",
"type" : "Related"
},
{
"id" : "https://ror.org/04r3s7465",
"label" : "McMurdo Dry Valleys Long Term Ecological Research",
"type" : "Related"
},
{
"id" : "https://ror.org/02kj3rm24",
"label" : "Wisconsin Geological and Natural History Survey",
"type" : "Child"
}
],
"status" : "active",
"types" : [
"Education",
"Funder"
],
"wikipedia_url" : "http://en.wikipedia.org/wiki/University_of_Wisconsin%E2%80%93Madison"
},
"score" : 1,
"substring" : "university of wisconsin madison"
},
{
"chosen" : false,
"matching_type" : "EXACT",
"organization" : {
"acronyms" : [
"UWHC"
],
"addresses" : [
{
"city" : "Madison",
"country_geonames_id" : null,
"geonames_city" : {
"city" : "Madison",
"geonames_admin1" : {
"ascii_name" : null,
"code" : "US.WI",
"id" : null,
"name" : "Wisconsin"
},
"geonames_admin2" : {
"ascii_name" : null,
"code" : null,
"id" : null,
"name" : null
},
"id" : 5261457,
"license" : {
"attribution" : "Data from geonames.org under a CC-BY 3.0 license",
"license" : "http://creativecommons.org/licenses/by/3.0/"
},
"nuts_level1" : {
"code" : null,
"name" : null
},
"nuts_level2" : {
"code" : null,
"name" : null
},
"nuts_level3" : {
"code" : null,
"name" : null
}
},
"lat" : 43.07305,
"line" : null,
"lng" : -89.40123,
"postcode" : null,
"primary" : false,
"state" : null,
"state_code" : null
}
],
"aliases" : [
"UW Hospital and Clinics",
"University of Wisconsin Health University Hospital",
"University of Wisconsin Hospital and Clinics",
"University of Wisconsin-Madison Hospital",
"Wisconsin General Hospital"
],
"country" : {
"country_code" : "US",
"country_name" : "United States"
},
"email_address" : null,
"established" : 1924,
"external_ids" : {
"GRID" : {
"all" : "grid.412647.2",
"preferred" : "grid.412647.2"
},
"ISNI" : {
"all" : [
"0000 0000 9209 0955"
],
"preferred" : null
},
"Wikidata" : {
"all" : [
"Q7896631"
],
"preferred" : null
}
},
"id" : "https://ror.org/02mqqhj42",
"ip_addresses" : [],
"labels" : [],
"links" : [
"https://www.uwhealth.org/locations/university-hospital-170"
],
"name" : "UW Health University Hospital",
"relationships" : [
{
"id" : "https://ror.org/01e4byj08",
"label" : "University of Wisconsin Carbone Cancer Center",
"type" : "Child"
},
{
"id" : "https://ror.org/03e3qgk42",
"label" : "University of Wisconsin Health",
"type" : "Parent"
}
],
"status" : "active",
"types" : [
"Healthcare"
],
"wikipedia_url" : "https://en.wikipedia.org/wiki/University_of_Wisconsin_Hospital_and_Clinics"
},
"score" : 0.87,
"substring" : "university of wisconsin madison"
},
{
"chosen" : false,
"matching_type" : "EXACT",
"organization" : {
"acronyms" : [
"UWCCC"
],
"addresses" : [
{
"city" : "Madison",
"country_geonames_id" : null,
"geonames_city" : {
"city" : "Madison",
"geonames_admin1" : {
"ascii_name" : null,
"code" : "US.WI",
"id" : null,
"name" : "Wisconsin"
},
"geonames_admin2" : {
"ascii_name" : null,
"code" : null,
"id" : null,
"name" : null
},
"id" : 5261457,
"license" : {
"attribution" : "Data from geonames.org under a CC-BY 3.0 license",
"license" : "http://creativecommons.org/licenses/by/3.0/"
},
"nuts_level1" : {
"code" : null,
"name" : null
},
"nuts_level2" : {
"code" : null,
"name" : null
},
"nuts_level3" : {
"code" : null,
"name" : null
}
},
"lat" : 43.07305,
"line" : null,
"lng" : -89.40123,
"postcode" : null,
"primary" : false,
"state" : null,
"state_code" : null
}
],
"aliases" : [
"Carbone Cancer Center",
"UW Carbone",
"UW Carbone Cancer Center",
"UW Carbone Comprehensive Cancer Center",
"UWCCC Madison",
"University of Wisconsin Cancer Center",
"University of Wisconsin Carbone Comprehensive Cancer Center",
"University of Wisconsin Comprehensive Cancer Center",
"University of Wisconsin-Madison Carbone Cancer Center"
],
"country" : {
"country_code" : "US",
"country_name" : "United States"
},
"email_address" : null,
"established" : 1940,
"external_ids" : {
"FundRef" : {
"all" : [
"100007923"
],
"preferred" : "100007923"
},
"GRID" : {
"all" : "grid.412639.b",
"preferred" : "grid.412639.b"
},
"ISNI" : {
"all" : [
"0000 0001 2191 1477"
],
"preferred" : null
},
"Wikidata" : {
"all" : [
"Q7876154"
],
"preferred" : null
}
},
"id" : "https://ror.org/01e4byj08",
"ip_addresses" : [],
"labels" : [],
"links" : [
"https://cancer.wisc.edu"
],
"name" : "University of Wisconsin Carbone Cancer Center",
"relationships" : [
{
"id" : "https://ror.org/02mqqhj42",
"label" : "UW Health University Hospital",
"type" : "Parent"
}
],
"status" : "active",
"types" : [
"Funder",
"Healthcare"
],
"wikipedia_url" : "https://en.wikipedia.org/wiki/University_of_Wisconsin_Carbone_Cancer_Center"
},
"score" : 0.76,
"substring" : "university of wisconsin madison"
}
],
"number_of_results" : 3
}
Don't automatically select the first "unchosen" result of an ?affiliation query with no
chosen: true
resultWhen no result has
"chosen": true
, the first result is not necessarily the best match for a given affiliation string. In that case, several results may have the exact same score and/or there may be no match with a high score. Because the affiliation service breaks a given affiliation string into multiple substrings and performs searches of each substring on its own as well as in combination with other substrings, a high scoring match that is not selected as chosen may be a match to only a small portion of the entire affiliation string. In these cases, it is best to respect the absence of"chosen: true"
and leave the string unmatched or add a layer of human or machine assignment, at your discretion.
Additional request examples:
Incorporated Research Institutions for Seismology (IRIS)
curl 'https://api.ror.org/v1/organizations?affiliation=Incorporated%20Research%20Institutions%20for%20Seismology%20(IRIS)' | json_pp
Universitätsbibliothek der Ludwig-Maximilians-Universität München
curl 'https://api.ror.org/v1/organizations?affiliation=Universit%C3%A4tsbibliothek%20der%20Ludwig-Maximilians-Universit%C3%A4t%20M%C3%BCnchen' | json_pp
Department of Civil and Industrial Engineering, University of Pisa, Largo Lucio Lazzarino 2, Pisa 56126, Italy
curl 'https://api.ror.org/v1/organizations?affiliation=Department%20of%20Civil%20and%20Industrial%20Engineering%2C%20University%20of%20Pisa%2C%20Largo%20Lucio%20Lazzarino%202%2C%20Pisa%2056126%2C%20Italy' | json_pp
Query parameter approach
Search all indexed fields in ROR records using the format https://api.ror.org/organizations?query=<search term>&filter=<filters>
. We strongly recommend using organization type and country filters. For more information, see the REST API guide.
curl 'https://api.ror.org/v1/organizations?query=%22Ministry+of+Health%22&filter=types:Government,country.country_code:NZ' | json_pp
The response is a JSON object containing full records for the first 20 search results. Results are paginated: see API paging for details. Matching scores are not included, but closest matches are at the beginning of the list.
Query parameter searches may return many results, so some amount of human intervention may be needed to determine the matching ROR ID (if there is one).
{
"items" : [
{
"acronyms" : [],
"addresses" : [
{
"city" : "Wellington",
"country_geonames_id" : null,
"geonames_city" : {
"city" : "Wellington",
"geonames_admin1" : {
"ascii_name" : null,
"code" : "NZ.WGN",
"id" : null,
"name" : "Wellington Region"
},
"geonames_admin2" : {
"ascii_name" : null,
"code" : null,
"id" : null,
"name" : null
},
"id" : 2179537,
"license" : {
"attribution" : "Data from geonames.org under a CC-BY 3.0 license",
"license" : "http://creativecommons.org/licenses/by/3.0/"
},
"nuts_level1" : {
"code" : null,
"name" : null
},
"nuts_level2" : {
"code" : null,
"name" : null
},
"nuts_level3" : {
"code" : null,
"name" : null
}
},
"lat" : -41.28664,
"line" : null,
"lng" : 174.77557,
"postcode" : null,
"primary" : false,
"state" : null,
"state_code" : null
}
],
"aliases" : [
"Manatū Hauora"
],
"country" : {
"country_code" : "NZ",
"country_name" : "New Zealand"
},
"email_address" : null,
"established" : 1903,
"external_ids" : {
"FundRef" : {
"all" : [
"501100001504"
],
"preferred" : null
},
"GRID" : {
"all" : "grid.415708.f",
"preferred" : "grid.415708.f"
},
"ISNI" : {
"all" : [
"0000 0004 0483 5988"
],
"preferred" : null
},
"Wikidata" : {
"all" : [
"Q16933991"
],
"preferred" : null
}
},
"id" : "https://ror.org/00vjb5165",
"ip_addresses" : [],
"labels" : [],
"links" : [
"http://www.health.govt.nz/"
],
"name" : "Ministry of Health",
"relationships" : [],
"status" : "active",
"types" : [
"Government",
"Funder"
],
"wikipedia_url" : "https://en.wikipedia.org/wiki/Ministry_of_Health_(New_Zealand)"
}
],
"meta" : {
"countries" : [
{
"count" : 1,
"id" : "nz",
"title" : "New Zealand"
}
],
"statuses" : [
{
"count" : 1,
"id" : "active",
"title" : "active"
}
],
"types" : [
{
"count" : 1,
"id" : "funder",
"title" : "Funder"
},
{
"count" : 1,
"id" : "government",
"title" : "Government"
}
]
},
"number_of_results" : 1,
"time_taken" : 6
}
Matching a list of organization names to ROR IDs using a script
You can write your own script to match a list of organization names to ROR IDs via the ROR API, though some manual intervention will likely be needed to make sure that matches are correct. If you are writing a script, remember the following:
- Try the affiliation parameter approach first instead of the query parameter approach.
- Look for "chosen" results where
"chosen": true
. - If no results are found, try alternate names/acronyms (if you have them) or try the query parameter approach.
- API results are paginated: see API paging for details.
- By default, API requests will return only records with an active status:
status: "active"
. Consider whether you want to retrieve records with an inactive status as well; inactive records generally represent organizations that no longer operate. See API filtering for details.
See, use, and clone examples of Python scripts that match organization names to ROR IDs in the ror-utilities Github repository.
Match organization names to ROR IDs using the data dump
Instead of using the ROR API, you can use your own scripts or processing tools on the ROR data dump to match organization names to ROR IDs. Advantages of this approach include:
- Fine-grained control over matching criteria
- Faster processing in cases where you have many IDs to map
- No chance of error responses due to network interruptions or API outages
Match organization names to ROR using OpenRefine
The ROR OpenRefine Reconciler is a fairly labor-intensive way of matching organization names to ROR IDs, but it works well for those who have a relatively small list of organization names, those who want to have a high degree of control and oversight over the matching process, and those who do not want to write code.
OpenRefine (formerly Google Refine) is a free, open source desktop tool for cleaning up messy data stored in common formats like CSV, JSON, XML, XLS. You can even use it to connect to SQL-based databases and Google Sheets.
See ROR OpenRefine Reconciler for written usage instructions, screenshots, and a tutorial video.
Match organization names to ROR IDs using third-party tools
Several projects and researchers have developed scripts and/or machine learning and artificial intelligence tools that match textual organization information to ROR IDs. Several of these tools are fast and can work with large amounts of data with accuracy rates before human intervention ranging from about 85% to 95%. These tools are not officially supported by ROR, but we list them here in case you find them useful.
-
OpenAlex Institution Parsing by OurResearch
-
S2AFF - Semantic Scholar Affiliations Linker by the Allen AI Institute
-
RORRetriever by Metadata Game Changers
-
EMBL-EBI ROR Predictor prototype by EMBL-EBI for Project FREYA
-
dataESR affiliation matcher developed by Anne L'Hôte and Eric Jeangirard for the French Ministry of Higher Education
-
OpenAlex ROR Predictor (gpu-based) by ROR Curation Lead Adam Buttrick
-
fastText ROR Predictor (cpu-based) by ROR Curation Lead Adam Buttrick
-
lr_predictor by ROR Curation Lead Adam Buttrick
-
ROR experimental affiliation matching - A collection of data and code for training models and experimenting with automatically matching affiliation strings to ROR IDs. Not production code, and not officially supported by ROR.
Testing and training data
ROR is currently experimenting with strategies for automatically matching affiliation strings to ROR IDs, and we have collected sets of data from Springer Nature, the American Physical Society, Crossref, and OpenAlex for testing and training matching strategies that are openly available at https://github.com/ror-community/affiliation-matching-experimental/tree/main/test_data. These datasets include affiliation text strings from production systems that have been matched to ROR IDs with varying levels of human review.
Updated about 1 month ago