OpenRefine reconciler
What is OpenRefine
OpenRefine (formerly Google Refine) is a free, open source desktop tool for cleaning up messy data stored in common formats like CSV, JSON, XML, XLS. You can even connect to SQL-based databases and Google Sheets. OpenRefine is a popular tool for tasks like normalizing text values in a dataset because it has a simple user interface and doesn't require coding.
How does ROR integrate with OpenRefine?
OpenRefine integrates with many external services that support the W3C Reconciliation Service API protocol for matching data on the Web, including ROR.
We've built a reconciliation API extension - the ROR OpenRefine Reconciler - that allows matching organization names in an OpenRefine project to ROR IDs using the ROR REST API, but with no coding needed!
Note that the ROR OpenRefine Reconciler matches names to ROR records with a status of active
only. ROR records with a status of inactive
or withdrawn
will not be displayed as possible matches.
What use cases is this tool best for?
The Reconciler requires manually confirming matches between organization names and ROR IDs, so it's useful for cases where you have a relatively small number of organizations names (up to hundreds or perhaps several thousand, if you have time and patience).
For large organization lists with many thousands of records, we recommend using the REST API or data dump, but this will require some coding. See our guide Match organization names to ROR IDs for tips and code examples.
Using the ROR OpenRefine Reconciler
Prerequisites
-
Download and install OpenRefine on your computer
-
Create a project by importing data that contains a column with organization names
Usage instructions
- Click the arrow beside the heading of the organization names column and choose Reconcile > Start reconciling...
- In the window that opens, click Add standard service... , enter
https://reconcile.ror.org/reconcile
and click Add service
- Leave the other settings as they are and click Start reconciling
- Processing may take a few minutes, especially for long lists
- A list of possible ROR matches (if available) are displayed below the original organization name value each cell. Hover over each match to see more information from ROR. Choose your preferred ROR match by clicking the checkbox beside it. Click the double checkbox to assign your chosen ROR match to the current cell and any identical cells in the same column.
When you select an organization match from ROR, OpenRefine will change the original value in your organization name column to the name in the corresponding ROR record. If you want to retain the original names, make a copy of your organization names column in OpenRefine before you start using the ROR Reconciler.
- In cases where no match was found, you can search ROR for name variations by clicking Search for match and entering variations in the search box. If you find a good match, choose it from the dropdown and click Match. If not, click Don't reconcile cell.
If you're not able to find a ROR ID for a particular research organization, you can suggest additions, which are handled through the ROR community curation process. Learn how to suggest additions and changes to ROR .
- Next, we'll copy just the ROR IDs to a new column. Click the arrow beside the heading of the organization names column and choose Edit column > Add column based on this column...
- Enter a name for your ROR IDs column in the New column name field, enter
cell.recon.match.id
in the Expression field and Click OK.
- You should now have a list of organization names and the corresponding ROR IDs that you selected. Export your project to your desired format following the directions in the OpenRefine User Manual: Exporting your work
Tutorial video
Additional OpenRefine resources
Updated 9 months ago