OpenRefine (formerly Google Refine) is a free, open source desktop tool for cleaning up messy data stored in common formats like CSV, JSON, XML, XLS. You can even connect to SQL-based databases and Google Sheets. OpenRefine is a popular tool for tasks like normalizing text values in a dataset because it has a simple user interface and doesn't require coding.
OpenRefine integrates with many external services that support the W3C Reconciliation Service API protocol for matching data on the Web, including ROR.
We've built a reconciliation API extension - the ROR OpenRefine Reconciler - that allows matching organization names in an OpenRefine project to ROR IDs using the ROR REST API, but with no coding needed!
The Reconciler requires manually confirming matches between organization names and ROR IDs, so it's useful for cases where you have a relatively small number of organizations names (up to hundreds or perhaps several thousand, if you have time and patience).
For large organization lists with many thousands of records, we recommend using the REST API or data dump, however this will require some coding. See our guide Match organization names to ROR IDs for tips and code examples.
Download and install OpenRefine on your computer
Create a project by importing data that contains a column with organization names
- Click the arrow beside the heading of the organization names column and choose Reconcile > Start reconciling...
- In the window that opens, click Add standard service... , enter
https://reconcile.ror.org/reconcileand click Add service
- Leave the other settings as they are and click Start reconciling
- Processing may take a few minutes, especially for long lists
- A list of possible ROR matches (if available) are displayed below the original organization name value each cell. Hover over each match to see more information from ROR. Choose your preferred ROR match by clicking the checkbox beside it. Click the double checkbox to assign your chosen ROR match to the current cell and any identical cells in the same column.
When you select an organization match from ROR, OpenRefine will change the original value in your organization name column to the name in the corresponding ROR record. If you want to retain the original names, make a copy of your organization names column in OpenRefine before you start using the ROR Reconciler.
- In cases where no match was found, you can search ROR for name variations by clicking Search for match and entering variations in the search box. If you find a good match, choose it from the dropdown and click Match. If not, click Don't reconcile cell.
If you're not able to find a ROR ID for a particular research organization, you can suggest additions, which are handled through the ROR community curation process. See How to suggest additions and changes to ROR .
- Next, we'll copy just the ROR IDs to a new column. Click the arrow beside the heading of the organization names column and choose Edit column > Add column based on this column...
- Enter a name for your ROR IDs column in the New column name field, enter
cell.recon.match.idin the Expression field and Click OK.
- You should now have a list of organization names and the corresponding ROR IDs that you selected. Export your project to your desired format following the directions in the OpenRefine User Manual: Exporting your work
Updated about 1 year ago