Dedupe extension
  • 16 Nov 2022
  • 6 Minutes to read
  • Dark
    Light

Dedupe extension

  • Dark
    Light

The dedupe extension helps you find and manage duplicate records in a table. You can delete duplicates or merge individual fields from duplicate records together.


Overview

Try using the dedupe extension when you want to:

  • Convert a messy CSV into an Airtable base
  • Clean up redundant form submissions
  • Consolidate leads that were accidentally added to your CRM multiple times

To get started, first add a dedupe extension to your base by opening up the extensions panel and clicking the + Add an extension button.


Find duplicate records

Next, you'll need to pick the table in which you'll be looking for duplicates. If you'd like, you can also choose to limit your search to a specific view to only find duplicates within a subset of records.

4405948086807addapp.jpg

After that, you'll need to pick the field or fields that you want to use to find duplicates. If you pick multiple fields to find duplicates, the extension will find sets of records where the values match for all of the selected fields.

360013893034FieldsforDuplicates.gif

For text field types, you can choose to use exact matching, similar matching, or fuzzy matching. Below is an explanation of what will be found by each matching type:

  • Exact: Has the same, case-sensitive value
    • Ex) J. K. Meowling and J. K. Meowling
  • Similar: Has the same value, but may have different capitalization, punctuation, accents, whitespace, or ordering of words.
    • Ex) J. K. Meowling and j k meôwling
  • Fuzzy: Looks for typos and transposition errors (where characters are swapped such as ei vs ie in the word field)
    • Ex) J. K. Meowling and J. K. Lemonwig

Be careful! Fuzzy matching can often result in false positive matches, since its search is so broad.

360013938833DedupeExactFuzzySimilar.gif

Once you've picked your field or fields for duplicate finding and determined how strict you want the duplicate search process to be, the dedupe extension will show how many sets of duplicate are in your table or view, plus a preview of those duplicates. Click Review duplicates when you're ready to start deduping your records.


Resolve duplicate records

This will bring you to the Resolve duplicate records screen, which will compare identified duplicates in a set, side-by-side. Note that at this time you must go through each set of duplicated records, individually, to ensure the final record contains the correct information.

For each set of duplicates, you can choose to Exclude a record from the set if it is not a duplicate. You can also choose a primary record, which will mark all remaining records for deletion. If there are fields from the other records that you would like to keep, you can merge them with the primary record.

Note

Selecting a primary record means that you are specifically preserving that record's comments and revision history. All field information, including the primary field, can then be merged into the primary record you selected.

Once you choose a primary record, you will see a green check mark appear in the header. All other records will appear crossed out. For each field, you can choose one value from the available duplicates. The chosen field will turn green, indicating that it will be included in the merged record. If you'd like to combine values from two or more fields, you can do so by selecting Edit record and then manually updating the desired field.

You can execute this merging process over as many records as you want. The example below shows information being merged from 3 different records. What the new record will look like shows up on the right in the merge preview area.

360013893154Merge3.png

When comparing duplicates, you can choose to sort the identified potential duplicates alphabetically, by created time, by the number of comments on a record, or by the number of fields filled out. You can also choose to hide fields that contain identical values for all the identified duplicates, which can save time when comparing records with many fields (identical fields will be merged into the remaining record after you merge and delete).

If you want to start from scratch on a given set of duplicates, you can always hit the Reset button at the bottom left, which restores all records excluded from the set, unsets the primary record, and resets all selected fields.


Choosing multiple cells to include in the final result

For some field types that support multiple values (multiple select, multiple collaborator, linked records, and attachments), you can choose to keep multiple values in the final result.

You can use the "+" button on the cell or Ctrl / Ctrl + click to select one additional cell to add to the results.

360051340393dedupe0.png

If you click any cell without the Ctrl or Ctrl key, the clicked cell will replace the entire selection. As an example, in the screenshot above, clicking “Agency Team” in the third record would exclude the "Agency Team" cells from the first two records and only include the cell in the third record as shown in the next screenshot.

360050433174mceclip0.png

One common flow might be to combine the cell values from all duplicates. For situations where a user wants to merge all the cells for a particular field, they can use Shift + click to bulk-select cells in a range for merging.

360050433234mceclip1.png

To do this, you must already have a primary record selected. Shift + clicking another record will select all cells between the primary record and clicked record. For example, in the screenshot above, Shift + clicking the third record selects records 1-3. However, it’s possible that the primary record isn’t the first record in the set.

When combining the values from these field types, any duplicates will be removed so that only unique values will be included in the final merged record.

360051340513mceclip2.png

Cells without the "+" button appearing are not able to have multiple values and so this kind of combining is not available.


Shortcuts

To speed up the deduplication process more, you can use keyboard shortcuts:

  • Use the arrow keys and to select a record. You can also select a record by pressing its corresponding number key
  • A will use the currently selected record as the primary record
  • S will exclude the currently selected record from the merge process, and will not delete it
  • Space will open up the currently selected record for editing
  • Ctrl + Enter will merge your selections

The following shortcuts can be used with fields that can have more than one value (collaborator, linked records, multiple select, and attachments).

  • Ctrl + click will add the current selection to those already selected
  • Shift + click will add all the values between the primary field and the selected field


Merge selections to records

When you're confident with your merge selections, you can select the "Merge records" button at the bottom right. At this point, you will be asked to confirm your changes. If you are confident in your deduping selections, go ahead and click "Merge". The dedupe extension will now show the next set of duplicates to work through.

Once you're finished working through the sets of duplicates, the extension will return you to the main selection screen where you can pick the field or fields that you want to use to find more duplicates. If you're totally done deduping, you can go ahead and press Cancel or the X in the top right to close the extension and return to your base.


FAQs

What is happening when I see the “You don’t have permission to apply these changes” message?

Typically, there is some kind of editing permission setting that is blocking you from merging (deleting) records. You can learn more in this article.


Was this article helpful?