- 10 May 2023
- 18 Minutes to read
- Print
- DarkLight
Data sets and verifying data in Airtable
- Updated on 10 May 2023
- 18 Minutes to read
- Print
- DarkLight
By leveraging our existing Airtable sync feature, creators or admins in your Enterprise organization can configure shareable data sets that will appear in a library alongside other data sets. Admins can verify individual data sets that your organization can trust as a single source of truth. Users with creator permissions can sync that verified data into bases across their organization's Airtable ecosystem. There are different permission settings and configuration options worth taking a deeper dive into which are covered in the article below.
Introduction
Plan availability | Enterprise workspaces only |
Permissions |
|
Platform(s) | Web/Browser, Mac app, and Windows app |
Related reading | |
Airtable terminology | Data set(s) - Specific data that has been shared to an organization’s library, where other people in the organization can find and access it. A data set may additionally be marked as verified by an admin, although this is not a requirement for a data set to be included in the library. Verified data set(s) - Specific data that has been approved by admins within an organization and shared for use by other people in that organization. Verified data is shared to a data library, where other members of the organization can find and access it. Data library - Available to an entire organization, the data library represents the collection of data sets that have been published for that organization. People in an organization who want to use a data set must navigate to the data library when adding a new table to a base. Data set owner - A data set owner is manually designated as a point of contact for a data set that’s been published to an organization’s data library. Although data set owners don’t have any additional access or permissions in-product, it is required to designate a data set owner before a data set can be published. |
Why use data sets in Airtable?
Often, especially in organizations with many departments and teams, there is an issue of content sprawl. Briefly, content sprawl is what happens when multiple versions of similar data are being used across an organization. This means that data becomes hard to trust, difficult to find, and tedious or inefficient to keep updated. Here are a few examples of how content sprawl might occur in your organization:
- Product roadmaps
- Marketing campaigns
- Company-wide KPIs and OKRs
- Org chart / Human resources directory
By utilizing data sets in Airtable, your organization can reduce content sprawl empowering Enterprise admins to verify the data sets that meet your organization's standards. By doing so, each verified data set effectively becomes a single source of truth for the entire organization. What that means is that your organization's data can only be updated from a single base location, but can be synced and viewed from multiple base locations throughout your organization's Airtable ecosystem.
Publishing a data set
The following steps can be performed by Enterprise admins with creator permissions or users with creator permissions (depending on admin panel settings) in the base view where the data is stored. Publishing data sets allows them to be seen by other members of your organization and verified by admins if they so choose.
First, navigate to the base, then the table, and lastly, the view that you would like to add to your organization's data library.
Click Share and sync option and then click the Publish to data library option.
- First, give the verified dataset a name.
- Next, give the data set a description. We highly recommend describing the data in as much detail as is necessary to help creators across your organization understand exactly what the verified data source contains.
By default, the user currently publishing the data set (you) will be assigned as the data set's "owner." However, you can also assign ownership to another user with creator permissions in the base where the data set is being sourced from.
By default, any user with base creator permissions will be able to see and add the data set from your organization's data library. However, by toggling on this option, you can choose particular user groups to share the data set with. This is particularly helpful when there is sensitive data or data only relevant to certain groups that you would like to hide from the larger organization.
Here you can choose whether to lock the view. By default, this option is toggled on. Remember, that records can still be modified in other views in the table where the data is being stored. The main reason for this feature is to prevent accidental changes to the configuration of the view being used as the source of the data set.
On the right portion of the setup screen, you'll see a general preview of how the data set will appear in the data library to other users. Once you have determined that the data set configuration is complete click Publish. After publishing, the data set will appear in your organization member's data library and can be verified by an Enterprise admin.
Using a data set
The following steps can be performed by any users from your organization who have Creator permissions in the base where they want to sync that data set. It's important to note that certain data sets may not be visible to every user depending upon the settings outlined in Step 5 of the section above.
Navigate to the base where you would like to add the data set. Remember that this will essentially create a synced table in the base that cannot be edited. However, the data can be viewed and enriched with additional fields in the table that is created.
First, click the + Add or import option to create a new table. You'll find this option next to the furthest right table in the base that is open. Clicking this will open a dropdown menu where you will see a section titled "Add from data library." In this section, you'll see 3 suggested data sets that are popular in your organization as well as the option to view x more data sets >. Clicking that option will open the data library.
If you clicked to view more data sets, then a popup will appear with more choices from your organization's data library.
Verified data sets that have specifically been marked as trustworthy by your organization's admins will show a Verified badge next to the name of the data set.
Clicking on any data set will open a preview of the data set containing:
- An interactable embedded preview of the data set
- A description of the data set
- The number of records and fields in the data set
- The number of other bases currently syncing to this data set
- The owner of the data set
- When the data set was last updated
When you have confirmed that this is the correct data set to use, click Add this data. Depending on the size of the source, it may take a few moments for the source data to sync over. You'll see a new synced table appear in the base when it is finished processing.
Data sets in the Admin Panel
Within the Admin Panel, Enterprise admins can access the Data sets page for their organization's instance from the left side menu. On this page, admins are able to see a list of data sets across their organization. Admins will find vital information about each data set and be able to perform various actions such as editing, verifying, and previewing data sets, and more. Additionally, there is a toggle to only allow admins to add data sets to your organization's data library.
You'll find the Data sets page on the left side of the Admin Panel.
- Total # of published data sets - This number represents the total number of published data sets across an organization.
- Toggle for restricting data set publishing - By default, this option is toggled off, which means that users with creator/owner permissions in a base can publish data sets. When toggled on, only Enterprise admins will be able to publish data sets in bases where they have creator/owner permissions.
- Active/Inactive - Here, admins can click between a list of active or inactive data sets. Inactive data sets represent data sets that are not currently added to any other bases across the organization utilizing the data sets feature. You can find more information about reactivating a data set below.
- Find a data set - Search for a data set by its given name in the data library.
- Data sets filter - Filter the list of data sets to show All datasets, Published only (data sets that are not currently verified), or Published and verified data sets.
- CSV - Download a CSV file containing the information listed above.
- Name - The name of the data set.
- Verified - A check mark indicates that the data has been verified. The process of verifying data is covered below.
- Data set owner - The user currently assigned as the "owner" of the data set.
- Bases using this data- The total number of bases that have included this data set as a table.
- Published for - The audience(s) that have access to see this data set in the data library. Either All org members or a list of specific user groups chosen in step 5 of the publishing flow outlined above.
- Published by - The user who originally published the data set. This user might be different than the "owner" of the data set.
- ... (Additional options) - Here, admins will find more options that allow them to:
- Preview - Preview the data set as if you were looking at it in the data library. Good for gaining context about what is contained within the data set.
- Verify data set - Mark the data as verified so that users across the organization know they can trust the data set. On previously verified data sets this option will appear as No longer verify data set, which essentially allows admins to "unverify" the data set. Note that "unverifying" a data set will not cause previously added data sets to be removed from connected bases.
- Edit - Edit the name, description, data set owner, and/or user group access of the data set.
- Go to data source - This option links to the original base view of the data published to the data library.
- Remove from library - Remove this data set from your organization's data library.
Verifying or removing verification of a data set
You'll need to have Airtable Enterprise admin permissions to be able to perform the steps listed below.
After logging into Airtable on the browser of your choice, visit this link to open your organization's admin panel.
After opening the Admin Panel, click the Data sets option on the left sidebar.
From here you can search for a specific data set by name or scroll through the list of data sets to find it. To filter the list to only show data sets that are not currently verified, set the filter next to the search bar to Published only.
On the right side of the data set you are viewing click the three-dot ... icon to open a menu of additional options. Next, click the option to Verify data set. The screen will refresh after a moment and you should see a check appear in the Verified column next to the data set you just verified.
Removing the verification badge from a data set follows similar steps to what is listed in the steps above. The main difference is that the option to Verify data set will now instead appear as No longer verify data set. Additionally, you may want to filter the Data sets page to only show data sets that are currently verified by choosing the Published and verified option in the filter dropdown menu.
Making inactive data sets active
Certain actions will deactivate a data set. These include:
- Turning off sync for the share
- The share link is disabled
- Setting a password for the share
- Turning on two-way sync for the share
- Enabling email domain restrictions for the share
We've included messaging in-product to help prevent changes like these from being made, but issues may still arise over time.
You'll need to have Airtable Enterprise admin permissions to be able to perform the steps listed below.
After logging into Airtable on the browser of your choice, visit this link to open your organization's admin panel.
After opening the Admin Panel, click the Data sets option on the left sidebar.
Below the toggle to Allow only admins to publish data sets you'll see the option to show Inactive data sets. Click this and move on to the next step.
From here you can search for a specific data set by name or scroll through the list of data sets to find it. You'll see an option, Make active, that when clicked on will take you to the base where the issue is occurring. As a reminder, we list those potential issues at the top of this section.
Once in the base, you'll see one or more warning messages explaining what will need to be resolved in order to reactivate the data set.
Any issues will need to be resolved before the data set can be reactivated. Using the example from the last step, let's say that a password was accidentally enabled for the data set. To resolve this, you'll need to:
- Click the blue back arrow that is highlighted.
- This will return you to the main Share and sync settings menu.
- From here, click Link settings.
- Then toggle off the Access is password-protected option.
- This will open a confirmation pop-up. Click Remove password.
Now that you've resolved the issue or issues that caused the data set to become inactive you can either click back to the Link settings or click the Share and sync button to open the view share settings menu again. From here:
- Click the Publish to data library option.
- This will open the next window. Confirm that All issues have been resolved.
- Finally, click Make active to reactivate the data set.
Data set dependencies
- Being an Organization member means that user has been claimed by a single enterprise account. More about Organizations can be found in this support article.
- Users who are not a member of the organization will not see Data set as an option in the data library while adding a new synced table, even if that base is enterprise owned.
Example: Personal accounts won’t see data sets in the data library. They will see existing synced tables though.
Data sets are ordered in the data library as:
- Verified data sets first, then unverified data sets.
- Within those two larger grouping orders, data sets will then appear in the order of most recently published to least recently published.
As a workaround to bump a data set to the top of the view order, an admin can:
- Remove the data set from the data library.
- Then, publish that data set again.
- Lastly, verify the data set again.
- In data set previews, the view's group settings will appear the same as they do in the source of the original data set.
- However, when a data set is added to another base those group settings may not be respected.
- In these cases, you may need to "rebuild" the group settings that were used in the original data set. If the data set owner is not you, then we recommend you contact them so that they can help to consult you on how you might recreate those settings.
FAQs
We have not included any limits at this time.
- By default, data sets can be published by any Organization member with creator permissions or higher. Admins can choose to allow only admins to publish data sets in the Admin Panel.
- Only the organization's admin(s) can verify a data set.
Users must be Organization members (i.e. claimed by a single Organization, read more here) to have access to an Organization’s data library. If a user is not an organization member or an admin, they will not see the library while adding a new synced table in a base, even if that base is owned by the Organization.
By default, a data set is published to the entire Organization but publishers can restrict the visibility of a data set to specific user groups in the publishing flow and when editing a published dataset in the library.
Admins can see the list of data sets in the Admin Panel “Data sets” tab and take various actions to manage them (e.g. search, filter, verify, stop verifying, change metadata)
Data verification can be performed by Airtable admins.
Managing and choosing which data sets should be marked as verified is highly dependent upon your organization's individual needs.
- It’s important for at least one admin to take the lead on this program – they should be responsible for verifying existing datasets and ensuring that they stay up to date.
- It’s also important that admins feel comfortable with the quality of this data w/o feeling like they have an unreasonable upkeep responsibility. That’s the reason the ability to establish an owner on a data set exists.
Users must enable a link and ensure that the following are disabled:
- Two-way sync
- Password protected
- Email domain access at the view level (org-wide domain sharing restrictions are fine)
If a user has published a data set and then enables one of these settings, the published data set will move into an inactive state and a user or admin must reconfigure sync settings to reactivate it.
Data sets are stored in the original bases they are published from. You can view all published datasets in the admin panel, including the bases that they have been synced to.
Yes! This is the recommended approach for data sets that are already used broadly. Publishing to the data library will work retroactively:
- Publishing a synced view already in use to the data library doesn’t change anything for bases already using it
- It immediately gives admins more visibility into their data ecosystem as these bases will be listed as using the published dataset in the Admin Panel.
No, existing syncs will not be affected. The feature to publish a dataset to specific groups is only about setting the visibility of the data set in the library, so if a user is already using that sync, it doesn't change anything for them.
Airtable recommends that the owner of a data set is clearly defined within your organization.
We will only track a source and all bases syncing to that source. This means that syncs of syncs (aka a chained sync) will not show up in the admin panel.
Yes. Published datasets update automatically, just like regular syncs.
The verified data sets come first, followed by published data sets. Within each of those groups, we show data sets published in descending order.
Yes, because this is how Airtable sync works.