Using data from Google BigQuery with the BigQuery data source

The following sections describe how to use data from Google BigQuery with the BigQuery data source:

Note: You can use data from Google BigQuery with the BigQuery data source (as described in this topic) or with Connected Sheets. To compare the two options, see Using data from Google BigQuery with AppSheet.

Overview

The BigQuery data source allows app creators to access BigQuery datasets in read-only mode only.

AppSheet does not initially support write/update/delete access to BigQuery datasets because BigQuery is a data warehouse database, not an online transaction processing (OLTP) database that you would normally use as the read/write data store for an application (like Cloud SQL). The intent of a data warehouse is to provide read-only datasets for reporting and data analytics.

For many use cases, we expect BigQuery datasets will be refreshed by the system of record (or authoritative source) in a separate process outside of AppSheet (like a nightly ETL batch process or a stream of data changes from the system of record using a tool like Cloud Data Fusion).

Licensing requirements for a BigQuery data source

BigQuery is a Google Cloud service that many enterprise customers use for data reporting. Because the volume of data that must be processed can be significant, BigQuery is classified as an Advanced Data Connector and requires an Enterprise-level plan.

The following AppSheet subscription plans include access to a BigQuery data source:

  • AppSheet Enterprise Standard

  • AppSheet Enterprise Plus

Limits when using a BigQuery data source

Because BigQuery has the potential to have massive amounts of data (potentially tens of millions of rows or more and up to 10,000 columns), we have put some product limits in place to ensure your apps will be able to load data and perform well.

AppSheet enforces the following limits for a BigQuery data source:

  • Data is read-only - BigQuery is a data warehouse; we do not support write operations from AppSheet at this time.

  • Maximum of 100,000 rows - BigQuery datasets exceeding the maximum number of rows will be truncated.

  • Partitioned tables are not supported - Any BigQuery tables that have partitioned columns will report an error during configuration in AppSheet. Use a BigQuery View that omits the partitioned column to work around this limitation.

  • Google Cloud access required - Service accounts are the only access method AppSheet supports. You will need to have access to create service accounts and keys in your Google Cloud project, or you will need to request assistance from an IT administrator with the appropriate level of access. The IT administrator can create the data source as a team data source and share it with you.

The next section provides best practices for setting up your app to work within the product limits.

Best practices for setting up your app to use a BigQuery data source

1. Use security filters.

Include AppSheet expressions in the security filters, which will be translated to BigQuery SQL and executed on the server before the data is downloaded to the app.

Examples:

  • Filter by USEREMAIL() to get rows owned or viewable by the currently logged-in app user.

  • Filter by DateTime values in the last week or month to get only the most recent data. For example, assuming there is a column in the dataset called DateCreated, you could use security filter expressions like:

    • For the last week: [DateCreated] >= TODAY() - 7

    • For the last month: [DateCreated] >= TODAY() - 30

  • Use a combination of user settings and security filters to allow your app user to limit the data. For example, using the user setting column named Year and a column in the dataset called StartTime, you can get data from the year 2018 or later (assuming the user setting Year is set to 2018), as follows: [StartTime] >= DATETIME("1/1/" & USERSETTINGS("YEAR"))

2. Use BigQuery Views

You can create specific views for AppSheet in BigQuery that have potentially complex SQL queries to create a limited number of rows from the dataset. When setting-up the BigQuery data source as a table in your app, select from the list of views in your project as well as from the base datasets.

3. Create a new BigQuery Table

BigQuery allows you to set up a copy of a table that can be re-created periodically based on a scheduled query. This is similar to a view, but it is a physical table that contains the subset of data. This can also be used to reduce the number of columns in order to simplify the table structure.


Configuring BigQuery in Google Cloud

In order to connect AppSheet to BigQuery, a service account must be created in Google Cloud with the correct role to access BigQuery data. A service account is a special kind of account that is used to grant system-to-system level access, rather than grant access to individual users. AppSheet (a system) will use the service account user to access BigQuery (a system) which will establish the system-to-system communications. Additional user-level access can be configured in AppSheet through security filters on the data source.

Accessing a public dataset only requires a BigQuery Job User role. Accessing a private dataset requires a custom role in Google Cloud, as described in Accessing private datasets using custom roles. (You can skip this section if you only plan to use a public dataset. )

Creating a service account

The following steps describe how to create a service account using the Google Cloud console.

  1. Go to the Google Cloud console.

  2. In the navigation menu, select IAM & Admin > Service Accounts to access the Service accounts page.

  3. Click + Create Service Account.

  4. Create service account with the BigQuery Job User role

    1. Enter the a Service account name and Service account description and click Create.

    2. Grant the Bigquery Job User role.

    3. Type in Bigquery Job in the text box to filter to the list of roles and select BigQuery Job User.

    4. Click Done.

Alternatively, the BigQuery Job User role can be set with the gcloud command-line interface, as shown in the following example. Change [PROJECTID] and [SERVICE_ACCOUNT_ID] to the values appropriate for your account.

gcloud projects add-iam-policy-binding [PROJECTID] \
--member= "serviceAccount:[SERVICE_ACCOUNT_ID]" \
--role="roles/bigquery.jobUser"

Create a JSON key for the service account

  1. Click the More menu (three vertical dots) in the Actions column of the new service account you created, and select Manage keys.

  2. Click Add Key and select Create new key.

  3. Accept the default selection of JSON and click Create.

  4. Note the file name. You will find this in your browser’s download folder. Depending on your browser type, you may see the downloaded file in the bottom bar of your browser window. Click Close.

  5. Open the file with your favorite file editor on your desktop so that you can cut and paste contents when Adding a BigQuery data source to your AppSheet app.

Accessing private datasets using custom roles

To access a private dataset, you need to create a custom role in Google Cloud IAM. If you do not need to access a private datasets (that is, you only need to access public datasets), skip to Adding the BigQuery data source to an AppSheet app.

Each Google Cloud account will have different security policies set up according to the security needs of the customer account. The instructions in this section should work for most Google Cloud projects with a highly restrictive set of policies. You may not need all of the permissions in the custom role in your specific Google Cloud project, so you can also try removing some of the permissions if you would like, then add them back in until it works for your project.

Google Cloud provides a shell feature in the Google Cloud Console called Cloud Shell which allows you to use a command0line interface. This is the quickest way to create a custom role.

  1. Click the Cloud Shell icon in the upper right of your Google Cloud Console window.

  2. Click Continue when prompted, and then wait for the Terminal window to boot up in the bottom portion of your screen.

  3. Click the Open Editor icon in the Cloud Shell Navigation bar (or use vi if you prefer).

  4. Click the icon and select New File in the menu.

  5. Enter a file name of your choice with a .yaml extension, like: BigQuery.AppSheet.Roles.yaml

  6. Cut and paste the following text into the file (modify the title and description values as needed for your use case).

    title: "bigquery.appsheet"
    description: "BigQuery ro role for AppSheet"
    stage: "GA"
    includedPermissions:
    - bigquery.datasets.get
    - bigquery.jobs.create
    - bigquery.routines.get
    - bigquery.routines.list
    - bigquery.tables.get
    - bigquery.tables.getData
    - bigquery.tables.list
    - resourcemanager.projects.get

  7. Remove any blank lines at the end of the file by using the Backspace or Delete keys on your keyboard. Your file should contain 12 lines, as shown below:

  8. Click on the x in the File tab to close the file (it will be saved automatically). Then, click Open Terminal to return to the shell.

  9. The terminal window reappears at the bottom of your screen.

  10. Cut and paste the following commands, one at a time. Replace the <bracketed> text with your own values. The backslash (\) tells the shell that the command will continue on the next line. The --file option tells the command to read the role definitions from the .yaml file you created above.

    gcloud iam roles create <your new role name> \

    --project=<your-project-id> \

    --file=<your-YAML-File-Name.yaml>

    Below is an example using actual values:

    gcloud iam roles create bigquery.appsheet3 \

    --project=appsheet-scott \

    --file=BigQuery.AppSheet.Roles.yaml

  11. Click Authorize to authorize the shell to call the API which will create the role.

    You should see the resulting text which confirms that the role was successfully created:

  12. Now, you need to share your private dataset with the role that you created so that all the permissions are available to the AppSheet BigQuery data source. You need the user name that was assigned to your service account. Navigate to the Service Accounts page of Google Cloud Console using the navigation menu or going to this URL: https://console.cloud.google.com/iam-admin/serviceaccounts

  13. Copy the email address from the Email column in the BigQuery service account that you created earlier. You will need this information in the next few steps. (You may want to open a new tab for the next step so that you can re-copy the service account email, if necessary.)

  14. Open a new tab in your browser and go to the BigQuery service in the Google Cloud Console. Use the navigation menu or go to this URL: https://console.cloud.google.com/bigquery

  15. Find your private dataset and click the More menu (three vertical dots) and select Open.

  16. Click Share Dataset in the BigQuery navigation bar:

  17. Paste the service account email address (copied in step 13) into the Add members text box.

  18. Select the Select a role menu and select Custom.

  19. Select the new custom role you created above, like bigquery.appsheet.

  20. Click Add.

  21. Note the new role now has a member, which is your service account. Click Done.

  22. Now, you are ready to add the BigQuery data source to an AppSheet app.


Adding the BigQuery data source to an AppSheet app

  1. Go to appsheet.com and log in to your account.

  2. Go to My Account > My Account.

  3. Click + New Data Source.

  4. Select Cloud Database and enter the data source name to use for your new data source in the text box, such as BigQuery-NYC-Citibikes.

  5. Choose BigQuery from the drop-down list of available database types.

    Note: If you don’t see BigQuery in the list, your account may not have this feature rolled out yet. We recommend that you wait and try again tomorrow. You do not have to repeat the previous steps; the information is saved. You can resume from here.

  6. Next, you need to obtain the following three values from the Google Cloud Console and the key file you downloaded previously: the BigQuery DataSet ID, the Google Cloud Project ID, and the Service Account key

  7. To obtain the BigQueryDataSet ID, go to the Google Cloud Console and select BigQuery under the Big Data section in the navigation menu. You may need to scroll down to find it.

  8. Find the BigQuery dataset that you’d like to use. We’ll use a public dataset called bigquery-public-data:new_york_citibike in our demo, but you should select the dataset that you would like to use in your app. If you don’t see the dataset you want, click + Add Data and add it.

  9. Open the dataset by clicking on the More menu (three vertical dots) to the right of the dataset name. Make sure you are at the top level. That is, the dataset level, indicated by this icon:

    Not on a Table level, indicated by this icon:

  10. Copy the Dataset ID so that you can paste it into AppSheet.

  11. Paste the Dataset ID into the DataSet ID field in AppSheet's Add database connection information window.

  12. To obtain the Google Cloud Project ID, go back to the Google Cloud Console. Click on the project name in the top navigation bar, and then copy the Project ID in the Select a project dialog so that you can paste it into AppSheet.

  13. Paste the Project Id into the Google Cloud Project ID field in AppSheet's Add database connection information window.

  14. To obtain the Service Account key, go to the JSON file that was downloaded when you created the key on the Service Account in Google Cloud Console (IAM & Admin). Find the file that was downloaded (it will be in your browser’s download directory) and open it with any text editor (if it is not already open) Copy the entire contents of the file.

  15. Paste the contents of the file into the Service Account key field in AppSheet's Add database connection information window.

  16. Click Test and then Authorize Access.


Adding a BigQuery table to your AppSheet app

  1. In AppSheet, go to Data > Tables and click + New Table.

  2. Select the data store you created (like big query NYC citibikes) on the Get data from... page.

  3. Select Tables or Views. (In this demo, we select Tables.)

  4. Select the table that you want for your App. (In this demo, we select citibike_stations.)

  5. Mark the table as Read-Only to improve performance as the app will not try to recache as often. Click Add This Table.

  6. Now, your app will have access to this data as if it were any other table.

  7. Important: Don’t forget to add security filters to limit the number of rows that will be used with AppSheet.

Did this answer your question?