AWS Feed
BIOps: Amazon QuickSight object migration and version control
DevOps is a set of practices that combines software development and IT operations. It aims to shorten the systems development lifecycle and provide continuous delivery with high software quality. Similarly, BIOps (business intelligence and IT operations) can help your Amazon QuickSight admin team automate assets migration and version control. Your team can design the migration and version control strategy in your organization by following the suggested pattern in this post. You can utilize the suggested framework and sample scripts to reduce your daily workload.
In this post, we discuss the following:
- The suggested automation process of QuickSight assets migration across accounts or Regions
- The suggested workflow of dashboard version control in one QuickSight account
- The advanced usage of Amazon QuickSight APIs
For migration across accounts, we provide two options and sample code packages:
- Amazon SageMaker notebooks with migration Python scripts for data scientists or engineers. These Python scripts can do batch migration (migrate all assets of the source account to the target account) and on-demand incremental migration (migrate specific assets across accounts).
- An application to perform migrations with a QuickSight embedded website as an UI. The backend of this application contains an Amazon API Gateway endpoint, several AWS Lambda functions, an Amazon Simple Queue Service (Amazon SQS) queue, and an Amazon Simple Storage Service (Amazon S3) bucket. This application is packed in AWS Cloud Development Kit (AWS CDK) stacks and can be easily deployed into your environment.
Migrate across accounts and Regions automatically
Let’s assume that we have two QuickSight accounts: development and production. Both accounts are configured to connect to valid data sources. The following diagram illustrates our architecture.
The architecture contains the following workflow:
- The Python scripts (SageMaker notebooks or Lambda functions) call QuickSight APIs (
list_datasources
) to get the data source list in the development account. - The scripts call the QuickSight
describe_data_source
API to describe the data source. The response of thedescribe_data_source
API is a JSON object. The scripts update the JSON object with production account information, for instance, Amazon Redshift credentials or cluster ID. - The scripts create the data source in the production account and share the data source with the BI admin team.
- The scripts perform the same procedure to the datasets.
- The scripts create a template of the dashboard or analysis that the BI admin wants to migrate. (A template only can be created from an analysis or an existing template. When we create a template of a dashboard, we have to create the template from the underlying analysis of this dashboard. The version of the published dashboard might be behind the underlying analysis.)
- The scripts call the
create_analysis
orcreate_dashboard
API in the production account to create the analysis or the dashboard from the remote template in the development account, and apply the theme. - The scripts share the analysis or dashboard to some specific groups or users.
- The scripts log the success messages and errors messages to Amazon CloudWatch Logs.
For migration across Regions, the BI admin can follow the same procedure to migrate assets from the source Region to the target Region. Instead of changing the account ID in the ARN of assets, change the Region name of the ARN.
We provide sample Python scripts later in this post.
Dashboard version control in one account
Under some conditions, the BI team might want to perform version control of the dashboard development in one account. The following diagram illustrates our architecture.
The workflow includes the following steps:
- The BI developer creates an analysis and a template of this analysis. Let’s call the analysis and template version 1 assets.
- The BI developer publishes the analysis as a dashboard, and the QA team runs tests on this dashboard.
- After the QA test, the BI developer continues to develop the analysis to be version 2.
- The BI team publishes version 2 of the dashboard.
- The QA team tests version 2 of dashboard again, and takes the following action based on the result:
- If the test is successful, the BI admin can update the template to be version 2.
- If the tests detect errors, the BI developer has to edit the analysis to fix the issues. However, some issues in the analysis may be unfixable. The BI admin can roll back the analysis or dashboard to be version 1 with the backup template. QuickSight allows authors to roll back analysis to previous version using an undo button. In case the undo history was reset (with user’s confirmation) due to an event like dataset swap, or authors want to go back to a confirmed V1 starting point, you can use the V1 template in an update-analysis API call to reset the analysis to V1 state.
- The BI developer works on the version 1 analysis to repeat the development cycle.
This workflow is the best practice we suggest to QuickSight users. You can modify the sample code packages we provide to automate this suggested process.
QuickSight API
For more information about the QuickSight API, see the QuickSight API reference and Boto3 QuickSight documentation.
Option 1: SageMaker notebooks of migration scripts
In this section, we present the first migration option for data scientists and engineers: using SageMaker notebooks with migration scripts.
Solution overview
We provide the sample Python scripts for migrating across accounts in three SageMaker notebooks:
- functions – Provides all the functions, including describe objects, create objects, and so on. The supportive functions are developed to perform the tasks to automate the whole process. For example, update the data source connection information, get the dashboard ID from dashboard name, and write logs.
- batch migration – Provides the sample automation procedure to migrate all the assets from the source account to the target account.
- incremental migration – Provides on-demand incremental migration to migrate specific assets across accounts.
The following diagram illustrates the functions of each notebook.
You can download the notebooks from the GitHub repo.
Prerequisites
For this solution, you should have the following prerequisites:
- Access to the following AWS services:
- QuickSight
- SageMaker
- AWS Identity and Access Management (IAM)
- Two different QuickSight accounts, for instance, development and production
- Basic knowledge of Python
- Basic AWS SDK knowledge
Create resources
Create your resources in the source account by completing the following steps:
- Download the notebooks from the GitHub repository.
- Create a notebook instance.
- Edit the IAM role of this instance to add an inline policy called
qs-admin-source
:
- On the notebook instance page, on the Actions menu, choose Open JupyterLab.
- Upload the three notebooks into the notebook instance.
Implement the solution
In this section, we walk you through the steps to implement the solution.
AssumeRole
To use AssumeRole
, complete the following steps:
- Create an IAM role in the target (production) account that can be used by the source (development) account.
- On the IAM console, choose Roles in the navigation pane.
- Choose Create role.
- Choose the Another AWS account role type.
- For Account ID, enter the source (development) account ID.
- Create an IAM policy called
qs-admin-target
:
- Grant the IAM role the
qs-admin-target
IAM policy. - Provide the
qs-admin-source
andqs-admin-target
role name in the Assume Role cells of the notebooks.
Static profile
To use a static profile, complete the following steps:
- Create the IAM user
qs-admin-source
with policyqs-admin-source
in the source account. - Create the IAM user
qs-admin-target
with policyqs-admin-target
in the target account. - Get the
aws_access_key_id
andsecret_access_key
of these two IAM users. - In the terminal of the SageMaker notebook, go to the directory
/home/ec2-user/.aws
. - Edit the config and credential file to add a profile named source with the
aws_access_key_id
andsecret_access_key
ofqs-admin-source
. - Edit the config and credential file to add a profile named target with the
aws_access_key_id
andsecret_access_key
ofqs-admin-target
. - Provide the source and target profile name in the Static Profile cell of the notebook.
The tutorials of these notebooks are provided as comments inside the notebooks. You can run it cell by cell. If you want to schedule the notebooks to run automatically, you can schedule the Jupyter notebooks on SageMaker ephemeral instances.
In this solution, we assume that the name of dashboard and dataset are unique in the target (production) account. If you have multiple dashboards or datasets with the same name, you will encounter an error during the migration. Every dashboard has its own business purpose, so we shouldn’t create multiple dashboards with the same name in the production environment to confuse the dashboard viewers.
Option 2: Dashboard as UI to enter the migration workflow
In this section, we present the second migration option with the use of a QuickSight embedded website as an UI.
Solution overview
The following diagram illustrates our solution architecture.
The following diagram illustrates the resources deployed in the central account to facilitate the migration process.
The resources include the following:
- Dashboard as UI – The QuickSight dashboard is based on a ticketing backend, QuickSight assets information, and migration status data. You can use the bottom banner of the dashboard to trigger a migration of resources. Choosing Submit sends the migration request and required parameters (asset name, source environment, and target environment) to API Gateway. The dashboard also displays the migration results, which are stored in an S3 bucket.
- S3 bucket – An S3 bucket hosts a static website to present you with a simple embedded dashboard that shows all active dashboards, analyses, datasets, data sources, and migration status.
- API Gateway – API Gateway provides endpoints for embedding a QuickSight dashboard and accepting POST requests to perform migrations:
- quicksight-embed – Embeds the migration status QuickSight dashboard. The API endpoint invokes the Lambda backend to generate a short-lived QuickSight embed URL, and presents the dashboard in an iFrame.
- quicksight-migration-sqs – Presents a footer form that allows the user to submit migration details with POST requests, which invoke the QuickSight migration Lambda function.
- SQS queue – We use an SQS queue between the QuickSight migration API endpoint and the backend Lambda function to perform the migration. Messages are deleted after a migration is complete.
- Lambda functions – We use three different functions:
- QuickSight migration – This function is invoked by the SQS queue, and it performs the necessary migration tasks depending on the parameters it receives. This function can perform both batch and incremental migration of QuickSight resources by querying the QuickSight service API, AWS Systems Manager Parameter Store, and AWS Secrets Manager.
- QuickSight embed URL – When invoked, this function fetches an embed URL of a given dashboard and returns an HTTP payload to the caller.
- QuickSight status – This function periodically queries the QuickSight API for details about dashboards, datasets, data sources, analyses, and themes, and uploads the results to Amazon S3. This S3 bucket is then used as a data source for a QuickSight dashboard to display a centralized view of all relevant resources.
Prerequisites
For this walkthrough, you should have the following prerequisites:
- Access to the following AWS services:
- API Gateway
- Amazon Athena
- Lambda
- QuickSight
- Amazon SQS
- Amazon S3
- Amazon CloudFront
- Parameter Store
- Secrets Manager
- Two different QuickSight accounts, such as development and production
- Basic knowledge of Python
- Basic AWS SDK knowledge
- Git and npm installed
- The AWS CDK installed (see AWS CDK Intro Workshop: Python Workshop)
Create resources
Create your resources by cloning the following AWS CDK stack from the GitHub repo:
Implement the solution
The following diagram illustrates the services deployed to our central and target accounts.
Deploy to the central account
We use the following stacks to deploy resources to the central account:
- QuicksightStatusStack – Deploys the Lambda functions and related resources to populate the S3 bucket with active QuickSight dashboard details
- QuicksightMigrationStack – Deploys the Lambda function, SQS queue, S3 bucket, and the API Gateway endpoint for initiating migration of QuickSight resources
- QuicksightEmbedStack – Deploys the API Gateway endpoint, CloudFront distribution, and Lambda functions to process the embed URL requests
The migration scripts require a QuickSight user to be created with the name quicksight-migration-user
. This user is given permissions to the migrated resources in the destination. However, another QuickSight user or group can be used in place of quicksight-migration-user
by replacing the following:
- The parameter in
~/amazon-quicksight-sdk-proserve/Migration-scripts/cdk/lambda/quicksight_migration/quicksight_migration/lambda_function.py
(line 66) - The QUICKSIGHT_USER_ARN variable in
~/amazon-quicksight-sdk-proserve/Migration-scripts/cdk/cdk/quicksight_embed_stack.py
(line 81)
Creating VPC connections in QuickSight allows QuickSight to access your private data resources and enhances your security. Create this connection in the central account with the VPC connection name set to the VPC ID.
Set up your environment
Set up your environment with the following code:
Deploy QuickSight status and migration stacks
Deploy the QuickSight status and migration stacks with the following code:
Note down the API Gateway endpoint from the output for a future step.
Create a dashboard
After the AWS CDK is deployed, run the Lambda function quicksight_status manually and then two files, group_membership.csv
and object_access.csv
, are created in the S3 bucket quicksight-dash-CENTRAL_ACCOUNT_ID
. By default, this Lambda function is invoked hourly.
In the source account, you can run the following SQL query to create two Athena tables (group_membership
and object_access
):
You can create two SPICE datasets in QuickSight with the two new Athena tables, and then create a dashboard based on these two datasets. For more information, see Using administrative dashboards for a centralized view of Amazon QuickSight objects.
Deploy the QuickSight embed stack
Update DASHBOARD_ID
in cdk/quicksight_embed_stack.py
(line 80) with the dashboard ID that you just created with the two joined Athena tables.
Update basic authentication credentials in lambda/embed_auth/index.js
(lines 9–10) with your preferred username and password.
Deploy the QuickSight embed stack with the following code:
In the html/index.html
file, update the following values to the output values from the QuicksightMigrationStack
and QuicksightEmbedStack
deployment steps, then upload it to the S3 bucket (quicksight-embed-CENTRAL_ACCOUNT_ID
) created by this stack:
- quicksight-embed-stack.EmbedAPIGatewayURL output value (line 85) –
apiGatewayUrl: '<quicksight-embed-stack.EmbedAPIGatewayURL>'
- quicksight-migration-stack.MigrationAPIGatewayURL output value (line 38) –
const apiGatewayUrl = '<quicksight-migration-stack.MigrationAPIGatewayURL>';
The index.html
file should be placed in the root of the S3 bucket with no prefix.
In the event index.html
was updated after CloudFront was deployed, and is displaying the wrong content, the CloudFront cache may need to be invalidated. The URI /index.html
should be invalidated in the central account CloudFront distribution.
The URL for the CloudFront distribution we created is outputted when the QuicksightEmbedStack
stack is deployed. In QuickSight, add the output value for quicksight-embed-stack.EmbedCloudFrontURL
to allow dashboard embedding, and select Include subdomains.
Deploy to the target account
We use the following stacks to deploy resources to the target account:
- InfraTargetAccountStack – Deploys an IAM role that can be assumed by the migration Lambda role. This stack should also be deployed to any target accounts that contain QuickSight resources.
- OptionalInfraTargetAccountStack – Deploys Amazon VPC, Amazon Redshift cluster, and Amazon Aurora cluster. This stack is optional and can be ignored if you have existing infrastructure for this proof of concept.
Deployment of target resources to the target account can either be done from the central account Amazon Elastic Compute Cloud (Amazon EC2) instance with the appropriate cross-account permissions or from an EC2 instance provisioned within the target account.
For this post, Amazon Redshift and Amazon Relational Database Service (Amazon RDS) clusters are required to perform migrations. Amazon Redshift and Amazon RDS aren’t necessary for migrating QuickSight resources that depend on Amazon S3 or Athena, for example. The stack optional-infra-stack
deploys both Amazon Redshift and Amazon RDS clusters in the target account. Although deploying this stack isn’t necessary if you already have these resources provisioned in your target account, it does set up the environment correctly for the example migrations. To deploy, use the following command:
Deploy the target account stack
Update line 16 in cdk/infra_target_account_stack.py
:
self.central_account_id = "123456789123"
with the central account ID.
If OptionalInfraTargetAccountStack
was deployed, update the /infra/config
AWS Systems Manager parameter found in cdk/infra_target_account_stack.py
(lines 67–77) file with the values of the newly created Amazon Redshift or Amazon RDS clusters. All values are provided as AWS CloudFormation outputs.
However, if you already have clusters deployed in your environment, update the /infra/config
Systems Manager parameter found in the cdk/infra_target_account_stack.py
(lines 67–77) file with the values of your existing Amazon Redshift or Amazon RDS clusters. Set redshiftPassword
and rdsPassword
to the name of the secret found in Secrets Manager for these resources. These secrets for Amazon Redshift and Amazon RDS should include username
and password
values, as shown in the following screenshot.
The following are example values for the /infra/config
parameter:
After the values have been updated in cdk/infra_target_account_stack.py file run the following shell commands:
Creating an Amazon VPC connection in QuickSight allows QuickSight to access your private data resources, and enhances your security. Create this connection in the target account with the VPC connection name set to the VPC ID. This is required because the data sources created in the OptionalInfraTargetAccountStack
stack are within a VPC.
Trigger a migration
The URL for the CloudFront distribution we created is outputted when the QuicksightEmbedStack
stack is deployed. Navigate in your browser to the output value for quicksight-embed-stack.EmbedCloudFrontURL
; the CloudFront distribution prompts you for basic authentication credentials, then redirects to the embedded QuickSight dashboard.
You can trigger a migration via the migration tool at the bottom of the dashboard.
As of this writing, this tool supports two types of migrations:
- Batch – Attempts to migrate all resources, including themes, dashboards, analyses, datasets, and data sources
- Incremental – Allows you to select a migration resource of the dashboard, analysis, or theme, and provide names of these items in the Migration Items field.
Choosing Submit sends a message to the SQS queue, which triggers the migration Lambda function on the backend. The embedded dashboard should reflect the status of the dashboard migration after it performs its periodic refresh.
Clean up
Finally, to clean up the resources created in this post, perform the following cleanup steps, depending on the solution option you used.
Option 1 cleanup
If you implemented the Option 1 solution, complete the following steps:
- Delete the SageMaker notebooks running the migration scripts.
- Delete the IAM role attached to the SageMaker notebooks.
Option 2 cleanup
If you implemented the Option 2 solution, complete the following steps:
- Destroy the resources created by the AWS CDK:
- Destroy the resources created by the AWS CDK in the target accounts:
- Manually delete S3 buckets created in both central and target accounts.
Things to consider
This solution will help you with QuickSight object migration and version control. Here are few limitations to consider:
- If there are deleted datasets used in the QuickSight analysis or dashboard, then consider deleting such datasets.
- If there are duplicate object names then consider naming them different.
- If there are file based data sources then consider converting them to S3 based data sources.
Pricing
For the pricing details of the services used in this post, see the following:
- Amazon S3 pricing
- AWS Lambda pricing
- Amazon SQS pricing
- Amazon API Gateway pricing
- AWS Secrets Manager pricing
Conclusion
Object migration and version control in a programmable method is always highly demanded by the BI community. This post provides the best practices and practical code package to address QuickSight object migration and version control. This solution can readily fit into a ticketing system or CI/CD pipelines.
If you have any feedback or questions, please leave them in the comments section. You can also start a new thread on the Amazon QuickSight forum.
About the Authors
Ying Wang is a Senior Data Visualization Architect with the Data & Analytics Global Specialty Practice in AWS Professional Services.
Vamsi Bhadriraju is a Data Architect at AWS. He works closely with enterprise customers to build data lakes and analytical applications on the AWS Cloud.
Samruth Reddy is a DevOps Consultant in the AWS ProServe Global Delivery team working on automation tooling, security and infrastructure implementations, and promoting DevOps methodologies and practices to his customers.