Build a data story app using Amazon Redshift Serverless and Toucan

986c6759a5b4c3da2ac9087e00f88d87.gif

This is a guest article co-written by Django Bouchez, a solution engineer at Toucan, and Amazon Cloud Technology.

Business Intelligence (BI) with dashboards, reporting and analytics remains one of the most popular data and analytics use cases. It provides business analysts and managers with visualization of the past and current state of the enterprise, helping leaders make strategic decisions that will affect the future of the enterprise. However, customers still want better ways to tell stories with their data to increase adoption of their BI tools.

Most BI tools on the market offer an exhaustive set of customization options for building data visualizations. This might seem like a good idea, but it ends up burdening business analysts to explore endless possibilities before writing a report. Analysts are not graphic designers, and a poorly designed data visualization may hide the insights it intends to convey or even mislead the viewer. To get more value from your data, you should focus on building data visualizations that tell a story and are easy for your audience to understand. This is where guided analysis comes in. It doesn’t offer unlimited customization options, but rather intentionally limits choices by enforcing design best practices. The simplicity of the guided experience allows business analysts to spend more time generating actual insights instead of worrying about how to present those insights.

This post clarifies the concept of guided analytics and shows you how to build data storytelling applications using Amazon Redshift Serverless and Amazon Cloud Technology Partner Toucan. Toucan integrates natively with Redshift Serverless, which enables you to deploy scalable data stacks in minutes without having to manage any infrastructure components.

Amazon Redshift is a fully managed cloud data warehouse service that allows you to analyze large amounts of structured and semi-structured data. Amazon Redshift can scale from data warehouses of several gigabytes to PB-level data warehouses. Amazon Cloud Technology has recently announced the global launch of Redshift Serverless, which makes it a scalable and cost-effective way to store data and run ad-hoc analysis. One of the best options.

With Redshift Serverless, you can gain data insights without having to manage data warehouse infrastructure by running standalone SQL queries or using data visualization tools such as Amazon QuickSight, Toucan, or other third-party options.

Toucan is a cloud-based guided analytics platform purpose-built to reduce the complexity of delivering data insights to business users. To do this, Toucan provides a no-code, comprehensive user experience at every stage of a data storytelling application, including connecting data, building visualizations, and distributing on any device.

If you’re in a hurry and want to see what you can do with this integration, check out Amazon Web Technologies and Toucan’s Visualization of Scammer Attacks (https://louishourcade.github.io/aws-toucan-website/), where Redshift Serverless and Toucan can help you understand the evolution of scammer attacks around the world.

Solution Overview

There are several BI tools available in the market, each offering an increasing set of features and customization options to stand out from the competition. Paradoxically, this does not appear to increase BI tool adoption within the enterprise. With more complex tools, data owners spend time building fancy visuals and tend to pack as much information as possible into dashboards rather than presenting simple, understandable information to business users.

In this post, we will explain the concept of guided analytics from the perspective of a data engineer who needs to communicate stories to business users through data visualizations. The fictional data engineer must create a dashboard to understand how scammer attacks have evolved over the past 120 years. After loading the scammer attack dataset in Redshift Serverless, we’ll guide you through writing stories using Toucan to better illustrate how scammer attacks evolve over time. With Toucan, you can natively connect to datasets in Redshift Serverless, transform data using a no-code interface, build storytelling visuals, and then publish them for consumption by business users. The scammer attack visualization example (https://louishourcade.github.io/aws-toucan-website/) illustrates what can be achieved by following the instructions in this post.

Also, we recorded a video tutorial (https://youtu.be/rIcUjUKkz20) on how to connect Toucan with Redshift Serverless and start building graphs.

Solution Architecture

The diagram below depicts our solution architecture.

dac8ed1123396d740a4c28f586c41f39.png

We use the Amazon CloudFormation stack to deploy all the resources needed in your Amazon account:

  • Network components – This includes a VPC, three public subnets, an internet gateway, and a security group for hosting Redshift Serverless endpoints. In this article, we use public subnets to facilitate data access from external sources such as Toucan instances. In this case, the data in Redshift Serverless is still protected by security groups and database credentials that restrict incoming traffic. For production workloads, it is recommended to keep traffic within the Amazon network. You can do this by setting up a Redshift Serverless endpoint in a private subnet and deploy Toucan in your Amazon account through Amazon Marketplace.

  • Redshift Serverless Components – This includes Redshift Serverless namespaces and workgroups. Redshift Serverless workspaces are publicly accessible for easy connections from Toucan instances. The database name and admin username are defined as parameters when deploying the CloudFormation stack, and the admin password is created in Amazon Secrets Manager. In this article, we use database credentials to connect to Redshift Serverless, but Toucan also supports connecting with Amazon credentials and Amazon Identity and Access Management (IAM) profiles.

  • Custom resource – The CloudFormation stack contains a custom resource, which is an Amazon Lambda function that automatically loads scammer attack data into a Redshift Serverless database when the CloudFormation stack is created.

  • IAM Roles and Permissions – Finally, the CloudFormation stack includes all the IAM roles associated with the previously mentioned services to interact with other Amazon resources in your account.

In the following sections, we’ll provide all the instructions for connecting Toucan to your data in Redshift Serverless and guide you through building a data storytelling application.

Example dataset

In this article, we use a custom dataset that lists all known scammer attacks worldwide starting in 1900. You don’t need to import the data yourself; we use the Amazon Redshift COPY command to load the data when deploying the CloudFormation stack. The COPY command is one of the fastest and most scalable methods for loading data into Amazon Redshift. For more information, see Loading Data Using the COPY Command.

The dataset contains 4900 records with the following columns:

  • date

  • Year

  • ten years

  • century

  • type

  • Zone_Type

  • area

  • country / region

  • Activity

  • gender

  • age

  • major

  • time

  • type

  • href (link to PDF with contextual description)

  • Case_Number

Prerequisites

For this solution, you should have the following prerequisites:

  • An Amazon account. If you don’t have an account yet, see the instructions in Signing up for Amazon.

  • An IAM user or role that has permissions to the Amazon resources used in this solution.

  • Free trial of Toucan for building data storytelling applications.

Setting up Amazon resources

You can launch a CloudFormation stack in any region where Redshift Serverless is available.

1. Choose Launch Stack to begin creating the Amazon Website Service resources required for this article: (https://console.aws.amazon.com/cloudformation/home?region= us-east-1#/stacks/new?stackName=aws-toucan & amp;templateURL=https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/BDB-2389/template.json)

272bbbb71934ec5fbe650d52f7ff789d.png

2. Specify the database name in Redshift Serverless (default is dev).

3. Specify the administrator username (default is admin).

You don’t need to specify the database administrator password because it is created in Secrets Manager by the CloudFormation stack. The name of this key is AWS-Toucan-Redshift-Password . We will use the key value in the next steps.

Test deployment

It takes a few minutes to deploy the CloudFormation stack. After the deployment is complete, you can confirm that the resources were created. To access your data, you need to obtain Redshift Serverless database credentials.

1. On the Outputs tab of the CloudFormation stack, note the name of the Secrets Manager secret.

32d8f5b0dfd196ae37ce2421a757b625.png

2. On the Secrets Manager console, navigate to the Amazon Redshift database secret and choose Retrieve secret value to obtain the database administrator’s username and password.

4be59ab68db23fda52f10b1c0c3a9b29.png

3. To ensure that your Redshift Serverless database is available and contains the scammer attack data set, open the Redshift Serverless workgroup on the Amazon Redshift console and select Query data to access query editing device.

4. Note also the Redshift Serverless endpoint, which you need to connect to Toucan.

5689692b91417f4df6a3e203c79d6eb1.png

5. In the Amazon Redshift query editor, run the following SQL query to view the scammer attack data:

SELECT * FROM "dev"."public"."shark_attacks";

Swipe left to see more

c7b8515b326da86ad9e2d4380cfebffa.png

Note that if you change the default when starting the CloudFormation stack, you will need to change the name of the database in the SQL query.

You have configured Redshift Serverless in your Amazon Website Service account and uploaded a scammer attack dataset. Now is the time to put this data to use by building storytelling apps.

Start Toucan Free Trial

The first step is to access the Toucan platform through the Toucan free trial.

Fill out the form and complete the registration steps. Afterwards, you will enter Storytelling Studio in Staging Mode. Feel free to browse what has been created.

04421a7cdfbd27fad557131218ea7dcd.png

Connecting Redshift Serverless to Toucan

To connect Redshift Serverless with Toucan, complete the following steps:

1. Select Datastore at the bottom of Toucan Storytelling Studio.

2. Select Connectors.

Toucan natively integrates with Redshift Serverless via AnyConnect.

3. Search for the Amazon Redshift connector and fill out the form with the following information:

  • Name – The name of the connector in Toucan.

  • Host – your Redshift Serverless endpoint.

  • Port – The listening port (5439) of your Amazon Redshift database.

  • Default Database – The name of the database to connect to (defaults to dev unless edited in the CloudFormation stack parameters).

  • Authentication Method – The authentication mechanism used to connect to Redshift Serverless. In this case we use database credentials.

  • User – The username used to authenticate to Redshift Serverless (defaults to admin unless edited in the CloudFormation stack parameters).

  • Password – The password used to authenticate to Redshift Serverless (you should retrieve this from Secrets Manager; the name of the key is AWS-Toucan-Redshift-Password ).

da6448f315414a873ac7696ccc18dffe.png

Create Live Query

You are now connected to Redshift Serverless. Complete the following steps to create a query:

1. On the home page, select Add tile to create a new visualization.

bd6ccee564d704eac6334899dc5317e3.png

2. Choose the Live Connections tab, and choose the Amazon Redshift connector you created in the previous step.

a189d334851ae38a64b8817bed16e98a.png

The Toucan trial will guide you through building your first live query, where you can use the Toucan YouPrep module to transform your data without writing code.

For example, as shown in the following screenshot, you can use this no-code interface to sum the number of significant scammer attacks by activity, get the top five, and then calculate the percentage of the total.

515a1af924b2d7ca2d2340122babf5e0.png

Build your first chart

Once the data is ready, select the Tile tab and fill out the form that will help you build the diagram.

For example, you can configure a leaderboard of the five most dangerous activities and add a highlighting effect to activities with more than 100 attacks.

Select Save Changes to save your work and return to the home page.

7a94027d6fd92661c3c32acc52827066.png

Publish and share your work

Until this stage, you have been working in Staging mode. For everyone to see your work, you need to publish it to Production.

In the lower right corner of the home page, select the eye icon to preview your work from the perspective of a future end user. You can then select Publish to make your work available for everyone to see.

e48e242f90f7fa46bc099e57d707d466.png

Toucan also offers several embedding options to make your diagrams more accessible to end users (e.g. using mobile phones and tablets).

5d60b4bbce333ab1b98ba303ed9375a2.png

By following these steps, you have connected to Redshift Serverless, transformed data using the Toucan codeless interface, and built data visualizations for business end users. Toucan Trial will guide you at every stage of the process to help you get started.

Redshift Serverless and Toucan guided analytics provide an effective way to increase adoption of BI tools by reducing infrastructure work for data engineers and making dashboards easier to understand for business end users. This article covers only a small portion of what Redshift Serverless and Toucan have to offer, so feel free to explore other features in the Amazon Redshift Serverless documentation and Toucan documentation.

Cleanup

Some of the resources deployed through CloudFormation templates in this article incur charges when used. Please be sure to remove these assets and clean up your work when finished to avoid unnecessary charges.

On the CloudFormation console, select Delete stack to delete all resources.

Summary

This article shows you how to set up an end-to-end architecture for guided analytics using Redshift Serverless and Toucan.

The solution benefits from the scalability of Redshift Serverless, allowing you to cost-effectively store, transform, and expose data without having to manage any infrastructure. Redshift Serverless is natively integrated with Toucan, a guided analytics tool available to anyone on any device.

Guided analytics focuses on communicating stories through data reporting. By setting intentional constraints on custom options, Toucan makes it easy for data owners to build meaningful dashboards that convey clear and concise information to end users. It works for your internal and external customers, and for an unlimited number of usage scenarios.

Get started today with our CloudFormation templates and free Toucan trial!

Original URL:

https://aws.amazon.com/blogs/big-data/build-a-data-storytelling-application-with-amazon-redshift-serverless-and-toucan/

The author of this article

aa5b5f47171f0565b305df8d02e0332f.png

Louis Hourcade

Data Scientist on theAmazon Professional Services team. He works with Amazon customers across a variety of industries to help them accelerate business results through innovative technologies. In his spare time, he enjoys running, climbing boulders, and surfing (not too big waves).

f73320a017b264a9aacc9374af5771ec.png

Benjamin Menuet

Data Architect atAmazon Professional Services. He helps clients develop big data and analytics solutions to accelerate business results. Outside of work, Benjamin is a cross-country runner and has completed some very prestigious races, such as the UTMB.

0f84167e09048b2c08510e7f048e8c7a.png

Xavier Naunay

Data Architect atAmazon Professional Services. He is a member of the Amazon ProServe team, helping enterprise customers solve complex problems using Amazon services. In his free time, he is either traveling or learning about technology and other cultures.

d3630231abd6877d9bc7177fa7d3b4a8.png

Django Bouchez

Toucan Solutions Engineer. He works with the sales team to provide support on technical and functional verification and certification, and also helps R&D demonstrate new features with cloud partners such as Amazon. Outside of work, Django is a homebrewer, scuba diving and sport climbing.

a63910f8c967ab341e23e3fe5e38fa86.gif

e046341fb4f3ba8442f61121616ecb77.gif

I heard, click the 4 buttons below

You won’t encounter bugs!

bbb3432a86c802eeb8ce650294253d4f.gif

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skills treeHomepageOverview 15,003 people are learning the system