Sales BI (Apache Superset)

Use Apache Superset to query and visualize datasets loaded in Spice.

Apache Superset is a modern, enterprise-ready business intelligence web application. It is fast, lightweight, intuitive, and loaded with options that make it easy for users of all skill sets to explore and visualize their data, from simple pie charts to highly detailed deck.gl geospatial charts.

– Apache Superset documentation

Connecting.Spice.AI.to.Apache.Superset.mp4

This recipe will show how to configure a BI dashboard (Apache Superset) to use Spice as the data source for sales data. The sales data is stored in a parquet file on Amazon S3.

Spice.ai can be used to accelerate data from connected data sources by keeping an automatically updated copy of the data in an optimized format. This data can be used to power a BI dashboard that refreshes quickly.

Pre-requisites

This recipe requires Docker and Docker Compose to be installed.

Start Apache Superset with Flight SQL & DataFusion SQL Dialect support

Superset requires a Python DB API 2 database driver and a SQLAlchemy dialect to be installed for each connected datastore. Spice implements a Flight SQL server that understands the DataFusion SQL Dialect. The flightsql-dbapi library for Python provides the required DB API 2 driver and SQLAlchemy dialect.

This recipe consists of a Docker Compose file with the following components:

A Spice runtime accelerating the data from the parquet file in S3.
An Apache Superset instance to visualize the data, connected to the Spice instance.

Clone the spiceai/cookbook repository and navigate to the sales-bi directory:

git clone https://github.com/spiceai/cookbook.git
cd cookbook/sales-bi

Run the following command to start the components in the Docker Compose file:

make

This will start the Spice runtime and Apache Superset. The Spice runtime will load two datasets based on the parquet file in S3 - one is accelerated and one is not:

version: v1
kind: Spicepod
name: sales-bi

datasets:
  - from: s3://spiceai-demo-datasets/cleaned_sales_data.parquet
    name: cleaned_sales_data_accelerated
    acceleration:
      enabled: true
      refresh_check_interval: 10s
      refresh_mode: full

  - from: s3://spiceai-demo-datasets/cleaned_sales_data.parquet
    name: cleaned_sales_data

Queries against cleaned_sales_data will always request data from the parquet file in S3, while queries against cleaned_sales_data_accelerated will be run against the locally accelerated copy of the parquet file.

The output of the make command should look like:

 ✔ Container superset-sales-bi-demo  Started                                                                                                                                           0.0s
 ✔ Container spiceai-sales-bi-demo   Started                                                                                                                                           0.0s
Connection to localhost port 8088 [tcp/radan-http] succeeded!
Initializing Superset...

Superset is running at http://localhost:8088, login with admin/admin

Navigate to http://localhost:8088 to access the Apache Superset dashboard. The login credentials are admin/admin.

Once logged in, navigate to the Sales Dashboard (Federated) to view the sales data that is querying against the non-accelerated data.

Click on a product line to view the sales data for that product line, i.e. for Vintage Cars:

Notice that the dashboard takes a few seconds to load the data. This is because the data is being queried from the parquet file in S3.

Navigate to the Sales Dashboard (Accelerated) to view the sales data that is querying against the accelerated data. Notice that the dashboard feels much more responsive.

In the top navigation bar, hover over the SQL menu and click on SQL Lab to view a query editor. Explore the data from the Spice.ai runtime by running SQL queries, i.e. SELECT * FROM cleaned_sales_data LIMIT 10 or SELECT * FROM cleaned_sales_data_accelerated LIMIT 10:

View the connection details for the Spice.ai runtime, hover over the Settings menu and click on Database Connections. Hover over the Spice.ai entry and click the pencil icon. The connection details for the Spice.ai runtime are shown:

Spice SQL REPL

In addition to Apache Superset, run queries using the Spice SQL REPL to explore the data in the Spice.ai runtime.

docker exec -it spiceai-sales-bi-demo spiced --repl

Welcome to the interactive Spice.ai SQL Query Utility! Type 'help' for help.

show tables; -- list available tables
sql> show tables;
+---------------+--------------+--------------------------------+------------+
| table_catalog | table_schema | table_name                     | table_type |
+---------------+--------------+--------------------------------+------------+
| spice         | runtime      | task_history                   | BASE TABLE |
| spice         | public       | cleaned_sales_data             | BASE TABLE |
| spice         | public       | cleaned_sales_data_accelerated | BASE TABLE |
+---------------+--------------+--------------------------------+------------+

Time: 0.01025875 seconds. 3 rows.

sql> select * from cleaned_sales_data limit 1;
sql> select order_date, sales from cleaned_sales_data limit 3;
+---------------------+---------+
| order_date          | sales   |
+---------------------+---------+
| 2003-02-24T00:00:00 | 2871.0  |
| 2003-05-07T00:00:00 | 2765.9  |
| 2003-07-01T00:00:00 | 3884.34 |
+---------------------+---------+

Time: 0.610236 seconds. 3 rows.
sql> select order_date, sales from cleaned_sales_data_accelerated limit 3;
+---------------------+---------+
| order_date          | sales   |
+---------------------+---------+
| 2003-02-24T00:00:00 | 2871.0  |
| 2003-05-07T00:00:00 | 2765.9  |
| 2003-07-01T00:00:00 | 3884.34 |
+---------------------+---------+

Time: 0.006600542 seconds. 3 rows.

Clean up

To stop and remove the Docker containers/volumes that were created, run:

make clean

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sales BI (Apache Superset)

Pre-requisites

Start Apache Superset with Flight SQL & DataFusion SQL Dialect support

Spice SQL REPL

Clean up

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sales BI (Apache Superset)

Pre-requisites

Start Apache Superset with Flight SQL & DataFusion SQL Dialect support

Spice SQL REPL

Clean up