How to get up and running with PredictHQ's event data using Snowflake in minutes

Peter Jansen
Head of Product

Snowflake users are able to access PredictHQ data quickly and seamlessly through the Snowflake Secure Data Share. This is one of the easiest ways to get our demand intelligence into your data lake or machine learning with minimal tech team time. This blog post walks you through the four steps, which only take a few minutes.

Why access our data through Snowflake’s Secure Data Share?

Snowflake Secure Data Share enables companies to access and share PredictHQ’s data in a controlled and efficient way. Access to a Secure Share of our events through your familiar and high-performance SQL interface with an up-to-date, clean, and complete set of PredictHQ’s data. 

This means you can immediately incorporate the data into your models, removing or greatly simplifying the need for ELT/ETL processes to pull event data into your data warehouse. You can check out the Introduction to Secure Data Sharing page if you're interested to read more on Snowflake's Secure Data Sharing.

The four step process for accessing PredictHQ data via Snowflake’s Secure Data Share

The steps are:

  1. Log into your Snowflake account and pick which PredictHQ data sample you would like to access.

  2. Create a database.

  3. Access and query the data for your project or test.

  4. Reach out to the PredictHQ team for questions, support or further data access.

Step 1: Log into your Snowflake account and pick which PredictHQ data sample you would like to access

Once you are in your Snowflake account, you will be able to choose from three data set samples. You can find out more about the key terms for accessing PredictHQ in Snowflake here.

  1. Attended events in Seattle: Sports, concerts, festivals, expos, conferences, performing arts and community events

    1. This includes one year of historical data and 30 days future facing data.

    2. This data enables companies to begin to understand the impact of events on your demand.

    3. Our customers in accommodation, transport, retail (grocery and QSR) and more find this to be a highly impactful event type, enabling better staffing, stocking and pricing.

    4. See the Snowflake listing here.

  2. Non-Attended events in Seattle: Public holidays, school holidays, academic dates, politics and daylight savings.

    1. This includes one year of historical data and 30 days future facing data.

    2. These events are high impact and vary significant by school district, tertiary institution as well as city for public holidays. Including them is critical to understanding how events impact demand.

    3. This event type is impactful for all our customers, but especially those in accommodation, transport and retail.

    4. See the Snowflake listing here.

  3. Unscheduled events in California: Severe weather warnings, natural disasters, public health warnings and terror attacks

    1. This includes one year of historical data and 30 days future facing data.

    2. These events can’t be used in forward looking demand forecasting or planning, beyond severe weather events. This is because by their nature they are unscheduled/live breaking events. But they are used by large companies to investigate the impact of these events on historical demand, so they can gauge their impact and have data-driven response plans ready to go when similar events occur.

    3. See the Snowflake listing here

Step 2: Create a database within Snowflake

Snowflake’s Secure Data Sharing is designed so all users need to do is create a read-only database within Snowflake to access PredictHQ’s sample demand intelligence. No data is transferred or copied to the user, so it’s not taking up bandwidth or costing the trial user anything beyond the compute resources within Snowflake.

Snowflake describes this step in the process as:

“Shares are named Snowflake objects that encapsulate all of the information required to share a database. Each share consists of:

  • The privileges that grant access to the database(s) and the schema containing the objects to share.

  • The privileges that grant access to the specific objects in the database.

  • The consumer accounts with which the database and its objects are shared.

Once a database is created (in a consumer account) from a share, all the shared objects are accessible to users in the consumer account.”

Once you have added a shared database, you can begin exploring and experimenting with PredictHQ’s event data.

Step 3: Access and query the data for your project or test

While you can only create one database of PredictHQ’s data (one database per share), you can use the supplied data in many ways.

We prioritized sharing a year of historical data, as most companies begin their analyses with this. Here are some common starting points that companies find useful:

  • Filter the Attended Events sample by category, and focus on expos and conferences. These are mostly high-impact events as they are both larger and often bring many people into a city or state (and when international travel has returned, country) that wouldn’t otherwise be there, driving spikes in demand.

  • Begin with the Attended Events sample, and filter these by the location of your stores (all events in PredictHQ’s database are geolocated with verified latitude and longitude). Identify which locations have more events nearby. We recommend you also filter using PHQ_Attendance to identify impactful events. Most of our customers filter out events of fewer than 300 for most, or fewer than 100 people (rideshare and mobility companies find this smaller size still impactful). 

  • Use the Non-Attended events sample and focus on Academic Dates. Identify which of your locations are near colleges and universities, and analyze how these locations are impacted by session dates, break periods as well as key events.

  • Using the Unscheduled event sample, analyze your data to see if significant demand anomalies (increased or decreased demand) correlate with severe weather events to begin to uncover impact you can use when similar events occur.

Step 4: Reach out to the PredictHQ team for questions, support or further data access

The above examples are only starting points. You probably downloaded the data with a specific goal in mind, and our team is here to help you make the most of it.

We work with companies of all sizes, across most industries. This means we have probably assisted teams working on projects like yours before, and can save you hours (or weeks and even months) as you identify the role of events on your demand and how to make the most of it.

Get in touch with our team for support and to expedite your experiments and access to demand intelligence today.