Feature engineering guide available in PredictHQ technical docs

Published on October 5, 2022
Valerie Williams
Content Marketing Manager

The goal of the Feature Engineering notebook is to give data science teams a step-by-step guide and hands-on experience querying different event-based features from the PredictHQ Features API. This guide outlines recommended features per event category with clear and simple examples. Data science teams can use these examples to create features easily and include them in their own demand forecasting models or any other applicable models.

Building machine learning features for the forecast is essential

Features represent the impact of events on a given location and can be used to train the model so it learns how events impact a customers demand. For example, the features API returns a feature called phq_attendance_sports using the sum aggregation in the features API that will return the total number of people attending sports events in the selected location (for example, 5 miles around your store or hotel). This feature will show the impact of sports events on the given location. Use the features for different categories so your model can learn from the impact of events in different categories on your location. 

Previously, our documentation on how to build features was spread over different notebooks and technical docs. We've consolidated it into a single guide to make it easy for you to get up and running with our data using the Features API. 

Use the feature engineering guide to use basic features from the features API for: 

  • Attended events (events such as concerts, sports games, conferences, and more) 

  • Non-attended events (such as public holidays, observances and more) 

  • Severe weather events (such as floods, hurricanes, and more) 

Over time we'll add more features to this guide, including features for academic events and other categories, and how to determine what radius to use around specific locations using the simple suggested radius labs API.

The guide can be found under the data science section of our tech docs.