Coronavirus Recovery: How We’re Calculating Our COVID-19 Recovery Rate and Ensuring Accurate Predicted Attendance for Conferences, Sports Games and More

Dr. Xuxu Wang
Chief Data Officer

Sign up to receive updates on event reschedule rates in key markets as well as some of the highest impact new events once the recovery begins.

China, South Korea and New Zealand are exiting their lockdowns, with many more countries heading steadily towards easing restrictions also. As restrictions ease, suppressed demand will begin to be released for many companies.

Yet we all know the return of demand will be fragmented and varied, with new catalysts to uncover in its recovery, making demand planning key. Data scientists all over the world are hard at work in coronavirus recovery teams trying to identify the rate their company’s demand may recover, while their strategy peers identify additional levers they can work with to drive up demand directly.

Identifying your coronavirus demand recovery rate is complex. I wanted to share how my team is approaching it in case it would be useful for your recovery team.

Why companies need a coronavirus recovery rate

Building the data science capability to identify and iterate on coronavirus recovery rates is essential. For us, once our systems aggregate and verify millions of events, our unique models rank them all by predicted impact. While we have the world’s deepest database of historical events, it is very unlikely a conference that used to draw 15,000 people will return at the same scale immediately. So we need to adjust our predicted attendance and our impact rankings models by adding a recovery ratio.

This is important because our customers use the combined impact of our event data. Rather than focusing only on the largest events (as identified by our rankings, which are log scaled from 0 to 100), many will look at all events—large and small— happening in aggregate on a specific day in their company’s key locations. This aggregate event impact enables teams to quickly understand demand on each day, and match their strategies to future demand such as with smarter stocking or staffing decisions.

This article focuses mostly on attended events – but we also track, verify and rank impactful events that are still taking place regardless of the lockdown orders, such as public holidays, observances, school holidays and closures, natural disasters, severe weather and terrorism. All of these events create demand impact – whether its incremental, decremental or suppressed demand causing sudden and often unexpected spikes in demand. Tracking and preparing for suppressed demand is of particular importance right now as many businesses are experiencing huge surges in demand as restriction lift. As these restrictions vary considerably by state in the US and by country more broadly, we have created a new feature to track the status of these restrictions at scale.

Many of our customers ingest our data directly into their demand forecasting models to significantly improve its accuracy, so we needed to get it right. During the coronavirus recovery, we need a robust, iterative and reliable way to update our event rankings at scale. We couldn’t rely on the attendance of many of our event sources, as those are usually based on venue capacity. Assuming every event is maximum capacity is misleading even before the pandemic, and why we draw on so many more factors from our knowledge graph to calculate our rankings.

Our customers need our updated intelligent event data to inform their own recovery rate identification and iteration. Many companies will need think about workforce optimization strategies and decide whether to re-hire or train up staff again to meet demand, as well as engage their supply chains before demand commences to ensure they inventory levels are optimized.

The challenges of identifying a coronavirus recovery rate

The impact of COVID-19 is unprecedented and varies by country, state and industry. In total, we are building well over 50 additional functions to our models to make our event impact rankings accurate in the post COVID-19 era. This breadth is required because we track many different kinds of events, plus there are a range of factors that will impact demand.

Government restrictions and flight bans will be more easier for our systems to track than say the willingness to spend, willingness to attend events without a vaccine.

Human factors such as fear or hesitancy to attend events, or the impact of the economic downtown on people’s ability and willingness to attend events will only become knowable as markets open so your models will need to update rapidly based on new information. This will impact both attended events such as conferences and sports games, but also non-attendance based events such as people celebrate public holidays, observances and school holidays.

The impact of COVID-19 on events will vary substantially. Smaller attended events are coming back faster, such as local concerts or fun runs. Events attended by a lot of international visitors may look very different for a while, as international travel is likely to be low for some time.

These are all complex factors to decompose into discrete problems to build models to solve. And this is only the tip of the iceberg, we are building new models to source, verify and incorporate substantial amounts of new data such as public transport data, trends data and much more. While almost every company out there is watching its spending, chief data officers shouldn’t be reluctant to invest in high quality data sources – this is exactly the time when your strategies need to be data-driven to filter out all of the noise.

Identifying a recovery ratio for each event subgroup in each category at a state of country level

Like many businesses, we offer a range of products or types. In our case, we offer 19 categories of events and they have been impacted by the novel coronavirus in different ways. For example, some people mistakenly claim that events aren’t happening. While it is true scheduled attended events such as conferences, concerts and sports games are postponed with a small proportion cancelled, other events that impact demand continue.

This includes school closures and holidays, public holidays and observances, as well as unscheduled events that impact demand such as severe weather, terrorism or natural disaster. It also increasingly includes community events such as farmers markets, which are continuing albeit in a more socially distanced way for the time being. These events can cause incremental, displaced or decremental demand and should be tracked.

These complexities are why we have a custom built ranking for each event category. It’s also why we needed to invest time into creating models that could identify an event category’s specific recovery ratio starting point, paying particular attention to which events were driven by international attendees.

Fundamental to identifying impact is building models that can accurately identify and sort every attended event into one of three buckets: mostly domestic attendees, mostly international attendees and events with a good mix of both. Our proprietary entities system and extensive verified events metadata means we are able to sort millions of events by the percentage of international attendees. This is critical because international travel bans are likely to remain for some time and the airline industry has been hit particularly hard so we anticipate mostly international attended events will take longer to recover. 

Even then, not all events attended by mostly international visitors will recover at the same rate so we are building recovery rates per event subgroups for each state and country to ensure our rankings and predicted attendance are accurate.

Whereas events made up of mostly domestic attendees, such as community events, but also massive sports events and concerts, are likely to recover earlier. We will be carefully tracking event limits and including these in our models. We are also on track to launch a new event category of TV events soon, for the many businesses that are impacted by televised sports games, such as those that receive a surge in demand such as groceries, CPG, and home delivery, or a wave of decremental demand such as restaurants.

Also worth noting for attended events such as conferences and expos, we anticipate many larger events may change location to territories with higher attendee limits for example, if a recovered San Francisco has a lower maximum attendee event than Las Vegas, we may see a shift of events into Las Vegas for the recovery.

Coronavirus recovery rate models need to be updated constantly

The data science and analyst team at PredictHQ makes up more than half the company. This is good news as we are going to be even busier than usual in the coming months! As we build out these new logics and enhancements, we recognize that they will need to evolve as soon as new information is available.

That is why we are focusing on developing data science models that can iterate swiftly, and we will be reviewing every model each week. For attended events, this will involve noting the differences between search volume, ticket sales and actual attendance figures of events so we can build models to start to quantify the more human and fluid variables. These will of course need to keep iterating per market, based on latest information. For non-attendance based events, we will be drawing on similar data sources as well as some additional factors with custom-built models for each category.

As we prepare to bring these models online, our data assurance team will be manually checking the logic and updating major events. And as we start to scale these models, we will continue to crosscheck their impact manually.

I hope this article is helpful, and wish I could share more details with you all. If you are a PredictHQ customer, get in touch with our team to find out more. And if you are not yet a customer, please think carefully about how powerful intelligent event data would be to help you prepare for the recovery.