EXPEDIA GROUP TECHNOLOGY — DATA

The Data Science behind the Egencia SmartMix Flight-Ranking Model

When sorting flights by price needs an upgrade

Photo: Ryan Pemberton/Boeing

Sorting flights by price is the default used by most online travel agencies and business travel management companies, but does it make sense for business travelers? Is price the most efficient ranking criterion in a business trip context? Let’s take the example of a transatlantic route from Paris (France) to San Francisco (United States). We observe in the figure below that we have hundreds of flight possibilities to be ranked:

Price-based sort on a transatlantic route from Paris (France) to San Francisco (United States)

And this is just one example.

Are we completely different when booking a flight?

At Egencia™, part of Expedia Group™, innovation is a key value, aiming to merge technology and traveler services for a better business travel management experience. We always considered that sorting flights by lowest price was the most obvious approach for ranking flights on our platform.

Photo by Briana Tozour on Unsplash

The goal of our business travel solution is to be customer centric. Helping travelers and travel managers find a suitable flight quickly is essential. Having a closer look at the users’ experience and based on their feedback, we decided to analyse in depth the current situation of how Egencia sorts flights on search results. We planned to prove that there exists a better alternative to the default price-based sort.

When we think about ranking we associate it immediately with a challenging problem studied by disciplines such as Information Retrieval, Machine Learning or Operations Research. Regardless of the approach, the ranking task is to give preference to certain documents relevant to a query, user or context. Various developments were done in this direction by important actors in the context of web engines, e-commerce applications or optimised leisure travel solutions.

Let’s transpose the same idea in the area of business travel solutions for flights ranking and investigate the path we’ve taken when we decided to make the transition from the classic price-based sort to a smarter sort.

Photo by Markus Spiske on Unsplash

Origin: Sort by price → Destination: SmartMix sort | “operated by” relevance

Our goal is to sort flights in a convenient way for all business travel users, based on their general booking behaviour observed on our platform. We want to facilitate the user’s flight booking process by prioritizing the more relevant results.

The key is to define how the relevance function could be expressed in this case.

1. Check-in | Confirm the need of a smarter sort

Photo by Nick Fewings on Unsplash

An important factor that enables us today to have a more optimal online experience is all the collected data. Looking at historical bookings data (default price-based sort) for the most popular routes from each corner of the world where we’re present, we observe the following:

  • We need to scroll down to position #30 in order to cover cumulatively 90% of the booked flights of our users;
  • The distribution of the positions of the picked flights decreases exponentially;
  • We found users often selected the same flights or very similar flights (for a given route) even when that flight was not in first page of results (and thus required scrolling).

This first step enabled us to emphasize the rationale behind such an “intelligent” approach in sorting the flight results, over the price-based sorting and made the transition to an in-depth analysis of the booking behaviour on our platform. It was then time to assemble the insights with feedback received from Egencia’s customers and, thus, confirm the existence of a general preference for more optimal solutions than the lowest fares option.

2. Boarding restrictions | Product requirements

Decorative logo of a suitcase and the words “intuitive”, “transparent”, and “universal”

Now that we’ve validated customers’ need for a more convenient search results order, we are going to capture the criteria that impacts a flight’s relevance to users on a general basis given all the historical bookings.
At this phase, we aimed for a ranking model that is: intuitive, transparent and universal (the same for any user).

2.1 Allowed items | Feature engineering

Considering the previously mentioned restrictions, in this first iteration we decide to roll-up from the single-user level to the global level of the ranking problem and focus on features such as: price, duration, distance, number of stops or departure time (in certain cases). They all have a similar meaning for most users.

Even though these features have a general characteristic, they don’t have the same meaning in all contexts:

  • A EUR500 flight from Paris (FR) to Nice (FR) doesn’t have the same meaning as a EUR500 flight from Paris (FR) to San Francisco (US).
  • Then, a EUR500 flight in a list in which all the others cost less than $300 doesn’t have the same meaning with a flight in a list in which all the others cost more than EUR700.

There are many other scenarios in which the absolute value itself does not provide enough information.

We want to be sure that we are not overfitting the future ranking model. For this reason, we compute and normalize relative scores based on the context of each search results set. For example, one possible approach is to normalize the features within each search results set using the MinMax normalization. In this way all the flights data within fit a range of [min, max] e.g. [0, 1]. After a fair amount of experiments, we managed to have a tuple of such scores associated with each flight displayed to our users in the moment of booking e.g. F:=(rel_price_score, rel_duration_score, rel_stops_score).

2.2 Cabin luggage | Ranking model

Smart weights
Given the labelled dataset of booked and not-booked flights together with the new score-based format of the features, we can compute the importance of each of these scores using a logistic regression modeling technique.

More precisely, for a given independent variable such as the “price score”, we quantify its power in relation to a dependent variable “booked” (Boolean which tells whether the flight was booked or not). Repeating the process for all the independent variables in our dataset enables us to compute weights of importance for each score.

MCDA model
These smart weights are the perfect binders for a traditional, yet very effective operations research approach under the form of a multi-criteria decision aid (MCDA hereafter) model. Some of the most common MCDA models are: Weighted Sum, Weighted Product or TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution).

We thus replicate the users’ flight booking behaviour in Egencia by the means of such an MCDA model. To better contextualize the approach, in this decision-making case we are dealing with a decision problemflights ranking and we have to deal with multiple alternatives (flights) that we have to compare. Every alternative is described by several attributes e.g. “price score” and a criterion is an attribute that we associate with a preference relation e.g. to be minimized.

Offline evaluation
In our offline experiments such an operations research approach performed 17% better at ranking the booked flights in the 1st position than price-based sort, despite the expected position bias (loosely speaking, the advantage of the price-based sort approach used within Egencia at that time). As a remark, one of the main ranking performance metrics that we observed offline is Precision@k.

In comparison with other more complex approaches that we have been exploring separately such as Learning to Rank (12% better at ranking the booked flights in the 1st position than our MCDA model), the classical MCDA model had the significant advantage to be entirely explainable, transparent and intuitive for our customers in this (Origin: Sort by price → Destination: SmartMix Sort) journey with very good improvements of the ranking performance.

To quickly recap, so far, we answered the following questions:

How to represent a flight “document” — a suite of relatively normalized flight scores within a search results set;
What is the importance of each flight score — the weights learnt from the historical booking data;
What is the model that best suits our requirements for this smooth transition from Sort by price to a more “intelligent” sort — an easy to explain operations research model.

Toy flights results set representation with 3 simple criteria

Given the toy example of four flights represented above by three simple criteria (based on price, overall trip duration and number of segments) we obtain the following new order:

Table showing order of flights selected, along with prices, durations, and segments
Toy flights results sorted by SmartMix

We observe that the most convenient flight F3 from the point of view of the number of segments and duration, for which there is a reasonable trade-off to be made in terms of price, is pushed on the top of the list. Then, we observe that the flight having a shorter layover time F2 for only $10 is pushed above the longer one F1. Though, the most expensive flight F4 which is with 122.2% more expensive than the cheapest flight remains in the last position.

Robustness. Smooth transition from sort by price
During the offline evaluation of the model we were interested to test the model’s robustness on a validation set. The objective was to see how sensitive the model is to a change of the weights associated to each criterion.

3D chart of weights and ranking improvement

We analyzed the vicinity of the smart weights and picked the most convenient values from the optimal plateau that are closest to the departing price-based sort point. This enabled a natural transition from a price-based sort to a new sort for our users. Once this has been done, it was just left to A/B test how the new ranking model performs in production.

3. Take off | Deployment

All that’s left was to enjoy the new optimized flights booking experience and assess the impact it has on Egencia’s customers. Do you remember the example of a transatlantic route from Paris (France) to San Francisco (United States) from the very beginning? With the new MCDA based ranking model we managed to bring the direct flights at a reasonable price on the top of the search results. Apart from what we can observe in the figure below, the flights having a more convenient connection time for only EUR2 difference are displayed above the other long layover options.

The new SmartMix model proved to be a huge success in our last A/B Tests and improved the user experience significantly:

✔ It confirmed the offline expectation of improvement of the percentage of booked flights ranked at position #1 with respect to the previous price-based sort. (We could also observe position bias consequences)
✔ It also reduced the position that covers 90% of the booked flights of our users from #30 in our previous price-based sort to position #13 now.
✔ It reduced the usage of other sort and filter widgets by 8.51%, as well as the average time spent on the booking page by 3 seconds.

4. Landing | Conclusions

By way of a thorough evaluation and an in-depth data analysis we confirmed the assumption made in the beginning about the need for a more complex ranking model that is not purely price oriented for business travel users. Today, SmartMix is our default flights sort model deployed worldwide since late 2019, enabling through the MCDA formulation a smooth transition from the previous price-based approach.

Our data science team is actively working on continuous improvements of the user experience with more complex ranking models and metrics that can capture finer ranking scenarios. Before approaching personalization, we strongly believe and confirmed offline that there is a solid common behaviour that all our business travel users share.

One of the main objectives of future iterations of SmartMix in Egencia remains the interpretability and explicability of the changes that a ranking model might have on the search results list. Being able to have both a quantitative (e.g. percentage of booked flights higher in the list) and qualitative (“makes sense”) improvement of the flights ranking performance is the key for our next journey, this time departing from SmartMix v1.0. Given the current COVID-19 situation we are continuously adapting the model to possible changes in the airlines offer as well.

This project started as part of my final year internship at Egencia between February 2018 and August 2018, during which I had the opportunity to work on my master thesis “Ranking flights for business travel agency users”, coordinated by Nacim Rahal and supervised by Prof. Nacera Bennacer.
In this paper we explored together with my colleagues in our Data Science team here in Paris (France) and Seattle (US) three main approaches for improving the flights ranking model in the case of business travel agency users: One class classification, Multi-criteria decision aid and Learning to Rank. We proved that in the case of the Multi-criteria decision aid approach presented today and a new Learning to Rank approach, the current price-based model’s performance used within Egencia’s flights service has improved significantly.
Many thanks to my colleague PhD Begoña Ascaso for the guidance and the continuous contribution to the improvement of the initial version of the ranking models.

Learn more about technology at Expedia Group

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store