Remodeling Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Studying | by Dillon Davis | The Airbnb Tech Weblog

Social Media Engagement in Early 2025

4 April 2025

Utilizing the Strangler Fig with Cellular Apps

Utilizing the Strangler Fig with Cell Apps

28 March 2025

How Airbnb leverages machine studying and reinforcement studying methods to unravel a novel data retrieval process with the intention to present company with distinctive, inexpensive, and differentiated lodging all over the world.

By: Dillon Davis, Huiji Gao, Thomas Legrand, Weiwei Guo, Malay Haldar, Alex Deng, Han Zhao, Liwei He, Sanjeev Katariya

Airbnb has reworked the way in which folks journey across the globe. As Airbnb’s stock spans numerous areas and property sorts, offering company with related choices of their search outcomes has turn into more and more complicated. On this weblog submit, we’ll focus on shifting from utilizing easy heuristics to superior machine studying and reinforcement studying methods to rework what we name location retrieval with the intention to tackle this problem.

Company sometimes begin looking out by getting into a vacation spot within the search bar and anticipate probably the most related outcomes to be surfaced. These locations may be international locations, states, cities, neighborhoods, streets, addresses, or factors of curiosity. Not like conventional journey lodging, Airbnb listings are unfold throughout completely different neighborhoods and surrounding areas. For instance, a household trying to find a trip rental in San Francisco may discover higher choices in close by cities like Daly Metropolis, the place there are bigger single-family properties. Thus, the system must account for not simply the searched location but in addition close by areas which may provide higher choices for the visitor. That is evidenced by the areas of booked listings when trying to find San Francisco proven under.

Given Airbnb’s scale, we can not rank each itemizing for each search. This introduced a problem to create a system that dynamically infers a related map space for a question. This technique, referred to as location retrieval, wanted to stability together with all kinds of listings to enchantment to all company’ wants whereas nonetheless being related to the question. Our search rating fashions can then effectively rank the subset of our stock that’s throughout the related map space and floor the most related stock to our company. This technique and extra is printed under

Initially, Airbnb relied on heuristics to outline map areas primarily based on the kind of search. For instance, if a visitor looked for a rustic, the system would use administrative boundaries to filter listings inside that nation. In the event that they looked for a metropolis, the system would create a 25-mile radius across the metropolis heart to retrieve listings.

Bettering these heuristics proved to be profoundly impactful. One such instance is the introduction of a log scale parameterized clean perform to compute an growth issue for the diagonal measurement of the executive bounds of the searched vacation spot. We utilized this for very exact areas like addresses, buildings, and POI’s leading to a 0.35% enhance in uncancelled bookers on the platform when examined in a web based A/B experiment towards the baseline heuristics. Figures under exhibit how search outcomes for a constructing in Ibiza, Spain improved dramatically with this heuristic by surfacing considerably extra and better high quality stock.

These heuristics had been easy and labored properly sufficient to start out, however that they had limitations. They couldn’t differentiate between several types of searches (e.g., a household searching for a big house versus a solo traveler searching for a small residence), they usually didn’t adapt properly to new knowledge as Airbnb’s stock and visitor preferences developed.

With extra knowledge obtainable over time from these instinct primarily based heuristics, we thought there could be a strategy to reap the benefits of this historic person reserving habits to enhance location retrieval. We constructed a dataset for every journey vacation spot that recorded the place company booked listings when trying to find that vacation spot. Based mostly on this knowledge, the system may create retrieval map areas that included 96% of the closest booked listings for a given vacation spot.

We examined these newly constructed retrieval map areas in lieu of the instinct primarily based heuristics outlined above primarily based on the speculation that it might present company a extra bookable number of stock. Whereas this statistical method was extra aligned with visitor reserving habits, it nonetheless had limitations. It handled all searches for a location the identical, no matter particular search parameters like group measurement or journey dates. This uniform method meant that some company may not see one of the best listings for his or her explicit wants. Because of this, this statistics primarily based methodology had no detectable enhance in uncancelled bookers on the platform when examined towards the heuristics outlined above in a web based A/B experiment. This led us to imagine that location retrieval could require extra superior methods akin to machine studying.

As a substitute of solely counting on previous reserving knowledge, the brand new system may be taught from varied search parameters, such because the variety of company and keep length. By analyzing this knowledge, a mannequin may predict extra related map areas for every search, reasonably than making use of a one-size-fits-all method.

For instance, a bunch of ten vacationers trying to find a San Francisco trip rental may choose bigger properties within the suburbs, whereas solo vacationers may prioritize central areas. The machine studying mannequin may distinguish between these completely different preferences and regulate the retrieval map areas accordingly, offering extra tailor-made outcomes.

We constructed our machine studying mannequin within the following method. This can be a results of three iterations that launched the machine studying mannequin, expanded its function set, and expanded search attribution. The structure is depicted within the determine under.

Coaching Examples: Searches issued by a booker by getting into a vacation spot within the search bar or manipulating the map that contained the booked itemizing of their search outcomes on the identical day or someday earlier than the reserving. We discard any bookings which might be canceled 7 days after reserving.
Coaching Options: We derive options immediately from the search request akin to location identify, keep size, variety of company, value filters, location nation, and so on. There are 9 steady options and 19 categorical options in complete.
Coaching Labels: The latitude and longitude coordinates of the booked itemizing attributed to the search
Structure: A two layer neural community of measurement 256 was chosen with the intention to have extra flexibility for loss formulation in comparison with conventional regression and resolution tree primarily based approaches.
Mannequin Output: 4 floats that outline the latitude and longitude offsets from the middle latitude and longitude coordinates of the searched vacation spot that symbolize the related map space.
Loss: Skilled to foretell map areas that include their related booked itemizing whereas minimizing the scale of the anticipated map space and the incidence of predictions that can’t assemble a sound rectangular map space.

The machine studying system elevated the recall of booked listings (i.e., how usually the system retrieved a list that was ultimately booked) by 7.12% and lowered the scale of the retrieval map space by 40.83%. It had a cumulative influence of +1.8% in uncancelled bookers on the platform. The preliminary mannequin was evaluated towards the baseline and every subsequent mannequin iteration was evaluated towards the previous outgoing mannequin.

Figures under exhibit how search outcomes for a particular avenue in Lima, Peru improved dramatically with the mannequin by surfacing outcomes which might be a lot nearer to the searched avenue.

Earlier than

After

Whereas machine studying improved the system’s capability to distinguish search outcomes, there was nonetheless room for enchancment, significantly in studying whether or not areas that had by no means been surfaced earlier than had been related to company for a search. To deal with this, Airbnb launched reinforcement studying to the situation retrieval course of.

Reinforcement studying allowed the system to constantly be taught from visitor interactions by surfacing new areas for a given vacation spot and adjusting the retrieval map space primarily based on visitor reserving habits. This method, referred to as a contextual multi-armed bandit downside, concerned balancing exploration (surfacing new areas) with exploitation (surfacing earlier profitable areas). The system may actively experiment with completely different retrieval map areas studying from visitor bookings to refine its predictions.

Making use of a contextual multi-armed bandit historically requires defining an energetic contextual estimator, a technique for uncertainty estimation, and an exploration technique. We took the next method given product constraints, system constraints, and the character of our mannequin formulation. The structure is depicted within the determine under.

Lively contextual estimation: We employed our current machine studying mannequin for location retrieval retrained each day to recurrently be taught from any new bookings knowledge that we acquire whereas surfacing beforehand unshown areas.
Uncertainty estimation: We modified our mannequin structure with a random dropout layer to generate 32 distinctive predictions for a given search (Monte Carlo Dropout). This enables us to measure the imply and normal deviation of our prediction whereas minimizing detrimental influence to system efficiency and adjustments to our current mannequin formulation.
Exploration Technique: We compute an higher confidence certain utilizing the imply and normal deviation of our prediction with the intention to assemble bigger retrieval map areas primarily based on the mannequin’s confidence in its prediction for the search.

This technique efficiently explored extra for less-traveled areas the place it was much less assured and explored much less for areas which might be usually searched and booked. For instance, pictured under are the imply (inside) and higher confidence certain (outer) estimates of retrieval map areas for San Francisco, CA (left) and Smith Mountain Lake, Virginia (proper). San Francisco is searched virtually 25x greater than Smith Mountain Lake with proportionately extra bookings as properly. Because of this, the mannequin is extra assured in its retrieval map space estimate for San Francisco vs Smith Mountain Lake leading to 2–3x much less exploration for San Francisco queries vs Smith Mountain Lake.

The reinforcement studying system was additionally examined towards the outgoing machine studying mannequin in on-line A/B experiments displaying a cumulative 0.51% enhance in uncanceled bookers and 0.71% enhance in 5 star journey charge over two iterations that launched reinforcement studying and optimized scoring of the extra complicated mannequin.

Airbnb’s journey from easy heuristics to stylish machine studying and reinforcement studying fashions demonstrates the facility of data-driven approaches in reworking complicated techniques. By regularly iterating and enhancing its location retrieval course of, Airbnb has not solely enhanced the relevance of its search outcomes but in addition helped company expertise extra 5 star journeys.

This transformation cumulatively ends in a 2.66% enhance in uncanceled bookers — a serious achievement for an organization working at Airbnb’s scale. Extra particulars may be present in our technical paper. As Airbnb continues to innovate, we’re constantly evaluating and introducing extra superior options and retrieval mechanisms like retrieving with complicated polygons . These will additional refine and improve the search expertise for tens of millions of company worldwide.

If the sort of work pursuits you, take a look at a few of our associated positions and extra at Careers at Airbnb!

All product names, logos, and types are property of their respective house owners. All firm, product and repair names used on this web site are for identification functions solely. Use of those names, logos, and types doesn’t suggest endorsement.