Establishing a Giant Scale Realized Retrieval System at Pinterest | by Pinterest Engineering | Pinterest Engineering Weblog

Social Media Engagement in Early 2025

4 April 2025

Utilizing the Strangler Fig with Cellular Apps

Utilizing the Strangler Fig with Cell Apps

28 March 2025

Bowen Deng | Machine Studying Engineer, Homefeed Candidate Era; Zhibo Fan | Machine Studying Engineer, Homefeed Candidate Era; Dafang He | Machine Studying Engineer, Homefeed Relevance; Ying Huang | Machine Studying Engineer, Curation; Raymond Hsu | Engineering Supervisor, Homefeed CG Product Enablement; James Li | Engineering Supervisor, Homefeed Candidate Era; Dylan Wang | Director, Homefeed Relevance; Jay Adams | Principal Engineer, Pinner Curation & Progress

At Pinterest, our mission is to convey everybody the inspiration to create a life they love. Discovering the proper content material on-line and serving the proper viewers performs a key function on this mission. Fashionable large-scale advice programs normally embody a number of levels the place retrieval goals at retrieving candidates from billions of candidate swimming pools, and rating predicts which merchandise a consumer tends to have interaction from the trimmed candidate set retrieved from early levels [2]. Fig 1 illustrates a normal multi-stage advice funnel design in Pinterest.

Fig 1. Common multi-stage advice system design in Pinterest. We retrieve candidates from billions of Pin content material corpus and slim it right down to hundreds of candidates for the rating mannequin to attain and at last generate the feeds for Pinners. “CG” is brief for candidate technology and “LWS” is brief for Lightweight Scoring, which is our pre-ranking mannequin.

The Pinterest rating mannequin is a robust transformer primarily based mannequin discovered from a uncooked consumer engagement sequence with a blended machine serving [3]. It’s highly effective at capturing customers’ lengthy and brief time period engagement and offers immediate predictions. Nevertheless, Pinterest’s retrieval system prior to now differs, as a lot of them are primarily based on heuristic approaches corresponding to these primarily based on Pin-Board graphs or user-followed pursuits. This work illustrates our effort in efficiently constructing Pinterest an inner embedding-based retrieval system for natural content material discovered purely from logged consumer engagement occasions and serves in manufacturing. Now we have deployed our system for homefeed in addition to notification.

Fig. 2. Two Tower Fashions for Coaching and Serving.

A two tower-based strategy has been extensively adopted in trade [6], the place one tower learns the question embedding and one tower learns the merchandise embedding. The net serving might be low cost with nearest neighbor search with question embedding and merchandise embeddings. This part illustrates the present machine studying design of the two-tower machine studying mannequin for discovered retrieval at Pinterest.

The overall two-tower mannequin structure with coaching goal and serving illustration is in diagram Fig 2.

For coaching an environment friendly retrieval mannequin, many works mannequin it as an excessive multi-class classification downside. Whereas in apply we can’t do softmax over all merchandise corpus, we are able to simply leverage in batch unfavourable, which offers a reminiscence environment friendly method of sampling unfavourable. To place it extra formally, a retrieval mannequin ought to optimize the place C is your entire corpus and T is all true labels.

Nevertheless, in apply we are able to solely pattern softmax over a set of unfavourable gadgets S.

The place given a sampled set D, and the sampled softmax could possibly be formulated as:

As we pattern gadgets from our coaching set which may have reputation bias, it is crucial for us to right the pattern likelihood [1]. We use easy logic tuning primarily based on the estimated likelihood for every merchandise.

𝐿⟮consumer, merchandise⟯ ＝ 𝒆user · 𝒆item － log P⟮merchandise is within the batch⟯

The place 𝒆user , 𝒆item are the consumer embedding and merchandise embedding correspondingly.

In our mannequin design, we encode consumer long-term engagement [11] , consumer profile, and context as enter [2] within the consumer tower (as proven later in Fig 4).

Fig 3. Person sequence modeling in two-tower structure. PinnerSage [11] encodes long-term consumer representations whereas consumer realtime consumer sequence modeled with sequence transformer make the mannequin in a position to seize immediate consumer intention.

As Pinterest serves over 500 million MAUs, designing and implementing an ANN-based retrieval system is just not trivial. At Pinterest, we now have our in-house ANN serving system designed primarily based on algorithms [5, 7]. So as to have the ability to serve the merchandise embeddings on-line, we break it down into two items: on-line serving and offline indexing. In on-line serving, consumer embedding is computed throughout request time so it could leverage essentially the most up-to-date options to do personalised retrieval. In offline indexing, hundreds of thousands of merchandise embeddings are computed and pushed to our in-house Manas serving system for on-line serving. Fig. 4 illustrates the system structure for embedding-based retrieval with auto retraining adopted.

Fig 4. Full Serving Pipeline of Realized Retrieval with Auto Retraining

In a real-world advice system, it’s a necessity to often retrain the fashions to refresh the discovered information of customers and seize latest tendencies. We established an auto retraining workflow to retrain the fashions periodically and validate the mannequin efficiency earlier than deploying them to the mannequin and indexing companies.

Nevertheless, totally different from rating fashions, two-tower fashions are cut up into two mannequin artifacts and deployed to separate companies. When a brand new mannequin is retrained, we have to make sure that the serving mannequin model is synchronized between the 2 companies. If we don’t take into account model synchronization, because of the distinction in deployment velocity (the place normally the Pin indexing pipeline takes for much longer time than the viewer mannequin being prepared), candidate high quality will drastically drop if the embedding area is mismatched. From the infrastructure perspective, any rollback on both service might be detrimental. Furthermore, when a brand new index is constructed and being rolled out to manufacturing, the hosts of ANN search service won’t change altogether instantly; this ensures that throughout the rollout interval, a sure share of the site visitors gained’t undergo from mannequin model mismatch.

To deal with the issue, we connect a bit of mannequin model metadata to every ANN search service host, which accommodates a mapping from mannequin title to the most recent mannequin model. The metadata is generated along with the index. At serving time, homefeed backend will first get the model metadata from its assigned ANN service host and use the mannequin of the corresponding model to get the consumer embeddings. This ensures “anytime” mannequin model synchronization: even when some ANN hosts have mannequin variations N and others have variations N+1 throughout the index rollout interval, the mannequin model continues to be synchronized. As well as, to make sure rollback functionality, we preserve the most recent N variations of the viewer mannequin in order that we are able to nonetheless compute the consumer embeddings from the proper mannequin even when the ANN service is rolled again to its final construct.

Homefeed in Pinterest might be essentially the most difficult system that should retrieve gadgets for various instances: Pinner engagement, content material exploration, curiosity diversification, and so on. It has over 20 candidate mills served in manufacturing with totally different retrieval methods. At present the discovered retrieval candidate generator goals for driving consumer engagement. It has the highest consumer protection and high three save charges. Since launched, it has helped deprecate two different candidate mills with enormous total website engagement wins.

On this weblog, we offered our work in constructing our discovered retrieval system throughout totally different surfaces in Pinterest. The machine studying primarily based strategy allows us for quick characteristic iteration and additional consolidates our system.

We want to thank all of our collaborators throughout Pinterest. Zhaohui Wu, Yuxiang Wang, Tingting Zhu, Andrew Zhai, Chantat Eksombatchai, Haoyu Chen, Nikil Pancha, Xinyuan Gui, Hedi Xia, Jianjun Hu, Daniel Liu, Shenglan Huang, Dhruvil Badani, Liang Zhang, Weiran Li, Haibin Xie, Yaonan Huang, Keyi Chen, Tim Koh, Tang Li, Jian Wang, Zheng Liu, Chen Yang, Laksh Bhasin, Xiao Yang, Anna Kiyantseva, Jiacheng Hong.

References:

[1] On the Effectiveness of Sampled Softmax Loss for Merchandise Suggestion

[2] Deep Neural Networks for YouTube Suggestions

[3] Transact: Transformer-based realtime consumer motion mannequin for advice at pinterest

[4] Pixie: A System for Recommending 3+ Billion Objects to 200+ Million Customers in Actual-Time

[5] Manas HNSW Streaming Filters

[6] Pinterest Dwelling Feed Unified Light-weight Scoring: A Two-tower Strategy

[7] Environment friendly and sturdy approximate nearest neighbor search utilizing hierarchical navigable small world graphs.

[8] Pattern Choice Bias Correction Principle

[9] PinnerFormer: Sequence Modeling for Person Illustration at Pinterest