模型¶

近年来，许多神经网络被提出用于CTR预测，并持续超越现有的最先进方法。著名的例子包括FM、DeepFM、Wide&Deep、DCN、PNN等。DT提供了这些模型中的大多数，并将在未来继续引入最新的研究成果。

Wide&Deep¶

Cheng, Heng-Tze, 等人. “Wide & deep learning for recommender systems.” 第一届推荐系统深度学习研讨会论文集. 2016.

检索自: https://dl.acm.org/doi/abs/10.1145/2988450.2988454

Wide & Deep learning—jointly trained wide linear models and deep neural networks—to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models.

_images/widedeep.png

DCN（深度与交叉网络）¶

王若曦等人。“用于广告点击预测的深度和交叉网络。”ADKDD’17会议论文集。2017年。1-7页。

检索自: https://dl.acm.org/doi/abs/10.1145/3124749.3124754

Deep & Cross Network (DCN) keeps the benefits of a DNN model, and beyond that, it introduces a novel cross network that is more efficient in learning certain bounded-degree feature interactions. In particular, DCN explicitly applies feature crossing at each layer, requires no manual feature engineering, and adds negligible extra complexity to the DNN model.

_images/dcn.png

PNN¶

曲燕茹等人。“基于产品的神经网络用于用户响应预测。”2016年IEEE第16届国际数据挖掘会议（ICDM）。IEEE，2016年。

检索自: https://ieeexplore.ieee.org/abstract/document/7837964/

Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between inter-field categories, and further fully connected layers to explore high-order feature interactions.

_images/pnn.png

DeepFM¶

郭慧峰等人。“Deepfm：一个端到端的宽深度学习框架，用于CTR预测。” arXiv预印本 arXiv:1804.04950 (2018)。

检索自: https://arxiv.org/abs/1804.04950

DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide & Deep model from Google, DeepFM has a shared raw feature input to both its “wide” and “deep” components, with no need of feature engineering besides raw features. DeepFM, as a general learning framework, can incorporate various network architectures in its deep component.

_images/deepfm.png

xDeepFM¶

Lian, Jianxun, 等. “xdeepfm: 结合显式和隐式特征交互用于推荐系统.” 第24届ACM SIGKDD国际知识发现与数据挖掘会议论文集. 2018.

检索自: https://dl.acm.org/doi/abs/10.1145/3219819.3220023

A novel Compressed Interaction Network (CIN), which aims to generate feature interactions in an explicit fashion and at the vector-wise level. We show that the CIN share some functionalities with convolutional neural networks (CNNs) and recurrent neural networks (RNNs). We further combine a CIN and a classical DNN into one unified model, and named this new model eXtreme Deep Factorization Machine (xDeepFM).

_images/xdeepfm.png

AFM¶

肖, 俊, 等. “注意力因子分解机：通过注意力网络学习特征交互的权重。” arXiv预印本 arXiv:1708.04617 (2017).

检索自: https://arxiv.org/abs/1708.04617

Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network. Extensive experiments on two real-world datasets demonstrate the effectiveness of AFM. Empirically, it is shown on regression task AFM betters FM with a 8.6% relative improvement, and consistently outperforms the state-of-the-art deep learning methods Wide&Deep and DeepCross with a much simpler structure and fewer model parameters.

_images/afm.png

AutoInt¶

宋伟平等. “Autoint: 通过自注意力神经网络自动学习特征交互.” 第28届ACM国际信息与知识管理会议论文集. 2019.

检索自: https://dl.acm.org/doi/abs/10.1145/3357384.3357925

AutoInt can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterwards, a multihead self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the lowdimensional space. With different layers of the multi-head selfattentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion.

_images/autoint.png

FiBiNet¶

黄同文，张志奇，和张俊林。“FiBiNET：结合特征重要性和双线性特征交互进行点击率预测。”第13届ACM推荐系统会议论文集。2019年。

检索自: https://dl.acm.org/doi/abs/10.1145/3298689.3347043

FiBiNET as an abbreviation for Feature Importance and Bilinear feature Interaction NETwork is proposed to dynamically learn the feature importance and fine-grained feature interactions. On the one hand, the FiBiNET can dynamically learn the importance of features via the Squeeze-Excitation network (SENET) mechanism; on the other hand, it is able to effectively learn the feature interactions via bilinear function.

_images/fibinet.png

FGCNN¶

刘斌等人。“通过卷积神经网络生成特征以预测点击率。”万维网会议。2019年。

检索自: https://dl.acm.org/doi/abs/10.1145/3308558.3313497

Feature Generation by Convolutional Neural Network (FGCNN) model with two components: Feature Generation and Deep Classifier. Feature Generation leverages the strength of CNN to generate local patterns and recombine them to generate new features. Deep Classifier adopts the structure of IPNN to learn interactions from the augmented feature space. Experimental results on three large-scale datasets show that FGCNN significantly outperforms nine state-of-the-art models. Moreover, when applying some state-of-the-art models as Deep Classifier, better performance is always achieved, showing the great compatibility of our FGCNN model. This work explores a novel direction for CTR predictions: it is quite useful to reduce the learning difficulties of DNN by automatically identifying important features.

_images/fgcnn.png