new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jan 13

Memorize, Factorize, or be Naïve: Learning Optimal Feature Interaction Methods for CTR Prediction

Click-through rate prediction is one of the core tasks in commercial recommender systems. It aims to predict the probability of a user clicking a particular item given user and item features. As feature interactions bring in non-linearity, they are widely adopted to improve the performance of CTR prediction models. Therefore, effectively modelling feature interactions has attracted much attention in both the research and industry field. The current approaches can generally be categorized into three classes: (1) na\"ive methods, which do not model feature interactions and only use original features; (2) memorized methods, which memorize feature interactions by explicitly viewing them as new features and assigning trainable embeddings; (3) factorized methods, which learn latent vectors for original features and implicitly model feature interactions through factorization functions. Studies have shown that modelling feature interactions by one of these methods alone are suboptimal due to the unique characteristics of different feature interactions. To address this issue, we first propose a general framework called OptInter which finds the most suitable modelling method for each feature interaction. Different state-of-the-art deep CTR models can be viewed as instances of OptInter. To realize the functionality of OptInter, we also introduce a learning algorithm that automatically searches for the optimal modelling method. We conduct extensive experiments on four large datasets. Our experiments show that OptInter improves the best performed state-of-the-art baseline deep CTR models by up to 2.21%. Compared to the memorized method, which also outperforms baselines, we reduce up to 91% parameters. In addition, we conduct several ablation studies to investigate the influence of different components of OptInter. Finally, we provide interpretable discussions on the results of OptInter.

  • 7 authors
·
Aug 2, 2021

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning

Large Language Models (LLMs), despite their remarkable progress across various general domains, encounter significant barriers in medicine and healthcare. This field faces unique challenges such as domain-specific terminologies and the reasoning over specialized knowledge. To address these obstinate issues, we propose a novel Multi-disciplinary Collaboration (MC) framework for the medical domain that leverages role-playing LLM-based agents who participate in a collaborative multi-round discussion, thereby enhancing LLM proficiency and reasoning capabilities. This training-free and interpretable framework encompasses five critical steps: gathering domain experts, proposing individual analyses, summarising these analyses into a report, iterating over discussions until a consensus is reached, and ultimately making a decision. Our work particularly focuses on the zero-shot scenario, our results on nine data sets (MedQA, MedMCQA, PubMedQA, and six subtasks from MMLU) establish that our proposed MC framework excels at mining and harnessing the medical expertise in LLMs, as well as extending its reasoning abilities. Based on these outcomes, we further conduct a human evaluation to pinpoint and categorize common errors within our method, as well as ablation studies aimed at understanding the impact of various factors on overall performance. Our code can be found at https://github.com/gersteinlab/MedAgents.

  • 7 authors
·
Nov 16, 2023