[PDF] A Gang of Bandits | Semantic Scholar (2024)

Figures and Tables from this paper

  • figure 1
  • table 1
  • table 2
  • figure 4

Topics

Gang Of Bandits (opens in a new tab)Linear Bandit Algorithm (opens in a new tab)Bandit Algorithms (opens in a new tab)Recommendation Systems (opens in a new tab)Contextual Bandits (opens in a new tab)Network Structure (opens in a new tab)Clusters (opens in a new tab)Real-world Datasets (opens in a new tab)

151 Citations

Contextual Bandits in a Collaborative Environment
    Qingyun WuHuazheng WangQuanquan GuHongning Wang

    Computer Science

    SIGIR

  • 2016

This paper develops a collaborative contextual bandit algorithm, in which the adjacency graph among users is leveraged to share context and payoffs among neighboring users while online updating, and rigorously proves an improved upper regret bound.

  • 105
  • Highly Influenced
  • PDF
Networked bandits with disjoint linear payoffs
    Meng FangD. Tao

    Computer Science, Mathematics

    KDD

  • 2014

This paper formalizes the networked bandit problem and proposes an algorithm that considers not only the selected arm, but also the relationships between arms, in that it decides an arm depending on integrated confidence sets constructed from historical data.

  • 27
Replicating A Gang of Bandits
    Bryce BernDawson D'almeidaWill KnospePaul Reich

    Computer Science

  • 2020

This paper explores the process of replicating the experiments and results from A Gang of Bandits in order to validate the paper and formalize the question about recommendation systems to be a multi-armed ban-dit problem.

  • Highly Influenced
  • PDF
Parallel Online Clustering of Bandits via Hedonic Game
    Xiaotong ChengCheng PanS. Maghsudi

    Computer Science

    ICML

  • 2023

This work proposes CLUB-HG, a novel algorithm that integrates a game-theoretic approach into clustering inference and discovers the underlying users’ clusters inContextual bandit algorithms.

Structured bandits and applications : exploiting problem structure for better decision-making under uncertainty
    Sharan Vaswani

    Computer Science, Mathematics

  • 2018

This thesis goes beyond the well-studied multi-armed bandit model to consider structured bandit settings and their applications, and proposes a bootstrapping approach and establishes theoretical regret bounds for it.

  • Highly Influenced
Exploring Clustering of Bandits for Online Recommendation System
    Liu YangBo LiuLeyu LinFeng XiaKai ChenQiang Yang

    Computer Science

    RecSys

  • 2020

The proposed ClexB policy for online RecSys explores knowledge transfer and further aids the inferences about user interests and estimates user clustering more accurately and with less uncertainty via explorable-clustering.

  • 16
  • PDF
A Gang of Adversarial Bandits
    M. HerbsterStephen PasterisFabio VitaleM. Pontil

    Computer Science

    NeurIPS

  • 2021

Two learning algorithms are presented, GABA-I and GABA-II, which exploit the network structure to bias towards functions of low Ψ values and highlight improvements of both algorithms over running independent standard MABs across users.

  • 9
  • PDF
Revealing Graph Bandits for Maximizing Local Influence
    A. CarpentierMichal Valko

    Computer Science

    AISTATS

  • 2016

BARE is proposed, a bandit strategy for which a regret guarantee is proved that scales with the detectable dimension, a problem dependent quantity that is often much smaller than the number of nodes.

  • 38
  • PDF
Multi-agent Heterogeneous Stochastic Linear Bandits
    A. GhoshAbishek SankararamanK. Ramchandran

    Computer Science

    ECML/PKDD

  • 2022

This paper seeks to theoretically understand the problem of minimizing regret in an N users heterogeneous stochastic linear bandits framework by studying the problem of minimizing regret in an N users heterogeneous stochastic linear bandits framework.

  • PDF
Dynamic Clustering of Contextual Multi-Armed Bandits
    Trong-The NguyenHady W. Lauw

    Computer Science

    CIKM

  • 2014

This work proposes an algorithm to divide the population of users into multiple clusters, and to customize the bandits to each cluster, and this clustering is dynamic, i.e., users can switch from one cluster to another, as their preferences change.

  • 52
  • Highly Influenced
  • PDF

...

...

25 References

Gossip-based distributed stochastic bandit algorithms
    Balázs SzörényiR. Busa-FeketeIstván HegedüsRóbert OrmándiMárk JelasityB. Kégl

    Computer Science

    ICML

  • 2013

This work shows that the probability of playing a suboptimal arm at a peer in iteration t = Ω(log N) is proportional to 1/(Nt) where N denotes the number of peers participating in the network.

  • 102
  • PDF
Contextual Bandits with Similarity Information
    Aleksandrs Slivkins

    Mathematics, Computer Science

    COLT

  • 2011

This work considers similarity information in the setting of contextual bandits, a natural extension of the basic MAB problem, and presents algorithms that are based on adaptive partitions, and take advantage of "benign" payoffs and context arrivals without sacrificing the worst-case performance.

Leveraging Side Observations in Stochastic Bandits
    S. CaronB. KvetonM. LelargeSmriti Bhagat

    Computer Science

    UAI

  • 2012

This paper considers stochastic bandits with side observations, a model that accounts for both the exploration/exploitation dilemma and relationships between arms and provides efficient algorithms based on upper confidence bounds to leverage this additional information and derive new bounds improving on standard regret guarantees.

  • 105
  • PDF
Multi-armed bandits in the presence of side observations in social networks
    Swapna BuccapatnamA. EryilmazN. Shroff

    Computer Science

    52nd IEEE Conference on Decision and Control

  • 2013

The investigations in this work reveal the significant gains that can be obtained even through static network-aware policies, and proposes a randomized policy that explores actions for each user at a rate that is a function of her network position.

  • 38
  • PDF
Bandit problems in networks: Asymptotically efficient distributed allocation rules
    S. KarH. PoorShuguang Cui

    Computer Science, Mathematics

    IEEE Conference on Decision and Control and…

  • 2011

A collaborative and adaptive distributed allocation rule DA is proposed and is shown to achieve the lower bound on the expected average regret for a connected inter-bandit communication network.

  • 32
  • PDF
A contextual-bandit approach to personalized news article recommendation
    Lihong LiWei ChuJ. LangfordR. Schapire

    Computer Science

    WWW '10

  • 2010

This work model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.

Contextual Bandits with Linear Payoff Functions
    Wei ChuLihong LiL. ReyzinR. Schapire

    Computer Science, Mathematics

    AISTATS

  • 2011

An O (√ Td ln (KT ln(T )/δ) ) regret bound is proved that holds with probability 1− δ for the simplest known upper confidence bound algorithm for this problem.

  • 944
  • Highly Influential
  • PDF
Improved Algorithms for Linear Stochastic Bandits
    Yasin Abbasi-YadkoriD. PálCsaba Szepesvari

    Computer Science, Mathematics

    NIPS

  • 2011

A simple modification of Auer's UCB algorithm achieves with high probability constant regret and improves the regret bound by a logarithmic factor, though experiments show a vast improvement.

  • 1,549
  • PDF
Stochastic Linear Optimization under Bandit Feedback
    Varsha DaniThomas P. HayesS. Kakade

    Mathematics, Computer Science

    COLT

  • 2008

A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.

  • 831
  • PDF
From Bandits to Experts: On the Value of Side-Observations
    Shie MannorOhad Shamir

    Computer Science, Mathematics

    NIPS

  • 2011

Practical algorithms with provable regret guarantees are developed, which depend on non-trivial graph-theoretic properties of the information feedback structure and partially-matching lower bounds are provided.

...

...

Related Papers

Showing 1 through 3 of 0 Related Papers

    [PDF] A Gang of Bandits | Semantic Scholar (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Rob Wisoky

    Last Updated:

    Views: 6411

    Rating: 4.8 / 5 (48 voted)

    Reviews: 95% of readers found this page helpful

    Author information

    Name: Rob Wisoky

    Birthday: 1994-09-30

    Address: 5789 Michel Vista, West Domenic, OR 80464-9452

    Phone: +97313824072371

    Job: Education Orchestrator

    Hobby: Lockpicking, Crocheting, Baton twirling, Video gaming, Jogging, Whittling, Model building

    Introduction: My name is Rob Wisoky, I am a smiling, helpful, encouraging, zealous, energetic, faithful, fantastic person who loves writing and wants to share my knowledge and understanding with you.