Search Results - "Zhang, Shangtong"

Refine Results
  1. 1

    Growth and Survival of Fusarium solani-F. oxysporum Complex on Stressed Multipurpose Contact Lens Care Solution Films on Plastic Surfaces In Situ and In Vitro by Zhang, Shangtong, Ahearn, Donald G, Noble-Wang, Judith A, Stulting, R Doyle, Schwam, Brian L, Simmons, Robert B, Pierce, George E, Crow, Sidney A

    Published in Cornea (01-12-2006)
    “…PURPOSE:To analyze factors implicating the association of ReNu with MoistureLoc (ReNu ML) multipurpose contact lens solution (MPS) with the increased incidence…”
    Get full text
    Journal Article
  2. 2

    IMGA: Efficient In-Memory Graph Convolution Network Aggregation with Data Flow Optimizations by Wei, Yuntao, Wang, Xueyan, Zhang, Shangtong, Yang, Jianlei, Jia, Xiaotao, Wang, Zhaohao, Qu, Gang, Zhao, Weisheng

    “…Aggregating features from neighbor vertices is a fundamental operation in Graph Convolution Network (GCN). However, the sparsity in graph data creates poor…”
    Get full text
    Journal Article
  3. 3

    Breaking the Deadly Triad in Reinforcement Learning by Zhang, Shangtong

    Published 01-01-2022
    “…Reinforcement Learning (RL) is a promising framework for solving sequential decision making problems emerging from agent-environment interactions via trial and…”
    Get full text
    Dissertation
  4. 4

    Toxic Effects of Ag(I) and Hg(II) on Candida albicans and C. maltosa: a Flow Cytometric Evaluation by Zhang, S, Crow, Jr, S A

    Published in Applied and Environmental Microbiology (01-09-2001)
    “…Classifications Services AEM Citing Articles Google Scholar PubMed Related Content Social Bookmarking CiteULike Delicious Digg Facebook Google+ Mendeley Reddit…”
    Get full text
    Journal Article
  5. 5

    Almost Sure Convergence of Average Reward Temporal Difference Learning by Blaser, Ethan, Zhang, Shangtong

    Published 29-09-2024
    “…Tabular average reward Temporal Difference (TD) learning is perhaps the simplest and the most fundamental policy evaluation algorithm in average reward…”
    Get full text
    Journal Article
  6. 6

    Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features by Wang, Jiuqi, Zhang, Shangtong

    Published 18-09-2024
    “…Temporal difference (TD) learning with linear function approximation, abbreviated as linear TD, is a classic and powerful prediction algorithm in reinforcement…”
    Get full text
    Journal Article
  7. 7

    Direct Gradient Temporal Difference Learning by Qian, Xiaochi, Zhang, Shangtong

    Published 02-08-2023
    “…Off-policy learning enables a reinforcement learning (RL) agent to reason counterfactually about policies that are not executed and is one of the most…”
    Get full text
    Journal Article
  8. 8

    Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design by Liu, Shuze, Zhang, Shangtong

    Published 31-01-2023
    “…Most reinforcement learning practitioners evaluate their policies with online Monte Carlo estimators for either hyperparameter tuning or testing different…”
    Get full text
    Journal Article
  9. 9
  10. 10

    Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning by Chen, Claire, Liu, Shuze, Zhang, Shangtong

    Published 07-10-2024
    “…In reinforcement learning, classic on-policy evaluation methods often suffer from high variance and require massive online data to attain the desired accuracy…”
    Get full text
    Journal Article
  11. 11

    Doubly Optimal Policy Evaluation for Reinforcement Learning by Liu, Shuze, Chen, Claire, Zhang, Shangtong

    Published 03-10-2024
    “…Policy evaluation estimates the performance of a policy by (1) collecting data from the environment and (2) processing raw data into a meaningful estimate. Due…”
    Get full text
    Journal Article
  12. 12

    Truncated Emphatic Temporal Difference Methods for Prediction and Control by Zhang, Shangtong, Whiteson, Shimon

    Published 11-08-2021
    “…Emphatic Temporal Difference (TD) methods are a class of off-policy Reinforcement Learning (RL) methods involving the use of followon traces. Despite the…”
    Get full text
    Journal Article
  13. 13

    Efficient Multi-Policy Evaluation for Reinforcement Learning by Liu, Shuze, Chen, Yuxin, Zhang, Shangtong

    Published 16-08-2024
    “…To unbiasedly evaluate multiple target policies, the dominant approach among RL practitioners is to run and evaluate each target policy separately. However,…”
    Get full text
    Journal Article
  14. 14

    The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise by Liu, Shuze, Chen, Shuhang, Zhang, Shangtong

    Published 15-01-2024
    “…Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gradient…”
    Get full text
    Journal Article
  15. 15

    CRISP: Triangle Counting Acceleration via Content Addressable Memory-Integrated 3D-Stacked Memory by Zhang, Shangtong, Wang, Xueyan, Zhao, Weisheng, Jin, Yier

    “…Triangle Counting is a fundamental problem in graph analysis, which usually needs to traverse the graph and perform set-intersections of neighbor sets…”
    Get full text
    Conference Proceeding
  16. 16

    Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning by Wang, Jiuqi, Blaser, Ethan, Daneshmand, Hadi, Zhang, Shangtong

    Published 22-05-2024
    “…In-context learning refers to the learning ability of a model during inference time without adapting its parameters. The input (i.e., prompt) to the model…”
    Get full text
    Journal Article
  17. 17

    In vitro interactions of Fusarium and Acanthamoeba with drying residues of multipurpose contact lens solutions by Ahearn, Donald G, Zhang, Shangtong, Stulting, R Doyle, Simmons, Robert B, Ward, Michael A, Pierce, George E, Crow, Jr, Sidney A

    “…To examine in vitro effects of evaporation and drying of multipurpose contact lens solutions on survival of Fusarium and Acanthamoeba. Conidia of…”
    Get full text
    Journal Article
  18. 18

    DAC: The Double Actor-Critic Architecture for Learning Options by Zhang, Shangtong, Whiteson, Shimon

    Published 29-04-2019
    “…We reformulate the option framework as two parallel augmented MDPs. Under this novel formulation, all policy optimization algorithms can be used off the shelf…”
    Get full text
    Journal Article
  19. 19

    On the Convergence of SARSA with Linear Function Approximation by Zhang, Shangtong, Tachet, Remi, Laroche, Romain

    Published 14-02-2022
    “…SARSA, a classical on-policy control algorithm for reinforcement learning, is known to chatter when combined with linear function approximation: SARSA does not…”
    Get full text
    Journal Article
  20. 20

    Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch by Zhang, Shangtong, Tachet, Remi, Laroche, Romain

    Published 04-11-2021
    “…In this paper, we establish the global optimality and convergence rate of an off-policy actor critic algorithm in the tabular setting without using density…”
    Get full text
    Journal Article