Search Results - "Zhang, Shangtong"
-
1
Growth and Survival of Fusarium solani-F. oxysporum Complex on Stressed Multipurpose Contact Lens Care Solution Films on Plastic Surfaces In Situ and In Vitro
Published in Cornea (01-12-2006)“…PURPOSE:To analyze factors implicating the association of ReNu with MoistureLoc (ReNu ML) multipurpose contact lens solution (MPS) with the increased incidence…”
Get full text
Journal Article -
2
IMGA: Efficient In-Memory Graph Convolution Network Aggregation with Data Flow Optimizations
Published in IEEE transactions on computer-aided design of integrated circuits and systems (01-12-2023)“…Aggregating features from neighbor vertices is a fundamental operation in Graph Convolution Network (GCN). However, the sparsity in graph data creates poor…”
Get full text
Journal Article -
3
Breaking the Deadly Triad in Reinforcement Learning
Published 01-01-2022“…Reinforcement Learning (RL) is a promising framework for solving sequential decision making problems emerging from agent-environment interactions via trial and…”
Get full text
Dissertation -
4
Toxic Effects of Ag(I) and Hg(II) on Candida albicans and C. maltosa: a Flow Cytometric Evaluation
Published in Applied and Environmental Microbiology (01-09-2001)“…Classifications Services AEM Citing Articles Google Scholar PubMed Related Content Social Bookmarking CiteULike Delicious Digg Facebook Google+ Mendeley Reddit…”
Get full text
Journal Article -
5
Almost Sure Convergence of Average Reward Temporal Difference Learning
Published 29-09-2024“…Tabular average reward Temporal Difference (TD) learning is perhaps the simplest and the most fundamental policy evaluation algorithm in average reward…”
Get full text
Journal Article -
6
Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features
Published 18-09-2024“…Temporal difference (TD) learning with linear function approximation, abbreviated as linear TD, is a classic and powerful prediction algorithm in reinforcement…”
Get full text
Journal Article -
7
Direct Gradient Temporal Difference Learning
Published 02-08-2023“…Off-policy learning enables a reinforcement learning (RL) agent to reason counterfactually about policies that are not executed and is one of the most…”
Get full text
Journal Article -
8
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
Published 31-01-2023“…Most reinforcement learning practitioners evaluate their policies with online Monte Carlo estimators for either hyperparameter tuning or testing different…”
Get full text
Journal Article -
9
mlpack 3: a fast, flexible machine learning library
Published in Journal of open source software (18-06-2018)Get full text
Journal Article -
10
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Published 07-10-2024“…In reinforcement learning, classic on-policy evaluation methods often suffer from high variance and require massive online data to attain the desired accuracy…”
Get full text
Journal Article -
11
Doubly Optimal Policy Evaluation for Reinforcement Learning
Published 03-10-2024“…Policy evaluation estimates the performance of a policy by (1) collecting data from the environment and (2) processing raw data into a meaningful estimate. Due…”
Get full text
Journal Article -
12
Truncated Emphatic Temporal Difference Methods for Prediction and Control
Published 11-08-2021“…Emphatic Temporal Difference (TD) methods are a class of off-policy Reinforcement Learning (RL) methods involving the use of followon traces. Despite the…”
Get full text
Journal Article -
13
Efficient Multi-Policy Evaluation for Reinforcement Learning
Published 16-08-2024“…To unbiasedly evaluate multiple target policies, the dominant approach among RL practitioners is to run and evaluate each target policy separately. However,…”
Get full text
Journal Article -
14
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
Published 15-01-2024“…Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gradient…”
Get full text
Journal Article -
15
CRISP: Triangle Counting Acceleration via Content Addressable Memory-Integrated 3D-Stacked Memory
Published in 2024 IEEE International Test Conference in Asia (ITC-Asia) (18-08-2024)“…Triangle Counting is a fundamental problem in graph analysis, which usually needs to traverse the graph and perform set-intersections of neighbor sets…”
Get full text
Conference Proceeding -
16
Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning
Published 22-05-2024“…In-context learning refers to the learning ability of a model during inference time without adapting its parameters. The input (i.e., prompt) to the model…”
Get full text
Journal Article -
17
In vitro interactions of Fusarium and Acanthamoeba with drying residues of multipurpose contact lens solutions
Published in Investigative ophthalmology & visual science (28-03-2011)“…To examine in vitro effects of evaporation and drying of multipurpose contact lens solutions on survival of Fusarium and Acanthamoeba. Conidia of…”
Get full text
Journal Article -
18
DAC: The Double Actor-Critic Architecture for Learning Options
Published 29-04-2019“…We reformulate the option framework as two parallel augmented MDPs. Under this novel formulation, all policy optimization algorithms can be used off the shelf…”
Get full text
Journal Article -
19
On the Convergence of SARSA with Linear Function Approximation
Published 14-02-2022“…SARSA, a classical on-policy control algorithm for reinforcement learning, is known to chatter when combined with linear function approximation: SARSA does not…”
Get full text
Journal Article -
20
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Published 04-11-2021“…In this paper, we establish the global optimality and convergence rate of an off-policy actor critic algorithm in the tabular setting without using density…”
Get full text
Journal Article