Show simple item record

dc.rights.licenseRestricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.
dc.contributorMitchell, John E.
dc.contributorBennett, Kristin P.
dc.contributorEcker, Joseph G.
dc.contributorSharkey, Thomas C.
dc.contributorWallace, William A., 1935-
dc.contributor.authorHammel, Erik
dc.date.accessioned2021-11-03T07:59:44Z
dc.date.available2021-11-03T07:59:44Z
dc.date.created2013-09-09T14:55:05Z
dc.date.issued2013-05
dc.identifier.urihttps://hdl.handle.net/20.500.13015/894
dc.descriptionMay 2013
dc.descriptionSchool of Science
dc.description.abstractOur goal is to determine and optimize the efficacy of reinforcing an existing flow network to prevent unmet demand from imminent disruptions. We are given probabilities of failures for edges in the network and are asked to find edges which will best provide durability to the network post-event. The problem is extended to multiple time steps to address concerns of available resources versus quality of installations: the farther away from the event one makes decisions the more resources are available but the less reliable the uncertainty information. This sequential decision-making process is a classic example of dynamic programming. To avoid the "curses of dimensionality", we formulate an approximate dynamic program. To improve performance, especially as applied to flow networks, we derive several innovative adaptations from reinforcement learning concepts. This involves developing a policy, a function that makes installation decisions when given current forecast information, in a two step process: policy evaluation and policy improvement.
dc.description.abstractWith a trained policy, we compare its performance against traditional two-stage stochastic programs with recourse utilizing a sample average approximation model. We consider several implementations of the stochastic problem to gauge performance in a variety of ways. The material presented here is developed in the context of preparing urban infrastructures against damages caused by disasters, however is applicable to any flow network. This paper contributes to both the field of multistage stochastic programming and approximate dynamic programming by introducing factors to each other. We also apply innovative reinforcement learning techniques to flow networks that, as of this writing, have yet to be addressed.
dc.description.abstractThe primary solution technique takes forecast samples from a Monte Carlo simulation in the style of stochastic programming. Once a forecast is obtained, the problem is set up by taking additional samples of the forecast probabilities to determine capacities for the given time step. This forms the state information used in performing the approximate dynamic program. The sampled outcome information is used to define network constraints for the policy improvement step. The approximation for future costs is then refined using the improved policy compared with a desirable target objective. This process is repeated over several iterations. Lastly, we provide empirical evidence which corroborates with basic theorems of convergence for more simplistic forms of the reinforcement learning process.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.subjectMathematics
dc.titleUsing reinforcement learning to improve network durability
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid167187
dc.digitool.pid167189
dc.digitool.pid167190
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreePhD
dc.relation.departmentDept. of Mathematical Sciences


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record