Expected Usefulness…………………………………………. 7
1.6 Synopsis of Thesis……………………………………………. 7
II BACKGROUND THEORY……………………………………… 9
2.1 Markov Processes…………………………………………….. 9
2.1.1 Discrete-Time Markov Chain……………………….. 10
2.1.2 Markov Decision Process…………………………… 11
2.2 Reinforcement Learning…………………………………… 12
2.2.1 Monte Carlo Method…………………………………14
2.2.2 Monte Carlo Estimation of Action Values…………...15
2.2.3 Monte Carlo Control…………………………………16
2.3 On-Policy Monte Carlo Method……………………………...17
III SECURE ROUTING IN MANETS : A REINFORCEMENT
LEARNING PROBLEM…………………………………………20
3.1 Introduction…………………………………………………. 20
3.2 Reputation Method………………………………………….. 22
3.3 Reputation as a Reinforcement Learning Problem…………...25
3.4 Problem Formulation…………………………………………27
3.5 Experimental Results…………………………………………28
3.5.1 Accumulated Reward per Episode…………………...30
3.5.2 Number of Packets Arrived at the Destination……….31
3.5.3 Relative Throughput………………………………….32
3.5.4 Effect of Varying the Maximum Allowed Packets…..33
3.6 Conclusions………………………………………………… 34
IV PERFORMANCE STUDY OF RL─BASED SECURE
ROUTING IN MANETS UNDER M/M/1/K MODEL…………...36
4.1 Introduction…………………………………………………..36