Reliability is defined as the extent to which a questionnaire, Test, observation or any measurement procedure produces the same results on repeated trials. In short, it is the stability or consistency of scores over time or across raters. Keep in mind that reliability pertains to scores not people. Thus, in research we would never say that someone was reliable. As an example, consider judges in a platform diving competition. The extent to which they agree on the scores for each contestant is an indication of reliability. Similarly, the degree to which an individual s responses (i.e., their scores) on a survey would stay the same over time is also a sign of reliability.
An important point to understand is that a measure can be perfectly reliable and yet not be valid. Consider a bathroom scale that always weighs you as being 5 lbs. heavier than your true weight. This scale (though invalid as it incorrectly assesses weight) is perfectly reliable as it consistently weighs you as being 5 lbs. heavier than you truly are. A research example of this phenomenon would be a questionnaire designed to assess job satisfaction that asked questions such as, “Do you like to watch ice hockey games?”, “what do you like to eat more, pizza or hamburgers?” and “what is your favorite movie?”. As you can readily imagine, the responses to these questions would probably remain stable over time, thus, demonstrating highly reliable scores. However, are the questions valid when one is attempting to measure job satisfaction? Of course not, as they have nothing to do with an individual s level of job satisfaction. While this example may seem just a tad far-fetched l hope that you grasp the underlying difference between reliability and validity