Non-Theists Are No Less Moral Than Theists: Some Preliminary Results

The longstanding stereotype that non-theists are less moral than theists is not empirically supported. To test this commonplace assumption, 114 undergraduate participants were evaluated to draw comparisons about religious identity and altruism levels. Participants were placed into one of two groups, theists or non-theists. The theist group was then further divided: weakly religious, moderately religious, and highly religious. Non-theists and theists as a whole, as well as theist subgroup assessments, were compared. Data were collected through self-report surveys. Additionally, to test moral decision-making abilities, participants answered questions based on situational dilemmas. Using Kohlberg’s coding schema, scores were assigned for the participant’s global moral reasoning rather than for the content of their answers. Using independent groups t-test, ANOVA, and post-hoc tests,our findings suggest no support for the existence of the stereotype that non-theists are less moral than theists. Religious identity did not conclusively determine whether or not an individual was more moral or more altruistic.


Introduction
In the United States, individuals who do not self-identify with any organized religion are generally seen in a negative light by those who do self-identify with organized religion (e.g., Harper, 2007; see also D'Andrea & Sprenger, 2007;Koproske, 2006). For example, Edgell, Gerteis, and Hartmann (2006 p. 281; see also Caldwell-Harris, Wilson, LoTempio, & Beit-Hallahmi, 2011) found that approximately half of Americans "Would Disapprove if My Child Wanted to Marry" an atheist. Yet non-theists are a growing minority, expanding from 8. 2% in 1990, to 14.1% in 2001, to 15.0% in 2008 of the US population (Kosmin & Keysar, 2009 p.1).Over the last decade, religious individuals have become more tolerant of members of other religions; however, they have not become more tolerant of non-religious individuals (Edgell et al. 2006). Such lack of tolerance possibly stems from a well-known stereotype that links elevated levels of religiosity with elevated levels of morality; that is, many people believe that non-theists are less moral than theists, on average (Zuckerman, 2009). Additionally, we know from interview research that non-religious individuals from rural states in the US, perhaps based on these negative stereotypes, are typically unwilling to reveal their non-religious identity, hiding their identity, unless explicitly confronted by family and friends (Charles, Rowland, Long, & Yarrison, 2012;Rowland, Long, & Yarrison, forthcoming).
Generally, the justification for this widespread prejudice is the belief that in the absence of spiritual guidance by organized religion or a higher power, people cannot overcome their base impulses, which results in a lawless, mendacious, and otherwise amoral individual (Zuckerman, 2009). It is beyond the scope of this paper to speculate on the origins of this "amoral atheist" stereotype. It may be the case that most people are simply misinformed about the correlation between moral reasoning abilities and belief in God. At this point in history, it appears that the label of atheist has come to be associated with amorality and a general rejection of societal values, and not simply the lack of a belief in God. 2 The possibility of morality without the benefit of religious belief is an ongoing debate, both publicly (e.g., Kaminer, 1997) and among scholars (Epstein, 2009). Still, there is little empirical evidence that can be evaluated to determine whether theists and non-theists differ in the capacity for moral reasoning. This preliminary study provides such evidence by measuring moral reasoning and self-reported altruistic behavior in a modest sample of college students.

Prior Research
While studies have examined beliefs about theists and non-theists, little empirical work directly compares theists to non-theists. Exceptions to this generally find little basis for bias against non-theists. For example, a survey by the Pew Forum on Religion and Public Life (2009) found that non-theist Americans are the group least supportive of governmental use of torture. Caldwell-Harris et al. (2011) sent questionnaires to theists and non-theists to analyze their levels of wellbeing, spirituality, self-compassion, interpersonal reactivity, religious background, and to what extent they believe in magic. They found no difference between theists and non-theists in levels of compassion, empathic concern, perspective taking, fantasy, or personal distress. Additionally, while some studies have found that religion impedes criminal behavior (Bair & Wright, 2001;Powell, 1997;Bainbridge, 1989;Elifson et al. 1983;Peek et al. 1985as cited in Zuckerman, 2009, several other studies found that religiosity has no significant effect on inhibiting criminal activity, and that murder rates are lower in more secular nations and higher in more religious nations (e.g., Jensen, 2006;Paul, 2005;Fox & Levin, 2000, as cited in Zuckerman, 2009).

Measuring Morality
Morality has numerous ambiguous definitions. Commonly conceptualized as compassion toward fellow human beings or sympathy for others experiencing difficult situations, especially those involving physical or mental pain, morality can also be observed and measured when thinking about what to do in various situations. There are two common types of measures for morality, namely, measures of altruistic behavior and measures of moral reasoning. These measures capture participants' past moral behavior and their current moral reasoning abilities. 3 Altruistic behavior is generally considered moral, as it involves placing others' needs before one's own. Altruistic behavior can be measured using self-report scales, for example, Rushton, Chrisjohn, and Fekken's (1981) Self-Report Altruism scale composed of a 20-question survey that asks participants how often they have engaged in various altruistic behaviors. With any scale that requires participants to reflect on past behavior, we know that data collected with these instruments suffer from retrospective bias, normative bias, and the potential for exaggeration or self-aggrandizement. However, this particular scale has been shown to correlate well with performance of altruistic behaviors.
Moral reasoning is not a measure of moral behavior per se, but a proxy reflecting one's intellectual approach to situations with respect to the moral consequences of different actions. The classic method for measuring moral reasoning is an interview designed by Kohlberg (1969), which can be adapted to other data gathering formats, such as self-reported surveys, to streamline coding (see Appendix). Kohlberg's theory of stages of moral reasoning measures people's reasoning ability by evaluating their reactions to hypothetical moral dilemmas (e.g., Colby et al. 1983;Kohlberg, 1969). In this system of measurement, participants are not evaluated based on their ability to select the "moral answer" for each dilemma, but rather on the type of reasoning they use to justify the behavioral option they selected (Crain, 1985). Based on their justifications, participants can be placed on a six-point scale indicating the stage of moral reasoning they have reached, or the stages they are between. Unlike the previous measure, which focused on altruistic behavior and has been shown to reflect verifiable behaviors, this measure examines moral reasoning. This is of particular interest because the stereotype of the "amoral atheist" pertains to both moral reasoning and moral behavior. 4 Despite the immense utility of Kohlberg's dilemmas, we agreed with critics that his dilemmas are gender-biased and dated. 5 To address this concern, we replaced one of Kohlberg's dilemmas in order to tap into contemporary issues regarding abortion, and revised two dilemmas in order to update their language and reflect contemporary dollar values. Only the Heinz dilemma remained intact. (See Appendix for the complete dilemmas.) Additionally, while Kohlberg collected data through face-to-face interviews, we did not. Time-intensive interviews were streamlined through the use of on-line survey data collection techniques. In the spirit of an interactive interview situation, we also collected responses to a number of questions including open-ended questions such as "what would you do?" and "why would you do that?" (see Appendix).

Research Hypotheses
Based on the stereotype that theists are less moral than theists, we generated and tested the following two hypotheses: 1. Non-theists will score lower than theists in terms of altruistic behavior.
2. Non-theists will score lower than theists in terms of moral reasoning.
More specifically, we hypothesized that: 3. Non-theists will score lower than religious individuals in any category (highly, moderately, and weakly religious) regarding altruistic behavior.
4. Non-theists will score lower than religious individuals in any category (highly, moderately, and weakly religious) regarding moral reasoning.

Method
The first author (an undergraduate thesis student) gained IRB approval for the study, oversaw the recruitment of participants from Introductory Psychology courses, was responsible for gaining informed consent from participants, oversaw the collection of data, and lead a team of assistants during the analysis and preparation of results. Participants came to a research laboratory to complete a survey administered through Checkbox, an online survey system (www.checkbox.com, 2002) at their own pace with complete confidentiality. Participants commonly arrived in small groups, but completed the questions in individual rooms so that other participants could not view their answers or the pace at which their surveys were completed.

Instruments, Scales, and Procedures
First, participants responded to the content of four moral dilemmas (see Appendix). Moral dilemmas are paragraph-long scenarios, in which the main character is forced to make a (potentially difficult) decision in a (potentially difficult) situation. For each dilemma, participants answered 10 questions. The first, third, and fourth dilemmas (or scenarios) were Kohlberg's (Colby et al. 1983), which are the "Joe,""Louise," and "Heinz"dilemmas, respectively. The second dilemma, the "Courtney" dilemma, was original (i.e., we designed it for the purpose of this study). Participants were asked to answer four different types of multiple choice questions after reading each dilemma: (type-1) what would you do?; (type-2) what should the protagonist do?; (type-3) why elect that option?; and (type-4) another group of questions that tie in to relevant social norms.
For example, in the Appendix, question 10, a type-1 question, asked whether or not the participants would have an abortion if they found themselves in "Courtney's" position. By comparison, question 19, a type-2 question,asked "Thinking back over this dilemma, what would you say is the most responsible thing for Courtney to do in this situation?" Questions 11, 16, and 18 were type-3 questions, asking the participants to explain why they made the particular decision that they did. In each of these questions, the multiple choice answer always included an option that allowed participants to indicate religion as their primary reason for making their decision (e.g., "I would feel religiously obligated not to have an abortion" or "God would be upset"). Questions 12, 13, 14, 15, and 17 were type-4 questions, asking about different social norms (e.g., "Is it wrong to consider having an abortion?", or "Is it acceptable for unmarried people to be having sex?").These questions allowed us to take a more in-depth look at participants' reasoning.
After completing the moral dilemmas, participants completed five surveys: the Self-Report Altruism Scale (Rushton et al. 1981);the Attitudes Toward God Scale (ATGS 9, Wood, Worthington, Exline, Yali, & Aten, 2010) 6 ; the Attachment to God Inventory (Beck & McDonald, 2004); a Religious Upbringing Scale (RUS, Charles, Rowland, & Didyoung, unpublished); and the Religious Commitment Inventory (Worthington, Wade, Hight, Ripley, & McCullough, 2003). Data from the ATGS 9, the Attachment to God Inventory, and the RUS are not analyzed here. Participants also completed four demographic questions: gender, age, religious preference (if any), and belief in God (yes/no). We did not ask about race or socioeconomic status.

Scoring and Coding
The open-ended questions in the moral dilemmas were scored in compliance with Kohlberg's published coding instructions (Colby et al. 1983). Participants' responses to each dilemma were coded, and the resulting scores were combined to give each participant a Moral Maturity Score (MMS). Roughly speaking, the process of scoring is as follows: the coder identifies the theme being taken into account by the participant (e.g., concern for other), which comes from a list prepared by Colby and Kolhberg (1983). The coder then rates the sophistication of the participant's reasoning about each theme (e.g., it is "stage 3", or "between 3 and 4"), according to Colby and Kolhberg's (1983) instructions, and then a weighted average is taken. Thus, an MMS score of 1 indicates the lowest possible score (i.e., only obedience and punishment driven reasoning), while a score of 3 indicates a higher level of moral reasoning (i.e., the search for interpersonal accord and compliance with general social norms). In principle, Kohlberg's scale has a ceiling of 6, which implies the highest level of moral reasoning (i.e., reasoning driven entirely by "universal principles"). The first author, along with two research assistants working under his supervision, coded every survey. The group met to determine the rate of consistency between independently derived codes. Because of intensive training prior to the coding process, nearly unanimous coding took place (more than 95% consistency between coding). In the few cases where disagreement arose, the group met and consensus was reached. This process lasted nearly five months in the fall of 2011.The other five surveys were coded by summing participants' responses, including a number of reverse-scored items.

Assignment of Religious Categories
The demographic information and data from the Religious Commitment Inventory were used to categorize participants into groups based on their level of religiosity. First, participants who said they did not believe in God or who listed their affiliation as Atheist or Agnostic were categorized "non-theist." We performed a quartile split on the Religious Commitment Inventory scores of the remaining participants. Participants in the bottom quartile were labeled "weakly religious," those in the top quartile were labeled "highly religious," and those in the middle were labeled "moderately religious." 7 In this study, "non-theist" included atheists, who lack belief in the existence of a god or supreme being, and agnostics,who believe that the existence of a god or supreme being cannot be determined (Miovic, 2004). Following Kosmin and Keysar's (2009) categories, non-theists also included anyone self-identifying as Secularists, Humanists, Ethical Culturalists, and individuals with no religious preference. These groups are commonly referred to as "Nones" because, when asked "What is your religion, if any?" members of these groups respond "None" (Kosmin & Keysar, 2009 p. 2). In this study, theists included individuals who self-identify with a major religion, which, in our sample, includes Catholics, Jews, and Protestants. We recruited participants from Introduction to Psychology classes at a rural college and 114 student participants joined the study (representing roughly half of the participant pool).
To add further validity to our religious categories, we analyzed the type-2 questions in the moral dilemmas survey to determine the number of times members of a given religiosity level claimed religion as their primary reason for behavioral decisions. Significant differences were found between the groups (F (3, 110) = 21.35, p < .05). Non-theists (M = .000, SD = .000) did not differ from weakly religious participants (M = .250, SD = .676); however, both those groups selected religious justifications less frequently than did the moderately religious (M = 1.04, SD = 1.160); and the moderately religious selected religious justifications less frequently than did the highly religious (M = 2.40, SD = 1.658).

First Hypotheses: Theists and Non-theists
The data did not support our first hypotheses. Theists did not significantly differ from non-theists in terms of self-reported altruism (t (112) = -1.157, p > .05) or moral reasoning (t (112) = -0.038, p > .05). While we are aware of the difficulties in interpreting null results, our analysis has adequate power to detect small effect sizes, leading us to conclude that any undetected effects are quite small. The analyses below indicate we were able to detect several other effects. These results are admittedly preliminary.

Social Norm Decisions Relevant to Abortion Scenario
Because groups only differed in their projected behaviors indicated in scenario 2, we further analyzed type-4 questions for scenario 2, the "Courtney" dilemma. Question 12 asked whether it is acceptable for the characters of the dilemma to be involved in premarital sex. A one-way ANOVA found a statistically significant difference between the groups (F (3, 110) = 10. 647 p < .05). A Tukey HSD test indicated that the mean for non-theists (M = 0.89, SD = 1.451) differed significantly from the mean for the moderately religious (M = 3.13, SD = 2.525) and from the mean for the highly religious (M = 4.64, SD = 3.094); the mean for the weakly religious (M = 1.71, SD = 1.805) differed significantly from the mean for the highly religious (see Figure 3a).
Religious individuals were more accepting of premarital sex when it was emphasized that the couple had been in a relationship for two years. For Question 17 there was still a significant difference between groups (F (3, 110) = 3.543 p > .05). However, for that question the only difference indicated by a Tukey HSD was that the mean for the highly religious (M = 6.88, SD = 3.180) differed from those of all other groups: non-theists (M = 9.17, SD = 1.465), weakly religious (M = 8.25, SD = 2.327), and moderately religious (M = 8.30, SD = 2.176)(see Figure 3b).
Higher degrees of religiosity were associated with lower levels of tolerance for Courtney's consideration of the abortion, which was observable from analysis of Question 15. There was a

Box plots showing attitudes towards premarital. Questions 12 and 17 can be summarized as follows (A) 12: Is premarital sex acceptable? (B) 17: Given that they have been together for 2 years? Question 12 has been reverse coded so that higher scores in both plots indicate sex is acceptable, while lower scores indicate it is not. Note that there is (A) a fairly linear trend between degree of religiosity and the belief that even premarital sex is wrong, unless (B) you point out that there is a long-term relationship, in which case Weak and Moderate Religious people are (as a group) indistinguishable from Non-Theists, and that even the majority of Highly Religious people think premarital sex is acceptable.
significant difference across groups (F (3, 110) = 11.249 p < .05). A TukeyHSD test indicated that the mean for non-theists (M = 1.33, SD = 2.497) differed from the mean for weakly religious (M = 3.46, SD = 2.654), the mean for weakly religious did not differ from the mean for moderately religious (M = 4.15, SD = 2.956), and all groups' means differed from the mean for highly religious (M = 6.28, SD = 2.865) (see Figure 4a).
Analysis of Question 14 indicated that theists thought that Courtney should bear the "burden" of being a teenager mother, whereas non-theists did not (F (3,110) = 13.892 p < .05). A Tukey HSD test indicated that the mean for non-theists (M = 2.22, SD = 2.157) differed significantly as compared to the means for all theist groups: weakly religious (M = 6.00, SD = 2.670), moderately religious (M = 6.02, SD = 2.739), and highly religious (M = 7.16, SD = 2.461). However, theists did not differ from each other.
In contrast, Question 13 indicated agreement about who to blame for the circumstances of the dilemma. There was no significant difference between groups (F (3,110) = 2.257 p > .05), and all groups indicated that fault rested on Courtney (see Figure 4c).

Discussion
The fool hath said in his heart, there is no God. They are corrupt, they have done

Box plots showing additional social-norm judgments from Scenario 2. Questions 15, 14, and 13 can be summarized as follows (A) 15: Is considering an abortion wrong? (B) 14: Must Courtney accept being a teen parent? (C) 13: Is it Courtney's fault for forgetting her birth control? Note that there is (A) a fairly linear trend between degree of religiosity and the belief that even considering an abortion is wrong, (B) a bifurcation with Theists thinking that Courtney must have the child, and Non-theists at least admitting her a choice, but that there is (C) agreement across all groups (on average) that it is Courtney's fault that she is pregnant. abominable works, there is none that doeth good (Psalm 14.1, The Holy Bible, King James Version) 8
Social acceptance of non-religious individuals in the United States has not kept pace with their growing numbers. The stereotype that non-theists are amoral endures among religious individuals, partly justified by particular readings of classic works, such as the Psalms; however, this tension is also readily seen in contemporary works, especially those locked in debate such as The Case for God (Armstrong, 2009) and Atheism: The Case Against God (Smith, 1979).
Social scientific and behavioral research, however, has not taken up this issue empirically. To the best of our knowledge, these preliminary results are the first empirical test of this stereotype, and the results evidence no difference in altruistic behavior or moral reasoning abilities between theists and non-theists (see Figure 1). Additionally, close inspection of the few observed differences between groups suggest that the cultural boundary separating morality from amorality may have changed over time in such a way that keeps non-theists ever outside the bounds of morality in this ongoing conversation about which groups are and are not moral.
On point, the behavioral choices that theists and non-theists considered to be culturally appropriate solutions to moral dilemmas were surprisingly similar in a number of cases. In Scenario 1, for example, all groups indicated it was acceptable to disobey direct instructions from a parent (see Figure 2a). In Scenario 3, all groups indicated it was acceptable to bear false witness to a parent (see Figure 2c). In Scenario 4, all groups expressed ambivalence about the acceptability of theft (see Figure  2d). The only significant difference in behavioral choices across the dilemmas was in Scenario 2, in which non-theists were, on average, more likely than were any of the theist sub-groups to indicate that they would have an abortion in the context of the situation described in the dilemma (see Figure 2b). Thus, any attempts to emphasize the differences in moral decision making between theists and non-theists can only hold if it focuses on the very narrow points of disagreement (e.g., abortion), while ignoring the wide swaths of similarity between the groups (as indicated by the other three dilemmas).
Even emphasis on the very limited moral question of whether abortion is acceptable overstates the differences between theists and non-theists. While the groups studied seemed to have different opinions about the acceptability of premarital sex (see Figure 3a), the groups all judged premarital sex acceptable when the question was rephrased to emphasize the ongoing and "committed" nature of the relationship (see Figure 3b). While group opinions differed in terms of the acceptability of Courtney even considering an abortion, all but the highly religious erred on the side of it being acceptable (see Figure 4a). Further, all groups agreed that responsibility for the precarious situation was Courtney's (see Figure 4c). The only question that suggested a clean divide between theists and non-theists was whether Courtney's responsibility for the situation obligated her to keep the child (see Figure 4b).

Limitations
We acknowledge that there is ongoing debate about whether or not Kolhberg's morality model is indicative of actual human behavior (Fishbein & Ajzen, 1975). Reasoning and action are obviously not the same. There is also the potential lack of generalizability. The religious affiliation among participants was limited. Further, our data were collected in a rural US context, and the participants were primarily traditionally aged college students.
In addition, we also recognize that with such a modest sample in our preliminary study, the error associated with measures of group means can be quite large. As mentioned earlier, this implies that null results are likely to be found when small differences are still present. We believe there is no substantial change to our conclusions if such small differences exist, though they might be important for other purposes. After all, the prominent stereotype that our data argues against would predict large differences, which we would have easily detected, if they were present.

A comment on the shifting cultural boundaries of morality
In closing, it is worth noting that much of the interesting data in this study was derived from our transformation of Kohlberg's dilemmas. We used three dilemmas from his monograph (Colby et al. 1983) and a new dilemma created for the study about abortion. We believe that if we used more of Kohlberg's original dilemmas, we would have continued to observe similarities between theists and non-theists. While it is an open empirical question, we do not know whether Kohlberg's dilemmas ever distinguished theists from non-theists in a culturally meaningful way; Kohlberg's original analysis from the 1950s did not report on this comparison. What we do know is that our new dilemma regarding abortion was the only one in this study to show a clear statistical difference between the groups. Perhaps not by coincidence, abortion is one of the most prominent issues in modern debates about morality.
We now hazard a hypothesis: Non-theists are consistently deemed amoral by theists, but for different reasons depending on the social and historical context. To illustrate, consider the cultural boundary that is now drawn among our participants regarding when premarital sexual relations are appropriate. We suspect that even the weakly religious participants in Kohlberg's studies from the 1950s would have judged premarital sex morally inappropriate after two years of dating. In contrast, our contemporary study shows that even highly religious participants considered pre-marital sexual relations acceptable, especially after two years in a committed relationship. Previously, cultural boundaries surrounding pre-marital sexual relations may have differentiated between "moral theists" and "amoral non-theists"; however, the same issue no longer maintains the boundary. Under such circumstances, either the stereotype must fall away or a new criterion must be established to maintain the boundary. In this particular case, if US theists are motivated to maintain their perceptions of moral superiority, the historic trend toward moral homogeneity puts ever growing pressure on whatever points of contrast that remain to cleanly separate them from non-theists. The entire weight of the negative stereotype comes to fall on an increasingly truncated subset of issues.
This may help explain the particular vehemence with which people argue about issues such as abortion. As one of the few current boundaries keeping the negative stereotype about non-theists alive, opinions about abortion have become a crucial component of religious adherents' self-identity as moral arbitrators. When abortion inevitably fails to polarize morality debates, those seeking to maintain their identities as moral arbitrators will, out of necessity, find some new issue to hoist up as firm proof that non-theists are (still) amoral.

Scenario II
Courtney is a 17-year-old girl. Her boyfriend and her have been dating for two years. They are sexually active, only with each other. Courtney started taking birth control in the second year of their relationship. Courtney started taking birth control because she was afraid to become pregnant as a teenager. She feels that could destroy her life because she is so young. Courtney tends to be forgetful and accidentally stopped taking her pill for 2 weeks. During this period she had intercourse with her boyfriend and became pregnant. She is afraid to tell her mother, and horrified at the thought of having this child. She and her boyfriend feel as if they are not ready to be parents, and that they have too much to lose in having this child. Courtney is considering having an abortion. Should she go through with this idea? Why? ___________________

Scenario III
Judy was a twelve-year-old girl. Her mother promised her that she could go to a special rock concert coming to their town if she saved up from baby-sitting and lunch money to buy a ticket to the concert. She managed to save up the 50 dollars the ticket cost plus another five dollars. But then her mother changed her mind and told Judy that she had to spend the money on new clothes for school. Judy was disappointed and decided to go to the concert anyway. She bought a ticket and told her