In sum, our findings highlight both the potential and the boundaries of LLMs in climate communication: they can shift attitudes modestly, especially when the information is delivered in a naturalistic and personalized format, but they are no silver bullet. As such, their use in climate communication should be approached with critical scrutiny and integrated thoughtfully alongside more traditional forms of science communication and engagement.
We initially collected data from 1,047 US participants via Prolific, using the prescreening option to obtain a representative sample across age, gender and political ideology. Participants were compensated an equivalent of £7.49 per hour for their time (preregistration https://osf.io/v7d9c/). The initial recruitment number was somewhat higher than the 1,000 participants who were preregistered because some participants were timed out by the system and so did not count against the 1,000 quota we requested (all participants were compensated for their time regardless). As preregistered, participants who expressed no view on climate change or who failed an attention check were removed from analysis. This left 949 usable participants (male, 49.7%; female, 49.2%; non-binary/other, 1.1%) with an average age of 45.3 years (s.d. = 15.6 years). We measured economic and social political ideology using a two-item measure on a seven-point scale (1, left/liberal; 7, right/conservative; r = 0.86). Average scores approximated the midpoint (M = 3.82). The sample skewed to being well-educated: 56.0% of participants reported having at least a bachelor's degree.
In line with our preregistration (https://osf.io/97ay4) we paid 500 US participants sourced from Prolific, although the final dataset incorporated an additional 14 participants who timed out or did not complete. Unlike study 1, for which we aimed for a representative sample, the goal of study 2 was to focus on climate sceptics only. As such, we used the prescreening filters of Prolific to only invite participants who responded to the question "Do you believe in climate change?" with "No". However, relying exclusively on participants' self-categorization as sceptical on the Prolific system is unreliable (1) because participants may have changed their mind since registering with Prolific and (2) because participants may strategically tick boxes for niche populations to maximize their chances of being invited for survey work. Given this -- and given that the interpretation of the confidence measure is contingent on the initial statements being sceptical -- we coded participants' initial statements to screen for signs of climate scepticism. Of the initial sample, 91 revealed no signs of scepticism and were removed from future analysis. A further 23 participants failed an attention check, leaving 400 participants who completed the initial intervention. Of these, 333 completed the 2-week follow-up survey and this was the sample on which we conducted the main analyses. We note that some participants who completed the pre-intervention surveys did not complete the post-intervention surveys, which helps account for the minor discrepancies between the sample size listed here and degrees of freedom reported in the analyses.
This sample comprised 198 female (135 male) with an average age of 45.6 years. Overall, 49.5% of participants reported having at least a bachelor's degree. Unsurprisingly given the focus on climate sceptics, the sample skewed conservative (M = 7.74). The sample of those who completed the follow-up was somewhat older than the non-completers (P = 0.028) but there were no other statistically significant differences between completers and non-completers with respect to demographics or pre-intervention scores on the four dependent measures.
We report the three-level analyses in the main paper (pre-intervention, post-intervention and follow-up). We note that the preregistration only featured the pre- and post-intervention waves: analyses of these scores are summarized separately in Supplementary Table 7 and Supplementary Fig. 1. The pattern of results is the same in these analyses as for the main analyses, with effects of time across all measures and no effects of treatment either as a main effect or as part of an interaction with time. Participants were compensated an equivalent of £7.50 per hour for completing the initial survey and £12 per hour for completing the follow-up survey.
For both experiments we adapted the procedure developed by ref. with several notable differences. Participants were first asked to enter their opinion on climate change in an open-ended text box: "We'd like you to share your overall perspective on climate change in a few sentences. Consider the following points in your response: How do you generally view climate change? Do you believe humans have a significant impact on the environment? There are no right or wrong answers -- we're interested in your honest thoughts and opinions". After writing their initial response, participants were asked "Could you share more about what led you to this opinion? For instance, are there specific pieces of evidence, events, sources of information, or personal experiences that have particularly influenced your perspective? Please describe these in as much detail as you feel comfortable". Participants then reported their age, gender, education and political ideology before the experimental manipulation.
We piped both responses related to climate change (the initial opinion and the elaboration) into the AI ChatGPT4-o Turbo, which was prompted to summarize each participant's views on climate change in a single sentence (see Supplementary Note for the prompt). We presented this summary to participants and asked them to indicate their level of confidence on a 0-100% scale that this statement was true (see Fig. 1 for an example). As in ref. , our paradigm was facilitated by the online AI aggregator OpenRouter, which was integrated into Qualtrics to pipe participant responses back and forth from Qualtrics into GPT 4o-Turbo. Also consistent with ref. , we disabled the copy-and-paste function to prevent participants using LLMs to create responses.
In study 1, participants were randomly assigned to one of two conditions. Participants in the AI condition engaged in a three-round conversation with GPT 4o-Turbo about climate change. Unlike the method used by ref. , GPT 4o-Turbo was unprompted for the duration of this conversation. Apart from being given the participants' two initial text entries on climate change, it was not instructed to persuade participants about climate change beyond its default directives and so responded with its default behaviour. In short, these three-round interactions were as similar as pragmatically possible to a participant accessing GPT 4o-Turbo themselves on their own computer and typing their views on climate change into the chatbox.
We used OpenRouter to facilitate conversations between participants and ChatGPT. OpenRouter provided our interface for accessing LLMs using default settings (temperature parameter of 1) that mirror the OpenAI API configuration. Our Qualtrics survey was designed to constrain the number of conversational rounds to three. We did not include instructions to the LLM to constrain the discussion to a set number of exchanges. For the sake of continuity, we incorporated the complete history (both AI and human messages) in API calls for each conversational round. We allowed AI responses without token limits, resulting in comprehensive replies often featuring several paragraphs and markdown formatting. Instead of word-by-word streaming, participants received fully constructed AI messages, with loading screens displayed during response generation intervals. This approach created an interaction paradigm that closely approximated the experience of conversing with ChatGPT through its standard web interface, maintaining ecological validity while satisfying our experimental requirements.
In the alternative information intervention condition of study 1, participants read a press release from the AR6 Synthesis Report released by the IPCC on 20 March 2023. We selected the IPCC AR6 Synthesis Report press release as our comparison condition because it represents one of the most authoritative, widely publicized and policy-relevant examples of international climate science communication. As a formal summary of the Sixth Assessment Report, it distils key findings intended for broad dissemination, including to policy-makers, media and the public. The press release version was chosen because it is a public-facing, accessible format that parallels the function of ChatGPT responses (both are concise, text-based explanations of climate issues aimed at lay audiences).
In study 2, all participants completed a conversation with ChatGPT, but they were randomly allocated to complete a three-round or a six-round conversation. Again, the number of rounds was determined through coding in Qualtrics; the AI was not instructed to keep the conversation to a particular length.
After the interventions, participants were presented with their original AI-summarized view on climate change and asked again to indicate their confidence. Participants were also asked to complete the same climate scepticism, policy support and pro-environmental action intentions scales used in the pre-intervention survey.
The same measures were used in both studies 1 and 2 and are itemized in Supplementary Table 1. We first presented participants with a paraphrased version of their open-ended statements about climate change, before asking them to rate their 'level of confidence that this statement is true' (0, definitely false; 100, definitely true).
Climate change scepticism was measured using a 12-item scale developed by ref. . Three items captured each of four dimensions of scepticism: trend scepticism (for example, "I am not sure that climate change is actually occurring"), attribution scepticism (for example, "The climate change we are observing is just a natural process"), impact scepticism (for example, "I believe that most of the concerns about climate change have been exaggerated") and response scepticism (for example, "There is not much we can do that will help solve environmental problems"). Participants responded on a slider scale (0, strongly disagree; 100, strongly agree). Although canvassing four subdimensions of scepticism, exploratory factor analysis of the pre-intervention scores revealed strong evidence for a one-factor solution in both studies and so we averaged all scores into a single scale (study 1, α = 0.96; study 2: α = 0.84).
Policy support was measured using nine items developed by the authors. Items canvassed diverse elements of policy support: support for renewable energy (for example, "Investing in renewable energy should be a priority for our country"), support for regulation and legislation (for example, "I am in favour of strict regulations to limit carbon emissions from factories and vehicles"), personal and community action (for example, "I support local initiatives to reduce waste and promote recycling") and support for global and national agreements (for example, "Our country should adhere to international agreements aimed at reducing climate change"). Participants responded on a slider scale (0, strongly disagree; 100, strongly agree). The eight items were highly correlated and were averaged to form a single scale (study 1, α = 0.96; study 2, α = 0.90).
Pro-environmental action was measured using a three-item scale developed by ref. . The three items were: "I want to change my lifestyle in ways that help to address climate change", "I am not at all motivated to help reduce climate change" (reverse coded) and "I am prepared to greatly reduce my energy use to help tackle climate change (0, strongly disagree; 100, strongly agree; study 1, α = 0.87; study 2, α = 0.81). Data for both studies are publicly available (https://osf.io/mtv39/).
In study 1, we coded participants' initial beliefs about climate change (that is, their open-text responses to the questions that asked for their views) into three categories: contains no scepticism, 0; contains some scepticism, 1; and expresses no view on climate change, 2. When coding responses for signs of scepticism in the open-ended comments, we instructed coders to focus on the dimensions of scepticism outlined in the introduction: trend scepticism, attribution scepticism, impact scepticism and response scepticism. Specifically, open-ended comments were considered to contain scepticism if they expressed sentiments that climate change is not real; that it is not primarily caused by humans; that the effects of climate change will not be as extreme as the science predicts; or that scientifically endorsed solutions do not work. There was substantial agreement (89.26%) between two coders, one of whom was not part of the research team. In cases where there was disagreement between the two raters, a third rater adjudicated. The analyses reported in study 1 focus only on participants who were coded as sceptical (1) or non-sceptical (0). An independent coder also examined open-ended comments in study 2, using the same coding scheme. Analyses reported in study 2 focus only on participants who were coded as sceptical.
We adhered to ethical guidelines and obtained ethical approval for all studies reported in this paper (studies 1 and 2: 2024/HE001753). All participants provided informed consent.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.