The Unseen Hand of the Machine — How Chatbot Writing Shapes Psychological Text Analysis**
Kadir Uludag’s article, *Exploring the Association Between Textual Parameters and Psychological and Cognitive Factors*, offers a timely starting point for a deeper conversation. The study sets out to provide a practical toolkit: a Python-based framework for quantifying linguistic features—sentence length, polarity, pronoun ratios, tense distribution, and more—that might correlate with psychological states. To test the code, the author asks ChatGPT 3.5 Turbo to “create a random text comprising 1000 words divided into 5 paragraphs,” then runs the analysis on the machine-generated output. The stated goal is a methodological foundation for future behavioral research. But the real revelation lies less in the code and more in what the generated text inadvertently teaches us about the nature of chatbot writing and its growing entanglement with psychological science.
The paper’s findings are quietly striking. The ChatGPT-generated text exhibited a consistent positive sentiment, a heavy bias toward the past tense (34 instances versus only 4 in the present tense and none in the future), zero location words, a moderate average sentence length of 16.41 words, and an absence of any duplicate sentences. No location words appeared at all. The author speculates, with appropriate caution, that the positive emotional valence may reflect a deliberate design choice by OpenAI to keep interactions pleasant and safe. I would argue it reflects something larger: the statistical backbone of a model trained on vast corpora where narrative and descriptive prose—often past-tense, positively valenced, and geographically untethered—predominates.
This seemingly mundane observation has profound implications for the impact of chatbot writing on research. If a researcher were to use such generated text as a neutral “control” or a proxy for human baseline data, they would be building their inferences on a foundation that is anything but neutral. The “random” text is not random; it is a smoothed, bias-laden reconstruction of the language patterns most probabilistically common in the model’s training data. It lacks the noise, the idiosyncrasy, the emotional volatility, and the groundedness of real human language. It does not reference a physical location because it has no body and no situated experience. It avoids the future tense because it has no plans. It leans positive because it was fine-tuned to be helpful, harmless, and honest. These are not flaws in the model; they are features—features that, if unrecognized, can silently corrupt psychological measurement.
Consider the consequences for the very textual parameters the paper catalogs. A dataset composed entirely of chatbot responses would show artificially low variance in sentiment, an unnatural distribution of tenses, and a suspiciously uniform pronoun ratio. Studies that rely on such data to establish “norms” for human linguistic behavior would be calibrating their instruments on a synthetic phantom. When the paper notes that “the observed positive sentiment in the study could be attributed to the coding of ChatGPT,” it understates the epistemic danger: it is not merely that this particular chatbot leans positive, but that every large language model carries a hidden fingerprint shaped by its training objectives, which remains invisible to a researcher who treats AI text as interchangeable with human expression.
The impact extends beyond data contamination. As chatbots are increasingly embedded in mental health apps, journaling platforms, and therapeutic chat services, their linguistic style leaks into the human communication that researchers then analyze. Users adopt the phrasing, the emotional tone, and even the grammatical structures of the models they interact with. If a therapeutic chatbot consistently frames reflections in the past tense, never marks future goals, and avoids naming concrete places, it may subtly discourage the very cognitive processes—future-oriented thinking, spatial grounding, autobiographical specificity—that are therapeutic. Uludag’s simple tense count becomes a canary in the coal mine: an AI therapist that never speaks of tomorrow might inadvertently train its users to stop doing so as well.
Moreover, the paper’s own methodology embodies a blurring of boundaries that is increasingly common. It uses a chatbot to generate the raw material, then applies a human-designed interpretive lens. The author is simultaneously researcher and prompter, but the “participant” is a black-box neural network. This configuration creates a closed loop in which the analytical framework and the object of analysis share a common ancestry in machine learning paradigms. The risk is a kind of academic tautology: using AI to generate data, then using AI-inspired metrics to analyze that data, while believing we are studying human psychology.
That is not to dismiss the paper’s utility. On the contrary, the Python code provided in the Supplementary Materials is a valuable resource for any researcher who wishes to subject AI-generated text—or human text—to systematic scrutiny. The real lesson, however, is that such tools must be used with acute critical awareness. Every textual parameter measured reflects an interaction between the prompt, the model’s training distribution, and the post-processing filters imposed by the platform. The absence of location words in Uludag’s generated text, for instance, is a direct consequence of a model that has been deliberately stripped of persistent memory and is discouraged from producing outputs that could be construed as identifying real-world places. A human, asked to write randomly, would almost certainly introduce a place name within a thousand words. The difference is not noise; it is a signal about the ontological gap between human and machine language.
Looking forward, the integration of large language models into psychological research demands a new layer of methodological transparency. When a chatbot generates text for a study, researchers should report not only the prompt and the model version, but also the known biases of that model—its sentiment tendencies, its temporal orientation, its typical use of self-reference. Journals in psychology and behavior research might consider requiring an “AI influence disclosure” akin to the conflict-of-interest statements already standard. And studies like Uludag’s, which compare the output of multiple chatbots using the same prompt, could become essential calibration tools, mapping the “personalities” of different models before they are deployed in sensitive human contexts.
In conclusion, the quiet lesson of this paper is that the impact of chatbot writing is no longer a distant speculation. It is already woven into our research instruments, our therapeutic interfaces, and our assumptions about what language can reveal about the mind. The challenge is not to retreat from these tools—they are too powerful and too pervasive—but to develop the literacy required to see the machine’s signature for what it is. Uludag has given us a lens; it is now our responsibility to look through it with open eyes.
link of study:
reference:
Uludag K. Exploring the Association Between Textual Parameters and Psychological and Cognitive Factors. Psychol Res Behav Manag. 2024;17:1139-1150
https://doi.org/10.2147/PRBM.S460503
