Science "Trust the Science" - A Growing Problem - Usage of certain words in scientific papers have skyrocketed since ChatGPT

  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
Word of the day: apophenia. Why? Because apophenia means a tendency to perceive connection or meaningful patterns between unrelated random things. While I do not personally believe that this video is an example of apophenia, I think every reader should be aware of what it means before reading this and carefully evaluate their own opinion on the subject.

A while ago I talked about the degradation of Internet quality as a direct result of generative AI, but today I have a much different approach which more adequately showcases what's happening right now in academia. I'm referring to medical science, journalistic publications, and in general the world of scholastic literature. See, it's important because we as human beings on planet Earth are consistently seeking new knowledge. Sure, modern generations seem to be looking for that knowledge on platforms like TikTok, which is brain dead, but even those of us who consider ourselves to be somewhat well informed - although there certainly is a degree of hubris in that statement alone, to be sure - even those of us who consider ourselves to be relatively well informed are consistently using appeals to authority.

For anyone who doesn't recognize that term, appeal to authority describes a very common logical fallacy in argumentation where, in order to support a given statement, someone will connect that statement to some sort of authoritative figure. The idea is that if an authority figure has made some sort of claim, the claim itself is more likely to be true because of who made it. While it's actually a bit more complicated than that, generally speaking, the truth here is that we often cite authoritative sources in an effort to understand the world or to prove ourselves correct.
Those authoritative sources come from a very wide background. Some consider journalists to be authoritative, some consider talk show hosts to be authoritative - scientists, politicians, law enforcement, etc. You get the point. However, the consistently cited highest authority is the world of science. That world is obviously broken down further into categories such as biological, physical, chemical, environmental, social, etc. But the world of scientific research and subsequent publication is very often considered the highest authority we have.

So where's the problem? Well, to properly explain that, I need to talk a little bit about generative AI and showcase a few patterns, because this new online world, so to speak, facilitated by ChatGPT and similar programs, is rapidly infecting every corner of our digital experience, even so far as to now be extensively involved in the world of academic research and publication behind the scenes. Let me explain.

First, ChatGPT became commercially active on November 30th, 2022. This makes 2023 the first complete fiscal year that ChatGPT was widely available, which is very important for us to remember. Second, over the course of its current lifespan, many people have begun noticing certain patterns. ChatGPT may produce lifelike imitations of written human consciousness, but no matter what you ask it to do, it consistently favors certain words and phrases. The process of directly speaking of how and why the program selectively chooses those words and phrases is extremely complicated, but a variety of sources ranging from research teams to individuals who frequently generate tons of outputs have broken down their list of the most common outputs from ChatGPT, which is where I decided to start looking.

According to Twixify, these right here are 124 of the most common words and phrases that ChatGPT chooses, starting off with "meticulous" and "meticulously". This is a list of 10 high-frequency outputs from an individual researcher, starting with "delve" and "dive". And this is an article from Ideapod showcasing top phrases to avoid for yourself as an individual if you don't want to sound like an AI, which has things like "let's delve in" or "let's uncover".

There's absolutely no shortage here in terms of people noticing patterns, from publications to Reddit threads to full-blown peer-reviewed research. What's more, these common words are very often similar between lists. This is the part where I urge everyone watching this to do their own individual testing. After browsing these papers for an extensive period of time and observing what words or phrases are the most commonly displayed throughout them, I put together a truncated list of words that appear frequently at a high level on most of these different lists. Words highlighted in green are some of the most common outputs by consensus, and words in yellow are also extremely common, appearing multiple times but maybe they aren't the number one or number two result on multiple of those lists and sometimes appear further down.

Obviously I could increase the size of this list pretty easily and add a lot more words and phrases to analyze, but for the sake of demonstration, this is probably more than sufficient for today. Now, having this list composed, we need to actually look at what's been happening. And that's where things get really strange.

I'll start with PubMed before moving over to a platform called OpenAlex, which is a lot easier to showcase what I'm talking about more directly. But here's the gist of it. Let's start with the very first word on my list: meticulous. Searching PubMed, which uses the National Library of Medicine containing 36 million citations for biomedical literature, we see that the word "meticulous" appears with slowly increasing frequency until 2023, where it suddenly doubles. Coincidentally, that is the first full year that ChatGPT was available commercially. And we also see that in 2024, it's on track to more than quadruple.
In 2021, there were roughly 1,200 papers using the term. In 2022, there were 1,100. But in 2023, that number suddenly jumps to over 2,000, on track now in the first quarter of 2024 to see nearly 5,000 instances by the end of the year. Keep in mind, "meticulous" is one of the most common outputs from ChatGPT, expressed by multiple studies, lists, and publications analyzing the subject, which means that the frequency of a ChatGPT favorite word is doubling, then quadrupling inside medical research literature during the two years directly after the program comes out.

Hopefully everyone can see what's happening here, because this requires the question to be asked: Is ChatGPT writing these papers? Is it translating these papers? Is it reformatting them? Is ChatGPT being used at all here, and if so, in what capacity? Obviously, that's just one singular example, so let's continue. "Delve". This is another commonly accepted "spam word" from ChatGPT, and also one that other users have searched before on PubMed, which effectively caused me to go down this road to begin with because the results are staggering and it's very extreme. So let's demonstrate.

"Delve", from 2020 up until 2022, was increasingly used, seemingly on pace with the general rise in publication frequency overall. But in 2023, the number of uses for this word just completely explodes, more than quadrupling in that one year from 627 to almost 3,000, and is currently on track to be used in more than 10,000 different published works during 2024. That rise is unbelievable, and once again requires the question: Is ChatGPT being used to write scientific research?

Let's keep going. "Seamlessly". Not nearly as shocking compared to the previous two, but a dramatic rise during 2023 usage beyond any single year jump in history. Next up, "realm". Once again, enormous spike in usage during 2023 alone, far surpassing any previous increase on a yearly basis by incredible margins. "Unwavering". A much less commonly used word in general, according to this search, but again, a dramatic spike in 2023, larger than any previous single year increase on record.

Next up, "additionally". This one is a bit more interesting, because it was basically growing exponentially up until 2021 already. But in 2023, a larger than expected increase based on historical patterns. I could keep going, which I will in the background, I guess, but this pattern is frequently occurring if you analyze words and phrases that are commonly associated with ChatGPT.
It's not universally true, mind you. There are plenty of words that don't showcase a surprising or unexpected increase in usage like this. But the number of times where it is true, and the scale to which we see it demonstrated, well, let's just say it makes me absolutely certain that ChatGPT is being heavily used in medical research papers.


Now, here's the thing. That's one way to showcase what I'm talking about, and it works, it definitely does. But there's another way. This is OpenAlex. It's named after the ancient Library of Alexandria, funnily enough, and contains over 250 million articles, dissertations, books, data sets, you name it. The interesting thing that OpenAlex lets us do is combine keyword searches one after the other, which paints an entirely new picture.

Let's begin by going one at a time. "Crucial", "harness", "unlock" - three of the words from our list of most common ChatGPT outputs. We start with "crucial", and there's not really any noticeable spike in usage during 2023. This is a much wider set of academic literature compared to just PubMed, so maybe that's to be expected. I don't know. But let's keep going. If we add "harness" now, we can see - well, wait a minute, there actually is an increase, a pretty dramatic spike, if we're being honest, in the usage of these two words simultaneously together during 2023. And if we then add "unlock", another hyper-common ChatGPT output, the results become very obvious. The usage of these three words in concert in the paper or the article or whatever it is more than doubles from 2022 onward, far outside normal historical patterns.

Let's try it again. "Seamlessly", "meticulous", "realm". Starting with "seamlessly" by itself, we have a pretty normal-looking distribution compared to other keywords that are not commonly associated with ChatGPT. Good sign, I guess. Add on "meticulous", and - wait a second - suddenly there's a huge spike. Now, I should be clear here, "meticulous" by itself does not have the same enormous discrepancy. So what we can see now, very clearly demonstrated, is that from 2022 to 2023, the usage of these two ChatGPT outputs in tandem has tripled. Add on one more, "realm" in this case, and what do we find? Yep, as expected, once again, the usage of these words has spiked by a completely disproportionate amount during 2023 alone, far beyond any normal pattern for each of them separately, and far beyond any historical increase before that.

I could keep going for quite a long time, but hopefully, it's becoming clear now: ChatGPT is being used in some capacity, widely, across academic research. It's being used in a capacity where, at the very least, it is writing or rewriting this academic research. And while some of that usage may be completely innocuous, the scale at which we see it demonstrated here, in my opinion, is extremely alarming.
Keep in mind, alongside all this, scientific papers have already been experiencing record growth in the number of retractions required during 2023, before we've even seen the effects that ChatGPT will have on the entire field. People are faking research, faking citations, and getting caught more than ever before, disproportionately higher than the natural growth rate of research itself.
But adding in a phenomenon where anyone with a computer can falsify these things and create fake articles, well, the AI-ification of the internet is happening faster and more invasively than almost anyone could have predicted.

Bottom line, simple keyword research indicates to me that science has a really, really serious problem growing inside it right now. Maybe a lot of this is just AI translation, or assisted research with an AI writer to try and eliminate mundane or boring tasks. But that needs to be disclosed. There needs to be very, extremely, unbelievably strict regulation on these things, at the very least mandating a detailed disclosure of what you used AI for, when you used it, and what precisely it did to that research. Or the credibility of all of your work goes down to absolute zero, and you cannot be taken seriously, and no one should listen to you.

The problem I see is that scientific research and academic publications are very often subjected to the same sort of corrupt profit motive as any other industry. The same way journalists will use AI, and are using AI now more and more, for cheap, easy, instant articles, scientists will use it for easy, hassle-free write-ups and citations. Hell, even lawyers are now using this to just fake citations in court cases.
If you can't see how badly this is all going downhill right now, I don't know what to tell you. In the end, this is just a demonstration of how far, and how fast I might add, the problems are spreading. But it remains to be seen what anyone will actually do to stop, regulate, clarify, or control it.

Originally from a youtube video by UpperEchelon and reformated into an article by Claude LLM.
 
"Smoke cigarettes! It's good for your health." (Lung Cancer)
Nicotine has been shown to stave off Crohns and other chronic inflammatory diseases .
We've also seen obesity balloon since nicotine use has subsided.
Everything has a tradeoff.
 
I just want to add: During a recent science conference I attended (Biology, chemistry etc., not psych or social studies), the professor/speaker actually directly encouraged us to use chat GPT and other AI to help "speed up writing papers" and "make it easier to do and figure out what to write about", the worst thing was the general positive reaction to this.
Science is going to get harder and harder to understand and use because of this slop and the further increase of retards that infiltrate the field.
 
Nicotine has been shown to stave off Crohns and other chronic inflammatory diseases .
Nicotine gum is a powerful antidepressant, so I use it to break out of particularly dark moods.
The shortterm memory boosts help too.
 
Nicotine has been shown to stave off Crohns and other chronic inflammatory diseases .
We've also seen obesity balloon since nicotine use has subsided.
Everything has a tradeoff.
When (or if, but hopefully when) I finally quit smoking, I will absolutely not be quitting nicotine. Gum or lozenges will likely be my route, since they seem to have the best ratio of good vs bad effects.
 
I don't like how this is just another "ai fearmongering article" that acts like shit saying "trust the science" only started being a cancerous marketing term since after chatgpt started writing science articles. Even actual legit problems get roped down the hole of "UHHH NO SEE IT'S DUE TOT HE RISE OF AI GUYS WE NEED TO REGULATE AND BAN AI FOR COMMON USAGE". Fucking hell you were ALMOST THERE but ended up being as absolute shitbrained as the people with actual degrees that are treating a glorified text generator as some sort of scientific oracle.
 
Back
Top Bottom