loader image
Home » Blog » We ran a test: How well can AI create texts?

We ran a test: How well can AI create texts?

Of course we did it last year too: we tested how good ChatGTP was at writing articles. Our verdict at the time was mixed. Yes, a home­work assign­ment on a not too current and diffi­cult topic is defi­ni­tely possible. The concerns of some teachers are justi­fied. In our opinion, it was no longer suitable for a review article on a scien­tific topic. Good for us — our job will stay with us for a while yet.

But what left us feeling more than a little uneasy at the time was the way ChatGTP handled refe­rences. For the most part, it simply made them up. No joke: these “refe­rences” were made up from set pieces of real-life author names, titles, jour­nals, etc. At first glance, it wasn’t even noti­ceable. At first glance, it wasn’t even noti­ceable, but only when it was checked — which is part of the job of a science journalist.

The good news is that arti­fi­cial intel­li­gence is indeed evol­ving. The same query sent to ChatGTP today is answered with comple­tely correct refe­rences. Nevert­heless, such missteps alert AI to the fact that we are giving it the benefit of the doubt — which may not be justified.

Now one of the online spell-checking plat­forms available on the internet has also “come up with AI” (rechtschreibpruefung24.de). This is not prima­rily intended to improve the grammar check, but rather to provide rewor­ding sugges­tions. This looks like this, for example:

Original:

What left me feeling more than a little uncom­for­table at the time, however, was the way ChatGTP handled refe­rences. For the most part, it simply made them up. No joke: these “refe­rences” were put toge­ther from set pieces of real exis­ting author names, titles, jour­nals, etc. At first glance, it wasn’t even noti­ceable. At first glance, it wasn’t even noti­ceable, but only when the checking started — which is part of the job for a science journalist.

Simpler:

At the time, I was parti­cu­larly bothered by the way ChatGTP handled refe­rences. For the most part, they were simply made up by combi­ning real author names, titles and journal names. I didn’t notice this at first, only on closer inspection.

Formal:

What left an extre­mely unplea­sant feeling, however, was ChatGTP’s approach to refe­rences. A large part of it was actually just cons­tructed. These ‘refe­rences’ consisted of frag­ments of real author names, titles and jour­nals. This went unno­ticed at first and only became appa­rent on closer inspec­tion — some­thing that is essen­tial for a science journalist.

Alter­na­tive:

A very unsett­ling feeling arose at the time from the way ChatGTP handled refe­rences. In fact, most of them were simply made up by thro­wing toge­ther real author names, titles and journal names. This was not noti­ceable at first; only on closer inspec­tion — which is important for a science jour­na­list — did it become clear.

 

Our opinion on the result

Not bad in parts, but the meaning is slightly distorted in some places and some content is missing. In further tests with other texts, the actual meaning some­times changed consider­ably in both the “Simpler” and “Alter­na­tive” versions.

Conclu­sion: AI curr­ently seems to be a nice tool for complex and deman­ding texts. But please take a close look and keep your trust in the black box within narrow limits!

P.S. Only the free versions were tested for our trials.