Home » Blog » We ran a test: How well can AI create texts?

We ran a test: How well can AI create texts?

Of course we did it last year too: we tested how good ChatGTP was at writing articles. Our verdict at the time was mixed. Yes, a homework assignment on a not too current and difficult topic is definitely possible. The concerns of some teachers are justified. In our opinion, it was no longer suitable for a review article on a scientific topic. Good for us — our job will stay with us for a while yet.

But what left us feeling more than a little uneasy at the time was the way ChatGTP handled references. For the most part, it simply made them up. No joke: these “references” were made up from set pieces of real-life author names, titles, journals, etc. At first glance, it wasn’t even noticeable. At first glance, it wasn’t even noticeable, but only when it was checked — which is part of the job of a science journalist.

The good news is that artificial intelligence is indeed evolving. The same query sent to ChatGTP today is answered with completely correct references. Nevertheless, such missteps alert AI to the fact that we are giving it the benefit of the doubt — which may not be justified.

Now one of the online spell-checking platforms available on the internet has also “come up with AI” (rechtschreibpruefung24.de). This is not primarily intended to improve the grammar check, but rather to provide rewording suggestions. This looks like this, for example:

Original:

What left me feeling more than a little uncomfortable at the time, however, was the way ChatGTP handled references. For the most part, it simply made them up. No joke: these “references” were put together from set pieces of real existing author names, titles, journals, etc. At first glance, it wasn’t even noticeable. At first glance, it wasn’t even noticeable, but only when the checking started — which is part of the job for a science journalist.

Simpler:

At the time, I was particularly bothered by the way ChatGTP handled references. For the most part, they were simply made up by combining real author names, titles and journal names. I didn’t notice this at first, only on closer inspection.

Formal:

What left an extremely unpleasant feeling, however, was ChatGTP’s approach to references. A large part of it was actually just constructed. These ‘references’ consisted of fragments of real author names, titles and journals. This went unnoticed at first and only became apparent on closer inspection — something that is essential for a science journalist.

Alternative:

A very unsettling feeling arose at the time from the way ChatGTP handled references. In fact, most of them were simply made up by throwing together real author names, titles and journal names. This was not noticeable at first; only on closer inspection — which is important for a science journalist — did it become clear.

Our opinion on the result

Not bad in parts, but the meaning is slightly distorted in some places and some content is missing. In further tests with other texts, the actual meaning sometimes changed considerably in both the “Simpler” and “Alternative” versions.

Conclusion: AI currently seems to be a nice tool for complex and demanding texts. But please take a close look and keep your trust in the black box within narrow limits!

P.S. Only the free versions were tested for our trials.

← Previous post Next post →