I asked Chat-GPT to Review a Book and the Results were Horrible

It Is a Mistake to Use Chat-GPT to Evaluate Texts

Setting Up the Experiment

I conducted an experiment with Chat-GPT Pro today. I took an incomplete manuscript for a book (approximately 100 pages, double-spaced), and uploaded the *.docx file to Chat-GPT. Then, I gave it simple instructions: give me a chapter-by-chapter feedback and analysis.

The Results

Chat-GPT began strong with a succinct summaries of the first chapters along with some basic analysis. However, after Chapter 10, Chat-GPT transitioned to more overarching themes and wrote some basic areas for improvement. So far, so good.

I assumed that she stopped after 10 chapters due to some constraint on response-length. That’s understandable. It’s not free to run, and each of those words comes with a cost. I gave it the next query to skirt around that issue: do the same for Chapters 11-20.

That’s where things got weird.

Giving Feedback on Text that Does Not Exist

There’s a chapter in the book titled “A Visitor from Another Dimension”. It is entirely blank. Here’s the feedback:

Love, loss, and addiction happen in other chapters that I’ve fed individually to Chat-GPT, but that was months ago.

Another chapter is titled The Buchanans. It is also entirely blank at the moment.

Ignoring the Text and Giving Feedback on Whatever Chat-GPT Wanted Instead.

Chapter 15: Hell is supposed to be a lighthearted chapter that talks about being high in The Container Store. It is an unserious chapter meant to break up the heavier events. Here is the beginning of the chapter, for reference. The tone does not change, nor does anything bad happen at the store.

The Implications

Chat-GPT Does Not Read Everything All the Way Through

It seems Chat-GPT stopped reading after about ten pages after it got the “feel” of the manuscript. This doesn’t seem that bad when we’re evaluating a manuscript, but imagine if you wanted it to analyze a data file. Would you want the machine to stop reading after a certain number of rows of data? Or what if you needed Chat-GPT to summarize some legal documents for you. Do you think getting a vague “feel” for a legal text would be sufficient?

Chat-GPT’s Feedback is Generic at Best, Completely Made-Up at Worst

The feedback the machine provides could be applied to nearly any piece of writing. “Pacing” is nebulous and you could interpret it to fit whatever you thought the piece needed in terms of pacing. Making characters distinct is also something that sounds plausible, but really could be applied to any piece of writing with characters.

Spare yourself the $20/month. Don’t use Chat-GPT for writing feedback/critique.

author avatar
Luke Data Engineer
Luke A man on a crusade against apathy. I created this website on April Fool’s Day 2024, after noticing the sharp uptick in garbage writing on the internet. My day job is as a data something. I also do consulting work for small-business’ trying to modernize their data situations and make a buck off them. I write a lot about technology, economics, my own antics, and opinions.  If my writing has entertained, informed, aggravated, or made you reconsider anything, then I consider this blog a fantastic success.

Leave a Reply