Proofreading an entire book with GPT-4 for…

May 2, 2023

How I automated a part of my job

8 Comments

May 3, 2023

Nice post! You mention a edit.py script, which is not the script in the gist (that one is called gpt-proofread.py). Would you mind sharing that too?

Expand full comment

Reply (1)

Christoph Molnar

May 3, 2023

It's the same script. I just renamed it for the gist so that the name is a bit more telling about what the script does.

Expand full comment

Valentino Zocca 🇮🇹 🇪🇺

May 2, 2023

This is great, however, your book’s content is now in the OpenAI database or corpus or training set or whatever you want to call it, even before being published. Aren’t you concerned about that? You have no copyright protection yet.

Expand full comment

Reply (1)

Christoph Molnar

May 2, 2023

For the API, they don't store the data for training the models: https://openai.com/policies/api-data-usage-policies

That's only for ChatGPT, but you can also opt-out.

Expand full comment

Reply (1)

Valentino Zocca 🇮🇹 🇪🇺

May 2, 2023

I thought you could only opt-out for GPT-4. Do they charge for using the GPT-4 API?

Expand full comment

mstrap

Nov 4, 2023Edited

While searching for like-minded developers, I came across your post. My proofreading workflows mainly consists of emails, markdown (GitHub pages), and some company-internal tools. GPT-3.5 gets it right almost every time, but having a diff and selectively applying changes is crucial for me. I have approached this problem with a small open-source desktop app that requires an OpenAI API token. It is still in the experimental stage in terms of functionality and design, but it already operates quite reliably at its core:

https://www.aibtra.dev

Expand full comment

Reply (1)

Christoph Molnar

Nov 5, 2023

That looks really interesting. Thanks for building and sharing.

Expand full comment

Jürgen Richard Plasser

May 2, 2023

I don't have access to the gpt-4 model in the API, so I tried gpt-3.5-turbo and it simply summarises the first lines of text and removes the rest. I have only some lines for testing in one document, the lines of text are separated by blank lines. And I use plain text, no markdown, but that shouldn't be an issue.

Maybe I'll have to wait for gpt-4 in this case.

Expand full comment

Mindful Modeler

Proofreading an entire book with GPT-4 for…