cnr.sh

AI at the Helm: Building an Entire Open Source Project With GPT-4

I've mostly been a LLM and GPT skeptic. Every so often I'd bang my head against ChatGPT, and it usually gave me junk. I'd wander off grumbling things jaded engineers grumble.

Then I payed for OpenAI's GPT-4 upgrade. GPT-4 actually seemed to work, so I decided to see how far I could push it. Could I write an entire open source project with GPT-4? Turns out, I could.

I had GPT-4 build me Twister, a Java library that converts Avro and Protobuf data to and from Java POJOs. Nearly every element--code, docs, commit messages, tests, its README.md, even this blog post--was written by GPT-4.

This post isn't about Twister, though. It's about how I learned to use ChatGPT effectively to write not just tests and documentation, but code.

Key Lessons from Building with GPT-4

Here's what I learned:

Start Small

Break down tasks. Ask GPT-4 to build simple code changes at first. With Twister, I would have it build support for basic primitives (int, float, string, etc.) first. Once I got that working, I'd ask it to add support for complex types (maps, lists, enums, etc.).

Code, Test, Repeat

The best pattern I found was:

  1. Ask GPT-4 to write code.
  2. Ask GPT-4 to write a test for the code it'd written.
  3. Ask GPT-4 to fix the failed test(s) if they don't pass.

Flipping 1 and 2 (TDD) works, too

If you spot bugs, ask it to write tests exposing the bug. Then ask it to fix the test. The template I used for this loop was:

Given this code:
...

And this test:
...

I get this error:
...

Can you fix this?

GPT-4 Forgets

GPT-4 has a bad long-term memory. I found it did much better when I kept re-pasting my code into the prompt on ever iteration. There is a limit to the input size, so you have to get creative sometimes to fit relevant snippets in.

Show Changes Only

GPT-4 can be slow. To speed it up, request that it shows you only the code that's changed.

Iterate

Don't dismiss GPT-4 at first error, even if it says silly things. Ask for corrections.

Great for Grunt Work

GPT-4 excels at mundane tasks. I'm convinced everyone should use it for tests, doc strings, and commit messages.

Best for 'Pure' Projects

Twister is a pretty pure computer science project; there isn't any real "business logic". I think GPT-4 does better at this kind of work.

Keep GPT-3.5 Handy

Use GPT-3.5 for simpler tasks. Saves you on your GPT-4 quota.

Why Not Github Copilot?

Copilot employs GPT-3.5. Compared to GPT-4, it is noticeably less effective. Copilot X offers improvements, yet it's tied to VisualStudio and VSCode. However, IntelliJ IDE is way more pleasant for Java than VSCode. So I just used ChatGPT. In the long run, IDE integration will certainly improve. Yet, for now, GPT-4 offered the optimal solution for Twister.

Future work

The Twister library is a small project right now. I want to add:

I'll continue having GPT write the code in this library. The experiment continues!

Conclusion

Though this post is about building with GPT-4, I want to re-iterate that Twister is a real project that I actually want people to use. It's a pretty cool library. If you're a Java developer dealing with Protobuf or Avro, check it out! Contributions are welcome, too (whether from GPT-4 or humans).

Addendum

Here's the prompt I used to generate this blog post:

The blog post should focus on my experience using GPT-4 to write an entire library.

The post should also talk about GPT-4 tricks I learned while building this project:

The blog post style should be:

  1. Matter-of-fact.
  2. No sentence should be more than 12 words.
  3. Include links to external sites where appropriate.
  4. Written in first-person.
  5. The post should include a bullet-point list of the tips near the introduction.
  6. The post should be written in markdown.
  7. The post should include a catchy title that will get attention on Hacker news.
  8. The post should include a section for each bullet point in the intro.
  9. The intro should say that the the code, docs, git commit messages, tests, README.md, and even this blog post were all written with GPT-4.

I pasted in an rough outline with some notes before this prompt.


Changelog