AI at the Helm: Building an Entire Open Source Project With GPT-4

Chris Riccomini on May 18, 2023

Iā€™ve mostly been a LLM and GPT skeptic. Every so often Iā€™d bang my head against ChatGPT, and it usually gave me junk. Iā€™d wander off grumbling things jaded engineers grumble.

Then I payed for OpenAIā€™s GPT-4 upgrade. GPT-4 actually seemed to work, so I decided to see how far I could push it. Could I write an entire open source project with GPT-4? Turns out, I could.

I had GPT-4 build me Twister, a Java library that converts Avro and Protobuf data to and from Java POJOs. Nearly every elementā€“code, docs, commit messages, tests, its README.md, even this blog postā€“was written by GPT-4.

This post isnā€™t about Twister, though. Itā€™s about how I learned to use ChatGPT effectively to write not just tests and documentation, but code.

Key Lessons from Building with GPT-4

Hereā€™s what I learned:

Start Small

Break down tasks. Ask GPT-4 to build simple code changes at first. With Twister, I would have it build support for basic primitives (int, float, string, etc.) first. Once I got that working, Iā€™d ask it to add support for complex types (maps, lists, enums, etc.).

Code, Test, Repeat

The best pattern I found was:

  1. Ask GPT-4 to write code.
  2. Ask GPT-4 to write a test for the code itā€™d written.
  3. Ask GPT-4 to fix the failed test(s) if they donā€™t pass.

Flipping 1 and 2 (TDD) works, too

If you spot bugs, ask it to write tests exposing the bug. Then ask it to fix the test. The template I used for this loop was:

Given this code:
...

And this test:
...

I get this error:
...

Can you fix this?

GPT-4 Forgets

GPT-4 has a bad long-term memory. I found it did much better when I kept re-pasting my code into the prompt on ever iteration. There is a limit to the input size, so you have to get creative sometimes to fit relevant snippets in.

Show Changes Only

GPT-4 can be slow. To speed it up, request that it shows you only the code thatā€™s changed.

Iterate

Donā€™t dismiss GPT-4 at first error, even if it says silly things. Ask for corrections.

Great for Grunt Work

GPT-4 excels at mundane tasks. Iā€™m convinced everyone should use it for tests, doc strings, and commit messages.

Best for ā€˜Pureā€™ Projects

Twister is a pretty pure computer science project; there isnā€™t any real ā€œbusiness logicā€. I think GPT-4 does better at this kind of work.

Keep GPT-3.5 Handy

Use GPT-3.5 for simpler tasks. Saves you on your GPT-4 quota.

Why Not Github Copilot?

Copilot employs GPT-3.5. Compared to GPT-4, it is noticeably less effective. Copilot X offers improvements, yet itā€™s tied to VisualStudio and VSCode. However, IntelliJ IDE is way more pleasant for Java than VSCode. So I just used ChatGPT. In the long run, IDE integration will certainly improve. Yet, for now, GPT-4 offered the optimal solution for Twister.

Future work

The Twister library is a small project right now. I want to add:

Iā€™ll continue having GPT write the code in this library. The experiment continues!

Conclusion

Though this post is about building with GPT-4, I want to re-iterate that Twister is a real project that I actually want people to use. Itā€™s a pretty cool library. If youā€™re a Java developer dealing with Protobuf or Avro, check it out! Contributions are welcome, too (whether from GPT-4 or humans).

Addendum

Hereā€™s the prompt I used to generate this blog post:

The blog post should focus on my experience using GPT-4 to write an entire library.

The post should also talk about GPT-4 tricks I learned while building this project:

  • Start small (donā€™t ask GPT-4 to write all features in a class at once)
  • Ask GPT for a basic class, then ask it to write a test for the class. If the tests fail, tell GPT, and have it fix the tests. Then go back to the basic class and ask GPT-4 to add the next feature. Rinse and repeat.
  • Always re-paste the code you want GPT-4 to update. It has bad long-term memory. I frequently use a 2 part template: ā€œGiven this code: ā€¦ Can you update it to ā€¦ā€
  • Tell it to just show changes, not the complete code. GPT-4 is slow, so telling it to skip unchanged code helps get you answers faster.
  • Donā€™t be afraid to iterate. Many people get a response from GPT-4, and if itā€™s wrong, they declare that it sucks. Instead, keep asking it to fix things.
  • GPT-4 is really good for grunt work (tests, docs, commit messages)
  • GPT-4 is also really good for more ā€œpureā€ projects like Twister, where it doesnā€™t have to understand a lot of business logic.
  • Keep a GPT-4 and a GPT-3.5 window open, so you can bounce to the GPT-3.5 window for more simple work. This will save you on your GPT-4 quota (currently 25 prompts per-3h window).

The blog post style should be:

  1. Matter-of-fact.
  2. No sentence should be more than 12 words.
  3. Include links to external sites where appropriate.
  4. Written in first-person.
  5. The post should include a bullet-point list of the tips near the introduction.
  6. The post should be written in markdown.
  7. The post should include a catchy title that will get attention on Hacker news.
  8. The post should include a section for each bullet point in the intro.
  9. The intro should say that the the code, docs, git commit messages, tests, README.md, and even this blog post were all written with GPT-4.

I pasted in an rough outline with some notes before this prompt.

Subscribe to my newsletter!