AI at the Helm: Building an Entire Open Source Project With GPT-4
Chris Riccomini on May 18, 2023Iāve mostly been a LLM and GPT skeptic. Every so often Iād bang my head against ChatGPT, and it usually gave me junk. Iād wander off grumbling things jaded engineers grumble.
Then I payed for OpenAIās GPT-4 upgrade. GPT-4 actually seemed to work, so I decided to see how far I could push it. Could I write an entire open source project with GPT-4? Turns out, I could.
I had GPT-4 build me Twister, a Java library that converts Avro and Protobuf data to and from Java POJOs. Nearly every elementācode, docs, commit messages, tests, its README.md, even this blog postāwas written by GPT-4.
This post isnāt about Twister, though. Itās about how I learned to use ChatGPT effectively to write not just tests and documentation, but code.
Key Lessons from Building with GPT-4
Hereās what I learned:
- Start Small: Donāt aim for complex features right away.
- Code, Test, Repeat: Follow the same patterns you use to write code.
- GPT-4 Forgets: Always re-paste the code you want to update.
- Show Changes Only: Helps you get answers faster. GPT-4 can be slow.
- Iterate: Donāt write GPT-4 off at first mistake. Keep asking it to correct.
- Great for Grunt Work: GPT-4 shines with tests, docs, and commit messages.
- Best for āPureā Projects: Less business logic, like Twister, suits GPT-4.
- Keep GPT-3.5 Handy: Good for simpler tasks. Saves on GPT-4 quota.
Start Small
Break down tasks. Ask GPT-4 to build simple code changes at first. With Twister, I would have it build support for basic primitives (int, float, string, etc.) first. Once I got that working, Iād ask it to add support for complex types (maps, lists, enums, etc.).
Code, Test, Repeat
The best pattern I found was:
- Ask GPT-4 to write code.
- Ask GPT-4 to write a test for the code itād written.
- Ask GPT-4 to fix the failed test(s) if they donāt pass.
Flipping 1 and 2 (TDD) works, too
If you spot bugs, ask it to write tests exposing the bug. Then ask it to fix the test. The template I used for this loop was:
Given this code:
...
And this test:
...
I get this error:
...
Can you fix this?
GPT-4 Forgets
GPT-4 has a bad long-term memory. I found it did much better when I kept re-pasting my code into the prompt on ever iteration. There is a limit to the input size, so you have to get creative sometimes to fit relevant snippets in.
Show Changes Only
GPT-4 can be slow. To speed it up, request that it shows you only the code thatās changed.
Iterate
Donāt dismiss GPT-4 at first error, even if it says silly things. Ask for corrections.
Great for Grunt Work
GPT-4 excels at mundane tasks. Iām convinced everyone should use it for tests, doc strings, and commit messages.
Best for āPureā Projects
Twister is a pretty pure computer science project; there isnāt any real ābusiness logicā. I think GPT-4 does better at this kind of work.
Keep GPT-3.5 Handy
Use GPT-3.5 for simpler tasks. Saves you on your GPT-4 quota.
Why Not Github Copilot?
Copilot employs GPT-3.5. Compared to GPT-4, it is noticeably less effective. Copilot X offers improvements, yet itās tied to VisualStudio and VSCode. However, IntelliJ IDE is way more pleasant for Java than VSCode. So I just used ChatGPT. In the long run, IDE integration will certainly improve. Yet, for now, GPT-4 offered the optimal solution for Twister.
Future work
The Twister library is a small project right now. I want to add:
- Avro default support
- Avro logical type support
- Protobuf WKT support
- Avro Record ā”ļø Map wrapper
- Protobuf Message ā”ļø Map wrapper
- .proto ā”ļø Protobuf Descriptor converter
- JDBC row ā”ļø Map wrapper
Iāll continue having GPT write the code in this library. The experiment continues!
Conclusion
Though this post is about building with GPT-4, I want to re-iterate that Twister is a real project that I actually want people to use. Itās a pretty cool library. If youāre a Java developer dealing with Protobuf or Avro, check it out! Contributions are welcome, too (whether from GPT-4 or humans).
Addendum
Hereās the prompt I used to generate this blog post:
The blog post should focus on my experience using GPT-4 to write an entire library.
The post should also talk about GPT-4 tricks I learned while building this project:
- Start small (donāt ask GPT-4 to write all features in a class at once)
- Ask GPT for a basic class, then ask it to write a test for the class. If the tests fail, tell GPT, and have it fix the tests. Then go back to the basic class and ask GPT-4 to add the next feature. Rinse and repeat.
- Always re-paste the code you want GPT-4 to update. It has bad long-term memory. I frequently use a 2 part template: āGiven this code: ā¦ Can you update it to ā¦ā
- Tell it to just show changes, not the complete code. GPT-4 is slow, so telling it to skip unchanged code helps get you answers faster.
- Donāt be afraid to iterate. Many people get a response from GPT-4, and if itās wrong, they declare that it sucks. Instead, keep asking it to fix things.
- GPT-4 is really good for grunt work (tests, docs, commit messages)
- GPT-4 is also really good for more āpureā projects like Twister, where it doesnāt have to understand a lot of business logic.
- Keep a GPT-4 and a GPT-3.5 window open, so you can bounce to the GPT-3.5 window for more simple work. This will save you on your GPT-4 quota (currently 25 prompts per-3h window).
The blog post style should be:
- Matter-of-fact.
- No sentence should be more than 12 words.
- Include links to external sites where appropriate.
- Written in first-person.
- The post should include a bullet-point list of the tips near the introduction.
- The post should be written in markdown.
- The post should include a catchy title that will get attention on Hacker news.
- The post should include a section for each bullet point in the intro.
- The intro should say that the the code, docs, git commit messages, tests, README.md, and even this blog post were all written with GPT-4.
I pasted in an rough outline with some notes before this prompt.