Using GPT to Compile Pseudocode: A New Approach to Programming
Like everyone on the internet, I've been playing around with large language models (LLMs). As a result, it has changed the way I think and write code.
Today, I'm sharing how I start most of my coding sessions, by using GPT as a compiler to translate pseudocode into real code.
All examples below are generated using chatgpt-3.5
NOTE: while you generally get better results running chatgpt-4 instead of 3.5, I find the huge latency in responses to not be worth the incremental improvements that chatgpt-4 provides
What is Pseudocode
Pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language but is intended for human reading rather than machine reading. Source: Wikipedia
Pseudocode uses plan language to describe a particular algorithm to humans. The output is prose that is easily understood by humans but not executable by machines.
Large language models are trained to take plain language intended for humans and generate effective outputs from the description. Code, aka instructions that can be executed by machines, is one of those outputs.
So what if we combine them? Let's have our cake and eat it too!
Why Pseudocode
Normally, when you see examples of GPT-generated code, they come in the form of simple directives like "fetch the top 50 items from hacker news" or "get the stock price for Google".
While these directives do generate executable code (most of the time), they do not enforce fine-grained constraints. Constraints arise in most real-world applications - maybe you're working with an existing code base that limits what dependencies you can use or you have to accomplish a complicated workflow in a particular sequence.
Pseudocode works well when in these situations and allows you, the programmer, to have all the benefits of LLMs while still retaining control of the output.
Case Study: Fetching All the AWS Docs
Today's example goes over the actual workflow I used to create AWS Reference Notes - this project attempts to consolidate all known gotchas, limitations and caveats of various AWS services.
AWS docs are available in public git repositories. As part of the project, I needed to implement logic that could clone these repos into a specific directory layout. I also needed the code to use the isomorphic-git
library which allows git to be run both on the web as well as on a server.
Prompt 1
My initial prompt to gpt.
Convert the given pseudocode into valid typescript.
The following third-party libraries will be available:
- isomorphic-git as git
The following operators are natural language operations. Translate these directives into valid typescript. Make sure to implement all logic.
- `%{action}%`: implement the desired {action} using what you know about the language and available libraries
- `%{library}|{action}%`: implement the desired {action} using the suggested {library}. you MUST use functions available in {library}. if none exist, create a helper function that accomplishes the desired {action}
Pseudocode below:
```ts
export main(services: string[]) {
%set basedir to `__dirname/..`%
basedir =
await Promise.all services.map( async service => {
await upsertDevGuide({service, basedir})
})
}
awsgitrepo(service: string) {
return `https://github.com/awsdocs/${service}-developer-guide.git`
}
upsertDevGuide(opts: {service: messtring, basedir: string}) {
guidePath = path.join(basedir, "docs", service, "developer-guide")
if !(
%git|check if "guidePath" exists and is a git repo%
) {
%ensure that the directory for "guidePath" exists, create if it does not%
%git|run git clone at "guidePath" using output of "awsgitrepo(service)"%
} else {
%git|run git pull at "guidePath"%
}
}
```
This prompt has 3 directives:
specify the desired output format (
typescript
).specify the availability of third-party libraries (
isomorphic-git
)specify natural language operations (NLOs) that we want GPT to translate into code.
There are two NLOs in this example:
%{action}%
: perform any action (wildcard action NLO)%{library}|{action}%
: perform an action using the suggested library (library action NLO)
The wildcard nlo
is useful when you want a specific side effect or state to happen that requires multiple steps (eg. ensuring a directory exists). The library nlo
is useful when you know a library has a particular function and you can't be bothered to fill in all the boilerplate.
Response 1
Here's the conversion of the given pseudocode into valid TypeScript, taking into account the natural language operations:
```typescript
import * as git from 'isomorphic-git';
import * as path from 'path';
export async function main(services: string[]) {
const basedir = path.join(__dirname, '..');
await Promise.all(services.map(async (service) => {
await upsertDevGuide({ service, basedir });
}));
}
function awsgitrepo(service: string) {
return `https://github.com/awsdocs/${service}-developer-guide.git`;
}
async function upsertDevGuide(opts: { service: string, basedir: string }) {
const guidePath = path.join(opts.basedir, 'docs', opts.service, 'developer-guide');
const isGitRepo = await git.isGitRepo({ fs, dir: guidePath }).catch(() => false);
if (!isGitRepo) {
await ensureDirectoryExists(guidePath);
await git.clone({ dir: guidePath, url: awsgitrepo(opts.service) });
} else {
await git.pull({ fs, dir: guidePath });
}
}
async function ensureDirectoryExists(directoryPath: string) {
const fs = git.plugins.get('fs');
const dirPath = path.dirname(directoryPath);
// Create directory if it doesn't exist
await fs.promises.mkdir(dirPath, { recursive: true });
}
```
Please note that in order to use `isomorphic-git` and its functions, you need to have it installed as a dependency in your TypeScript project.
The initial response from gpt doesn't compile but gets us most of the way there. It has the following issues:
git
operations require additional arguments (http
andfs
parameters)git.isGitRepo
is a hallucination and not a real functionensureDirectoryExists
usesgit.plugins.get('fs')
(which might be hallucinated) instead of the builtinfs
module
Prompt 2
When first encountering issues, it's worthwhile to ask GPT to make corrections.
isGitRepo isn't a function
Note that the follow-up is much more concise than the initial prompt. GPT already has the prior context so only minimal prompting is needed to "fine-tune" the results.
Response 2
Apologies for the confusion. You're correct that `isGitRepo` is not a built-in function in `isomorphic-git`. Instead, we can check if the `.git` directory exists in the specified path to determine if it's a git repository. Here's the updated code:
```typescript
import * as git from 'isomorphic-git';
import * as fs from 'fs';
import * as path from 'path';
export async function main(services: string[]) {
const basedir = path.join(__dirname, '..');
await Promise.all(services.map(async (service) => {
await upsertDevGuide({ service, basedir });
}));
}
function awsgitrepo(service: string) {
return `https://github.com/awsdocs/${service}-developer-guide.git`;
}
async function upsertDevGuide(opts: { service: string, basedir: string }) {
const guidePath = path.join(opts.basedir, 'docs', opts.service, 'developer-guide');
const isGitRepo = await checkIfGitRepo(guidePath);
if (!isGitRepo) {
await ensureDirectoryExists(guidePath);
await git.clone({ dir: guidePath, url: awsgitrepo(opts.service) });
} else {
await git.pull({ dir: guidePath });
}
}
async function checkIfGitRepo(directoryPath: string): Promise {
const gitDirPath = path.join(directoryPath, '.git');
return new Promise((resolve) => {
fs.access(gitDirPath, fs.constants.F_OK, (err) => {
resolve(!err);
});
});
}
async function ensureDirectoryExists(directoryPath: string) {
const dirPath = path.dirname(directoryPath);
// Create directory if it doesn't exist
await fs.promises.mkdir(dirPath, { recursive: true });
}
```
This updated code checks if the `.git` directory exists in the specified path to determine if it's a git repository. If the `.git` directory exists, it is considered a git repository; otherwise, it is not.
Since GPT generates the entire code with the corrections, it can be hard to spot the changes. I've highlighted them in the diff below.
diff --git a/src/pseudocode.gpt35.1.ts b/src/pseudocode.gpt35.1.ts
index 6977cae..425cc2a 100644
--- a/src/pseudocode.gpt35.1.ts
+++ b/src/pseudocode.gpt35.1.ts
@@ -1,4 +1,5 @@
import * as git from 'isomorphic-git';
+import * as fs from 'fs';
import * as path from 'path';
export async function main(services: string[]) {
@@ -16,20 +17,29 @@ function awsgitrepo(service: string) {
async function upsertDevGuide(opts: { service: string, basedir: string }) {
const guidePath = path.join(opts.basedir, 'docs', opts.service, 'developer-guide');
- const isGitRepo = await git.isGitRepo({ fs, dir: guidePath }).catch(() => false);
+ const isGitRepo = await checkIfGitRepo(guidePath);
if (!isGitRepo) {
await ensureDirectoryExists(guidePath);
await git.clone({ dir: guidePath, url: awsgitrepo(opts.service) });
} else {
- await git.pull({ fs, dir: guidePath });
+ await git.pull({ dir: guidePath });
}
}
+async function checkIfGitRepo(directoryPath: string): Promise<boolean> {
+ const gitDirPath = path.join(directoryPath, '.git');
+
+ return new Promise<boolean>((resolve) => {
+ fs.access(gitDirPath, fs.constants.F_OK, (err) => {
+ resolve(!err);
+ });
+ });
+}
+
async function ensureDirectoryExists(directoryPath: string) {
- const fs = git.plugins.get('fs');
const dirPath = path.dirname(directoryPath);
-
+
// Create directory if it doesn't exist
await fs.promises.mkdir(dirPath, { recursive: true });
}
Note that GPT was able to fix the issue with the hallucinated git.isGitRepo
function. In the process, it also simplifies the ensureDirectoryExists
and switched to the node fs
dependency. That said, it also went one step back by removing the (required) fs
parameter from the git.pull
function.
Finishing it up.
With more prompting, we can get GPT to fix the remaining issues with the code. I find that this isn't worthwhile as it becomes simpler to manually apply the remaining fixes at this point.
index 425cc2a..3bff3d9 100644
--- a/src/pseudocode.gpt35.1.ts
+++ b/src/pseudocode.gpt35.1.ts
@@ -1,6 +1,7 @@
import * as git from 'isomorphic-git';
import * as fs from 'fs';
import * as path from 'path';
+import http from 'isomorphic-git/http/node';
export async function main(services: string[]) {
const basedir = path.join(__dirname, '..');
@@ -21,9 +22,9 @@ async function upsertDevGuide(opts: { service: string, basedir: string }) {
if (!isGitRepo) {
await ensureDirectoryExists(guidePath);
- await git.clone({ dir: guidePath, url: awsgitrepo(opts.service) });
+ await git.clone({ dir: guidePath, url: awsgitrepo(opts.service), fs, http });
} else {
- await git.pull({ dir: guidePath });
+ await git.pull({ fs, dir: guidePath, http });
}
}
As you can see, only 3 lines needed to be changed to arrive at the working code.
Next Steps
There's a lot more to talk about that didn't make it in this article. Some things I'm currently experimenting with:
enforcing high-level styles (eg. prefer object-based arguments vs positional arguments)
additional operators (eg.
A := B
: extract anywhere from object operator. This extracts property "A" from "B". This requires inferring a path that will yield the value "A" from "B")automatically loading in additional context from the existing codebase
There's no end to things that you can do here - context size's the limit. If you have interesting examples of using GPT for code, please share them either in the comments below or reach out at kevin(a t)dendron.so. I'm currently thinking about putting together a repo to collect examples of how people code using GPT and looking for folks that are interested in contributing 🌱