LLM Batch Processing with Code Interpreter
As my previous articles already demonstrate, Code Interpreter is capable of much more than all other AI models I know of: Its awesome LLM (let's call it GPT4.5) in combination with its code interpretation (let's call it CI) via file uploading, storage, and sandboxed code execution, make it an LLM agent that can iteratively (1) generate GPT4.5 output, including code (2) make use of CI to store data and create output from code execution, and (3) reflect on the result to gain information for the next iteration, to perform the current subtask better or use the information to move on to the next subtask.
However, GPT4.5 has a token limit (i.e. context length) of 8192 tokens, limiting the number of iterations the LLM agent can perform in one go. Of course GPT4.5's token limit is only relevant for GPT4.5's output, so as long as GPT4.5 generates code that in turn iterates over the upload and stored data without printing too much output, CI can do many iterations without reaching GPT4.5's token limit. But sometimes you want to iterate over your data and make use of the LLM in each iteration. This kind of LLM batch processing is very useful for e.g. evaluations, such as HumanEval on Code Interpreter itself, or data generation, such as creating a medical tiny story for each ICD-10 code. In this case, GPT4.5 (as opposed to the code that CI executes) needs to iterate over the data.
For the HumanEval benchmark evaluation in my previous post, I could easily fit 5 iterations in one go, i.e. within one GPT4.5 output. You can further increase the number of iterations that GPT4.5 can do in one go by reducing GPT4.5's output, e.g. by using "notalk;justgo". But in my experiments, GPT4.5 thinks less step by step and yields much worse results if you limit its verbosity. Thus, you get the best solution if you make GPT4.5 do a single iteration per output, and automate the chat.
Lacking access to https://www.multion.ai, I came up with the following JavaScript code to execute in the browser to automate the chat -- with the help of browser developer tools and, of course, Code Interpreter:
Recommended by LinkedIn
If you set TIME_INTERVAL high enough, you can avoid creating too many messages, which would exceed your usage cap and lead to the following error:
You've reached the current usage cap for GPT-4. You can continue with the default model now, or try again after 10:59 PM. Learn more
But after you have idled for about 30 minutes, CI drops all its storage. In my experiments for a relatively hard task that takes about 2 minutes per iteration (after some more minutes, CI would time out), I set TIME_INTERVAL to around 1 minute and don't run into the usage cap (which is supposed to be 50 messages every 3 hours).
Even if you have set TIME_INTERVAL high enough, it is advisable to sporadically tell GPT4.5 to offer its data as download, and to download the backup. If multiple files are involved, you can tell GPT4.5 to create a zip file. If you happen to exceed your usage cap in spite of a high TIME_INTERVAL, you can simply re-upload the data you backed up and resume with the first iteration not included in your backup, e.g.:
I lost the last couple of outputs, please continue from line 95. I uploaded `backup5.zip` with the relevant data (the original upload and derived data up to line 94).