Skip to main content

OpenAI Dev Day launch

· 2 min read
Benjamin D. Brodie

This week, OpenAI launched support for built-in RAG, as part of their larger strategy to enable non-developers to create custom GPTs for specific use cases.

Let's take a quick look at the functionality:

The new Assistant API, available in beta, adds the possibility to create and refer to files in the context of a user conversation. This is great for processing intermediate results of calculations, or caching information retrieved from functions or APIs. Additionally, it adds a new form of observability for the end user, who can simply request to see the data contained in files, optionally downloading them for use outside of the assistant. Note: not all file types are supported.

From a developer perspective, the new Assistant API has added long requested support for persistent threads. Up until now, a developer needed to send all messages previously generated, by the user and the assistant, for each new message. Now, each thread will be assigned an ID and new messages will be appended to a stateful thread collection on OpenAI's servers. An added benefit is that OpenAI will intelligently handle threads that exceed the context window sizing limits, truncating earlier messages as needed.

Finally, the new support for Files means that assistant developers can maintain a persistent store of files that should be available to all conversations, by default. This makes is incredibly easy to get started with a RAG flow. The downside is that the RAG implementation is not very customizable, not yet at least.

For our use case, where we are experimenting with multiple prompts for different use cases, retaining control over the retrieval step is well worth the extra effort to set it up in the first place. But given OpenAI's focus on allowing non-developers to make assistants without any coding, I can see why they would offer this option.