While is a cool idea on its own, I don't get why they try to reinvent it as a new system. We've got swagger, openapi, graphql and many other systems that already describe the APIs. They mostly include documentation too. Why not just expose those for the same effect? (If I was cynical, I'd guess Vercel wanting a proprietary thing of their own just for less portability)
It's the vercel way. There have been plenty of experiments leading up to this (even by vercel employees before they joined) but re-packaging it as "the" solution, rather than just a tool renderer from props (tool schema)
OpenAPI is great, there are a lot of tools to go from OpenAPI to UI, but you don't have a lot of control over the presentation.
For instance, you can specify that "first_name","last_name" and "email" are strings, but not that first/last name should be next to each other and email in its own row.
There are supersets of OpenAPI that can control the look and feel more. JSON Forms, for instance.
While this is interesting i'd incentivize looking into `yaml` returns for LLMs, it's cheaper in terms of tokens and more closely aligned with direct English / markdown.
I’ve had some success building “text to dashboard” with this using vercel.
I use bash-tool and Vercel sandbox to generate charts (Echarts) or tables (Tanstack table) from json data, and then json-render to render the charts, tables and markdown into a dashboard.
Please share as I would like to see what you have built.
What I like about this is that ides of a catalog which is what most business systems have in the form of their records and objects. Giving an AI accessible structure to this gets AI into the realm of the various 4GLs back in late 90s which made user created forms so much easier. Anybody remember that Informix 4GL for building simple apps from the db schema?
Neat. I’ve actually been planning on building a cut rate version of this for a project that I’ve been working on. Hopefully I can just use this instead :-)
This would be a dev time dependency I imagine? Team A provides the catalogue of components and product devs can vibe code their UI. This would also be good for prototype/design. Makes sense.
The json here is to ease the machine's ability to generate UI, but reciprocally it feels like this could also be a useful render tree that ai could read & fire actions on too.
There's some early exploration of using accessibility APIs to empower LLM's. This feels like it's sort of also a super simple & direct intermediate format, that maybe could be a more direct app model that LLMs could use.
More broadly it feels like we have a small forming crisis of computing having too many forms. We had cli tools, unix. Thenw we made gui's, which are yet another way to involve tools (and more). Then webapps where the page is what expresses tools (and more). Then react virtualized the page, supplanted dom. Now we have json that expresses views & tool calling. Also tools need to now be expressed as MCP as well, for ai to use it. Serverless and http and endless trpc and cap'n proto and protobuf ways to call functions/invoke tools. We keep making news ways to execute! Do we have value that each one is distinct, that they all have their own specific channels of execution, all distinct?
MCPs are a dead end. CLIs are just better, already did all the things MCPs struggle with, and are human usable. Plus you can use bash or nushell to do all sorts of fun things with command output.
I tried to have Cursor use the playwright MCP to click a few buttons on my project as a test and while it did do what I asked successfully, it burned through like 150 premium requests in 5 minutes.
I guess if you’re totally insensitive to the cost you can use this.
While is a cool idea on its own, I don't get why they try to reinvent it as a new system. We've got swagger, openapi, graphql and many other systems that already describe the APIs. They mostly include documentation too. Why not just expose those for the same effect? (If I was cynical, I'd guess Vercel wanting a proprietary thing of their own just for less portability)
OpenAPI, JsonSchema, GraphQL all describe *Data*.
This describes *User Interfaces*. The closest alternative would be to just return React JS code directly.
But this adds a layer of constraint and control, preventing LLMs to generate e.g. malicious React JS code.
It's the vercel way. There have been plenty of experiments leading up to this (even by vercel employees before they joined) but re-packaging it as "the" solution, rather than just a tool renderer from props (tool schema)
OpenAPI is great, there are a lot of tools to go from OpenAPI to UI, but you don't have a lot of control over the presentation.
For instance, you can specify that "first_name","last_name" and "email" are strings, but not that first/last name should be next to each other and email in its own row.
There are supersets of OpenAPI that can control the look and feel more. JSON Forms, for instance.
This is cool, too, though.
those describe server APIs
how would it relate to ui?
OpenAPI is a superset of JSON Schema. You can look at properties in JSON Schema and turn that into UIs.
For instance, strings would get a text box, enums would get a dropdown, etc., with validation and everything.
Check this out as an example: https://prismatic.io/docs/jsonforms/playground/
While this is interesting i'd incentivize looking into `yaml` returns for LLMs, it's cheaper in terms of tokens and more closely aligned with direct English / markdown.
I’ve had some success building “text to dashboard” with this using vercel.
I use bash-tool and Vercel sandbox to generate charts (Echarts) or tables (Tanstack table) from json data, and then json-render to render the charts, tables and markdown into a dashboard.
Please share as I would like to see what you have built.
What I like about this is that ides of a catalog which is what most business systems have in the form of their records and objects. Giving an AI accessible structure to this gets AI into the realm of the various 4GLs back in late 90s which made user created forms so much easier. Anybody remember that Informix 4GL for building simple apps from the db schema?
Is it reliable/robust?
It is more robust than when I tried the exact thing with structured outputs API and gpt4 era models, it’s not perfect but surprisingly good
Also see: https://repalash.com/uiconfig.js/
Neat. I’ve actually been planning on building a cut rate version of this for a project that I’ve been working on. Hopefully I can just use this instead :-)
This would be a dev time dependency I imagine? Team A provides the catalogue of components and product devs can vibe code their UI. This would also be good for prototype/design. Makes sense.
The json here is to ease the machine's ability to generate UI, but reciprocally it feels like this could also be a useful render tree that ai could read & fire actions on too.
There's some early exploration of using accessibility APIs to empower LLM's. This feels like it's sort of also a super simple & direct intermediate format, that maybe could be a more direct app model that LLMs could use.
More broadly it feels like we have a small forming crisis of computing having too many forms. We had cli tools, unix. Thenw we made gui's, which are yet another way to involve tools (and more). Then webapps where the page is what expresses tools (and more). Then react virtualized the page, supplanted dom. Now we have json that expresses views & tool calling. Also tools need to now be expressed as MCP as well, for ai to use it. Serverless and http and endless trpc and cap'n proto and protobuf ways to call functions/invoke tools. We keep making news ways to execute! Do we have value that each one is distinct, that they all have their own specific channels of execution, all distinct?
MCPs are a dead end. CLIs are just better, already did all the things MCPs struggle with, and are human usable. Plus you can use bash or nushell to do all sorts of fun things with command output.
> There's some early exploration of using accessibility APIs to empower LLM's.
any examples come to mind?
The popular Playwright MCP uses the Chrome accessibility tree to help agents navigate websites: https://github.com/microsoft/playwright/blob/ed176022a63add8...
I tried to have Cursor use the playwright MCP to click a few buttons on my project as a test and while it did do what I asked successfully, it burned through like 150 premium requests in 5 minutes.
I guess if you’re totally insensitive to the cost you can use this.
Chris Shank & Orion Reed's work is always excellent. https://bsky.app/profile/chrisshank.com/post/3m3q23xpzkc2u
[dead]