Why not use LLM as the backend for anything?

Why not use LLM as the backend for anything?

My vision for a no-code app builder follows these four principles:

  1. No-code: builders shouldn’t interact with code when building the application as it will incur steep learning

  2. Never a blank slate: instead of a blank canvas, we will guide the builder and provide some auto-generated frontend and backend per the builder’s requirements

  3. Respect non-technical mental models of applications: instead of introducing programming concepts to builders, we make use of models that builders are familiar with when interacting with existing applications, like screens, flows, inputs, data rendering, etc.

  4. Customizable: once the frontend and backend are generated, builders should be empowered to modify them to their specific requirements

But just like other applications, the apps that builders create in this no-code application builder will also have the same working components of an application, namely a Frontend, and Backend.

Then there are two major working parts to make this app builder work: generating a frontend and generating a backend that connects with the frontend, per the builder’s requirements.

(There are also other smaller moving parts, like building a preview & live update engine, deploying the application, provisioning & live-updating database infrastructure, etc.)

Why focus on the backend

The reason why I want to tackle how we might automagically generate the backend that connects with the frontend is because the challenge of creating a backend automatically is way higher.

The precision of the frontend generated does not need to be high, because I expect builders would want to modify the frontend regardless, and non-technical builders are more confident playing around with the frontend.

Whereas the precision required for the backend needs to be extremely high because oftentimes builders will not and cannot modify the backend without having some coding knowledge.

However, I have more confidence that AI can play a big part in generating simple backends because backends are more “taste-agnostic”. There is a lot of personal preference to how a Frontend is implemented, mostly down to layout and styles. Therefore the code that is used to train LLMs may not be enough to make these taste and stylistic choices.

Whereas backend code functions over form. While there are still various ways to implement the same function, the variance is arguably lower, particularly for simple backends for CRUD (create, read, update, delete) operations. Code for LLM training would have enough of these cases to mimic and adapt to the builder’s specific requirements.

So for autogenerating backends, it is important to be highly accurate as builders would not want to or cannot modify it, but it also presents the biggest possibility for AI to fill the skill gap of a non-technical builder, and differentiate this from other no-code app builders.

My two approaches for automagical backend

My automagical backend has a few limitations. It relies on understanding the frontend and uses to support all frontend interactions, such as displaying information and handling form submissions. And my backend will only support CRUD operations with a database because most applications only need that, and some data transformation once it is retrieved or before it is sent to the database.

There are two paradigms I am thinking to automagically generate a backend, given the frontend:

  1. Pre-built Backend: Generate all the backend code in build time, and serve the backend as a static runtime codebase

  2. Dynamic Backend: Have a backend logic that takes in any API request and payload, and in runtime, dynamically determines what operations needed to handle the payload and API request

In essence, the difference between the two approaches are:

  1. Time of generation: for Pre-built Backend, the backend is automagically generated when the builder is building the application. Whereas for Dynamic Backend, backend logic is generated at runtime, depending on what API call and payload the users parse.

  2. Code complexity: for Pre-built Backend, for every frontend interaction (i.e. button click), there will be a backend logic function (i.e. API route) implemented, increasing the complexity of the backend codebase and exposing more points of breakage. Whereas, Dynamic Backend has only one universal backend logic function that all frontend interaction will trigger. It then uses AI to determine the CRUD operation and how to transform the payload

  3. AI Generation Scaling: for Pre-built Backend, I ask AI to generate once for every step the user wants to implement a UI interaction, plus the re-prompting to change the logic of the UI interaction. So it scales with the amount of UI interaction points. Whereas, for Dynamic Backend, I need to generate the backend every time the user (not the builder) interacts with any UI interactive elements. So it scales with the user interaction. This is important because every code generation costs credits.

I know the gist of how to do the Pre-built Backend engine. So Dynamic Backend is more interesting for me to explore because of the one-function-rule-them-all nature of it. But after two days of building a proof-of-concept, it turns out to be a bad implementation of an automagical backend.

How does Dynamic Backend work and how to build it?

Inspired by a hackathon project at Scale AI in January 2023, a Dynamic Backend is essentially an API server that has a catch-all route. It accepts any API endpoint name and any payload and asks the LLM to determine what the state change to the existing database looks like.

Every time the application calls an API, it will

  1. Take the API route name and the payload as the context

  2. Fetch the schema of the database as another component of the context

  3. Ask LLM which CRUD operation to do based on the context

  4. Perform the CRUD operation to a database (in my case, a MongoDB)

In less than 40 lines of code, the backend is done and it supports any CRUD operation on any type of data it receives. That’s super cool 😎

app.all("/*", async function (req, res) {
  const endpoint = `/${Object.values(req.params).join("/")}`;
  const body = req.body;

  const bodyString = JSON.stringify(body, null, 2);
  const systemPrompt = getSystemPrompt(process.env.MONGO_DB_DATASOURCE,process.env.MONGO_DB_DATABASE);
  const existingSchema = fs.readFileSync(schemaPath, "utf8");
  const userPrompt = getUserPrompt(endpoint, bodyString, existingSchema);

  const chatCompletion = await openai.createChatCompletion({
    model: "gpt-4",
    messages: [
      { role: "system", content: systemPrompt },
      { role: "user", content: userPrompt },
    ],
  });
  const response = chatCompletion.data.choices[0].message.content;

  // PARSE THE RESPONSE
  const apiHeadersMatch = response.match(/!API route!:([^]+)!API body!:/)[1].trim().match(/\\/.*/)[0];
  const apiBody = JSON.parse(response.match(/!API body!:([\\s\\S]*?)!Updated schema!:/)[1].trim()..replace(/^```json/, "").replace(/```$/, ""));
  const updatedSchema = JSON.parse(response.match(/!Updated schema!:([\\s\\S]+)/)[1].trim().replace(/^```json/, "").replace(/```$/, ""));

  // MAKE THE MONGODB CALL
  const mongoAPIEndpoint = `${process.env.MONGO_DB_URL}${remainingRoute}`;
  const mongoAPIHeaders = ;
  await fetch(mongoAPIEndpoint, {
    method: "POST",
    headers: {"Content-Type": "application/json", "api-key": process.env.MONGO_DB_API_KEY,},
    body: JSON.stringify(apiBody),
  })
    .then((res) => {
      return res.json();
    })
    .then((data) => {
      // WRITE THE SCHEMA BACK TO FILE
      fs.writeFile(schemaPath, JSON.stringify(updatedSchema));
      res.send(data).end();
    });
});

Behind the scenes, the hardest part is writing the prompts (highlighted in yellow in the above code block) that can

  1. Take the API name, payload the existing data schema as the context

  2. Determine the right MongoDB API to use to perform the right CRUD operation on the database

  3. Return the MongoDB API endpoint and body in the right format for the backend to call it

Why doesn’t it work

Technically, it works. It actually feels magical that it can figure out sign up and login a user, without it having any specific API route implemented.

However, there are a few things that made it not suitable for production:

  1. Because the backend logic to handle any API request is generated by the LLM at runtime, and LLM is not deterministic in its response, sometimes it will generate outputs that cannot be parsed into a MongoDB database operation, which will break the backend

  2. Every API request incurs a call to the LLM, to translate the API name and body into the appropriate MongoDB database operation, so the cost of LLM scales according to the number of user requests, making it unsustainable as a backend service

  3. On top of point 2, because every API request incurs a call to the LLM, it adds to the request time. And since LLM now takes a while to respond, and we need the whole response in order to make the correct MongoDB database operation, we ended up adding 2-5 seconds per request just on the LLM translating the API name and payload to the right database operation

What I learned in building this dynamic backend is that generating code is better than asking LLM to output a response with a set schema because of its non-deterministic nature. Code can be written in many different ways and still work as intended. But if the dynamic backend needs to parse a response in a certain format, the LLM may not be able to generate responses of that format every single time.

The limitation of LLM is that the prompt highly influences the response, and the response is non-deterministic. Therefore, many times we need to re-prompt in order to modify the output of an LLM, to make the code generated do what we intended to do. But in a runtime scenario, like the dynamic backend I built, there is no opportunity for the end user to re-prompt in order to make the code generated do what he/she intended. Whereas, if code generation is done in the building process, builders have the opportunity to modify the code generated using re-prompting.

The final lesson I learned is that simplicity on the surface may actually incur a lot of complexity in the end. The reason I want to try implementing a dynamic backend is that in my last experiment of fixing code in an existing codebase, I found it unreliable for LLM to edit code that exists in a codebase with functions and modules that are dependent on each other. That’s why the one-route-to-catch-them-all model was appealing because it removes all dependencies between functions as it only has 1 function.

But as I implement it, it is actually more complex to use one function to generalize all backend functions of an application. I ended up having to build many ways to handle all the cases, especially edge cases. All of these add to the complexity of the implementation and introduce many more vulnerabilities and possible points of failure that a normal backend with many API route implementations will not have. Turns out there is value in the separation of concern.


The code for LLM-Backend is on my Github for anyone to play around with. All you need is to bring your own OpenAI key and create a MongoDB Atlas database cluster