I’m NOT excited about coding by AI [venting]

luciole (he/him)@beehaw.org · 1 year ago

I’m NOT excited about coding by AI [venting]

vampatori@feddit.uk · 1 year ago

The issues with LLM’s for coding are numerous - they don’t produce good results in my experience, there’s plenty of articles on their flaws.

But… they do highlight something very important that I think we as developers have been guilty of for decades… a large chunk of what we do is busy work; the model definitions, the api to wrap the model, the endpoint to expose the model, the client to connect to the endpoint, the ui that links to the client, the server-side validation, the client-side validation, etc. On and on… so much of it is just busy work. No wonder LLM’s can offer up solutions to these things so easily - we’ve all been re-inventing the wheel over and over and over again.

Busy work is the worst and it played a big part in why I took a decade-long break from professional software development. But now I’m back running my own business and I’m spending significant time reducing busy work - for profit but also for my own personal enjoyment of doing the work.

I have two primary high-level goals:

Maximise reuse - As much as possible should be re-usable both within and between projects.
Minimise definition - I should only use the minimum definition possible to provide the desired solution.

When you look at projects with these in mind, you realise that so many “fundamentals” of software development are terrible and inherently lead to busy work.

I’ll give a simple example… let’s say I have the following definition for a model of a simple blog:

User:
  id: int generate primary-key
  name: string

Post:
  id: int generate primary-key
  user_id: int foreign-key(User.id)
  title: string
  body: string

Seems fairly straight-forward, we’ve all done this before - it can be in SQL, prisma, etc. But there’s some fundamental flaws right here:

We’ve tightly coupled Post to User through the user_id field. That means Post is instantly far less reusable.
We’ve forced an id scheme that might not be appropriate for different solutions - for example a blogging site with millions of bloggers with a distributed database backend may prefer bigint or even some form of UUID.
This isn’t true for everything, but is for things like SQL, Prisma, etc. - we’ve defined the model in a data-definition language that doesn’t support many reusability features like importing, extending, mixins, overriding, etc.
We’re going to have to define this model again in multiple places… our API that wraps the database, any clients that consume that API, any endpoints that serve that API up, in the UI, the validation, and so on.

Now this is just a really simple, almost superficial example - but even then it highlights these problems.

So I’m working on a “pattern” to help solve these kinds of problems, but with a reference implementation in TypeScript. Let’s look at the same example above in my reference implementation:

export const user = new Entity({
    name: "User",
    fields: [
        new NameField(),
    ],
});

export const post = new Entity({
    name: "Post",
    fields: [
        new NameField("title", { maxLength: 100 }),
        new TextField("body"),
    ],
});

export const userPosts = new ContentCreator({
    name: "UserPosts",
    author: user,
    content: post,
});

export const blogSchema = new Schema({
    relationships: [
        userPosts,
    ],
});

So there’s several things to note:

Entities are defined in isolation without coupling to each other.
We have sane defaults, no need to specify an id field for each entity (though you can).
You can’t see it here because of the above, but there are abstract id field definitions: IDField and AutoIDField. It’s the specific implementation of this schema where you specify the type of ID you want to use, e.g. IntField, BigIntField, UUIDField, etc.
Relationships are defined separately and used to link together entities.
Relationships can bestow meaning - the ContentCreator relationship just extends OneToMany, but adds meta-data from which we can infer things in our UI, authorization, etc.
Fields can be extended to provide meaning and to abstract implementations - for example the NameField extends TextField, but adds meta-data so we know it’s the name of this entity, and that it’s unique, so we can therefore have UI that uses that for links to this entity, or use it for a slug, etc.
Everything is a separately exported variable which can be imported into any project, extended, overridden, mixed in, etc.
When defining the relationship we sane defaults are used so we don’t need to explicitly define the entity fields we’re using to make the link, though we can if we want.
We don’t need to explicitly add both our entities and relationships to our schema (though we can) as we can infer the entities from the relationships.

There is another layer beyond this, which is where you define an Application which then lets you specify code generation components that to do all the busy work for you, settings like the ID scheme you want to use, etc.

It’s early days, I’m still refining things, and there is a ton of work yet to do - but I am now using it in anger on commercial projects and it’s saving me time - generating types/interfaces/classes, database definitions, api’s, end points, ui components, etc.

But it’s less about this specific implementation and more about the core idea - can we maximise reuse and minimise what we need to define for a given solution?

There’s so many things that come off the back of it - so much config that isn’t reusable (e.g. docker compose files), so many things that can be automatically determined based on data (e.g. database optimisations), so many things that can be abstracted (e.g. deployment/scaling strategies).

So much busy work that needs to be eliminated, allowing us to give LLM’s a run for their money!

SebKra@feddit.de · 1 year ago

Building abstractions and tools to reduce busy-work has been the goal of computer science since the moment we created assembly. The difficulty lies in finding methods that provide enough value for enough use-cases to outweigh the cost of learning, documenting, and maintaining them. Finding a solution that works for your narrow use-case is easy - every overly eager junior has done it. However, building solutions that truly advance CS takes time, effort, and many, many failures. I don’t mean to discourage you, but always be aware of the cost of your abstraction. Sometimes, the busy work is actually better.