AI chatbots were tasked to run a tech company. They built software in under seven minutes — for less than $1.

shish_mish@lemmy.world · 1 year ago

AI chatbots were tasked to run a tech company. They built software in under seven minutes — for less than $1.

Puzzle_Sluts_4Ever@lemmy.world · edit-2 1 year ago

While I do agree that management is genuinely important in software dev:

If you can rewrite the codebase quickly enough, versioning matters a lot less. Its the idea of “is it faster to just rewrite this function/package than to debug it?” but at a much larger scale. And while I would be concerned about regressions from full rewrites of the code… have you ever used software? Regressions happen near constantly even with proper version control and testing…

As for testing and documentation: This is actually what AI-enhanced tools are good for today. These are the simple tasks you give to junior staff.

Conflicting requests and iterating on descriptions: Have you ever futzed around with chatgpt? That is what it lives off of. Ask a question, then ask a follow up question, and so forth.

I am still skeptical of having no humans in the loop. But all of this is very plausible even with today’s technology and training sets.

Just to add a bit more to that. I don’t think having an AI operated company is a good idea. Even ignoring the legal aspects of it, there is a lot of value to having a human who can make irrational decisions because one customer will pay more in the long run and so forth.

But I can definitely see entire departments being a node in a rack. Customers talk to humans (or a different LLM) which then talk to the “Network Stack” node and the “UI/UX” node and so forth.

Vlyn@lemmy.zip · 1 year ago

If you just let it do a full rewrite again and again, what protects against breaking changes in the API? Software doesn’t exist in a vacuum, there might be other businesses or people using a certain API and relying on it. A breaking change could be as simple as the same endpoint now being named slightly differently.

So if you now start to mark every API method as “please no breaking changes for this” at what point do you need a full software developer again to take care of the AI?

I’ve also never seen AI modify an existing code base, it’s always new code getting spit out (80% correct or so, it likes to hallucinate functions that don’t even exist). Sure, for run of the mill templates you can use it, but even a developer who told me on here they rely heavily on ChatGPT said they need to verify all the code it spits out, because sometimes it’s garbage.

In the end it’s a damn language model that uses probability on what the next word should be. It’s fantastic for what it does, but it has no consistent internal logic and the way it works it never will.

Puzzle_Sluts_4Ever@lemmy.world · 1 year ago

You are literally describing constraints. They can be applied to an LLM the same way they can be applied to a dev team. And if you have never had to report an API change that breaks functionality… I wish I was you.

And if your full time software engineers are just running a unit test suite all day? … are you hiring?

As for modifcations: Again, have you ever used an LLM? Have a conversation with chatgpt. It will iterate on its responses. That is iterating on code.

In the end it’s a damn language model that uses probability on what the next word should be. It’s fantastic for what it does, but it has no consistent internal logic and the way it works it never will.

And that is demonstrably false and mostly just highlights that you don’t know what you are talking about. Or what language is, for that matter.

Vlyn@lemmy.zip · 1 year ago

Mate, I’ve used ChatGPT before, it straight up hallucinates functions if you want anything more complex than a basic template or a simple program. And as things are in programming, if even one tiny detail is wrong, things straight up don’t work. Also have fun putting ChatGPT answers into a real program you might have to compile, are you going to copy code into hundreds of files?

My example was public APIs, you might have an endpoint /v2/device that was generated the first time around. Now external customers/businesses built their software to access this endpoint. Next run around the AI generates /v2/appliance instead, everything breaks (while the software itself and unit tests still seem to work for the AI, it just changed a name).

If you don’t want that change you now have to tell the AI what to name things (or what to keep consistent), who is going to do that? The CEO? The intern? Who writes the perfect specification?

Puzzle_Sluts_4Ever@lemmy.world · edit-2 1 year ago

Yes. ChatGPT is not perfect. Because it is a general purpose LLM. Stuff like Github CoPilot and other software specific approaches are a LOT better at avoiding all the noise from bad answers on stack overflow and proposals. In large part because they have more focused training data.

But it can still do a remarkably good job so long as you have a human looking at it after the fact. Which… is how I would describe most software engineers I have ever worked with. Even the SSEs need someone to review their code. Which… is what is being described here. Combine that with a gitlab runner and you got yourself a stew.

As for APis and the like: Again, it feels like nobody here has ever actually worked with public software and think regressions don’t exist. But this is literally constraints and would be put in the requirements document that you give either the dev team or the LLM.

As for who is going to make that document: The same people who already do? Management.

Vlyn@lemmy.zip · 1 year ago

Management and sound technical specifications, that sounds to me like you’ve never actually worked in a real software company.

You just said what the main problem is: ChatGPT is not perfect. Code that isn’t perfect (compiles + has consistent logic) is worthless. If you need a developer to look over it you’ve already lost and it would be faster to have that developer write the code themselves.

Have you ever gotten a pull request with 10k lines of code? The AI could spit out so much code in an instant, no developer would be able to debug this mess or do a code review. They’ll just click “Approve” and throw it on the giant garbage heap whatever the AI decided to spit out.

If there’s a bug down the line (if you even get the whole thing to run), good luck finding it if no one in your developer team even wrote the code in the first place.

Puzzle_Sluts_4Ever@lemmy.world · edit-2 1 year ago

Management and sound technical specifications, that sounds to me like you’ve never actually worked in a real software company.

Worked at quite a few. Once you get out of college and start engaging with companies beyond “Ugh, how dare they want me to waste my precious time by talking to people” you start to learn the value of a strong management team.

And, more importantly, where those jira tickets come from.

A bog standard development flow is “all pull requests are linked to a documented issue/ticket. All pull requests require tests to pass, code coverage to not decrease, and approval by a code owner”

How does that work in reality?

Issues/tickets (just going to say issues from here on out) are created by a combination of customer feedback, identified issues by the development team, and directives from on high (which is generally related to the overall roadmap). One or more developers work on a merge request, the person who best understands the appropriate code looks it over, it is tested, and it is merged in. After enough of those cycles happen, a release is prepared and a manager signs off on it.

How does that map to an “AI” based workflow?

Issues/tickets (just going to say issues from here on out) are created by a combination of customer feedback, identified issues by the development team, and directives from on high (which is generally related to the overall roadmap). Because LLMs can provide feedback and uncertainty measurements once you get past Google Bard. And regression testing and nightly performance testing can highlight deficiencies. The issue is put into a template, that includes all existing constraints, and the LLM generates a solution. Someone who understands the code checks to make sure that looks sane, it is tested, and it is merged in. After enough of those cycles happen, a release is prepared and a manger signs off on it.

And then it becomes a question of what level you start requiring humans. Because when I do a code review prior to a Release? I am relying VERY heavily on my team to have been doing their due diligence. I skim through the MRs and look for a few hot spots but it is mostly “Well, Fred and Nancy said this was good and it passes all the tests so…”

You just said what the main problem is: ChatGPT is not perfect. Code that isn’t perfect (compiles + has consistent logic) is worthless. If you need a developer to look over it you’ve already lost and it would be faster to have that developer write the code themselves.

I VEHEMENTLY disagree with this. If you don’t have developers looking over your code then you are not a software engineer. And if it takes them the same amount of time to review code as it does to write it? You aren’t working on interesting problems and are wasting vast amounts of money.

I can farm out a general task of “improve our code coverage” to an intern. They can spend a few days (or even weeks) doing that, and I can review their MRs in a few minutes. If something looks weird, I leave a comment and wait for them to get back to me. All the time I am working on much more interesting problems… or doing the same for my SSEs.

Once you stop worshiping the ground that “developers” walk on (which mostly comes from time and experience) you start to realize how many people spend most of their lives just filling out tickets with no understanding of “Why”. And how much work your management is putting in so that you don’t throw a temper tantrum or break the code base. Which… maps pretty well to an LLM.

Just to make it clear. I am not saying that all developers should strive to be managers. I actively disagree with that.

But if you aren’t interested in how management works? Whether it is because you want a heads up when crunch is coming or want to understand the big picture or just figure out when it is time to get getting? Then you aren’t growing as a developer and are not an engineer. You are a monkey with a typewriter in the basement.

Vlyn@lemmy.zip · 1 year ago

You misunderstood, I never said management is worthless. The product managers know what customers want. The product owners keep 8 out of 10 dumb ideas away from the development team. And management again leans on the development team to find out what is actually technically possible and in what time frame.

If management just threw every customer wish into a magic black box to get code out, even if that code was perfect, you wouldn’t have a product. You’d have a pile of steaming crap.

I’ve done plenty of code reviews, they only work if they are small human readable increments. Like they say: A code review of 100 lines might take an hour. A code review of 10000 lines takes thirty minutes.

AI would spit out so much code with missing context for the developer, it would be impossible to properly review.

Puzzle_Sluts_4Ever@lemmy.world · edit-2 1 year ago

Again: No

if it takes you the same amount of time to review 10k lines versus write 10k lines? Either you are bad at your job or you aren’t working on a meaningful problem. One of the most valuable things an engineer can learn is to ask questions. If this MR is hard to parse? Leave a comment and make the developer improve the documentation or restructure a function or two. And you can do that with LLMs.

And, again, there is no difference between assigning “Implement Feature X” ticket to Stan versus StanAI. If StanAI is writing 500x the amount of code that Stan would? StanAI sucks and needs to be retrained.

And, as it stands? Using tools like CoPilot or even ChatGPT, “StanAI” tends to write more concise AND more readable code. In large part because its training data is weighted by the code that has already gone through code review, was accepted, and may even be part of the production stack on half the planet.

AI chatbots were tasked to run a tech company. They built software in under seven minutes — for less than $1.

AI chatbots were tasked to run a tech company. They built software in under seven minutes — for less than $1.

AI chatbots were tasked to run a tech company. They built software in under seven minutes —&nbsp;for less than $1.

AI chatbots were tasked to run a tech company. They built software in under seven minutes — for less than $1.