I don’t get the “just spend more time with AI” argument. Its not a skill, stop trying to make it one. Why should I spend 30 days with it? The only thing that would accomplish is taking the soul and joy out of everything. Everyone just sound like they don’t like coding.
Of course using AIs is a skill, just like e.g. effectively writing search queries used to be a skill back in the day. When I actually tried getting something done with AI models for the first time, rather than just kicking the tires with the implicit motivation of showing how useless they were, it took way more iterations to get a satisfactory output at the start than a week later.
The kinds of things you'll learn are:
- What's even worth asking for? What categories of requests just won't work, what scope is too large, what kinds of things are going to just be easier to do yourself?
- Just how do you phrase the request, what kind of constraints should you give up front, what kind of things do you need to tell it that should be self-evident but aren't?
- How do you deal with sub-optimal output? Whe do you fix it yourself, when do you get the AI to iterate on it, when do you just throw out the entire sessions and start afresh?
The only way for it to not be a skill would be if how you use an AI either did not matter for the quality output, or if getting better results just a natural talent some people have and some don't. Both of those seem like pretty unrealistic ideas.
I think there's probably a discussion to be had about how deep or transferrable the skill is, but your opening gambit of "it's not a skill, stop trying to make it one" is not a productive starting point for such a discussion.
That seems to be a struggle for many. A friend of my wife turned 50 and we went to her birthday party. Two speechs and one song was AI generated, two speeches where written by actual humans, guess which should never have been created, let alone performed?
More and more I struggle to see the point of LLMs. I can sort of convince myself that there are niches where LLMs are really useful, but it's getting harder to maintain that illusion. There are cases where AI technologies are truly impressive and transformative, but they are rarely based on a chat interface.
But why would you devote much time and energy trying to massage AI when you can instead apply directly to the problem? With likely more satisfying process and result. You paint it as if that was some prejudice.
No, I was pretty careful to address only the very specific claim the OP made about how effective AI use is not a skill. If you're reading anything more than that into the comment, I think you're projecting. I really don't care at all whether you or the OP use AIs, and am not trying to convince you of that either way.
My personal experience is that it might be called a skill like learning to use dull knife can be called a skill. I might be mistaken, but I need to see clear process and result. Not lengthy comments like "no it's still useful but you need to approach it deliberately".
And rest assured I don't care about you either(why such tone lol).
I agree, it is absolutely not a skill. LLMs are a black box and the models keep changing under you, and their output can change if you try the exact same input more than once.
People claiming it's a skill should read up on experiments on behavior adaptation to stochastic rewards. Subjects develop elaborate "rain dances" in the belief that they can influence the outcome. Not unlike sports fans superstitions.
This. If there was some stability in the space, you could empirically develop good practices that probably beat naive practices. But since everything changes every couple of months and since you'll usually want to try different models on an ongoing basis, I found I'm doing just fine with a very small bag of tricks.
Sure, by definition, prompting is a skill. But it's a skill that really isn't hard to learn, and the gap between a beginner and a master is pretty narrow. The real differentiator is understanding the domain you're promoting for deeply, e.g. software development or visual design. Most value comes out of knowing what to ask for, and knowing how to evaluate the results.
Analogy would have been correct if prompting didn't influence the output (which I hope you agree is not the case).
And yes, the model keeps changing under you -- much like a horse is changing under a jockey, forcing them to adapt. Or like formula drivers and different car brands.
You can absolutely improve the results by experimenting with prompting, by building a mental mode of what happens inside the "black box", by learning what kinds of context it has/does not have, how (not) to overburden it with instructions etc. etc.
You can optimize a prompt for a particular LLM model and this can be done only through experimentation. If you take your heavily optimized prompt and apply it to a different model there is a good chance you need to start from scratch.
What you need to do every few months/weeks depending of when the last model was released is to reevaluate your bag of tricks.
At some point it becomes a roulette - you try this, you tray that and maybe it works or maybe not ...
> Everyone just sound like they don’t like coding.
It’s no secret that a lot of people (I’d like having an accurate percentage) got into coding because of the money. When you view it from that perspective everything becomes clearer: those people don’t care for the craft or correctness. They don’t understand or care when something is wrong and it’s in their best interest to convince themselves and everyone else that AI is as good or better than any expert programmer because it means they themselves don’t need to improve or care more than what they already don’t, they can just concentrate on the getting rich part.
There are experts (programmers or otherwise) who use these tools as guidelines and always verify the output. But too often they defend LLMs as unambiguously good because they fail to understand the overwhelming majority of humans aren’t experts or sufficiently critical of whatever they read, taking whatever the LLM spits as gospel. Which is what makes them dangerous.
Using it efficiently is absolutely a skill. Just like google-fu is a skill. Or reading fast / skimming is a skill. Or like working with others is a skill. And so on and so on.
Yeah. It’s a skill. I used walking to basically say it’s like a universal skill that’s deadly easy to learn. So easy that it doesn't even feel like a skill.
Having errors is not the user error - Google will also return you bad results but I'd still consider it user error if someone can't avoid the bad results well enough to find some use for it.
A lot of the same kind of skill goes into prompting AI and delegating work to other humans. Delegation requires building intellectual empathy for the task recipient, giving them an instruction they can verifiably follow. It requires building trust, and more often than not requires a certain degree of trial/error/watching others work before one can delegate reliably. A lot goes into delegation, and much of this stuff is hard! It's also hard to be delegated to -- especially by someone you haven't worked with before, what is it that they mean when they ask for "more sparkles in the UI" or "I tried C and it didn't work"? Can I guess their background to meet them where they are? The list goes on.
In some ways it's easier to delegate to an AI because you don't have to care for anyone's feelings but your own, and you lose nothing but your own time when things don't go well and you have to reset. On the other hand, when the delegation does not go well, you still got yourself to blame first.
This is very accurate imo - it really is the skill of proper delegation. Same for asking AI questions in an unbiased way so it doesn’t just try to please you - this has made me better at asking questions to people as well!
It’s like a slightly over-eager junior-mid developer, which however doesn’t mind rewriting 30k lines of tests from one framework to another. This means I can let it handle that dirty work, while focusing on the fun and/or challenging parts myself.
I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.
> It’s like a slightly over-eager junior-mid developer
One with brain damage maybe, I tried out having Claude & Gemini modify a Go program with an absolutely trivial change (change the units displayed in an output type) and it got one of the four lines of code correct (the actual math for the unit conversion) and the rest was incorrect.
In the end, I integrated the helper function it output myself.
SOTA models can generate two or three lines of code accurately at a time and you have to describe them with such specificity that I've usually already done the hard part of the thinking by the time I have a specific enough prompt, that it's easier to just type out the code.
At best they save me looking up a unit conversion formula, which makes them about as useful as a search engine
> you lose nothing but your own time when things don't go well and you have to reset
Crucially, you lose money with a lot of these models when they output the wrong thing, because you pay by token whether the tokens coming out are what you want or not.
It's a bit like a slot machine. You write your prompt, insert some money, and pull the lever. Sometimes it saves you a lot of time! Sometimes, not so much. Sometimes it gets 80% of the way and you think oh, let me just put in another coin and tweak my prompt and pull the lever, this time it will get me 100%
Listening to people justify pulling the lever over and over again is a little bit like listening to an addict excusing their behavior.
I realize there are flat rate plans like Kagi offers, but the API offerings and IDE integrations all feature the slot machine and sunk cost effects that I describe.
> The problem is that most people misunderstand what AI is good at. They talk about it "taking over" writing, planning, and problem-solving—as if these were simple, mechanical tasks that could be fully automated without any loss in quality.
Because that’s the claim of all the AI companies. Right next to the claim that AGI is in reach.
The question is if all use AI will all text become too similar.
Making racist memes is like the only thing generative AI is better at than real humans. (Makes sense if you consider that "stereotypes" is just another word for "likelihood estimates".)
It's surprising to me that these things are so hard to use well. If you asked me before ChatGPT to guess how the user experience with this kind of technology would be, I would have said I expect it to be as intuitive as talking, almost no friction. I think this is a natural expectation that, when violated, turn a lot of people off.
Except talking is not intuitive. It's an unbelievably hard skill. How many years have you spent on talking until you can communicate like an adult? To convey complicated political, philosophical, or technical ideas? To express your feelings honestly without offending others?
For most people it takes from 20 years to a lifetime. Personally I can't even describe a simple (but not commonly known) algorithm to another programmer without a whitboard.
I was speaking two languages at two years old, and debating political systems by ten. I'm not really sure that talking is actually that hard, depending on your cultural background. The more diverse, the easier you may find it to convey incredibly complex concepts. I'm not an outlier - I'm a boring statistical point.
I've heard plenty of overly complicated explanations of what a monad is. It's also not a complicated concept. Return a partial binding until all argument slots are filled, then return the result of the function. Jargon gets in the way of simple explanations. Ask a kid to explain something, and it will probably be a hell of a lot clearer.
The more experience you have, the harder it often is to draw out something untainted by that experience to give to someone else. We are the sum of our experience, and so its so darn easy to get lost in that, rather than to speak from where the other person is standing.
> I would have said I expect it to be as intuitive as talking, almost no friction
There is so much friction when you try to do anything technical by talking to someone that don't know you, you have to know each other extremely well for there to be no friction.
This is why people prefer communicating in pseudo code rather than natural language when discussing programming, its really hard to describe what you want in words.
For me this is exactely one of the biggest developments as LLMs became available: They 'get it' much more than the previous tech (search engines) and fill in the blanks much more than previously thought possible.
Sure if you leave out too much context you get generic responses but that isn't too surprising.
So it sounds like the proposed use-case is to use LLMs for generating feedback on your own work.
But if we accept that LLMs generally (in other use-cases) produce output that looks deceptively similar to what you ask for (i.e. it seems to work) but is actually worthless junk if carefully inspected (i.e. it doesn't actually work), why would you think they are able to generate accurate feedback?
That there is such a calendar for using ChatGPT in the style of topics like "how to eat healthy", "how to stay fit" or "how to be more confident" shows to me more than anything what impact AI has on our society.
> The people who are most skeptical of AI are often those with the highest standards for quality.
From Anger to Denial to Bargaining. And we are starting out with flattery. Masterful gambit!
Instead of participating in slop coding (sorry, "AI collaboration"), I think I'll just wait for the author and their ilk to make their way across Depression and Acceptance.
While the current state of generative AI isn't yet capable of full automation, it will soon enough. But it's also important to understand that the candle is burning from both ends. Here is what I mean by that. The acceptance rate of lower quality goods, work, and pretty much everything has gone up. People, corporations, and everything in between has been "enshitefied" and that is where the crucial miscalculation happens by people who claim that AI will never replace people in this or that field. Your food is 60% percent lower quality, your goods are worse quality, your entertainment is worse quality, and your children have been inundated by brain rot for at least a whole generation now. The standards are lower then they have ever been on pretty much everything, and so are the requirement standard for whatever it is you consider your career. And very soon Ai will fill that shitty position just fine.
The problem is that current AI companies are ignoring domain expertise in favor of overly generalist models. "Meh, we have AGI planned for tomorrow anyway, it will sort everything out by itself. Somehow." This is understandable (see the "Bitter lesson"), but particular knowledge domains are so deep that you can't just ignore them, you'll produce a metric ton of crap if you stay oblivious. No matter how advanced your model is, without consulting with actual experts for fundamentals it will always miss the mark and look off.
Anthropic used to do this with Claude's character until Claude 3, but then dropped it. OAI's image generation is consistently ahead in prompt understanding and abstraction, but they famously don't give a flying turd about nuances. Current models are produced by ML nerds that are handwaving the complexity away, not by experts in what they're trying to solve. If they want it to be usable now, they need to listen to this kind of people [1]. But I don't think they really care.
But what kind of magic sauce are experts really based on in your opinion? Something which hasn't been written down in the thousands of books on any technical subject?
In my opinion it is ridiculous to still say that there is anything fundamentally different between human intelligence and scaling LLMs another 10x or 100x.
Valid question, and yes I don't think there's any difference in performance.
However I'm not talking about technical tasks with objectively measurable criteria of success (which is a super narrow subset, not even coding is entirely like this). I'm saying that you have to transfer some kind of human preference to the model, as unsupervised learning will never be able to infer an accurate reference point for what you subjectively want from the pretraining data on its own, no matter the scale. Even if I'm wrong on that somehow, we're currently at 1x scale, and model finetuning right now is a pretty hands-on process. It's clear that ML people that usually curate this process have a really vague idea of what looks/reads/feels good. Which is why they produce slop.
TFA is talking about that:
>AI doesn’t understand why something matters, can’t independently prioritize what’s most important, and doesn’t bring the accountability or personal investment that gives work its depth and resonance.
Of course it doesn't, because it's not trained to understand it. Claude was finetuned for "human likeness" up to the version 3, and Opus had really deep understanding of why something matters, it had better agency than any current model, and a great reference point for your priorities. That's what happens when you give the curation to a non-ML adjacent person who knows what she's doing (AFAIK she left Anthropic since then and Anthropic seemingly dropped that "character training" policy).
Check 4o's image generation as well - it has terrible yellow tint by default, thick constant-width linework in "hand-drawn" pictures etc. You can somewhat steer it with a prompt and references, but it's pretty clear that the people that have been finetuning it didn't have a good idea whether their result is any good, so they made something instantly recognizable as slop. This is not just a botched training run or a dataset preparation bug, it's a recurring pattern for OpenAI, they simply do not care about this. The recurring pattern for Midjorney, for example, is to finetune their models on kitsch.
This all could be fixed in no time, making these models way more usable as products, right now, not someday when they maybe reach the 100x scale (which is neither likely to happen nor likely to change anything).
I am with you that the current dichotomy of training vs. inference seems unsustainable in the long run. We need ways for LLMs to learn from the interactions they are having, we might need introspection and self-modification.
I am not sure we need more diversity. Part of your argument sounds to me like we do. Slop (to me) is primarily the result of over-generalizing to everyone's taste. We get generic replies and generic images rather than consistently unique outcomes which we could call a personality.
>AI doesn’t understand why something matters.
I beg to differ. LLMs have seen all the reasons why something could matter. This is how they do everything. This is also how the brain works: You excite neurons with two concepts at a similar time and they become linked. For causality/correlation/memory...
I also agree with you that too much reliance on RLHF has not been the best idea. We are overfitting what people want rather than what people should want if they knew. LLMs are too eager to please and haven't yet learned how much teenage rebellion is needed for progress.
> The only thing that I have seen convince people (and it always does)
...when anyone starts talking in universals like this, they're usually deep in some hype cycle.
This is a problematic approach that many people take; they posit that:
1) AI is fundamentally transformative.
2) People who don't acknowledge that simply haven't tried it.
However, I posit that:
3) People who think that haven't actually used it a serious capacity or are deliberately misrepresenting things.
The problem is that:
> In reality, I go back and forth with AI constantly—sometimes dozens of times on a single piece of work. I refine, iterate, and improve each part through ongoing dialogue. It's like having a thoughtful and impossibly fast colleague who's always available to help me develop and sharpen my ideas.
...is only true for trivial problems.
The author calls this out, saying:
> It won't excel at consistently citing specific papers, building codes, or case law correctly. (Advanced techniques exist for these tasks, but they're not worth learning when you're just starting out. For now, consider them out of scope.)
...but, this is really the heart of everything.
What are those advanced techniques? Seriously, after 30 days of using AI if all you're doing is:
> Prepare for challenging conversations by using ChatGPT to simulate potential scenarios, helping you approach interpersonal dynamics with empathy and grace.
Then what the absolute heck are you doing.
Stop gaslighting everyone.
Those 'advanced techniques' are all anyone cares about, because they are the things that are hard, and don't work.
In reality, it doesn't matter how much time you spend learning; the technology is fundamentally limited. It can't do some things.
Spending time learning how to do trivial things will never enable you to do hard things.
It's not missing the 'human touch'.
It's the crazy hallucinations, invalid logic, failure to do as told, flat out incorrect information or citations, inability to perform a task (eg. as an agent) without messing some other thing up.
There are a few techniques that can help you have an effective workflow; but seriously, if you're a skeptic about AI, spending a month doing trivial stuff like asking for '10 ideas about X' is an insult to your intelligence and doesn't address any of the concerns that, I would argue, skeptics and real people actually have about AI.
Let’s take vim and emacs, or bash. People do not spend years on them only for pleasure or fun, it’s because they’re trying to eliminate tedious aspects of their previous workflows.
That’s the function of a tool. To help do something in a more relaxed manner. Learning to use it can take some time, but the acquired proficiency will compensate for that.
General public LLMs have been there for two years, and still today, there are no concrete uses cases that can have the definition of tools. It’s trust me bro! and warnings in small print.
I don’t get the “just spend more time with AI” argument. Its not a skill, stop trying to make it one. Why should I spend 30 days with it? The only thing that would accomplish is taking the soul and joy out of everything. Everyone just sound like they don’t like coding.
Of course using AIs is a skill, just like e.g. effectively writing search queries used to be a skill back in the day. When I actually tried getting something done with AI models for the first time, rather than just kicking the tires with the implicit motivation of showing how useless they were, it took way more iterations to get a satisfactory output at the start than a week later.
The kinds of things you'll learn are:
- What's even worth asking for? What categories of requests just won't work, what scope is too large, what kinds of things are going to just be easier to do yourself?
- Just how do you phrase the request, what kind of constraints should you give up front, what kind of things do you need to tell it that should be self-evident but aren't?
- How do you deal with sub-optimal output? Whe do you fix it yourself, when do you get the AI to iterate on it, when do you just throw out the entire sessions and start afresh?
The only way for it to not be a skill would be if how you use an AI either did not matter for the quality output, or if getting better results just a natural talent some people have and some don't. Both of those seem like pretty unrealistic ideas.
I think there's probably a discussion to be had about how deep or transferrable the skill is, but your opening gambit of "it's not a skill, stop trying to make it one" is not a productive starting point for such a discussion.
> What's even worth asking for?
That seems to be a struggle for many. A friend of my wife turned 50 and we went to her birthday party. Two speechs and one song was AI generated, two speeches where written by actual humans, guess which should never have been created, let alone performed?
More and more I struggle to see the point of LLMs. I can sort of convince myself that there are niches where LLMs are really useful, but it's getting harder to maintain that illusion. There are cases where AI technologies are truly impressive and transformative, but they are rarely based on a chat interface.
But why would you devote much time and energy trying to massage AI when you can instead apply directly to the problem? With likely more satisfying process and result. You paint it as if that was some prejudice.
No, I was pretty careful to address only the very specific claim the OP made about how effective AI use is not a skill. If you're reading anything more than that into the comment, I think you're projecting. I really don't care at all whether you or the OP use AIs, and am not trying to convince you of that either way.
My personal experience is that it might be called a skill like learning to use dull knife can be called a skill. I might be mistaken, but I need to see clear process and result. Not lengthy comments like "no it's still useful but you need to approach it deliberately".
And rest assured I don't care about you either(why such tone lol).
I agree, it is absolutely not a skill. LLMs are a black box and the models keep changing under you, and their output can change if you try the exact same input more than once.
People claiming it's a skill should read up on experiments on behavior adaptation to stochastic rewards. Subjects develop elaborate "rain dances" in the belief that they can influence the outcome. Not unlike sports fans superstitions.
This. If there was some stability in the space, you could empirically develop good practices that probably beat naive practices. But since everything changes every couple of months and since you'll usually want to try different models on an ongoing basis, I found I'm doing just fine with a very small bag of tricks.
Sure, by definition, prompting is a skill. But it's a skill that really isn't hard to learn, and the gap between a beginner and a master is pretty narrow. The real differentiator is understanding the domain you're promoting for deeply, e.g. software development or visual design. Most value comes out of knowing what to ask for, and knowing how to evaluate the results.
Analogy would have been correct if prompting didn't influence the output (which I hope you agree is not the case).
And yes, the model keeps changing under you -- much like a horse is changing under a jockey, forcing them to adapt. Or like formula drivers and different car brands.
You can absolutely improve the results by experimenting with prompting, by building a mental mode of what happens inside the "black box", by learning what kinds of context it has/does not have, how (not) to overburden it with instructions etc. etc.
And yet prompts can be optimized.
You can optimize a prompt for a particular LLM model and this can be done only through experimentation. If you take your heavily optimized prompt and apply it to a different model there is a good chance you need to start from scratch.
What you need to do every few months/weeks depending of when the last model was released is to reevaluate your bag of tricks.
At some point it becomes a roulette - you try this, you tray that and maybe it works or maybe not ...
> Everyone just sound like they don’t like coding.
It’s no secret that a lot of people (I’d like having an accurate percentage) got into coding because of the money. When you view it from that perspective everything becomes clearer: those people don’t care for the craft or correctness. They don’t understand or care when something is wrong and it’s in their best interest to convince themselves and everyone else that AI is as good or better than any expert programmer because it means they themselves don’t need to improve or care more than what they already don’t, they can just concentrate on the getting rich part.
There are experts (programmers or otherwise) who use these tools as guidelines and always verify the output. But too often they defend LLMs as unambiguously good because they fail to understand the overwhelming majority of humans aren’t experts or sufficiently critical of whatever they read, taking whatever the LLM spits as gospel. Which is what makes them dangerous.
> Its not a skill, stop trying to make it one.
Using it efficiently is absolutely a skill. Just like google-fu is a skill. Or reading fast / skimming is a skill. Or like working with others is a skill. And so on and so on.
Agreed it’s a skill in the same way walking is a skill.
I'd using bicycling as the analogy, some people never learn and thus don't understand the gains it provides.
Yeah. It’s a skill. I used walking to basically say it’s like a universal skill that’s deadly easy to learn. So easy that it doesn't even feel like a skill.
Bicycling is slightly harder than walking.
So strange, I haven't had this much fun at coding in a long time. It's amazing.
Why is it strange? Different people enjoy different things. Seems normal to me
It's a skill. The more time (and intentional practice) you invest in it the better you'll get at it.
> It's like having a thoughtful and impossibly fast colleague who's always available to help me develop and sharpen my ideas.
More like an absolute bumbling idiot of a colleague that you have to explain things over and over again and can’t ever trust to get anything right.
User error.
When it makes up a PowerShell command that is a user error?
When it takes longer to prompt it with the details you would want in an email than to just write the email, that is user error?
Like I get the use case with summarization or translation but I can’t trust the output 100% when I know complete nonsense could be output.
Having errors is not the user error - Google will also return you bad results but I'd still consider it user error if someone can't avoid the bad results well enough to find some use for it.
Yes, and I already got those
Now, running with scissors
> More like an absolute bumbling idiot
sam altman said AI would "clone his brain" by 2026. He is wrong, it already has.
I've listened to him speak many times and thats an accurate description. seriously, has he ever said even one interesting thing ?
Still unclear why OpenAI wanted him back in power. He’s lost their lead and their top talent
Because the alternative wouldn't push so hard for profits, their shares would go down in value and very few people want that.
Didn't he once say he really wanted to start a religion, but it's easier to start a business?
"AI will solve all of physics "
these guys are making shit up on the fly now. anything goes.
A lot of the same kind of skill goes into prompting AI and delegating work to other humans. Delegation requires building intellectual empathy for the task recipient, giving them an instruction they can verifiably follow. It requires building trust, and more often than not requires a certain degree of trial/error/watching others work before one can delegate reliably. A lot goes into delegation, and much of this stuff is hard! It's also hard to be delegated to -- especially by someone you haven't worked with before, what is it that they mean when they ask for "more sparkles in the UI" or "I tried C and it didn't work"? Can I guess their background to meet them where they are? The list goes on.
In some ways it's easier to delegate to an AI because you don't have to care for anyone's feelings but your own, and you lose nothing but your own time when things don't go well and you have to reset. On the other hand, when the delegation does not go well, you still got yourself to blame first.
This is very accurate imo - it really is the skill of proper delegation. Same for asking AI questions in an unbiased way so it doesn’t just try to please you - this has made me better at asking questions to people as well!
It’s like a slightly over-eager junior-mid developer, which however doesn’t mind rewriting 30k lines of tests from one framework to another. This means I can let it handle that dirty work, while focusing on the fun and/or challenging parts myself.
I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.
> It’s like a slightly over-eager junior-mid developer
One with brain damage maybe, I tried out having Claude & Gemini modify a Go program with an absolutely trivial change (change the units displayed in an output type) and it got one of the four lines of code correct (the actual math for the unit conversion) and the rest was incorrect.
In the end, I integrated the helper function it output myself.
SOTA models can generate two or three lines of code accurately at a time and you have to describe them with such specificity that I've usually already done the hard part of the thinking by the time I have a specific enough prompt, that it's easier to just type out the code.
At best they save me looking up a unit conversion formula, which makes them about as useful as a search engine
That sounds very unlike my experience. I frequently get it to modify / create large parts of files at a time, successfully.
> you lose nothing but your own time when things don't go well and you have to reset
Crucially, you lose money with a lot of these models when they output the wrong thing, because you pay by token whether the tokens coming out are what you want or not.
It's a bit like a slot machine. You write your prompt, insert some money, and pull the lever. Sometimes it saves you a lot of time! Sometimes, not so much. Sometimes it gets 80% of the way and you think oh, let me just put in another coin and tweak my prompt and pull the lever, this time it will get me 100%
Listening to people justify pulling the lever over and over again is a little bit like listening to an addict excusing their behavior.
I realize there are flat rate plans like Kagi offers, but the API offerings and IDE integrations all feature the slot machine and sunk cost effects that I describe.
> The problem is that most people misunderstand what AI is good at. They talk about it "taking over" writing, planning, and problem-solving—as if these were simple, mechanical tasks that could be fully automated without any loss in quality.
Because that’s the claim of all the AI companies. Right next to the claim that AGI is in reach.
The question is if all use AI will all text become too similar.
Talking to some younger colleagues over drinks the other evening they showed me their Instagram feed. It's all AI slop. Machine generated jokes.
For all the talk about jobs and art, LLMs seem to love shitposting.
Making racist memes is like the only thing generative AI is better at than real humans. (Makes sense if you consider that "stereotypes" is just another word for "likelihood estimates".)
It's surprising to me that these things are so hard to use well. If you asked me before ChatGPT to guess how the user experience with this kind of technology would be, I would have said I expect it to be as intuitive as talking, almost no friction. I think this is a natural expectation that, when violated, turn a lot of people off.
> intuitive as talking
Except talking is not intuitive. It's an unbelievably hard skill. How many years have you spent on talking until you can communicate like an adult? To convey complicated political, philosophical, or technical ideas? To express your feelings honestly without offending others?
For most people it takes from 20 years to a lifetime. Personally I can't even describe a simple (but not commonly known) algorithm to another programmer without a whitboard.
I was speaking two languages at two years old, and debating political systems by ten. I'm not really sure that talking is actually that hard, depending on your cultural background. The more diverse, the easier you may find it to convey incredibly complex concepts. I'm not an outlier - I'm a boring statistical point.
I've heard plenty of overly complicated explanations of what a monad is. It's also not a complicated concept. Return a partial binding until all argument slots are filled, then return the result of the function. Jargon gets in the way of simple explanations. Ask a kid to explain something, and it will probably be a hell of a lot clearer.
The more experience you have, the harder it often is to draw out something untainted by that experience to give to someone else. We are the sum of our experience, and so its so darn easy to get lost in that, rather than to speak from where the other person is standing.
> I would have said I expect it to be as intuitive as talking, almost no friction
There is so much friction when you try to do anything technical by talking to someone that don't know you, you have to know each other extremely well for there to be no friction.
This is why people prefer communicating in pseudo code rather than natural language when discussing programming, its really hard to describe what you want in words.
For me this is exactely one of the biggest developments as LLMs became available: They 'get it' much more than the previous tech (search engines) and fill in the blanks much more than previously thought possible.
Sure if you leave out too much context you get generic responses but that isn't too surprising.
So it sounds like the proposed use-case is to use LLMs for generating feedback on your own work.
But if we accept that LLMs generally (in other use-cases) produce output that looks deceptively similar to what you ask for (i.e. it seems to work) but is actually worthless junk if carefully inspected (i.e. it doesn't actually work), why would you think they are able to generate accurate feedback?
[dead]
> illustrative 30-day calendar of exercises
That there is such a calendar for using ChatGPT in the style of topics like "how to eat healthy", "how to stay fit" or "how to be more confident" shows to me more than anything what impact AI has on our society.
It shows the desperation of the bubble, the retarded miracles must sell to keep the "soonstartrek" delusion vector going, the customers be damned.
> The people who are most skeptical of AI are often those with the highest standards for quality.
From Anger to Denial to Bargaining. And we are starting out with flattery. Masterful gambit!
Instead of participating in slop coding (sorry, "AI collaboration"), I think I'll just wait for the author and their ilk to make their way across Depression and Acceptance.
While the current state of generative AI isn't yet capable of full automation, it will soon enough. But it's also important to understand that the candle is burning from both ends. Here is what I mean by that. The acceptance rate of lower quality goods, work, and pretty much everything has gone up. People, corporations, and everything in between has been "enshitefied" and that is where the crucial miscalculation happens by people who claim that AI will never replace people in this or that field. Your food is 60% percent lower quality, your goods are worse quality, your entertainment is worse quality, and your children have been inundated by brain rot for at least a whole generation now. The standards are lower then they have ever been on pretty much everything, and so are the requirement standard for whatever it is you consider your career. And very soon Ai will fill that shitty position just fine.
The problem is that current AI companies are ignoring domain expertise in favor of overly generalist models. "Meh, we have AGI planned for tomorrow anyway, it will sort everything out by itself. Somehow." This is understandable (see the "Bitter lesson"), but particular knowledge domains are so deep that you can't just ignore them, you'll produce a metric ton of crap if you stay oblivious. No matter how advanced your model is, without consulting with actual experts for fundamentals it will always miss the mark and look off.
Anthropic used to do this with Claude's character until Claude 3, but then dropped it. OAI's image generation is consistently ahead in prompt understanding and abstraction, but they famously don't give a flying turd about nuances. Current models are produced by ML nerds that are handwaving the complexity away, not by experts in what they're trying to solve. If they want it to be usable now, they need to listen to this kind of people [1]. But I don't think they really care.
[1] https://yosefk.com/blog/the-state-of-ai-for-hand-drawn-anima...
But what kind of magic sauce are experts really based on in your opinion? Something which hasn't been written down in the thousands of books on any technical subject?
In my opinion it is ridiculous to still say that there is anything fundamentally different between human intelligence and scaling LLMs another 10x or 100x.
Valid question, and yes I don't think there's any difference in performance.
However I'm not talking about technical tasks with objectively measurable criteria of success (which is a super narrow subset, not even coding is entirely like this). I'm saying that you have to transfer some kind of human preference to the model, as unsupervised learning will never be able to infer an accurate reference point for what you subjectively want from the pretraining data on its own, no matter the scale. Even if I'm wrong on that somehow, we're currently at 1x scale, and model finetuning right now is a pretty hands-on process. It's clear that ML people that usually curate this process have a really vague idea of what looks/reads/feels good. Which is why they produce slop.
TFA is talking about that:
>AI doesn’t understand why something matters, can’t independently prioritize what’s most important, and doesn’t bring the accountability or personal investment that gives work its depth and resonance.
Of course it doesn't, because it's not trained to understand it. Claude was finetuned for "human likeness" up to the version 3, and Opus had really deep understanding of why something matters, it had better agency than any current model, and a great reference point for your priorities. That's what happens when you give the curation to a non-ML adjacent person who knows what she's doing (AFAIK she left Anthropic since then and Anthropic seemingly dropped that "character training" policy).
Check 4o's image generation as well - it has terrible yellow tint by default, thick constant-width linework in "hand-drawn" pictures etc. You can somewhat steer it with a prompt and references, but it's pretty clear that the people that have been finetuning it didn't have a good idea whether their result is any good, so they made something instantly recognizable as slop. This is not just a botched training run or a dataset preparation bug, it's a recurring pattern for OpenAI, they simply do not care about this. The recurring pattern for Midjorney, for example, is to finetune their models on kitsch.
This all could be fixed in no time, making these models way more usable as products, right now, not someday when they maybe reach the 100x scale (which is neither likely to happen nor likely to change anything).
Thanks for your reply. Well reasoned.
I am with you that the current dichotomy of training vs. inference seems unsustainable in the long run. We need ways for LLMs to learn from the interactions they are having, we might need introspection and self-modification.
I am not sure we need more diversity. Part of your argument sounds to me like we do. Slop (to me) is primarily the result of over-generalizing to everyone's taste. We get generic replies and generic images rather than consistently unique outcomes which we could call a personality.
>AI doesn’t understand why something matters.
I beg to differ. LLMs have seen all the reasons why something could matter. This is how they do everything. This is also how the brain works: You excite neurons with two concepts at a similar time and they become linked. For causality/correlation/memory...
I also agree with you that too much reliance on RLHF has not been the best idea. We are overfitting what people want rather than what people should want if they knew. LLMs are too eager to please and haven't yet learned how much teenage rebellion is needed for progress.
[dead]
> The only thing that I have seen convince people (and it always does)
...when anyone starts talking in universals like this, they're usually deep in some hype cycle.
This is a problematic approach that many people take; they posit that:
1) AI is fundamentally transformative.
2) People who don't acknowledge that simply haven't tried it.
However, I posit that:
3) People who think that haven't actually used it a serious capacity or are deliberately misrepresenting things.
The problem is that:
> In reality, I go back and forth with AI constantly—sometimes dozens of times on a single piece of work. I refine, iterate, and improve each part through ongoing dialogue. It's like having a thoughtful and impossibly fast colleague who's always available to help me develop and sharpen my ideas.
...is only true for trivial problems.
The author calls this out, saying:
> It won't excel at consistently citing specific papers, building codes, or case law correctly. (Advanced techniques exist for these tasks, but they're not worth learning when you're just starting out. For now, consider them out of scope.)
...but, this is really the heart of everything.
What are those advanced techniques? Seriously, after 30 days of using AI if all you're doing is:
> Prepare for challenging conversations by using ChatGPT to simulate potential scenarios, helping you approach interpersonal dynamics with empathy and grace.
Then what the absolute heck are you doing.
Stop gaslighting everyone.
Those 'advanced techniques' are all anyone cares about, because they are the things that are hard, and don't work.
In reality, it doesn't matter how much time you spend learning; the technology is fundamentally limited. It can't do some things.
Spending time learning how to do trivial things will never enable you to do hard things.
It's not missing the 'human touch'.
It's the crazy hallucinations, invalid logic, failure to do as told, flat out incorrect information or citations, inability to perform a task (eg. as an agent) without messing some other thing up.
There are a few techniques that can help you have an effective workflow; but seriously, if you're a skeptic about AI, spending a month doing trivial stuff like asking for '10 ideas about X' is an insult to your intelligence and doesn't address any of the concerns that, I would argue, skeptics and real people actually have about AI.
> This is a problematic approach that many people take; they posit that
It’s like the people who think that everyone who opposes cryptocurrencies only do so because they are jealous they didn’t invest early.
Let’s take vim and emacs, or bash. People do not spend years on them only for pleasure or fun, it’s because they’re trying to eliminate tedious aspects of their previous workflows.
That’s the function of a tool. To help do something in a more relaxed manner. Learning to use it can take some time, but the acquired proficiency will compensate for that.
General public LLMs have been there for two years, and still today, there are no concrete uses cases that can have the definition of tools. It’s trust me bro! and warnings in small print.
> there are no concrete uses cases that can have the definition of tools
There are some, but you won't like them. Three big examples:
a) Automating human interactions. (E.g., "write some birthday wishes for my coworker".)
b) Offensive jokes and memes.
c) Autogenerated NPC's for role-playing games.
So, generally things that don't require actual intelligence. (Weird that empathy is the first thing we managed to automate away with "AI".)