If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
Hallelujah, that's always been my position. To the static typing folks: leave my dynamically typed languages alone and go coding with something that really suit your needs. If the answer is that Python, Ruby, JS, whatever are really much more pleasant to code with, my reply is that they are so precisely because we don't have to type type definitions. Tradeoffs.
I think types are particularly valuable for libraries. A library author using copious types really helps the downstream user to know "Ok, this function returns a dict(Foo, Bar)". But after that, it's a matter of preference if you want to add those types to your own code or not.
Having the types in the libraries makes it a lot easier for your tools/IDEs to give good suggestions and catch bugs that you might otherwise miss.
Part of it is due to the clunky `_NoArg.NO_ARG` business for optional params. Pretty-printing it would also go a long way, but that technology seems too advanced for any language circa 2026.
This is a big part of the reason that I've embraced ths sqlc (d/re)evolution.
Writing queries in sql and then generating for the target language also provides a flexibility that has reduced rewrite cost. Add to this ease of organization and layoit, and I'm not going back.
It's probably hard to come up with something messier than SqlAlchemy here. Not an expert, but spent more than enough time spelunking queries in the debugger. I much prefer bugs that can be surfaced at compile-time rather than run-time.
I do think it is somewhat of an all or nothing thing. I can write dynamic languages, sure; I prefer having static types, but I have written a lot of dynamically typed code. However if I'm working in an editor with LSP integration, the experience is much worse when some things are missing types.
As an example, I may have a variable with types:
const something = somelibrary.getSomething();
and I can type `something.` and see methods and properties. However, if my own code doesn't use types consistently, it's so easy to lose type info. For example:
function doSomethingWith(something) {
...
}
doSomethingWith(someLibrary.getSomething()
or any number of other patterns which accidentally strips type info from the variable if you don't use types everywhere.
I would much rather have a language where the compiler complains if some variable doesn't have a static type, than a language where I can accidentally leave something untyped. I don't understand which case I would want a variable or function to not have associated static type information.
It kind of is? All the partial-typing systems are too complex and usually broken in various ways. Compare to eg Elm or Gleam which are typed and super simple.
It kind of isn't. We are talking about using types in type optional languages. We aren't talking about the quality of those type systems or whether or not they are good type systems.
If I was comparing type systems then it'd be relevant to talk about statically typed languages like Elm or Gleam.
This is even worse because you attempt to try to sell why types SOMETIMES make sense. But you aim with this for a language that did not have nor need types to begin with. People don't seem to understand that this is an issue.
The library-situation is really not different from having types everywhere, and some people will do that too.
> catch bugs that you might otherwise miss.
People repeat this a lot. In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types. I don't understand why people keep on repeating this. Repetition does not make it anymore true.
Think in the opposite way: if types would have been necessary to begin with, why would ruby have been successful back in 2006? It was successful without types already. And types were never needed - they came because some people THINK they are needed. This is the biggest problem - the thinking part. They think they are right and all who do not use types, must be wrong and very foolish people.
Have you considered these people in general aren't some outsiders out to attack you or your favorite language?
The people who do end up making and using type checkers are people who have or are actively using these dynamic languages and found out that they CAN help THEM with preventing bugs.
Also, really? 22 years in which not one type-related error happened? Never? I don't want to say I don't believe you, but I really don't.
> In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types
You must be the world's greatest programmer with perfect memory. Every nil pointer exception is a bug a (good) type checker could have caught. You've never had a NameError or NoMethodError in Ruby?
SRE here, I've had multiple outages caused by lack of typing in both Ruby and Python where bad types get passed, something doesn't catch it and either data corruption or constant crashes. Couple cost us big money because it screwed up billing and we were forced to eat the billing cost.
>In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types.
I've definitely ran into that although much less common at places with good test discipline.
I think the related and often conflated problem is errors caught by compilers which you don't hit til runtime in Ruby/Python without good test coverage. For example, referencing an undefined variable
This is perhaps the least believable comment I have seen on HN, ever. It would be more believable for someone using C to say "In about 22 years of writing C code. I have never ran into a memory bug".
Avoiding "memory bugs" in C is trivial, but tedious, so too many C programmers fail to use an appropriate programming style. Nonetheless, there are some who have never encountered a "memory bug" in programs written by them.
I agree that a programming language should enforce such features, instead of counting on competent programmers.
Personally I like having my TypeScript cake and eating it.
I also truly believe those who design type systems would benefit from taking a look what kind of code people programming in dynamically-typed languages produce.
I do too, but I feel like TypeScript stands alone as an unusually effective and pleasant to use bolted-on type system. I've not seen any other approach come close. (My sample size is Python, Ruby and Elixir)
I really like PHP's type hints (I think they were the first I used) though it's somewhat limited (can't type hint complex/nested structures last time I checked).
Flow for Javascript was okay but Typescript I've found to be much nicer (last used flow years ago but occasionally I'd encounter bugs in Flow).
you don't think the elixir type system is effective? I've never seen a bolted-on type system get so much acceptance from the hardcore "you can add types into my dead hands" crowd
I find it funnysad that python people coined the phrase duck typing and then ended up designing what they have now. Meanwhile TS manages to embody duck typing far better even though coming from very different background.
Does Python needs its own TypeScript moment? Many times, while writing Python and deeply frustrated with its weak(er) type system, I have dreamed of something like TypeScript or VB/VBA from the early 2000s (where the type system was surprisingly strict!). However, there are so many Python libraries written in pure C, it is way harder to create a TypeScript equivalent.
Could you point me towards the kind of code people programming in dynamically-typed languages produce?
I have lived in statically typed languages almost all of my life, and even when I don't, I pretend I do, just without having a typechecker. So I'm very curious about what I'm missing.
I hate TS's tooling with a burning, deep passion. But its type system is actually pretty incredible for what it is.
There are times that I yearn for TS's ability to do duck type reasoning in e.g. Rust (despite that not being feasible) when working with very large data types.
The only reason I gave up resisting and started writing any significant code in Python at all was that it got some kind of type system, and thus became less unpleasant to code with.
"Pleasant to code with" does not describe getting "AttributeError: 'NoneType' object has no attribute 'foo'" 25 levels deep in a stack trace already obfuscated by dynamic object-oriented nonsense. In production, because it's an unusual case and testing missed it. Not that test cases aren't way more work than types anyway.
> To the static typing folks: leave my dynamically typed languages alone
Surely you understand that the push to add types to dynamically-typed languages comes from dynamic-typing folks, not from static-typing folks. People who are deeply into static typing have little incentive to consider e.g. Python, whose support for types is relatively weak, loosely-defined, and rarely-enforced compared to the statically-typed languages that exist today.
Let me paraphrase the summary of Eloquent Ruby by Russ Olsen:
It’s easy to write correct Ruby code, but to gain the fluency needed to
write great Ruby code, you must go beyond syntax and absorb the “Ruby way”
of thinking and problem solving
> People who are deeply into static typing have little incentive
Except when their boss tell them to use Python, or they rely on one of Python libraries that their pet language couldn't provide via its powerful type system.
My engineers write better code when we enforce types.
It's easier to do this then retrain everyone on Go and rewrite all our code.
New stuff is often in Go now, but prototyping quickly in Python and then enforcing types when we have to get it ready for production has been working decently
I am totally with you and I am glad I am not the only one who is totally against those type-addictions leaking into languages that did not need them in the first place.
Types in ruby are even worse than in python, because the type systems in use really make ruby turn very ugly. In python it is not as much as a huge problem with regards to syntax, as python has a stricter syntax (e. g. mandating foo.bar() whereas in ruby you can typically omit the (), among other syntax sugar examples).
We need to keep the type people out of those languages.
Many years ago, on IRC, on #haskell, they said they don't want everyone to use Haskell. Back then I did not understand it. After the type-addicted people emerged out of nowhere, I now begin to understand why Haskell is so snobbish. If you let every idea float, you end up ruining languages - and then those who wanted this, will retire and move away too. Ultimate damage factor caused as outcome here.
So far I have been avoiding Pydantic as a huge-ass dependency. Instead I am relying on standard type annotations, lots of typed dicts and at service/program boundaries use a jsonschema. I like being able to specify the type of most functions, and get some hints, completions and so on, but I don't want to _have to_ specify every darn type. I also don't want to write a class for everything. Typing dicts is good and usually sufficient. If I wanted to write types for everything, then I could also just write Java or Rust or similar.
Unfortunately, I think the kingdom of nouns faction has long invaded the Python world and I see more and more companies demanding Pydantic and similar things. They are dragging us all the way to Java land, it seems.
I tend to get triggered when TypeScript is painted as “JS with type hints”. Coming from Python background, TS and Python with type hints are just so different.
With Python I can’t see myself type-annotating everything (or bringing in pydantic anymore for that matter, it is indeed becoming a blight), but with TypeScript my process is turned on its head: I find it natural and easy to start writing with types and have everything fully typed, and I find the fact that it simply won’t compile if anything is off (compared to Python where it’s more like “one of my N type checkers/linters failed, oh well it still runs though) a useful constraint that gives peace of mind.
I started using types with Python in 2018-ish, and I never looked back.
I am not that good a programmer, so maybe I am wrong, but I just like being able to tell what the data is that's moving through the system. Typed function signatures, a little shift+k here and there, a warning that I am trying to add int and a string. I don't see what's the harm in having that?
At the end of the day, if you don't want to use Python with types -- do not. Unless somebody at work is forcing you, and it feels like putting lipstick on a pig (especially with something like numpy that doesn't easily support types)? Then condolences.
For personal projects, I don't want to learn Rust just so I can do `def add(a: int, b: int) -> int`.
For work, I don't really get a choice. I work on brownfield projects. We do use TypeScript, thankfully, for all the browser bits. But nobody is going to stop to refactor a 5 year old production code base from Python to Go just for better types. And -- pepega -- definitely not our codebase that's full of data sciency stuff (numpy/opencv/pandas). So we live with a not-as-good-as-it-could-have-been type system.
Meh, the amount of effort required to keep up to date with the python ecosystem churn is around the same as learning rust. More so if you are starting from scratch.
I quit python after realizing the amount of effort it required to just implement the tooling for a project… when all of that comes included with rust. I have spent maybe an hour in the last year thinking about tooling. Glorious.
But yeah, I feel for you. It is an impossible sell when they pay off is impossible to understand without a Time Machine and the only thing known about the cost is that it’s high. But for new people and projects, I can’t imagine starting with python in 2026.
Which Python tooling? I know that uv is replacing pip but all of my costumers' projects still use pip. One of them installed python with asdf. I can't think about any other tool we are using except Claude, but I don't think that's the kind of tool we are writing about. We deploy with a custom bash script resembling Ruby's Capistrano. Those projects are web apps with server generated HTML.
Historically (80s/90s) I started using Perl because I didn't have to write all those malloc and free I spent years writing in C and I could perform string operations much more easily. Then in the mid 90s because of that wonderful CGI.pm Perl module. But the plus of all those languages, and Java, was managed memory. Then in the mid 2000 I learned Rails, and after Rails I learned Ruby. It was like Perl but much easier to understand and again no types to type. Basically what I did in Java but in a fraction of the time and in a fraction of the lines of code. Then a customer asked me to work with Python on a Django app so I learned Python. It looks like a Ruby designed by Klingons but it's OKish.
All those bugs I constantly read about, they don't happen very often and are a good tradeoff. Maybe Rails and by Django are shielding me from some bug scenarios.
I'm not trying to be cute here, but it seems like you have mostly been using scripting languages without static typing. How do you know what it is like to develop in languages with strong typing that produce binaries and that these are unpleasant to use?
And dabbling doesn't really count. It takes time to actually learn a language. Much longer than most people are willing to admit.
Typing "type definitions" makes you type less, not more, because you type the definition only once, instead of writing many tests wherever values of that type are used.
In a decent programming language, one would frequently avoid the need to declare the type of a variable, whenever the type can be deduced from the value used to initialize the variable.
Having types in a language has 2 purposes, one is to enable the compiler to check at compile time or at run type that all the subsequent uses of an identifier after its first occurrence are consistent.
I cannot imagine which are the benefits for the programmers who are against this rule, i.e. who want to reuse the same identifier for multiple purposes in the same scope (N.B. reusing an identifier in the same scope has nothing to do with data types that are disjoint unions or virtual types, which can be used in any type-enforcing language, or with reusing the same identifier in different scopes).
The second purpose of data types is that when the type of a variable or parameter is known at compile-time, that allows more efficient implementations, which are especially important for aggregated data, e.g. arrays.
Again, I also do not understand why anyone would want to have inefficient data representations, to avoid the need of data types.
There has been some argument that when a language does not use data types you might avoid having to rewrite some library functions if you want to change the parameter types at invocation. However this is a problem that has been better solved for more than a half of century, by providing various means to write generic functions that can be specialized at compile-time or by using disjoint union types or virtual types, for using the same function for many different data types, while still ensuring that other data types, whose use would be erroneous, are rejected.
A language without data types saves writing effort only when the programmer omits the run-type checks for correct values, which would be needed to avoid bugs when such checks are not done automatically by the compiler.
I agree that several very popular programming languages with type checking, including C and C++, are very poor examples about how a type system should be implemented, because they require the writing of a great amount of superfluous boilerplate that is completely unnecessary (e.g. writing headers with function declarations instead of extracting automatically the interfaces of a package a.k.a. module or writing explicit type names in a lot of places where they can be deduced automatically from the context).
Such languages are strawmen in a discussion about whether a language must enforce type checking or not.
God, I hope you never touch a production code base with real users. Strong, static typing has won. It is a must for serious software development. The time and money saved in stupid errors that are caught and avoided before the software even runs is enormous. For internal tools and one-off scripts, sure, go nuts with dynamic stuff, but there is no reason to use a dynamic language for code paths that actually make the business money. If you cannot specify precisely the type of every piece of data you touch and all the operations that can be performed on it, you're not doing software engineering, you're spitballing. And if you want to spitball, LLMs are here and they're great. You can type in a very loose description indeed and get properly typed Rust out the other end.
Nothing can beat the Python numpy/ML ecosystem. There's a lot of value in just being able to run a Python script as well without any compilation step. The typing isn't perfect right now but it's usable.
For vectorizable problems there also won't be huge performance gains from switching to a compiled language because all the hard stuff is already done in highly optimized native code. The only time it really makes a difference is if you have to write a custom for loop or traversal.
Running more type checkers isn't really about strictness. The main benefit to library maintainers is to make sure that their APIs are compatible with whatever tools their users run.
This wouldn't really be an issue for most other languages, but Python's typing ecosystem is uniquely fragmented, with only partial standardization between several popular tools.
GP's point is obvious: performance is immaterial to the discussion. Static code analysis is about preventing bugs. Therefore OP fails to make any sort of point, as it's a straw man argument.
I think those of us who work in compiled languages are just snooty about them.
I'm a compiled language snoot, and happen to be working over the past couple days in typed Python for the first time. It's kind of nice. I like it. It's a huge improvement for me over ordinary Python/Ruby/Javascript; it materially improves the experience of working in the language.
> happen to be working over the past couple days in typed Python for the first time. It's kind of nice. I like it.
I like me a good type system and have always hated about everything about types in Python. What do you find nice and like about it?
(My experience with Python: all the type checkers are broken, there are false positives and false negatives everywhere. The LSPs are likewise broken, I have not found one that knew the types at least somewhat reliably...)
Lack of typing is my biggest problem with Python, Ruby, and ES6 Javascript; I have to write everything twice, once to do the stuff I want, and once to double check that it's actually doing stuff, because a single typo blow the program up despite it parsing fine.
Python typing is easy to dip in and out of. It handles None nicely; not as nicely as a true Optional, but enough for daily driving. The annotations are readable and simple. What more could I ask for, without asking for an entirely different language? Python typing catches a lot of bugs I'd otherwise have to tediously unit-test for.
The only thing I don't like about it is that it feels like it relies a lot on importing stuff from the swamp of the Python stdlib.
I believe you are right to point out how you feel about some of these features of a particular language, and to let that guide your decision in how and whether you use them.
It does not, however, say anything about the actual productivity gain or loss of using types in a language like Python which does not require them — that should be the ultimate objective measure of whether they make sense or not.
With most languages, I get annoyed if I need to create separate types for every variant of a basic type (eg. let's have a firstNameString, familyNameString, CountryCodeString, CountryNameString... when is it too much?) - I do not think there is any way someone can prove going this deep improves maintainability long term. Eg. imagine you introduced validation of CountryCodeString based on ISO-3166, and there has been a change to ISO-3166 which happen every couple of years — how do you start supporting new codes? How do you deprecate and remove old ones? All of those are not helped with a type being very strict, you still have the persistent data to worry about, code actually supporting any of those, etc — the basic type check is trivial with a couple of small unit tests in comparison. You also quickly venture into a territory of complex types with complex interaction rules (this subvalue can be one of A if another one is X; but B if another is Y).
So for me covering the basic invariants with a unit test is not much more effort and — especially with Python — does not stop one from refactoring effectively and building stable, long-running systems.
Really, complex relationships in data are complex, and encoding it in a declarative way using a complex schema does not guarantee correctness (see Pydantic); if you want just very basic data conformance (type) checking, it's mostly a question of ergonomics.
Basically, you need to keep to some principles of code structure and architecture, but they are a simple set of principles — perhaps the fact that most Python projects do not abide by these should be a knock against Python? I only attribute it to the approachability of Python, but I am open to being wrong and this being the latent forced idiomatic use that projects always evolve into?
What I like about Python types is that they accommodate both styles of programming. I happen to be completely sold on at least some baseline level of type safety (I don't need my type system to be a complete modeling toolkit for everything in my problem domain, just enough for basic sanity), but if you're an old-school Python type, you and I can work on the same codebase without types ruining your life.
The modern approach seems to be to require full typing on items seen from outside the function or object. Within functions, have the compiler infer as much as it can.
Newer languages (Go, Rust) seem to be converging on this approach.
Function parameters need type info as guidance for people and LLMs calling the function. Even though cross-function type inference is technically possible, it's too confusing. Long-distance inference failures tend to generate poor messages.
Within a function, if you have typed parameters, the type inference engine has a local starting point and a good chance of success on most local variables.
Unchecked advisory typing in Python was a terrible idea. All the work of writing type declarations with none of the benefits.
> If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
What statically typed language would you suggest for machine learning and large data pipelines? I don't love Python, but it has by far the largest ecosystem.
It’s still dynamic in nature. But you can tune how much staticity you want. The spectrum goes from Python to C in terms of staticity. And with tools like JETLS.jl maturing you get a lot of the benefits static analysis.
The data pipeline ecosystem is starting to rival that of R and Python. The fact that you can just use Julia functions while keeping the performance allows you to avoid those weird vectorization gymnastics. The ML ecosystem is also in a great state. JUMP.jl, Touring.jl, the whole SciML ecosystem, autodiffing and gpu computing are all close to best in class in terms of quality.
The NN side of ML is a a bit weaker, but just for lack of developer time investment into that side of the ecosystem.
I use Julia! I like it a lot, and add type parameters to all application code. But JET.jl does not feel anywhere close to the assurances I can get from a statically typed language (yet)
It is brutal. I can say with first hand experience: The APIs for Pandas and NumPy are awful and insanely dynamic. As a result, it is frequently difficult to know what is allowed with calling a method. It is exhausting. Since many methods are "hyper-dynamic", many of the error messages are unhelpful.
Well, that's the curse of machine learning: since everyone uses Python you have to deal with Python. Even though Python isn't very nice when things start to get serious and you don't want to spend your time fiddling with noise just to make something work at scale.
I'd wish the ML/AI/LLM crowd would see that it is in their interest to get better developer ergonomics at scale. (I don't want to have to turn to C++)
The ML/AI ecosystem is a minefield, and pure Rust rewrites (Candle, Burn, ...) are still immature and incomplete. But I'm pretty sure we're eventually going to see the same uptake that's already happening in the data processing world.
The performance is not the (only) issue. The issue is the death by a thousand cuts involved in distributing Python programs without a two page set of instructions that have to be followed to make it work. It is rarely "just works" unless you can make a lot of assumptions about the environment it runs in. It's why I generally steer away from any application written in Python. It is going to be painful.
However my experience might be a bit different since I actually have to deal with Python at scale and in a fairly dynamic environment.
I need to run hundreds of Python programs, written by dozens of programmers, over many years, that speaks to custom hardware, runs on a remote site, in a production environment that has to work and with new versions of things coming in all the time. Some of these Python programs not only link with C libraries, but run external binaries because developers didn't have time to integrate them as libraries because it takes forever to make it work on all the different os/arch combinations (easier to just run the C code in subprocesses).
This runs on three different CPU architectures (we're trying to eliminate one of them), two different operating systems and a pretty wide mix of hardware and system configurations I need to insulate Python from. Much of the hardware being custom built stuff. Because Python has a lot of exposed surface to the OS compared to a statically linked binary. (Roughly 100x the surface of statically linked binaries that don't link with libc -- which is evident by the insanely bloated OCI images that result from packaging what you need to run)
Modern compiled languages that have sorted toolchains makes it pretty easy to produce "production grade" os/arch specific binaries that can survive almost everywhere. You compile build a statically linked binary for each architecture to overcome the challenges of varied Linux runtime environments (see Linus T's frustrations with Linux and software distribution - it's not like it is easy to begin with). Go and Rust do this well.
So you end up having to containerize everything in ephemeral containers to lock down the execution environment while retaining some speed. But of course it isn't that simple, because if you depend on access to weird hardware and/or you run on custom built machines you have to detect this and ensure the application inside the container gets access to the things it needs from the container. So you have to fix that.
In a way that is almost completely invisible to the developer.
All of this has to be understandable and _reduce_ complexity for developers and operators so at the very least you don't follow the Python philosophy of "just throw another layer of complexity on it and make the instructions another page longer".
40-50kLOC later (in Go and Python, I have lost count) of code to try to make the problems go away, and I have something that is on the verge of actually being usable in a production environment for taming wayward Python code.
The easiest fix? If people could stop using Python because they don't want to learn a language that can produce something that is easier to distribute to users.
Believe me, I have spent months now trying to make Python work properly in a challenging environment. The only way this "worked" before was by just lowering standards to where the definition of "works" is flexible enough to count daily dumpster fires as "nominal". And of course people don't care. Python fosters a "it works for me" mentality where people don't know and don't care what it is like to be on the receiving end.
90% of problems I have because of Python would just disappear if people used languages that can produce robust binaries with limited exposure to system peculiarities. But that kind of requires people to understand why it is a problem in the first place. And people generally don't bother to know.
No, you are right about it not being limited to Python. But for python the common courtesies I am used to right out of the box tend to require extra effort on part of the programmer. And «extra» doesn’t usually happen.
Even C, with its ancient, haphazard, ugly, fragile, awkward toolchain, can often trivially produce binaries that will just work with very little effort.
I have spent decades of my life writing tooling, libraries and infrastructure. And no matter where you go, developers only do the bare minimum if they can get away with it. That doesn’t mean they are bad people. It means tools and infrastructure has to be designed with acute awareness of reality.
Python has been around for 35 years. And it still hasn’t evolved things we should take for granted today despite its increase in adoption. To me that’s pretty fucking awful project governance.
Cython is a niche language for writing perf-critical bits inside your Python codebase. It's like C for people who don't want to learn C. At least that's how I treated it, when I had to write some stuff to make some numpy ops faster.
Cython is not in any real sense a replacement for a modern data/ml stack.
However, I think an ML designed for machine learning would be nice, especially if the type system is extended to multidimensional arrays shapes. Pattern matching on array shapes would be rather nice. Ocaml style interactive mode for exploration and compiling for performance would be nice too.
LLMs are leveling the developer experience and productivity in a way that makes Python's strengths almost irrelevant, while it's still suffering from bad tooling (even with uv and friends) and poor performance.
AI/ML: interfacing with C++ libraries directly (or in Rust) is now a real option. For everything else, even 5 years ago I wouldn't have used Python, now there are even fewer reasons to do so. As far as I'm concerned the remaining use cases are notebooks and one-shot scripts.
> With writing code in english now, why have it use a slow weak language?
Because the feedback loop of writing few lines of Python inside Jupyter cell is much shorter than with your currently favorite AI tool. It costs less too.
> What statically typed language would you suggest for machine learning and large data pipelines? I don't love Python, but it has by far the largest ecosystem.
Pay no attention to OP. It's nonsensical to even suggest you should migrate away from a whole tech stack just because you want to run static code analysis, specially when the argument is based on having too many static analysis tools to chose from. Utter nonsense.
Yes, but unfortunately Python has invaded everything, and one must adapt.
Python is going to be preinstalled on almost any machine I use, with a reasonable assortment of libraries. And even if they're not preinstalled, the libraries I want are likely to exist. They'll have unstable APIs and weird quirks, and I'll have to take my choice of bad packaging systems to install them, and everything will just generally be a pain, but they'll exist and largely work. That's not true for any language I actually want to code in. I mean, I'm not going to deny that Python is better than shell scripts or (usually) C.
It's not like it's a pleasant language to code in, especially if you actually want to use the type support, which is weird and irregular and keeps changing and has to work around fundamental design problems at the core of the language.
strict type checking is an incredibly useful tool for cases when you really want to make sure your code is correct and behaving as expected (one of many tools).
There are lots of people who like python and want to use it for things that where incorrect code has serious consequences. Type checking is helpful in these contexts.
Type checking remains optional for the masses and is not practical in many cases. Still, pushing away people who want to use all available tools for writing correct python only hurts the community.
That is why I'm using C# and Rust more now than Python. You get far better RoI on types. and they are so much faster and can use all cores so much more easily.
> If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
I don't understand your question. The whole point of static code analysis is preventing bugs. Don't you like Python code to not have bugs that are easily caught with static code analysis, or is preventing code a foreign idea that is better left to other languages?
Yes, but using five type checkers on Python is the equivalent of buying a Nissan Leaf and trying to turn it into a Lamborghini by adding an internal combustion engine which is only supposed to produce a nice roar but no thrust.
Often, when I code in Python, it's because there are some libraries that aren't available in whatever other language would have been my first pick. Then, typing and type-checking are useful tools to stave off the codebase turning into the unruly mess that all Python projects eventually become.
Don't projects in all languages turn into unruly mess by default?
It requires special care to not let long-running projects evolve into it.
Python is only special in that it is extremely productive and allows lots of easy evolutions of a project with not a care in the world, so the timeline is probably shorter on getting to the "unruly mess" if no special care is put in to make it survive many evolutions.
Perhaps a bit special too in that it looks welcoming to the masses who have no idea how software systems evolve and thus do not even know there are some special patterns to introduce for code to survive many changes in the future. I am undecided if this is a pro or a con of the language itself.
Yeah, I can't say I really get the appeal of gradual typing. It's commented/documented code at best and outright lies at worst. Sure, you can build tooling around it and improve your DX a bit but isn't it always a house of cards?
If you add one of these type checkers into your CI or a pre-commit hook, it provides the same guarantees you get from a compiler along with the same tooling benefits. It gives you the option of using the structure when you need it, but not being forced to use it when you want to take advantage of some of the more dynamic features of the language.
But none of the runtime benefits of having static types in the compiler, since the runtime still can't trust the types. Still, half a loaf is better than none.
Python has other, bigger problems that make it a constant headache. One of them being the dismissive attitude towards any and all of problems that come from versioning, dependencies and quirks that make it challenging to have robustness.
Criticisms are typically dismissed by suggesting heaping yet another "solution" onto the growing pile of "solutions" that you have to drag around with you. That people have to learn. That you have to install tooling for. That has to be vetted. That has to become part of the toolbox to get even seemingly simple things done. This attitude is a big part of the reason that I strongly advise people against using Python in production. On top of all the problems presented in a real-world setting.
Almost all of the time, people who are fond of Python are more interested in defending python, disparaging me, downvoting me etc that listen to why I make that recommendation.
(I get it. People like Python. What I think of Python as a language is irrelevant. In fact I don't have that much against it. But I do have a lot against it in a setting where you need reliability and repeatability)
I have spent the last month of my life building a system that can run Python tooling reliably in a business critical application. I knew this was going to be a pretty big job when I started, but for every problem I solve, a bunch of new problems arise. I am starting to see light at the end of the tunnel but it hasn't exactly been smooth sailing. I'm almost there for a first version, but there are a bunch of problems still to solve. Mostly because I care about developer ergonomics and that things should "just work". One important goal is that my solution shouldn't impose any significant cognitive burden on people who use it. That's really hard.
(I don't think the solution will be open source since my contract wouldn't allow for it. But I'll make the case at some point for why it should be open sourced)
And yes. There are statically typed languages available today that have decent tooling that provides superior developer ergonomics. I can understand that people don't want to learn new languages, but if you have the capacity to do so I would recommend trying to move on if the code you write has to run outside your own workstation. If an old fart like me can learn and adopt new languages, so can you.
Python the language is pretty nice. It has its warts, but I make my living in PHP which is practically made of warts. But the python ecosystem still seems to be trying to figure out this whole package management and project setup thing. In most languages I can do some form of `$blub install` where $blub is the language's official package manager or some close equivalent. It's just python that always screams at me that I have to set up and "enter" a virtualenv. I get what venv is for, but it's still a weird hack of hardlinks and relative paths that no other language seems to need, and a clumsy two-step dance of a UX that hasn't improved in like 20 years.
I am not sure you get what virtualenvs are: Python is never screaming at you to set up a virtualenv, it must be a particular package recommending use of virtualenv for easy set up without interfering with the rest of your system.
Virtualenv allows you to seamlessly run multiple Python ecosystems simultaneously, even within the same project directory. It's basically primitive containerisation mechanism that predates any actual containerisation systems on Linux.
You do not have to use it, but then you can easily slip into a sort of "DLL hell" (multiple incompatible library versions installed system-wide) with multiple projects — or need to bundle all dependencies within your project directly. None of this is specific to Python, really — any shared library system has the same challenge. How many other systems are there in active use making it as easy to use multiple incompatible versions of shared libraries per project or within the same project?
When in doubt, you can always retreat to the basics in Python world: put packages you need in a path of your choice, and point PYTHONPATH (sys.path) at it.
I thought I'd given sufficient clues that I actually do in fact know what virtualenv is. I was hip-deep into python when it was introduced, and thought it was a clever hack, but it's relying on python being compiled to find its libraries in a path relative to the python binary, so rather than use something sensible like a launcher that sets PYTHONPATH to a local dir like every other language does, it hardlinks the python install into the local dir in order to pretend everything is a system-wide global install. I always figured that hack was a temporary workaround, but it's going on a couple decades now.
As for the screaming, I'm talking about pip, which has the unique property of just refusing to work by default unless you use --user (what most other package managers now call "global") or are in one of these pretend-global environments. Ultimately it's just a paper cut, not a show-stopper, but it points to not just a failure to address the pain points of the package ecosystem, but a seeming refusal to do so.
Usually running pip as root overcame that problem — but it did cause all the other problems virtualenvs were introduced to work around (so I am not suggesting this as the solution).
You are right that there are many problems with packaging in Python, and yet I feel like virtualenvs are the smallest of those.
I believe we also need to compare tech this old to tech from the same era: obviously newer ecosystems had the benefit of hindsight, but how does managing dependencies compare between Perl and Python, for instance?
The biggest problem with Python packaging is — IMO at least — that it is actually attracting so many evolutions and proposals that ot is hard to stay on top of if you do not make Python packaging your core interest.
For managing dependencies in Perl, it was originally a similar story to Python: everything was system-installed, but many people would install things to their home dir and set PERL5LIB in their .bashrc. The cpan client was smart enough to detect and use the home install when writing its initial config, so you could call it a day. Later there was local::lib which fiddled @INC for the use of a project-local directory, and cpanminus defaulted to using it, and then Carton came around which is more or less a clone of Bundler from Ruby, also using local::lib under the covers.
What are the problems you have with tooling? Imo it's no worse than most other languages besides a very small handful of recent ones (rust, go) where everything is included
The easy approach is usually just throw it in an OCI container
There's not much concrete to go on here besides "I don't like the ergonomics"
Throwing it in an OCI container is not the "easy" approach. It is the beginning of opening a can of worms. I know, because I've been developing tooling to do that at scale for the last few months and behind every layer of worms there are uglier and bigger worms that you need to deal with.
And yeah, I have written a lot of code to insulate Python in containers while allowing meaningful access to hardware and services. While at the same time not heaping more complexity and cognitive surface at the developer. Including writing my own container software to actually understand what's involved at a more detailed level so I know what I'm doing when trying to make this work with existing container software. (No, I don't run any of my container managers in production since I don't want to maintain it -- but this also means a bit more complexity in using existing ones)
It may be "easy" in trivial cases, but it is very far from easy if you want to make something that can cater to a wide range of scenarios.
Do you have examples of issues? OCI is tarball of dependencies that doesn't fight with the OS userland.
Why do you need to write code to insulate Python in containers?
At the simplest level, you can add the flags to the container runtime (network host, host ipc, host process namespace) to turn off all the namespacing besides filesystem and the Python container runs just like a non containerized process.
And even there most of the custom code is just running a ton of combinations of inputs against docker build. The OCI container gets rid of "wide range of scenarios" for you standardizing the runtime environment
The fact that this article seems to honestly recommend people run 5 different type checkers on library test suits really reflects the tacked on feeling of Python typing.
I am not sure it is recommending more than it is commenting on the current state of developing public-facing APIs in Python.
The downstream users that import the package either have to ignore checking its exported types altogether, manually stub it, or have a subpar development experience to varying degrees.
This is something I saw the other day with some package that provided comprehensive stubs for an untyped library. The .pyi file was littered with comments about quirks from the numerous type checkers (five now).
It's ridiculous. They should have made it an explicit part of the language. The interpreter knows about types already, it's crazy that they couldn't just let the user make the types explicit rather than implicit, and have the interpreter enforce that.
The interpreter knows types at runtime, not at parse/compile time. The interpreter already does a lot of dynamic type checking. It has a much stricter type system than e.g JavaScript; JavaScript will pretty much always convert operands to produce some result (even if it's just NaN or the string "object Object"), while Python will often just give you a type error.
The interpreter doesn't know about static types.
I agree that they should've made typing more a proper part of the language and not left it in this weird half-defined state of "standard syntax and some standard typing imports but undefined semantics". But it's not just a matter of enforcing existing types.
No, it reflects the nature of misunderstanding Python by people who think their system is better, have no idea how Python in production actually works, and just publish things like the article to make themselves feel better.
Typing is not a huge issue, period. In Python, if you pass a wrong type to something, program just throws exceptions. Exceptions are not the end of the world like people make it seem. Functionally, finding errors during the process of taking code and compiling it with type checking is no different than taking code and just running it against a set of tests, which every production code has (or should have)
The only waytyping ever saves you from it is by being absolutely strict - every type defined has a finite range of values, and every operation has bounded domain and range. I.e if you have a string field, its not enough that its a string, you also must define the total number of characters that string can have, and values for each character, along with more complex rules on sequences of characters.
If you have this system, (something like Coq comes close), then if your program compiles, its by definition correct. But even the strongest proponents of typing don't really want to do this, because they realize how long it would take to write code.
The simple truth is that Python is easy and flexible enough to work in that you don't even need type checking. An LLM can effectively function as a type checker for you if you care enough. For any errors that you encounter due to lack of typing, its ultimately way faster to fix with Python than it is to spend time writing strongly typed language.
> "In Python, any method __eq__ is expected to return bool, and if it doesn't, then we need to explicitly tell type-checkers to ignore the type error. This function in Polars can also return different types depending on the inputs, thus requiring overloads."
Why would you ever want a == b to not return a bool??
EDIT: Yes, I understand that you can do element-wise equality checks on numpy arrays now
There are examples like ORM query builders (something like `User.id == user_id` should not return a boolean, but rather some inspectable query part), multi-value comparisons (e.g. numpy arrays and views which could also be used as masks for indexing)
In general, when you get your hands on operator overloading you get a bunch of various quirky applications for each. Some dunder methods have strict runtime-level rules (e.g. __hash__ or __len__), some don't
That's one of the things I truly didn't get from my (very limited) experience with SQLAlchemy. Why not just have a method like orm.eq(User.id, user_id)? Much more readable.
Elementwise equality! Given two dataframe columns or ndarrays, users often expect `==` to give out a column or ndarrays of bools (like `+`, ``, `*, `&`, and just about every other binary operator).
What I love about operator overloading is that now you can't use operators without looking at their definition, in which case.. you could have done numpy.equals(a, b) anyway.
Does a == b true, if all elements are the same? Does it return an array of booleans? It's anyone's guess!
The function call approach can be a lot less readable.
Consider using Shamir secret sharing to share a secret, D, among several people with two people required to recover the secret. D is a positive integer, such as a randomly generated 128 bit AES key you are using to encrypt your launch codes or credit card database.
For anyone not familiar with Shamir secret sharing what you do is pick a prime number, p, that is larger than D and another random positive integer A, that is less than p. Then give each person a pair of numbers, (i, (Ai + D) % p), where each person gets a different i (which should be a positive integer less than p...it is OK to simply use 1, 2, 3, ...). Let's let Di = (Ai + D) % p.
(This is for the case where you want any two people to be able to launch your missiles or decrypt your database. If you wanted 3 required instead of giving out (i, (Ai + D) % p) you would give out (i, (Bi^2 + Ai + D) % p) where B is a randomly chosen positive integer less than p. For 4 required add on a Ci^3 term, and so on).
Given (i, Di) and (j, Dj) and p it is possible to recover A and D.
Here's what that looks like in a language where the big int library uses an accumulator style, i.e., operations are of the form X = X op Y, where the ops are methods on the big int objects. Assume Bi and Bj are big int objects initialized from i and j, and Di and Dj are already big into objects, as is p. This particular example is using Perl. (This is very old code. Since 2002 you can add a "use bigint" pragma to Perl code and then it would look a lot more like the second Python example below).
my $A = $Dj->copy()->bsub($Di); # Dj-Di
$Di->bmul($Bj); # j*Di
$Dj->bmul($Bi); # i*Dj
$Di->bsub($Dj); # j*Di-i*Dj
$Bj->bsub($Bi); # j-i
$Bj->bmodinv($p); # (j-i)'
$Di->bmul($Bj); # (j*Di-i*Dj)*(j-i)'
$Di->bmod($p); # (j*Di-i*Dj)*(j-i)' mod p
$A->bmul($Bj); # (Dj-Di)*(j-i)'
$A->bmod($P); # (Dj-Di)*(j-i)' mod p
At this point, the recovered A is in $A and the recovered D is in $Di
Here's what it looks like in a language with the ops as function calls taking the big int objects as arguments. This example is Python without using operator overloading.
import operator as op
def recover(i, j, Di, Dj, p):
j_i_inv = pow(op.sub(j, i), -1, p)
A = op.mod(op.mul(op.sub(Dj, Di), j_i_inv), p)
D = op.mod(op.mul(op.sub(mul(j, Di), op.mul(i, Dj)), j_i_inv), p)
return A, D
Probably more readable than accumulator style. Here it is in Python using its built-in operator overloading for big ints:
def recover(i, j, Di, Dj, p):
j_i_inv = pow(j-i, -1, p)
A = ((Dj - Di) * j-i_inv ) % p
D = ((j*Di - i*Dj) * j_i_inv) % p
return A, D
I'd sure rather come across that than either of the earlier examples.
OT: this reminds me of something I started to do once but never finished. I was going to write for each language we used at work that had a big int library but that did not support operator overloading a class that implemented a big int RPN calculator. Java, for example. Then recover would look something like this:
calc = new BigRPNCalc();
calc.do(j, i, "-", p, "modinv dup");
calc.do(Dj, Di, "- *", p, "mod swap");
calc.do(j, Di, "*", i, Dj. "* - *", p, "mod");
D = calc.pop();
A = calc.pop();
But I never ended up needing big ints in any of those languages so never really got past some initial design work.
I can see the first one making sense, but why would you need a representation of equality other than "yes, these are equal" and "no, these are not equal"?
The first use case that comes to mind is if you want a DSL to build expressions that are evaluated later in some different context e.g. when using `polars`:
Well personally I’m not a fan of turning everything into an object, but if you have properties or methods that exist upon the concept of Equality you might want to encode directly onto a class. Maybe in a domain where “Equality” is an important concept, like mathematics or even something like accounting.
Could enable a different interface into approximate equality for floating point numbers: Equality.approximate(iota: float) -> bool
IIRC, SQLAlchemy overloads this to return an object that represents an equality check in SQL. Because it was returning an object, it was always evaluating to True, because of another of Python’s footguns: truthiness/falsiness. This was a decade ago, and these particular footguns were not even remotely the biggest culprits in our bug backlogs (another honorable mention includes accidentally calling a sync function in an async context, causing timeouts in unrelated endpoints and leading to cascading system failure).
It could return a vector or a deferred expression? In polars, for example, operations on `pl.col` return `Expr` objects that are used to build queries, not immediately evaluated:
df.filter(pl.col("status") == "active")
In numpy, `x == y` return a boolean vector of the same shape as x and y, comparing them element-wise.
Never understood this complaint about operator overloading.
In any language, a function called `isEqual` could wipe your hard drive and replace your wallpaper with a photo of a penguin. Therefore, letting programmers pick the names of their functions is bad? No, obviously naming things for least surprise is the programmer's responsibility.
But when it's the symbols `==` instead of an ASCII name, it's a problem in language design?
(FWIW in Javascript, being unable to override == is actually a problem when you want to use objects as Map keys)
Python never met a footgun it didn’t need to adopt. In this case, however, it’s not equality checks, but operator overloading. I was a Python developer for a decade before switching to Go and life on this side is so much better.
Operator overloading has never been an issue for me, but terminating a line with a comma creating a tuple, or white space (including new lines) between strings to concatenate have cost me days of work over the years.
I understand why those exist, but they’re pure evil.
Dynamically typed languages are going to decline with the rise of AI coding.
Statically typed languages provide the determinism necessary to efficiently anchor probabalistic coding agents.
You can throw as much type checking at dynamic languages after the fact, but youre just going to burn energy (and tokens) doing what another language gets 'for free'.
No matter your preference, programs in dynamically typed languages are still very much deterministic.
To be able to reason about the output of LLMs (though it is debatable how often will this be needed), you want the output from your imprecise human language spec to a deterministic spec (code) to be as easy to review as possible (for correctness, but mostly for any glaring errors). With proper setup, ensuring correctness of one deterministic output (Python) in comparison with another (eg. typed language like Rust) is just a deterministic run away (a test suite) that should not use any tokens from the LLM, and should have no practical differences in compute use.
Static type checking catches a whole class of programming errors for free. Writing tests costs tokens or human time, so you end up needing more code (and probably more CPU time) to achieve the same level of error-checking in a dynamic language.
Very simplistic look, IMO. I'll add another one mostly as a counterpoint (not that I believe it is strictly true, but largely yes!).
Python is a lot more expressive than other languages and has a very terse syntax, and thus requires LLM to output much fewer tokens to achieve the same job compared to other languages.
Adding a few more tests to ensure data conformity on top of what you have to do anyway with a statically typed language still results in fewer tokens overall.
The expressiveness of a language and whether it does static/dynamic type checking are orthogonal. Type inference and generics are a thing in most modern languages. I also think you're underestimating the amount of code you need to guarantee that there are no type errors in your Python code.
Does it have a terse syntax? I main F#, and when I have to work with Python I generally find myself complaining about how verbose it is. (Needing intermediate variables for what should have been a pipeline, the ceremony around parallelism, having to store constructor parameters as object fields, etc.)
It sure does in comparison with most mainstream statically typed languages — if you feel that way about Python, I wonder what you say about Java, C++ or even Rust or Go?
Checking some examples at https://learn.microsoft.com/en-us/dotnet/fsharp/tour, I'd say it's quite similar in verbosity — eg. no need to declare a module in Python since the code already lives in a file that is the module (plus one less indentation level for module level functions); inline function declaration and calling is thereabouts with F# slightly more terse (let vs lambda + spaces vs parenthesis); if-then more verbose in F# (no then in Python, just "if x:"); F# does not seem to need "return"...
In many cases you can avoid intermediate varibles: inline ifs, list comprehensions, lambdas, etc... Constructor arguments are a good point, but this is mostly about idiomatic use instead of language itself: you can simply do
def __init__(self, **kwargs):
self.args = kwargs
I'll give you one on the ceremony around concurrency, though! I have different ideas of how it should have been done to shift the cost to the language runtime instead of the developer, but alas... :)
Maybe, but AI models are better at writing python and JS than any other language. Probably because they are the most common and thus had the most code available for training.
This is probably the most intellectualism ive seen anyone put into a comment that is so very, very, obviously wrong.
Yeah, in the age of AI where the whole goal is to not have to think, type as fast as you can with misspellings, and copy paste stuff without thinking, its TOTALLY a better system to worry about the types of whatever you are feeding into llms.
The gymnastics people are putting their ops teams through in order to validate oceans of generated slop is insane. Just use Rust and half of that work goes away.
> Prioritise running as many type-checkers as possible on your test suite. Run at least one on your source code.
There are two types of tests: those that test against the public API, and those that test internal codes with various mocks and fakes. I think the vast majority of unit tests is the latter one, in which case the suggestion does not really make sense.
Why anyone would still use mypy besides legacy infrastructure is beyond me. It is dog slow as well as being the laziest of all, not catching many mistakes.
Unfortunately for Django apps switching to any alternative leads to the dreaded “wall of errors” issue. If anyone got to work this out in the past, I’d gladly take advices.
We switched our very large Django monolith codebase over to ty — the trick for us was generating stubs for Django models and having tooling keep those stubs in sync with the actual models.
Went from type checking taking ~10 minutes in CI to now taking ~15 seconds and runs on pre-commit.
Absolute game changer, I think we spent $10k in claude credits and did the entire mypy -> ty refactor in about 3 weeks.
Since we are talking about Python type-checkers: we've built a (non-AI based) type assistant for Python called RightTyper (https://github.com/RightTyper/RightTyper). Below is a brief description; a technical paper describing RightTyper is here: https://arxiv.org/abs/2507.16051, "Getting Python Types Right with RightTyper"
RightTyper is a Python tool that automatically generates type annotations for your code. It monitors your program as it runs and records the types of function arguments, return values, local variables, and class fields — with only about 25% runtime overhead. This makes it easy to integrate into your existing tests and development workflow, and lets a type checker like mypy catch type mismatches in your code.
Five different type-checkers and even type-checking projects think adding multiple "ignores" is sound code. Typescript would allow overloads without ignores, for example.
The whole type checking experience in python has disappointed me deeply and is seriously affecting my work.
I see the appeal for type-checking and yeah it has caught many bugs. But the language is quickly running blindly to the worst of all worlds in regards to typing.
1. You have to exhaustively write types in many cases where they can be obviously inferred.
2. The type checking is just a lint step. i.e. we are still paying for the duck typed typing system.
3. We no longer get to use the duck typed typing system making a lot of generic code require obscure annotation incantations to pass the lint check while it's correct python code.
My ideal typing system would be around constraints introduced by the code and completely inferred unless the user wants to tighten the constraints. i.e
Instead of
def foo(a: int, b: int) -> int:
return a + b
You would write:
def foo(a, b):
return a + b
And upon checking if you tried to do foo(5, {})
It would tell you that there is no + operator for int and dictionary that is required by the foo function.
My ideal typing system would allow you to constraint the types as well like so
def foo(a: int, b: int):
return a + b
The return type is not required in this case because it can be inferred by the function definition. For other cases it could be defined as well to constraint that we don't want None for example.
Declaring types only where one wants to introduce a constraint would be nice, but unfortunately calls with obvious wrong types like "foo(5, {})" are a minority; in most cases types in a call depend on types of the calling function's parameters, and the type checker can only deduce unknown type from unconstrained type.
Moreover, the extremely dynamic execution and lack of compilation in Python means that determining in a lint step whether "there is no + operator for int and dictionary" is essentially impossible: if you find some __add__ or __radd__, you need to trust its declaration that it accepts certain types or not and hope it's still there at runtime; if you don't find it, it might exist at runtime when operator + is actually used.
A compiled and sufficiently static language, on the other hand, can inspect source code and libraries and reason about potentially infinite sets of definitions and prove that functions exist or don't exist.
I have the sneaking suspicion that for the vast majority of code it could be done.
I don't think the operator overloading is such a big issue since you can do the same kind of tracing for the overloaded operator.
The main issues would be metaclasses and all sort of monkey patching etc. i.e. anything that could mutate a type at runtime because the linter can't know if the mutation has happened.
Unfortunately I do not have the time nor patience to introduce the sixth type checker.
You need to distinguish the prevention of actual type errors during the execution of a Python program, which would require mutilating the Python language, and linting Python code to find proven or likely defects, which is merely fraught with complications (open world and closed world assumptions, expecting more or less adversarial behaviour from callers of a function, etc.) and necessarily dependent on adding optional type declarations.
From my experience with Python, both personal and professional, I find it immature and not well-suited for large codebases. Typing should have become part of the language a long time ago; it is clear that users want it.
Take, for example, PHP… look at the features released in the last 6 or so years, starting with PHP 7, and how mature the language has become.
With the advance of AI-assisted programming, I feel like Python is always a bad choice.
I run pyright in CI and mypy locally. They catch different things - pyright is stricter on overloads, mypy catches more None issues in our codebase. Annoying, but I have not found one that covers both.
Why would users care if you're using the same type checker as them? Surely they're not expecting all their imports to be instrumented for running redundant types checks?
Users do not care about that, but they want to not see type errors or warnings when they integrate your API in their code.
That’s why you want to run their type checker on your API. you cannot know what “their type checker” is, so you want to run all popular type checkers on your API.
Sounds like a them-problem. Their type checker can accept my declared typings for my public API, or they can override it with their own custom type stubs if they have objections.
what are ppls' impression of pyrefly? i've become completely captive to uv's tooling. it has allowed me to think only about coding versus tooling. dont feel like giving another typechecker a chance unless it offer's something i'm not getting from ty.
The blog entry fits into ruby too, to some extent; while the situation is
nowhear near as bad as in python, you have the same question-marks why
types suddenly emerge out of nowhere. Almost ... almost as if some people
have a specific agenda, and try to pull through with it.
Well, there you have it - the type-addicted people are ruining python.
With agents it no longer makes sense to tie yourself to Python's archaic
development experience. How many type checkers are there? Package managers? Don't even get me started on cross-platform deployment.
Strongly typed, compiled languages have never been easier to use, and agents reap huge benefits from the tight feedback loop that the compiler provides. Moreover the benefits of the Python ecosystem are less significant today than anytime in the past 20 years. Need something that's only available in Python? Just point some agents at it and you can port it.
I've noticed this a lot in LLM generated Java. Since it doesn't know what can or can't be null it tends to wrap everything in Optional<T>. Super strong type systems are becoming even more important.
> Just point some agents at it and you can port it.
Don’t think we’re there yet, otherwise we would see a bunch of forks of major libraries to alternative languages - and not just Python. There’s still too much risk of insidious errors and bugs.
I've done thus a few times for stuff in the < 10,000 LOC space. It works great.
There's something particularly satisfying about shipping a 1-10MB static rust binary instead of a 2GiB docker python environment.
(I'm talking about just porting simple applications, or maybe a missing package/crate at a time. Not both at once, and not typical 100K-10M line internal legacy sprawl)
If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
Hallelujah, that's always been my position. To the static typing folks: leave my dynamically typed languages alone and go coding with something that really suit your needs. If the answer is that Python, Ruby, JS, whatever are really much more pleasant to code with, my reply is that they are so precisely because we don't have to type type definitions. Tradeoffs.
It's not an all or nothing thing.
I think types are particularly valuable for libraries. A library author using copious types really helps the downstream user to know "Ok, this function returns a dict(Foo, Bar)". But after that, it's a matter of preference if you want to add those types to your own code or not.
Having the types in the libraries makes it a lot easier for your tools/IDEs to give good suggestions and catch bugs that you might otherwise miss.
Yes, where would I be without the _RelationshipBackPopulatesArgument type of
It's not for you, it's for your IDE. And if you aren't using an IDE then you can pretty much ignore it anyways.
You are in exactly the same position as if you knew or didn't know that type.
If you're not using and IDE nor an LLM
[flagged]
>> I think types are particularly valuable for libraries.
> Yes, where would I be without the _RelationshipBackPopulatesArgument type of ...
(proceeds to list a signature with over 40 parameters)
You would be left wondering which of the 40+ arguments provided to a given invocation is not what was allowed without a compiler to tell you.
Have fun tracking down which one, or ones, is causing the problem.
What function signature isn't going to look messy with 36 keyword arguments.
https://github.com/sqlalchemy/sqlalchemy/blob/0798e6cbe11b30...
Part of it is due to the clunky `_NoArg.NO_ARG` business for optional params. Pretty-printing it would also go a long way, but that technology seems too advanced for any language circa 2026.
This is a big part of the reason that I've embraced ths sqlc (d/re)evolution.
Writing queries in sql and then generating for the target language also provides a flexibility that has reduced rewrite cost. Add to this ease of organization and layoit, and I'm not going back.
It's probably hard to come up with something messier than SqlAlchemy here. Not an expert, but spent more than enough time spelunking queries in the debugger. I much prefer bugs that can be surfaced at compile-time rather than run-time.
_RelationshipBackPopulatesArgument = Union[ str, PropComparator[Any], Callable[[], Union[str, PropComparator[Any]]], ]
I do think it is somewhat of an all or nothing thing. I can write dynamic languages, sure; I prefer having static types, but I have written a lot of dynamically typed code. However if I'm working in an editor with LSP integration, the experience is much worse when some things are missing types.
As an example, I may have a variable with types:
and I can type `something.` and see methods and properties. However, if my own code doesn't use types consistently, it's so easy to lose type info. For example: or: or any number of other patterns which accidentally strips type info from the variable if you don't use types everywhere.I would much rather have a language where the compiler complains if some variable doesn't have a static type, than a language where I can accidentally leave something untyped. I don't understand which case I would want a variable or function to not have associated static type information.
> It's not an all or nothing thing.
It kind of is? All the partial-typing systems are too complex and usually broken in various ways. Compare to eg Elm or Gleam which are typed and super simple.
It kind of isn't. We are talking about using types in type optional languages. We aren't talking about the quality of those type systems or whether or not they are good type systems.
If I was comparing type systems then it'd be relevant to talk about statically typed languages like Elm or Gleam.
This is even worse because you attempt to try to sell why types SOMETIMES make sense. But you aim with this for a language that did not have nor need types to begin with. People don't seem to understand that this is an issue.
The library-situation is really not different from having types everywhere, and some people will do that too.
> catch bugs that you might otherwise miss.
People repeat this a lot. In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types. I don't understand why people keep on repeating this. Repetition does not make it anymore true.
Think in the opposite way: if types would have been necessary to begin with, why would ruby have been successful back in 2006? It was successful without types already. And types were never needed - they came because some people THINK they are needed. This is the biggest problem - the thinking part. They think they are right and all who do not use types, must be wrong and very foolish people.
Have you considered these people in general aren't some outsiders out to attack you or your favorite language?
The people who do end up making and using type checkers are people who have or are actively using these dynamic languages and found out that they CAN help THEM with preventing bugs.
Also, really? 22 years in which not one type-related error happened? Never? I don't want to say I don't believe you, but I really don't.
> In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types
You must be the world's greatest programmer with perfect memory. Every nil pointer exception is a bug a (good) type checker could have caught. You've never had a NameError or NoMethodError in Ruby?
SRE here, I've had multiple outages caused by lack of typing in both Ruby and Python where bad types get passed, something doesn't catch it and either data corruption or constant crashes. Couple cost us big money because it screwed up billing and we were forced to eat the billing cost.
>In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types.
I've definitely ran into that although much less common at places with good test discipline.
I think the related and often conflated problem is errors caught by compilers which you don't hit til runtime in Ruby/Python without good test coverage. For example, referencing an undefined variable
This is perhaps the least believable comment I have seen on HN, ever. It would be more believable for someone using C to say "In about 22 years of writing C code. I have never ran into a memory bug".
This is not as uncommon as you may think.
Avoiding "memory bugs" in C is trivial, but tedious, so too many C programmers fail to use an appropriate programming style. Nonetheless, there are some who have never encountered a "memory bug" in programs written by them.
I agree that a programming language should enforce such features, instead of counting on competent programmers.
> In about 22 years of writing ruby code, I have never ran into a situation once where I would have caught a bug through types.
In 22 years you have never seen `nil` show up in places it wasn't expected? Really?
Your app didn't silently break when you upgrade rails or any other gem?
If ruby was statically typed the typechecker would have caught it.
Personally I like having my TypeScript cake and eating it.
I also truly believe those who design type systems would benefit from taking a look what kind of code people programming in dynamically-typed languages produce.
I do too, but I feel like TypeScript stands alone as an unusually effective and pleasant to use bolted-on type system. I've not seen any other approach come close. (My sample size is Python, Ruby and Elixir)
I really like PHP's type hints (I think they were the first I used) though it's somewhat limited (can't type hint complex/nested structures last time I checked).
Flow for Javascript was okay but Typescript I've found to be much nicer (last used flow years ago but occasionally I'd encounter bugs in Flow).
Python's is okay but it feels clunky.
you don't think the elixir type system is effective? I've never seen a bolted-on type system get so much acceptance from the hardcore "you can add types into my dead hands" crowd
I find it funnysad that python people coined the phrase duck typing and then ended up designing what they have now. Meanwhile TS manages to embody duck typing far better even though coming from very different background.
Does Python needs its own TypeScript moment? Many times, while writing Python and deeply frustrated with its weak(er) type system, I have dreamed of something like TypeScript or VB/VBA from the early 2000s (where the type system was surprisingly strict!). However, there are so many Python libraries written in pure C, it is way harder to create a TypeScript equivalent.
Could you point me towards the kind of code people programming in dynamically-typed languages produce?
I have lived in statically typed languages almost all of my life, and even when I don't, I pretend I do, just without having a typechecker. So I'm very curious about what I'm missing.
Any Rails app. A random one: Redmine. You can look at this file and browse the rest of the repository.
https://github.com/redmine/redmine/blob/master/app/controlle...
I hate TS's tooling with a burning, deep passion. But its type system is actually pretty incredible for what it is.
There are times that I yearn for TS's ability to do duck type reasoning in e.g. Rust (despite that not being feasible) when working with very large data types.
The only reason I gave up resisting and started writing any significant code in Python at all was that it got some kind of type system, and thus became less unpleasant to code with.
"Pleasant to code with" does not describe getting "AttributeError: 'NoneType' object has no attribute 'foo'" 25 levels deep in a stack trace already obfuscated by dynamic object-oriented nonsense. In production, because it's an unusual case and testing missed it. Not that test cases aren't way more work than types anyway.
> To the static typing folks: leave my dynamically typed languages alone
Surely you understand that the push to add types to dynamically-typed languages comes from dynamic-typing folks, not from static-typing folks. People who are deeply into static typing have little incentive to consider e.g. Python, whose support for types is relatively weak, loosely-defined, and rarely-enforced compared to the statically-typed languages that exist today.
Doesn't it come from folks that are forced to work with dynamically-typed languages but can't be arsed to understand them?
understand?
It's really easy to understand that everything is typed as Any/Object/whatever upper bound type your statically-typed language of choice uses.
Desiring something better does not mean a lack of understanding of the status quo.
Can you elaborate?
Let me paraphrase the summary of Eloquent Ruby by Russ Olsen:
> People who are deeply into static typing have little incentive
Except when their boss tell them to use Python, or they rely on one of Python libraries that their pet language couldn't provide via its powerful type system.
My engineers write better code when we enforce types.
It's easier to do this then retrain everyone on Go and rewrite all our code.
New stuff is often in Go now, but prototyping quickly in Python and then enforcing types when we have to get it ready for production has been working decently
> my reply is that they are so precisely because we don't have to type type definitions.
My reply is that no, that's not why they're pleasant. If that were the only criteria, we could conclude Python is only as pleasant as Forth.
Totally agree. I hear a lot of rust makes it hard to write incorrect programs. In my experience it makes it hard to write programs in general.
I am totally with you and I am glad I am not the only one who is totally against those type-addictions leaking into languages that did not need them in the first place.
Types in ruby are even worse than in python, because the type systems in use really make ruby turn very ugly. In python it is not as much as a huge problem with regards to syntax, as python has a stricter syntax (e. g. mandating foo.bar() whereas in ruby you can typically omit the (), among other syntax sugar examples).
We need to keep the type people out of those languages.
Many years ago, on IRC, on #haskell, they said they don't want everyone to use Haskell. Back then I did not understand it. After the type-addicted people emerged out of nowhere, I now begin to understand why Haskell is so snobbish. If you let every idea float, you end up ruining languages - and then those who wanted this, will retire and move away too. Ultimate damage factor caused as outcome here.
So far I have been avoiding Pydantic as a huge-ass dependency. Instead I am relying on standard type annotations, lots of typed dicts and at service/program boundaries use a jsonschema. I like being able to specify the type of most functions, and get some hints, completions and so on, but I don't want to _have to_ specify every darn type. I also don't want to write a class for everything. Typing dicts is good and usually sufficient. If I wanted to write types for everything, then I could also just write Java or Rust or similar.
Unfortunately, I think the kingdom of nouns faction has long invaded the Python world and I see more and more companies demanding Pydantic and similar things. They are dragging us all the way to Java land, it seems.
I tend to get triggered when TypeScript is painted as “JS with type hints”. Coming from Python background, TS and Python with type hints are just so different.
With Python I can’t see myself type-annotating everything (or bringing in pydantic anymore for that matter, it is indeed becoming a blight), but with TypeScript my process is turned on its head: I find it natural and easy to start writing with types and have everything fully typed, and I find the fact that it simply won’t compile if anything is off (compared to Python where it’s more like “one of my N type checkers/linters failed, oh well it still runs though) a useful constraint that gives peace of mind.
I started using types with Python in 2018-ish, and I never looked back.
I am not that good a programmer, so maybe I am wrong, but I just like being able to tell what the data is that's moving through the system. Typed function signatures, a little shift+k here and there, a warning that I am trying to add int and a string. I don't see what's the harm in having that?
At the end of the day, if you don't want to use Python with types -- do not. Unless somebody at work is forcing you, and it feels like putting lipstick on a pig (especially with something like numpy that doesn't easily support types)? Then condolences.
But why does your appreciate of type systesm not lead you to something like Typescript? Which is a lot more robust? Or Rust? C#?
I guess my speculation is that not every language is good at everything. Sure you might want a better type system with Rust. But for data science?
In practice, inertia is stopping me.
For personal projects, I don't want to learn Rust just so I can do `def add(a: int, b: int) -> int`.
For work, I don't really get a choice. I work on brownfield projects. We do use TypeScript, thankfully, for all the browser bits. But nobody is going to stop to refactor a 5 year old production code base from Python to Go just for better types. And -- pepega -- definitely not our codebase that's full of data sciency stuff (numpy/opencv/pandas). So we live with a not-as-good-as-it-could-have-been type system.
Compromises, man %) One of the constants in life.
Meh, the amount of effort required to keep up to date with the python ecosystem churn is around the same as learning rust. More so if you are starting from scratch.
I quit python after realizing the amount of effort it required to just implement the tooling for a project… when all of that comes included with rust. I have spent maybe an hour in the last year thinking about tooling. Glorious.
But yeah, I feel for you. It is an impossible sell when they pay off is impossible to understand without a Time Machine and the only thing known about the cost is that it’s high. But for new people and projects, I can’t imagine starting with python in 2026.
Which Python tooling? I know that uv is replacing pip but all of my costumers' projects still use pip. One of them installed python with asdf. I can't think about any other tool we are using except Claude, but I don't think that's the kind of tool we are writing about. We deploy with a custom bash script resembling Ruby's Capistrano. Those projects are web apps with server generated HTML.
What specifically makes them more pleasant? (not a rhetorical question, I want to know what's important to you)
Historically (80s/90s) I started using Perl because I didn't have to write all those malloc and free I spent years writing in C and I could perform string operations much more easily. Then in the mid 90s because of that wonderful CGI.pm Perl module. But the plus of all those languages, and Java, was managed memory. Then in the mid 2000 I learned Rails, and after Rails I learned Ruby. It was like Perl but much easier to understand and again no types to type. Basically what I did in Java but in a fraction of the time and in a fraction of the lines of code. Then a customer asked me to work with Python on a Django app so I learned Python. It looks like a Ruby designed by Klingons but it's OKish.
All those bugs I constantly read about, they don't happen very often and are a good tradeoff. Maybe Rails and by Django are shielding me from some bug scenarios.
I'm not trying to be cute here, but it seems like you have mostly been using scripting languages without static typing. How do you know what it is like to develop in languages with strong typing that produce binaries and that these are unpleasant to use?
And dabbling doesn't really count. It takes time to actually learn a language. Much longer than most people are willing to admit.
> because we don't have to type type definitions
Typing "type definitions" makes you type less, not more, because you type the definition only once, instead of writing many tests wherever values of that type are used.
In a decent programming language, one would frequently avoid the need to declare the type of a variable, whenever the type can be deduced from the value used to initialize the variable.
Having types in a language has 2 purposes, one is to enable the compiler to check at compile time or at run type that all the subsequent uses of an identifier after its first occurrence are consistent.
I cannot imagine which are the benefits for the programmers who are against this rule, i.e. who want to reuse the same identifier for multiple purposes in the same scope (N.B. reusing an identifier in the same scope has nothing to do with data types that are disjoint unions or virtual types, which can be used in any type-enforcing language, or with reusing the same identifier in different scopes).
The second purpose of data types is that when the type of a variable or parameter is known at compile-time, that allows more efficient implementations, which are especially important for aggregated data, e.g. arrays.
Again, I also do not understand why anyone would want to have inefficient data representations, to avoid the need of data types.
There has been some argument that when a language does not use data types you might avoid having to rewrite some library functions if you want to change the parameter types at invocation. However this is a problem that has been better solved for more than a half of century, by providing various means to write generic functions that can be specialized at compile-time or by using disjoint union types or virtual types, for using the same function for many different data types, while still ensuring that other data types, whose use would be erroneous, are rejected.
A language without data types saves writing effort only when the programmer omits the run-type checks for correct values, which would be needed to avoid bugs when such checks are not done automatically by the compiler.
I agree that several very popular programming languages with type checking, including C and C++, are very poor examples about how a type system should be implemented, because they require the writing of a great amount of superfluous boilerplate that is completely unnecessary (e.g. writing headers with function declarations instead of extracting automatically the interfaces of a package a.k.a. module or writing explicit type names in a lot of places where they can be deduced automatically from the context).
Such languages are strawmen in a discussion about whether a language must enforce type checking or not.
God, I hope you never touch a production code base with real users. Strong, static typing has won. It is a must for serious software development. The time and money saved in stupid errors that are caught and avoided before the software even runs is enormous. For internal tools and one-off scripts, sure, go nuts with dynamic stuff, but there is no reason to use a dynamic language for code paths that actually make the business money. If you cannot specify precisely the type of every piece of data you touch and all the operations that can be performed on it, you're not doing software engineering, you're spitballing. And if you want to spitball, LLMs are here and they're great. You can type in a very loose description indeed and get properly typed Rust out the other end.
Nothing can beat the Python numpy/ML ecosystem. There's a lot of value in just being able to run a Python script as well without any compilation step. The typing isn't perfect right now but it's usable.
For vectorizable problems there also won't be huge performance gains from switching to a compiled language because all the hard stuff is already done in highly optimized native code. The only time it really makes a difference is if you have to write a custom for loop or traversal.
Running more type checkers isn't really about strictness. The main benefit to library maintainers is to make sure that their APIs are compatible with whatever tools their users run.
This wouldn't really be an issue for most other languages, but Python's typing ecosystem is uniquely fragmented, with only partial standardization between several popular tools.
Hmm... that doesn't answer the question?
> Hmm... that doesn't answer the question?
GP's point is obvious: performance is immaterial to the discussion. Static code analysis is about preventing bugs. Therefore OP fails to make any sort of point, as it's a straw man argument.
No? One has nothing to do with the other.
I think those of us who work in compiled languages are just snooty about them.
I'm a compiled language snoot, and happen to be working over the past couple days in typed Python for the first time. It's kind of nice. I like it. It's a huge improvement for me over ordinary Python/Ruby/Javascript; it materially improves the experience of working in the language.
> happen to be working over the past couple days in typed Python for the first time. It's kind of nice. I like it.
I like me a good type system and have always hated about everything about types in Python. What do you find nice and like about it?
(My experience with Python: all the type checkers are broken, there are false positives and false negatives everywhere. The LSPs are likewise broken, I have not found one that knew the types at least somewhat reliably...)
Lack of typing is my biggest problem with Python, Ruby, and ES6 Javascript; I have to write everything twice, once to do the stuff I want, and once to double check that it's actually doing stuff, because a single typo blow the program up despite it parsing fine.
Python typing is easy to dip in and out of. It handles None nicely; not as nicely as a true Optional, but enough for daily driving. The annotations are readable and simple. What more could I ask for, without asking for an entirely different language? Python typing catches a lot of bugs I'd otherwise have to tediously unit-test for.
The only thing I don't like about it is that it feels like it relies a lot on importing stuff from the swamp of the Python stdlib.
I believe you are right to point out how you feel about some of these features of a particular language, and to let that guide your decision in how and whether you use them.
It does not, however, say anything about the actual productivity gain or loss of using types in a language like Python which does not require them — that should be the ultimate objective measure of whether they make sense or not.
With most languages, I get annoyed if I need to create separate types for every variant of a basic type (eg. let's have a firstNameString, familyNameString, CountryCodeString, CountryNameString... when is it too much?) - I do not think there is any way someone can prove going this deep improves maintainability long term. Eg. imagine you introduced validation of CountryCodeString based on ISO-3166, and there has been a change to ISO-3166 which happen every couple of years — how do you start supporting new codes? How do you deprecate and remove old ones? All of those are not helped with a type being very strict, you still have the persistent data to worry about, code actually supporting any of those, etc — the basic type check is trivial with a couple of small unit tests in comparison. You also quickly venture into a territory of complex types with complex interaction rules (this subvalue can be one of A if another one is X; but B if another is Y).
So for me covering the basic invariants with a unit test is not much more effort and — especially with Python — does not stop one from refactoring effectively and building stable, long-running systems.
Really, complex relationships in data are complex, and encoding it in a declarative way using a complex schema does not guarantee correctness (see Pydantic); if you want just very basic data conformance (type) checking, it's mostly a question of ergonomics.
Basically, you need to keep to some principles of code structure and architecture, but they are a simple set of principles — perhaps the fact that most Python projects do not abide by these should be a knock against Python? I only attribute it to the approachability of Python, but I am open to being wrong and this being the latent forced idiomatic use that projects always evolve into?
What I like about Python types is that they accommodate both styles of programming. I happen to be completely sold on at least some baseline level of type safety (I don't need my type system to be a complete modeling toolkit for everything in my problem domain, just enough for basic sanity), but if you're an old-school Python type, you and I can work on the same codebase without types ruining your life.
The modern approach seems to be to require full typing on items seen from outside the function or object. Within functions, have the compiler infer as much as it can. Newer languages (Go, Rust) seem to be converging on this approach.
Function parameters need type info as guidance for people and LLMs calling the function. Even though cross-function type inference is technically possible, it's too confusing. Long-distance inference failures tend to generate poor messages.
Within a function, if you have typed parameters, the type inference engine has a local starting point and a good chance of success on most local variables.
Unchecked advisory typing in Python was a terrible idea. All the work of writing type declarations with none of the benefits.
> If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
You can use type-checking to get better performance already, without leaving Python. See https://blog.glyph.im/2022/04/you-should-compile-your-python...
What statically typed language would you suggest for machine learning and large data pipelines? I don't love Python, but it has by far the largest ecosystem.
You could take a stab at Julia.
It’s still dynamic in nature. But you can tune how much staticity you want. The spectrum goes from Python to C in terms of staticity. And with tools like JETLS.jl maturing you get a lot of the benefits static analysis.
The data pipeline ecosystem is starting to rival that of R and Python. The fact that you can just use Julia functions while keeping the performance allows you to avoid those weird vectorization gymnastics. The ML ecosystem is also in a great state. JUMP.jl, Touring.jl, the whole SciML ecosystem, autodiffing and gpu computing are all close to best in class in terms of quality. The NN side of ML is a a bit weaker, but just for lack of developer time investment into that side of the ecosystem.
I use Julia! I like it a lot, and add type parameters to all application code. But JET.jl does not feel anywhere close to the assurances I can get from a statically typed language (yet)
It is brutal. I can say with first hand experience: The APIs for Pandas and NumPy are awful and insanely dynamic. As a result, it is frequently difficult to know what is allowed with calling a method. It is exhausting. Since many methods are "hyper-dynamic", many of the error messages are unhelpful.
Well, that's the curse of machine learning: since everyone uses Python you have to deal with Python. Even though Python isn't very nice when things start to get serious and you don't want to spend your time fiddling with noise just to make something work at scale.
I'd wish the ML/AI/LLM crowd would see that it is in their interest to get better developer ergonomics at scale. (I don't want to have to turn to C++)
Anything performance sensitive ends up being an extension in compiled code anyway
Python is mostly just glue
And now with LLMs, writing glue in Rust is cheap.
The ML/AI ecosystem is a minefield, and pure Rust rewrites (Candle, Burn, ...) are still immature and incomplete. But I'm pretty sure we're eventually going to see the same uptake that's already happening in the data processing world.
The performance is not the (only) issue. The issue is the death by a thousand cuts involved in distributing Python programs without a two page set of instructions that have to be followed to make it work. It is rarely "just works" unless you can make a lot of assumptions about the environment it runs in. It's why I generally steer away from any application written in Python. It is going to be painful.
However my experience might be a bit different since I actually have to deal with Python at scale and in a fairly dynamic environment.
I need to run hundreds of Python programs, written by dozens of programmers, over many years, that speaks to custom hardware, runs on a remote site, in a production environment that has to work and with new versions of things coming in all the time. Some of these Python programs not only link with C libraries, but run external binaries because developers didn't have time to integrate them as libraries because it takes forever to make it work on all the different os/arch combinations (easier to just run the C code in subprocesses).
This runs on three different CPU architectures (we're trying to eliminate one of them), two different operating systems and a pretty wide mix of hardware and system configurations I need to insulate Python from. Much of the hardware being custom built stuff. Because Python has a lot of exposed surface to the OS compared to a statically linked binary. (Roughly 100x the surface of statically linked binaries that don't link with libc -- which is evident by the insanely bloated OCI images that result from packaging what you need to run)
Modern compiled languages that have sorted toolchains makes it pretty easy to produce "production grade" os/arch specific binaries that can survive almost everywhere. You compile build a statically linked binary for each architecture to overcome the challenges of varied Linux runtime environments (see Linus T's frustrations with Linux and software distribution - it's not like it is easy to begin with). Go and Rust do this well.
So you end up having to containerize everything in ephemeral containers to lock down the execution environment while retaining some speed. But of course it isn't that simple, because if you depend on access to weird hardware and/or you run on custom built machines you have to detect this and ensure the application inside the container gets access to the things it needs from the container. So you have to fix that.
In a way that is almost completely invisible to the developer.
All of this has to be understandable and _reduce_ complexity for developers and operators so at the very least you don't follow the Python philosophy of "just throw another layer of complexity on it and make the instructions another page longer".
40-50kLOC later (in Go and Python, I have lost count) of code to try to make the problems go away, and I have something that is on the verge of actually being usable in a production environment for taming wayward Python code.
The easiest fix? If people could stop using Python because they don't want to learn a language that can produce something that is easier to distribute to users.
Believe me, I have spent months now trying to make Python work properly in a challenging environment. The only way this "worked" before was by just lowering standards to where the definition of "works" is flexible enough to count daily dumpster fires as "nominal". And of course people don't care. Python fosters a "it works for me" mentality where people don't know and don't care what it is like to be on the receiving end.
90% of problems I have because of Python would just disappear if people used languages that can produce robust binaries with limited exposure to system peculiarities. But that kind of requires people to understand why it is a problem in the first place. And people generally don't bother to know.
> Python fosters a "it works for me" mentality where people don't know and don't care what it is like to be on the receiving end.
In my experience, that's not limited to Python.
> And people generally don't bother to know.
Yes, that's the real problem and transcends ecosystem and toolchain
No, you are right about it not being limited to Python. But for python the common courtesies I am used to right out of the box tend to require extra effort on part of the programmer. And «extra» doesn’t usually happen.
Even C, with its ancient, haphazard, ugly, fragile, awkward toolchain, can often trivially produce binaries that will just work with very little effort.
I have spent decades of my life writing tooling, libraries and infrastructure. And no matter where you go, developers only do the bare minimum if they can get away with it. That doesn’t mean they are bad people. It means tools and infrastructure has to be designed with acute awareness of reality.
Python has been around for 35 years. And it still hasn’t evolved things we should take for granted today despite its increase in adoption. To me that’s pretty fucking awful project governance.
You could try Cython and Lush. An ML dialect for ML would have been nice, but doesn't exist.
Cython is a niche language for writing perf-critical bits inside your Python codebase. It's like C for people who don't want to learn C. At least that's how I treated it, when I had to write some stuff to make some numpy ops faster.
Cython is not in any real sense a replacement for a modern data/ml stack.
True but it's really nice way to get the benefit of type checking in Python.
Just like you, I had started using Cython for performance but then realized that I can discard a bulk of type errors if I used for type checking.
The other benefit is that the Python library ecosystem stays available.
As funny as it would be, ML isn't really a great fit for ML, I don't think.
That's true for current ML offerings.
However, I think an ML designed for machine learning would be nice, especially if the type system is extended to multidimensional arrays shapes. Pattern matching on array shapes would be rather nice. Ocaml style interactive mode for exploration and compiling for performance would be nice too.
ML is basically the one use case for Python anymore.
And that even shrinks by the day
How so?
LLMs are leveling the developer experience and productivity in a way that makes Python's strengths almost irrelevant, while it's still suffering from bad tooling (even with uv and friends) and poor performance.
AI/ML: interfacing with C++ libraries directly (or in Rust) is now a real option. For everything else, even 5 years ago I wouldn't have used Python, now there are even fewer reasons to do so. As far as I'm concerned the remaining use cases are notebooks and one-shot scripts.
With writing code in english now, why have it use a slow weak language?
ML still has a depth of libraries that can't be replicated easily but ML work is decreasing by the day with LLMs.
Because the English to code translation step is fallible
Which is precisely a reason for not using Python, despite LLMs being good at it.
Why is that?
> With writing code in english now, why have it use a slow weak language?
Because the feedback loop of writing few lines of Python inside Jupyter cell is much shorter than with your currently favorite AI tool. It costs less too.
Implementing whole features is? What are we talking about?
> What statically typed language would you suggest for machine learning and large data pipelines? I don't love Python, but it has by far the largest ecosystem.
Pay no attention to OP. It's nonsensical to even suggest you should migrate away from a whole tech stack just because you want to run static code analysis, specially when the argument is based on having too many static analysis tools to chose from. Utter nonsense.
Definitely, and statically typed language have REPLs as well.
I know Python since 1.6, and it has always been mostly for OS scripting.
I do see a value on it being the new BASIC, and like BASIC, building full businesses applications with it, comes with gotchas.
Additionally, it appears the C libraries as "Python" libraries culture will never go away.
Yes, but unfortunately Python has invaded everything, and one must adapt.
Python is going to be preinstalled on almost any machine I use, with a reasonable assortment of libraries. And even if they're not preinstalled, the libraries I want are likely to exist. They'll have unstable APIs and weird quirks, and I'll have to take my choice of bad packaging systems to install them, and everything will just generally be a pain, but they'll exist and largely work. That's not true for any language I actually want to code in. I mean, I'm not going to deny that Python is better than shell scripts or (usually) C.
It's not like it's a pleasant language to code in, especially if you actually want to use the type support, which is weird and irregular and keeps changing and has to work around fundamental design problems at the core of the language.
strict type checking is an incredibly useful tool for cases when you really want to make sure your code is correct and behaving as expected (one of many tools).
There are lots of people who like python and want to use it for things that where incorrect code has serious consequences. Type checking is helpful in these contexts.
Type checking remains optional for the masses and is not practical in many cases. Still, pushing away people who want to use all available tools for writing correct python only hurts the community.
That is why I'm using C# and Rust more now than Python. You get far better RoI on types. and they are so much faster and can use all cores so much more easily.
You can even add F# into the loop, with REPL and type inference much better than those two.
F# is a really nice language, I just wish I could get paid for writing it.
> If you are going to be super-strict with type-checking, wouldn’t it be best to switch to a statically typed language and get the performance gains as well?
I don't understand your question. The whole point of static code analysis is preventing bugs. Don't you like Python code to not have bugs that are easily caught with static code analysis, or is preventing code a foreign idea that is better left to other languages?
I don't understand your question. Are you saying static code analysis is impossible without type declarations? None of it?
Shhh! You're not supposed to say that part out loud.
That depends on your needs and goals. Super strict type checking might not be the most important feature.
Its like saying "if you want a car that is a bit sporty, wouldn't it be best to buy a Lamborghini??
Yes, but using five type checkers on Python is the equivalent of buying a Nissan Leaf and trying to turn it into a Lamborghini by adding an internal combustion engine which is only supposed to produce a nice roar but no thrust.
Personally because I'm making a blender add on that only uses python, and it's at the complexity where having types catches a ton of bugs easily.
Often, when I code in Python, it's because there are some libraries that aren't available in whatever other language would have been my first pick. Then, typing and type-checking are useful tools to stave off the codebase turning into the unruly mess that all Python projects eventually become.
Don't projects in all languages turn into unruly mess by default?
It requires special care to not let long-running projects evolve into it.
Python is only special in that it is extremely productive and allows lots of easy evolutions of a project with not a care in the world, so the timeline is probably shorter on getting to the "unruly mess" if no special care is put in to make it survive many evolutions.
Perhaps a bit special too in that it looks welcoming to the masses who have no idea how software systems evolve and thus do not even know there are some special patterns to introduce for code to survive many changes in the future. I am undecided if this is a pro or a con of the language itself.
Yes. If you have a choice.
For people who don't have a choice, type checked Python is better than nothing.
The goal is to be strict, with explicit exceptions.
You don't use a static language because you want the exceptions, but the type checking can still statically validate most of your code.
Yeah, I can't say I really get the appeal of gradual typing. It's commented/documented code at best and outright lies at worst. Sure, you can build tooling around it and improve your DX a bit but isn't it always a house of cards?
If you add one of these type checkers into your CI or a pre-commit hook, it provides the same guarantees you get from a compiler along with the same tooling benefits. It gives you the option of using the structure when you need it, but not being forced to use it when you want to take advantage of some of the more dynamic features of the language.
But none of the runtime benefits of having static types in the compiler, since the runtime still can't trust the types. Still, half a loaf is better than none.
> commented/documented code at best
Machine-checked documentation is always valuable, IMO
Python has other, bigger problems that make it a constant headache. One of them being the dismissive attitude towards any and all of problems that come from versioning, dependencies and quirks that make it challenging to have robustness.
Criticisms are typically dismissed by suggesting heaping yet another "solution" onto the growing pile of "solutions" that you have to drag around with you. That people have to learn. That you have to install tooling for. That has to be vetted. That has to become part of the toolbox to get even seemingly simple things done. This attitude is a big part of the reason that I strongly advise people against using Python in production. On top of all the problems presented in a real-world setting.
Almost all of the time, people who are fond of Python are more interested in defending python, disparaging me, downvoting me etc that listen to why I make that recommendation.
(I get it. People like Python. What I think of Python as a language is irrelevant. In fact I don't have that much against it. But I do have a lot against it in a setting where you need reliability and repeatability)
I have spent the last month of my life building a system that can run Python tooling reliably in a business critical application. I knew this was going to be a pretty big job when I started, but for every problem I solve, a bunch of new problems arise. I am starting to see light at the end of the tunnel but it hasn't exactly been smooth sailing. I'm almost there for a first version, but there are a bunch of problems still to solve. Mostly because I care about developer ergonomics and that things should "just work". One important goal is that my solution shouldn't impose any significant cognitive burden on people who use it. That's really hard.
(I don't think the solution will be open source since my contract wouldn't allow for it. But I'll make the case at some point for why it should be open sourced)
And yes. There are statically typed languages available today that have decent tooling that provides superior developer ergonomics. I can understand that people don't want to learn new languages, but if you have the capacity to do so I would recommend trying to move on if the code you write has to run outside your own workstation. If an old fart like me can learn and adopt new languages, so can you.
Python the language is pretty nice. It has its warts, but I make my living in PHP which is practically made of warts. But the python ecosystem still seems to be trying to figure out this whole package management and project setup thing. In most languages I can do some form of `$blub install` where $blub is the language's official package manager or some close equivalent. It's just python that always screams at me that I have to set up and "enter" a virtualenv. I get what venv is for, but it's still a weird hack of hardlinks and relative paths that no other language seems to need, and a clumsy two-step dance of a UX that hasn't improved in like 20 years.
I am not sure you get what virtualenvs are: Python is never screaming at you to set up a virtualenv, it must be a particular package recommending use of virtualenv for easy set up without interfering with the rest of your system.
Virtualenv allows you to seamlessly run multiple Python ecosystems simultaneously, even within the same project directory. It's basically primitive containerisation mechanism that predates any actual containerisation systems on Linux.
You do not have to use it, but then you can easily slip into a sort of "DLL hell" (multiple incompatible library versions installed system-wide) with multiple projects — or need to bundle all dependencies within your project directly. None of this is specific to Python, really — any shared library system has the same challenge. How many other systems are there in active use making it as easy to use multiple incompatible versions of shared libraries per project or within the same project?
When in doubt, you can always retreat to the basics in Python world: put packages you need in a path of your choice, and point PYTHONPATH (sys.path) at it.
I thought I'd given sufficient clues that I actually do in fact know what virtualenv is. I was hip-deep into python when it was introduced, and thought it was a clever hack, but it's relying on python being compiled to find its libraries in a path relative to the python binary, so rather than use something sensible like a launcher that sets PYTHONPATH to a local dir like every other language does, it hardlinks the python install into the local dir in order to pretend everything is a system-wide global install. I always figured that hack was a temporary workaround, but it's going on a couple decades now.
As for the screaming, I'm talking about pip, which has the unique property of just refusing to work by default unless you use --user (what most other package managers now call "global") or are in one of these pretend-global environments. Ultimately it's just a paper cut, not a show-stopper, but it points to not just a failure to address the pain points of the package ecosystem, but a seeming refusal to do so.
Usually running pip as root overcame that problem — but it did cause all the other problems virtualenvs were introduced to work around (so I am not suggesting this as the solution).
You are right that there are many problems with packaging in Python, and yet I feel like virtualenvs are the smallest of those.
I believe we also need to compare tech this old to tech from the same era: obviously newer ecosystems had the benefit of hindsight, but how does managing dependencies compare between Perl and Python, for instance?
The biggest problem with Python packaging is — IMO at least — that it is actually attracting so many evolutions and proposals that ot is hard to stay on top of if you do not make Python packaging your core interest.
For managing dependencies in Perl, it was originally a similar story to Python: everything was system-installed, but many people would install things to their home dir and set PERL5LIB in their .bashrc. The cpan client was smart enough to detect and use the home install when writing its initial config, so you could call it a day. Later there was local::lib which fiddled @INC for the use of a project-local directory, and cpanminus defaulted to using it, and then Carton came around which is more or less a clone of Bundler from Ruby, also using local::lib under the covers.
What are the problems you have with tooling? Imo it's no worse than most other languages besides a very small handful of recent ones (rust, go) where everything is included
The easy approach is usually just throw it in an OCI container
There's not much concrete to go on here besides "I don't like the ergonomics"
Throwing it in an OCI container is not the "easy" approach. It is the beginning of opening a can of worms. I know, because I've been developing tooling to do that at scale for the last few months and behind every layer of worms there are uglier and bigger worms that you need to deal with.
And yeah, I have written a lot of code to insulate Python in containers while allowing meaningful access to hardware and services. While at the same time not heaping more complexity and cognitive surface at the developer. Including writing my own container software to actually understand what's involved at a more detailed level so I know what I'm doing when trying to make this work with existing container software. (No, I don't run any of my container managers in production since I don't want to maintain it -- but this also means a bit more complexity in using existing ones)
It may be "easy" in trivial cases, but it is very far from easy if you want to make something that can cater to a wide range of scenarios.
Do you have examples of issues? OCI is tarball of dependencies that doesn't fight with the OS userland.
Why do you need to write code to insulate Python in containers?
At the simplest level, you can add the flags to the container runtime (network host, host ipc, host process namespace) to turn off all the namespacing besides filesystem and the Python container runs just like a non containerized process.
An extreme example
https://github.com/home-assistant/docker-base (Debian, Ubuntu, Alpine base images with Python for arm64 and amd64)
& https://github.com/home-assistant/core/blob/dev/Dockerfile (Python app built on those with >1000 deps)
And even there most of the custom code is just running a ton of combinations of inputs against docker build. The OCI container gets rid of "wide range of scenarios" for you standardizing the runtime environment
Show us the language and we'll all switch tomorrow.
ML
Data tooling
Talent pool
Libraries for customers
Brownfield codebases
Academics
I can keep going…
Seriously, just switch to Go or something
[dead]
The fact that this article seems to honestly recommend people run 5 different type checkers on library test suits really reflects the tacked on feeling of Python typing.
I am not sure it is recommending more than it is commenting on the current state of developing public-facing APIs in Python.
The downstream users that import the package either have to ignore checking its exported types altogether, manually stub it, or have a subpar development experience to varying degrees.
This is something I saw the other day with some package that provided comprehensive stubs for an untyped library. The .pyi file was littered with comments about quirks from the numerous type checkers (five now).
It's ridiculous. They should have made it an explicit part of the language. The interpreter knows about types already, it's crazy that they couldn't just let the user make the types explicit rather than implicit, and have the interpreter enforce that.
The interpreter knows types at runtime, not at parse/compile time. The interpreter already does a lot of dynamic type checking. It has a much stricter type system than e.g JavaScript; JavaScript will pretty much always convert operands to produce some result (even if it's just NaN or the string "object Object"), while Python will often just give you a type error.
The interpreter doesn't know about static types.
I agree that they should've made typing more a proper part of the language and not left it in this weird half-defined state of "standard syntax and some standard typing imports but undefined semantics". But it's not just a matter of enforcing existing types.
No, it reflects the nature of misunderstanding Python by people who think their system is better, have no idea how Python in production actually works, and just publish things like the article to make themselves feel better.
Typing is not a huge issue, period. In Python, if you pass a wrong type to something, program just throws exceptions. Exceptions are not the end of the world like people make it seem. Functionally, finding errors during the process of taking code and compiling it with type checking is no different than taking code and just running it against a set of tests, which every production code has (or should have)
The only waytyping ever saves you from it is by being absolutely strict - every type defined has a finite range of values, and every operation has bounded domain and range. I.e if you have a string field, its not enough that its a string, you also must define the total number of characters that string can have, and values for each character, along with more complex rules on sequences of characters.
If you have this system, (something like Coq comes close), then if your program compiles, its by definition correct. But even the strongest proponents of typing don't really want to do this, because they realize how long it would take to write code.
The simple truth is that Python is easy and flexible enough to work in that you don't even need type checking. An LLM can effectively function as a type checker for you if you care enough. For any errors that you encounter due to lack of typing, its ultimately way faster to fix with Python than it is to spend time writing strongly typed language.
> "In Python, any method __eq__ is expected to return bool, and if it doesn't, then we need to explicitly tell type-checkers to ignore the type error. This function in Polars can also return different types depending on the inputs, thus requiring overloads."
Why would you ever want a == b to not return a bool??
EDIT: Yes, I understand that you can do element-wise equality checks on numpy arrays now
There are examples like ORM query builders (something like `User.id == user_id` should not return a boolean, but rather some inspectable query part), multi-value comparisons (e.g. numpy arrays and views which could also be used as masks for indexing)
In general, when you get your hands on operator overloading you get a bunch of various quirky applications for each. Some dunder methods have strict runtime-level rules (e.g. __hash__ or __len__), some don't
That's one of the things I truly didn't get from my (very limited) experience with SQLAlchemy. Why not just have a method like orm.eq(User.id, user_id)? Much more readable.
Elementwise equality! Given two dataframe columns or ndarrays, users often expect `==` to give out a column or ndarrays of bools (like `+`, ``, `*, `&`, and just about every other binary operator).
What I love about operator overloading is that now you can't use operators without looking at their definition, in which case.. you could have done numpy.equals(a, b) anyway.
Does a == b true, if all elements are the same? Does it return an array of booleans? It's anyone's guess!
What's fun is that it could return an array of false if all elements are different, and then that value is truthy.
The function call approach can be a lot less readable.
Consider using Shamir secret sharing to share a secret, D, among several people with two people required to recover the secret. D is a positive integer, such as a randomly generated 128 bit AES key you are using to encrypt your launch codes or credit card database.
For anyone not familiar with Shamir secret sharing what you do is pick a prime number, p, that is larger than D and another random positive integer A, that is less than p. Then give each person a pair of numbers, (i, (Ai + D) % p), where each person gets a different i (which should be a positive integer less than p...it is OK to simply use 1, 2, 3, ...). Let's let Di = (Ai + D) % p.
(This is for the case where you want any two people to be able to launch your missiles or decrypt your database. If you wanted 3 required instead of giving out (i, (Ai + D) % p) you would give out (i, (Bi^2 + Ai + D) % p) where B is a randomly chosen positive integer less than p. For 4 required add on a Ci^3 term, and so on).
Given (i, Di) and (j, Dj) and p it is possible to recover A and D.
Here's what that looks like in a language where the big int library uses an accumulator style, i.e., operations are of the form X = X op Y, where the ops are methods on the big int objects. Assume Bi and Bj are big int objects initialized from i and j, and Di and Dj are already big into objects, as is p. This particular example is using Perl. (This is very old code. Since 2002 you can add a "use bigint" pragma to Perl code and then it would look a lot more like the second Python example below).
At this point, the recovered A is in $A and the recovered D is in $DiHere's what it looks like in a language with the ops as function calls taking the big int objects as arguments. This example is Python without using operator overloading.
Probably more readable than accumulator style. Here it is in Python using its built-in operator overloading for big ints: I'd sure rather come across that than either of the earlier examples.OT: this reminds me of something I started to do once but never finished. I was going to write for each language we used at work that had a big int library but that did not support operator overloading a class that implemented a big int RPN calculator. Java, for example. Then recover would look something like this:
But I never ended up needing big ints in any of those languages so never really got past some initial design work.One example is if an and b are arrays (e.g. numpy arrays) it’s not unreasonable for dunder eq to return an array of booleans.
Another example might be if you have a domain specific representation of equality (e.g. class Equality)
I can see the first one making sense, but why would you need a representation of equality other than "yes, these are equal" and "no, these are not equal"?
The first use case that comes to mind is if you want a DSL to build expressions that are evaluated later in some different context e.g. when using `polars`:
```python df.filter( pl.col("foo") == pl.col("bar"), ) ```
Sqlalchemy does something equivalent too, and I'm sure there are many others.
Well personally I’m not a fan of turning everything into an object, but if you have properties or methods that exist upon the concept of Equality you might want to encode directly onto a class. Maybe in a domain where “Equality” is an important concept, like mathematics or even something like accounting.
Could enable a different interface into approximate equality for floating point numbers: Equality.approximate(iota: float) -> bool
IIRC, SQLAlchemy overloads this to return an object that represents an equality check in SQL. Because it was returning an object, it was always evaluating to True, because of another of Python’s footguns: truthiness/falsiness. This was a decade ago, and these particular footguns were not even remotely the biggest culprits in our bug backlogs (another honorable mention includes accidentally calling a sync function in an async context, causing timeouts in unrelated endpoints and leading to cascading system failure).
It could return a vector or a deferred expression? In polars, for example, operations on `pl.col` return `Expr` objects that are used to build queries, not immediately evaluated:
In numpy, `x == y` return a boolean vector of the same shape as x and y, comparing them element-wise.Primarily, because Python doesn't have quasi-quoting. You can't pass an expression without workarounds like this.
I thought JavaScript language equality quirks was seen as problematic not a missing feature in Python.
At least in javascript, it tells you if things are equal or not. In python, apparently you could answer if A is equal to B with "beans" or 17 or ['a']
Never understood this complaint about operator overloading.
In any language, a function called `isEqual` could wipe your hard drive and replace your wallpaper with a photo of a penguin. Therefore, letting programmers pick the names of their functions is bad? No, obviously naming things for least surprise is the programmer's responsibility.
But when it's the symbols `==` instead of an ASCII name, it's a problem in language design?
(FWIW in Javascript, being unable to override == is actually a problem when you want to use objects as Map keys)
Python never met a footgun it didn’t need to adopt. In this case, however, it’s not equality checks, but operator overloading. I was a Python developer for a decade before switching to Go and life on this side is so much better.
Operator overloading has never been an issue for me, but terminating a line with a comma creating a tuple, or white space (including new lines) between strings to concatenate have cost me days of work over the years.
I understand why those exist, but they’re pure evil.
Dynamically typed languages are going to decline with the rise of AI coding.
Statically typed languages provide the determinism necessary to efficiently anchor probabalistic coding agents.
You can throw as much type checking at dynamic languages after the fact, but youre just going to burn energy (and tokens) doing what another language gets 'for free'.
I am not so sure that's this simple.
No matter your preference, programs in dynamically typed languages are still very much deterministic.
To be able to reason about the output of LLMs (though it is debatable how often will this be needed), you want the output from your imprecise human language spec to a deterministic spec (code) to be as easy to review as possible (for correctness, but mostly for any glaring errors). With proper setup, ensuring correctness of one deterministic output (Python) in comparison with another (eg. typed language like Rust) is just a deterministic run away (a test suite) that should not use any tokens from the LLM, and should have no practical differences in compute use.
Static type checking catches a whole class of programming errors for free. Writing tests costs tokens or human time, so you end up needing more code (and probably more CPU time) to achieve the same level of error-checking in a dynamic language.
Very simplistic look, IMO. I'll add another one mostly as a counterpoint (not that I believe it is strictly true, but largely yes!).
Python is a lot more expressive than other languages and has a very terse syntax, and thus requires LLM to output much fewer tokens to achieve the same job compared to other languages.
Adding a few more tests to ensure data conformity on top of what you have to do anyway with a statically typed language still results in fewer tokens overall.
The expressiveness of a language and whether it does static/dynamic type checking are orthogonal. Type inference and generics are a thing in most modern languages. I also think you're underestimating the amount of code you need to guarantee that there are no type errors in your Python code.
You were talking about token economics: expressiveness surely matters there?
Does it have a terse syntax? I main F#, and when I have to work with Python I generally find myself complaining about how verbose it is. (Needing intermediate variables for what should have been a pipeline, the ceremony around parallelism, having to store constructor parameters as object fields, etc.)
It sure does in comparison with most mainstream statically typed languages — if you feel that way about Python, I wonder what you say about Java, C++ or even Rust or Go?
Checking some examples at https://learn.microsoft.com/en-us/dotnet/fsharp/tour, I'd say it's quite similar in verbosity — eg. no need to declare a module in Python since the code already lives in a file that is the module (plus one less indentation level for module level functions); inline function declaration and calling is thereabouts with F# slightly more terse (let vs lambda + spaces vs parenthesis); if-then more verbose in F# (no then in Python, just "if x:"); F# does not seem to need "return"...
In many cases you can avoid intermediate varibles: inline ifs, list comprehensions, lambdas, etc... Constructor arguments are a good point, but this is mostly about idiomatic use instead of language itself: you can simply do
I'll give you one on the ceremony around concurrency, though! I have different ideas of how it should have been done to shift the cost to the language runtime instead of the developer, but alas... :)Maybe, but AI models are better at writing python and JS than any other language. Probably because they are the most common and thus had the most code available for training.
Or attach an LSP server with a type checker and require the model to produce strict strongly typed python code
Python is dynamically, strongly typed. It cares a lot about the types of its objects, you can't just mix and match at will.
Perhaps you meant statically typed?
Oops thank you, I did mean static. I should know better as I am a cpp fan. Updated.
This is probably the most intellectualism ive seen anyone put into a comment that is so very, very, obviously wrong.
Yeah, in the age of AI where the whole goal is to not have to think, type as fast as you can with misspellings, and copy paste stuff without thinking, its TOTALLY a better system to worry about the types of whatever you are feeding into llms.
Its about limiting surface area.
The gymnastics people are putting their ops teams through in order to validate oceans of generated slop is insane. Just use Rust and half of that work goes away.
Please stop trying to make Rust happen. You guys are reaching SO hard now.
There is no way that Rust will be faster than simply specifying tests for an agent to run after it has generated code.
If I had said c++ would that have triggered you less?
> Prioritise running as many type-checkers as possible on your test suite. Run at least one on your source code.
There are two types of tests: those that test against the public API, and those that test internal codes with various mocks and fakes. I think the vast majority of unit tests is the latter one, in which case the suggestion does not really make sense.
Why anyone would still use mypy besides legacy infrastructure is beyond me. It is dog slow as well as being the laziest of all, not catching many mistakes.
Unfortunately for Django apps switching to any alternative leads to the dreaded “wall of errors” issue. If anyone got to work this out in the past, I’d gladly take advices.
I use pyright with a 50k LOC Django REST API codebase. I haven't really had problems. From my pyproject.toml:
django==4.2.30
djangorestframework==3.16.1
---
django-types==0.15.0
djangorestframework-types==0.8.0
pyright==1.1.390
My dj version is pretty old, but I'd assume things have only gotten better since v 4?
The django mypy plugin can inspect django models types, project settings and much more context.
We switched our very large Django monolith codebase over to ty — the trick for us was generating stubs for Django models and having tooling keep those stubs in sync with the actual models.
Went from type checking taking ~10 minutes in CI to now taking ~15 seconds and runs on pre-commit.
Absolute game changer, I think we spent $10k in claude credits and did the entire mypy -> ty refactor in about 3 weeks.
Since we are talking about Python type-checkers: we've built a (non-AI based) type assistant for Python called RightTyper (https://github.com/RightTyper/RightTyper). Below is a brief description; a technical paper describing RightTyper is here: https://arxiv.org/abs/2507.16051, "Getting Python Types Right with RightTyper"
RightTyper is a Python tool that automatically generates type annotations for your code. It monitors your program as it runs and records the types of function arguments, return values, local variables, and class fields — with only about 25% runtime overhead. This makes it easy to integrate into your existing tests and development workflow, and lets a type checker like mypy catch type mismatches in your code.
Everything that isn't uv, ty, ruff is wrong and deprecated
Will you update the list for us, a month from now?
i doubt uv and ruff are going anywhere that quickly at least
Five different type-checkers and even type-checking projects think adding multiple "ignores" is sound code. Typescript would allow overloads without ignores, for example.
Python's type checking ecosystem truly is a mess.
The whole type checking experience in python has disappointed me deeply and is seriously affecting my work.
I see the appeal for type-checking and yeah it has caught many bugs. But the language is quickly running blindly to the worst of all worlds in regards to typing.
1. You have to exhaustively write types in many cases where they can be obviously inferred.
2. The type checking is just a lint step. i.e. we are still paying for the duck typed typing system.
3. We no longer get to use the duck typed typing system making a lot of generic code require obscure annotation incantations to pass the lint check while it's correct python code.
My ideal typing system would be around constraints introduced by the code and completely inferred unless the user wants to tighten the constraints. i.e
Instead of
You would write: And upon checking if you tried to do foo(5, {})It would tell you that there is no + operator for int and dictionary that is required by the foo function.
My ideal typing system would allow you to constraint the types as well like so
The return type is not required in this case because it can be inferred by the function definition. For other cases it could be defined as well to constraint that we don't want None for example.Declaring types only where one wants to introduce a constraint would be nice, but unfortunately calls with obvious wrong types like "foo(5, {})" are a minority; in most cases types in a call depend on types of the calling function's parameters, and the type checker can only deduce unknown type from unconstrained type.
Moreover, the extremely dynamic execution and lack of compilation in Python means that determining in a lint step whether "there is no + operator for int and dictionary" is essentially impossible: if you find some __add__ or __radd__, you need to trust its declaration that it accepts certain types or not and hope it's still there at runtime; if you don't find it, it might exist at runtime when operator + is actually used.
A compiled and sufficiently static language, on the other hand, can inspect source code and libraries and reason about potentially infinite sets of definitions and prove that functions exist or don't exist.
I have the sneaking suspicion that for the vast majority of code it could be done.
I don't think the operator overloading is such a big issue since you can do the same kind of tracing for the overloaded operator.
The main issues would be metaclasses and all sort of monkey patching etc. i.e. anything that could mutate a type at runtime because the linter can't know if the mutation has happened.
Unfortunately I do not have the time nor patience to introduce the sixth type checker.
You need to distinguish the prevention of actual type errors during the execution of a Python program, which would require mutilating the Python language, and linting Python code to find proven or likely defects, which is merely fraught with complications (open world and closed world assumptions, expecting more or less adversarial behaviour from callers of a function, etc.) and necessarily dependent on adding optional type declarations.
Ocaml is probably closest to that. The ML language family has roots in generic code with global type inference.
This is what the typing spec says about type narrowing[1]:
"Type checkers should narrow the types of expressions in certain contexts. This behavior is currently largely unspecified."
Have fun.
[1] https://typing.python.org/en/latest/spec/narrowing.html#type...
From my experience with Python, both personal and professional, I find it immature and not well-suited for large codebases. Typing should have become part of the language a long time ago; it is clear that users want it.
Take, for example, PHP… look at the features released in the last 6 or so years, starting with PHP 7, and how mature the language has become.
With the advance of AI-assisted programming, I feel like Python is always a bad choice.
I'm happy w/ ty right now. My agents runs it fast and it seems to provide great guardrails.
What is this saying differently from https://peps.python.org/pep-0827/ ?
Everything? The blog post and PEP have almost nothing to do with each other lol
I run pyright in CI and mypy locally. They catch different things - pyright is stricter on overloads, mypy catches more None issues in our codebase. Annoying, but I have not found one that covers both.
I use ty with Zed.
But PyCharm's built-in type checker is far and away the best that I've used with proper type inference through multiple class inheritance hoops.
Why would users care if you're using the same type checker as them? Surely they're not expecting all their imports to be instrumented for running redundant types checks?
Users do not care about that, but they want to not see type errors or warnings when they integrate your API in their code.
That’s why you want to run their type checker on your API. you cannot know what “their type checker” is, so you want to run all popular type checkers on your API.
Sounds like a them-problem. Their type checker can accept my declared typings for my public API, or they can override it with their own custom type stubs if they have objections.
what are ppls' impression of pyrefly? i've become completely captive to uv's tooling. it has allowed me to think only about coding versus tooling. dont feel like giving another typechecker a chance unless it offer's something i'm not getting from ty.
That blog needs to run a AI checker. Content aside, a lot of the writing is pure AI style.
> The type checking that matters most (and why you've probably got it backwards)
Honestly, I don’t care if the author got some AI help. But that click-bait style is ubiquitous and obnoxious.
[dead]
[dead]
The type-lovers will be angry! :)
The blog entry fits into ruby too, to some extent; while the situation is nowhear near as bad as in python, you have the same question-marks why types suddenly emerge out of nowhere. Almost ... almost as if some people have a specific agenda, and try to pull through with it.
Well, there you have it - the type-addicted people are ruining python.
With agents it no longer makes sense to tie yourself to Python's archaic development experience. How many type checkers are there? Package managers? Don't even get me started on cross-platform deployment.
Strongly typed, compiled languages have never been easier to use, and agents reap huge benefits from the tight feedback loop that the compiler provides. Moreover the benefits of the Python ecosystem are less significant today than anytime in the past 20 years. Need something that's only available in Python? Just point some agents at it and you can port it.
What about the several people worldwide who don't want to use LLMs to program?
They also "reap huge benefits from the tight feedback loop that the compiler provides".
When something is easier/requires less context, it tends to work well for both human and LLM.
I've noticed this a lot in LLM generated Java. Since it doesn't know what can or can't be null it tends to wrap everything in Optional<T>. Super strong type systems are becoming even more important.
You probably need to tell it to rip as many of those out as possible (and replace them with null annotations).
I've noticed LLMs sometimes pick a documented anti-pattern (passing Optional around in Java is not recommended), then amplify it (like a human might).
That's because LLMs suck.
> Just point some agents at it and you can port it.
Don’t think we’re there yet, otherwise we would see a bunch of forks of major libraries to alternative languages - and not just Python. There’s still too much risk of insidious errors and bugs.
I've done thus a few times for stuff in the < 10,000 LOC space. It works great.
There's something particularly satisfying about shipping a 1-10MB static rust binary instead of a 2GiB docker python environment.
(I'm talking about just porting simple applications, or maybe a missing package/crate at a time. Not both at once, and not typical 100K-10M line internal legacy sprawl)
What have you ported, for example? Any 3PL or all internal code?