The person who eventually fixed the issue, mkitti, had to push through a lot of "institutional" friction to do so, and the eventual fix is the result of his determined efforts.
while I certainly agree that bug was a bit frustrating, in all that
* it happened in the first place
* it took so long to get noticed
* it took so long to merge the fix after being reported
I do feel like I should push back on the term "institutional friction." it was more of a bus factor problem; there were not enough (aka zero) maintainer eyeballs on the proposed fix. but there wasn't exactly anybody saying not to fix it, which is what I think of as friction.
Julia is a very powerful and flexible language. With very powerful tools you can get a lot done quickly, including shooting yourself into the foot. Julia's type-system allows you to easily compose different elements of Julia's vast package ecosystem in ways that possibly were never tested or even intended or foreseen by the authors of these packages to be used that way. If you don't do that, you may have a much better experience than the author. My own Julia code generally does not feed the custom type of one package into the algorithms of another package.
One can hardly call using the canonical autograd library and getting incorrect gradients or using arrays whose indices aren't 1:len and getting OOB errors “shooting oneself in the foot” — these things are supposed to Just Work, but they don't. Interfaces would go a long way towards codifying interoperability expectations (although wouldn't help with plain old correctness bugs).
With regard to power and flexibility, homoiconicity and getting to hook into compiler passes does make Julia powerful and flexible in a way that most other languages aren't. But I'm not sure if that power is what results in bugs — more likely it's the function overloading/genericness, whose power and flexibility I think is a bit overstated.
Zygote hasn’t been the “canonical” autodiff library for some time now, the community recognized its problems long ago. Enzyme and Mooncake are the major efforts and both have a serious focus on correctness
In this debate there seems to be a pervasive impulse to point out "that specific problem doesn't exist anymore," and to their credit the developers are generally good at responding to serious problems.
However, the spirit of the original post was about the lack of safeguards and cohesive direction by the community to find ways to preempt such errors. It's not an easy problem to solve since Julia's composability and flexibility adds complexity not encountered in other languages. The current solution is, 'users beware', while there are a few people working on ways to enforce correct composability. I think it's best to acknowledge that this is an ongoing issue and that it's not a problem anymore because the specific ones pointed out are fixed.
One of the basic marketing claims of the language developers is that one author's algorithm can be composed with another author's custom data type. If that's not really true in general, even for some of the most popular libraries and data types, maybe the claims should be moderated a bit.
> one author's algorithm can be composed with another author's custom data type
This is true, and it's a powerful part of the language - but you can implement it incorrectly when you compose elements together that expect some attributes from the custom data type. There is no way to formally enforce that, so you can end up with correctness bugs.
My experience is that it is true, if you throughly implement the interfaces that your types are supposed to respect. If you dont, well, thats not the languages fault.
I'm reminded of the recent post about R's (CRAN's) somewhat radical approach to integration testing [1] and I wonder if something like that would help with the composition issues described here.
The Julia world is already quite careful with testing and CI. Apart from the usual unit testing, many packages do employ integration testing. The Julia project itself (compiler, etc) is tested against the package ecosystem quite often (regularly and for select pull requests).
I certainly didn't mean to imply that Julia's community was incompetent or that they were not doing integration testing. CRAN's approach (which is mandatory integration testing against all known dependents enforced by the packaging authority - the global and mandatory nature being what makes it different) is genuinely innovative and radical. I don't think that's an approach that should be adopted lightly or by most ecosystems, but I do observe that a.) these languages have similar goals and b.) it's an approach intended to solve problems of much the same shape as described in the article.
Again I think this approach is too radical for most ecosystems, but Julia is pursuing a similarly radical level of composability/reusability and evidently encountering difficulties with it, so I think there may be a compatibility there.
I don't think testing against every existing dependent would make sense currently. The issue is the lack of tooling for mechanically checking whether the dependent accesses implementation details of the dependency, in which case it would be valid for the dependency to break the dependent.
There are some proposals to forbid the registration of a package release which trespasses on the internals of another package, though.
I hope someone tackles the above sooner or later, but another issue is the approach of testing every known dependent package might be very costly, both in terms of compute and manual labor, the latter because someone would have to do the work of maintaining a blacklist for packages with flaky unit tests. The good news is that this work might considerably overlap with the already existing PkgEval infrastructure. We'll see.
I've been using Julia for seven years, four years as my main language at work.
It's my view that all the major points in the blog post are true, and the problem persists. It's slightly better now, because Julia has more usage in industry and less usage by hobby hackers.
I'm convinced it's caused by two factors: The first is the duck-typed, dynamic nature of the language, which, like Python, gives the developer no tools to check or enforce correctness.
More fundamentally, the culture of Julia is a cowboy hacking culture where we just start writing, and then we can always kick the code around once bugs appears. There seem to be an almost complete disinterest in careful documentation of behaviour and edge cases, or even actual descriptions of what some abstraction is supposed to do. The natural result is that people interpret all kinds of meaning to any abstraction and use them in slightly different ways. It's madness.
As an example, consider [the definition of Base.seek](https://docs.julialang.org/en/v1/base/io-network/#Base.seek). There is no description of what the position is, what type is can be or operations it's supposed to support. Nor that the seek position is typically zero-indexed. There is no description of what should happen for out of bounds seeks, or how it differs for files open in reading and writing mode. Nor any description of the errors it can throw.
I must emphasize that this kind of documentation is the norm, not the exception.
This kind of indifference towards actually specifying behaviour is not a foundation you can build a language on. And it's very hard to change in retrospect, because by now, seek means a bunch of different things in Base Julia and the ecosystem, and it would be breaking to change.
I've several times seen a core dev change some behaviour of some code because they clearly thought the behaviour was always meant to be X, even though it actually did Y, arguing that Y was an implementation detail. No shit - everything is an implementation detail when nothing is documented.
I think Julia needs to grow up and begin taking it's documentation and interfaces seriously.
improving, still not perfect. it was true then, and is even more true now, that a large fraction of "correctness bugs" (maybe even the majority) arise from `OffsetArrays.jl`, so a simple solution besides "avoid Julia" is "avoid that package"
The simplest approach is to always read the interface of packages one wants to use, and if one isn't provided look at the code / open an issue to interact with the developers about their input assumptions. One should also make tests to ensure the interface behaves in the expected manner when working with your code.
Using this approach since 2017 I've never really encountered the types of issues mentioned in Yuri's blog post. The biggest issue I've had is if some user-package makes a change that is effectively breaking but they don't flag it the associated release as breaking. But this isn't really a Julia issue so much as a user-space issue, and can happen in any language when relying on others' libraries.
It doesn't happen so frequently in practice, which is maybe why it's not felt equally by everyone in the community. I've followed some discussions on this topic and the current solution is, as you say, to place the onus entirely on the user. There is another faction that wants to support the user through the IDE to warn them of errors they might be making, and others that want to implement formal specifications that catch the errors during specifications. It's not an easy problem to fix and therefore the "simplest approach" as you describe remains the leading solution.
It baffles me that they dug this hole in the first place. I have feelings on the zero-indexing vs one-indexing debate, but at the end of the day you can write correct code in either, as long as you know which one you're using.
But Julia fucked it up to where it's not clear what you're using, and library writers don't know which one has been passed! It's insane. They chose style over consistency and correctness and it's caused years of suffering.
Technically you don't need to know what array indexing is being used if you iterate using firstindex(arr):lastindex(arr). AFAIK the issue was that this wasn't consistently done across the Julia ecosystem including parts of the standard library at the time. No clue as to whether this still holds true, but I don't worry about it because I don't use OffsetArrays.
I say this as a huge Julia fan, but the point is not the specific bugs in the article, but the culture of not prioritizing correctness in computation. The initial response by many (not all) in the community was look, those specific bugs are fixed; all languages have bugs; more importantly - look at the benchmark speeds of these computations! Which only reinforced this negative perception.
My understanding is that it's a difficult problem to solve, and there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge. In practice, composability problems arise seldomly, but there is no formal way to guard against it yet. I believe there was some work done at Northeastern U. [1] toward this goal but it's still up to the user to "be careful", essentially.
> the culture of not prioritizing correctness in computation
On the contrary, it is my impression the experienced Julia programmers, including those involved in JuliaLang/julia, take correctness seriously. More so than in many other PL communities.
> there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge
What exactly do you mean by "traits" or "interfaces"? Why do you think these "traits" would help with the issues that bug you?
True, actually they are good in following up on numerical correctness, so I should be rephrase 'correctness in computation' to 'correctness in composition' - the types of bugs that arise from mashing a lot of modules together. On the one hand it's not a Julia issue but a package ecosystem issue.
I think you're actually even more active in the Julia community so maybe I don't have to summarize the debate but these are the types of traits and interface packages being developed that are meant to formalize how modules can be used and extended by others.
What I wanted to say is that I'm skeptical regarding "interfaces", either as a language feature or as a package. Although TBH I have not yet given any specific "interfaces" design more than a cursory glance, so my position is not really justified.
> the culture of not prioritizing correctness in computation.
In a language with pervasive use of generic methods, I don't know what actually means. If I write a function like:
function add3(x, y, z)
x + y + z
end
Is it correct or not? What does "correct" even mean here? If you call it with values where `+` is defined on them and does what you expect, then my function probably does what you expect too. But if you pass values where `+` does something weird, then `add3()` does something weird too.
Is that correct? What expectations should someone have about a function whose behavior is defined in terms of calls to other open-ended generic functions?
I should have been more precise in my language - it's not numerical correctness but composability correctness. They won't appear in a simple example like the one you provided, but more complicated ones are provided in the original post - in the example it partially centers around how getindex should be used with a particular struct and so on.
There's nothing numerical in my comment. There's just arguments and function calls, which happened to be named "+".
My point is that there is an implied contract that `add3()` only does what you expect if you pass it values where `+` happens to do what you expect. When you have a language with fully open generic methods like Julia, it's very powerful, but the trade-off is that every function is effectively like middleware where all it can really say is "if you give me things that to delegate to the right things, I'll do the right thing too".
When I'm writing `add3()`, I don't know what `+` does. I'm writing a function in terms of open-ended abstractions that I don't control, so it's very hard to make any promises about the semantics of my function.
Your perspective tracks with mine. Without contracts, either specified in documentation or as static guarantees, it is hard or impossible to build robust programs.
In Julia it's almost as if every function is an interface, with (usually quite terse) documentation as its only semantic constraint. For example, here is the full documentation for `+`: https://docs.julialang.org/en/v1/base/math/#Base.:+
I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
> Without contracts, either specified in documentation or as static guarantees, it is hard or impossible to build robust programs.
Right. I think a big part of this is expectation management. Julia lets you compose unrelated libraries much more freely than most other languages do. That's very powerful, but if you come into it expecting all of those compositions to magically work, I think you just have an unrealistic expectation.
There's no silver bullet when it comes to code reuse and Conway's Law can't be entirely avoided.
> I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
> Julia lets you compose unrelated libraries much more freely than most other languages do. That's very powerful, but if you come into it expecting all of those compositions to magically work, I think you just have an unrealistic expectation.
Yep, and it is unfortunate that this unrealistic expectation is explicitly encouraged by the creators of the language:
> It is actually the case in Julia that you can take generic algorithms that were written by one person and custom types that were written by other people and just use them together efficiently and effectively.
It seems worth reiterating that on a personal level I really like and appreciate the vast majority of the folks I’ve met in the Julia community. I’m glad I got to hang out with them and learn from them. But in my opinion setting expectations like this fosters bad science.
I've nearly exclusively used Julia since 2017. I don't think this is a perverse use of such functions -- long ago I naturally guessed I could use `cumsum!` on the same input and output and it would correctly overwrite the values (which now gives a similar warning in the documentation). However, when I first used it that way I tested if it did what I expected to verify my assumption.
It is good the documentation is now explicit that the behavior is not guaranteed in this case, but even better would be if aliasing were detected and handled (at least for base Julia arrays, so that the warning would only be needed for non-base types).
Still, the lesson is that when using generic functions one should look at what they expect of their input, and if this isn't documented one should at least test what they are giving thoroughly and not assume it just works. I've always worked this way, and never run into surprises like the types of issues reported in the blog post.
Currently there is no documentation on what properties an input to `sum!` must support in the doc string, so one needs to test its correctness when using it outside of base Julia data types (I haven't checked the broader docs for an interface specification, but if there is one it really should be linked in the docstring).
But your use of cumsum!() seems natural; I can see using it that way, and might have done so myself. The use of sum!() under discussion seems weird, though.
I use julia intensively and have done so for 5 years or so. I have never encountered anything I would call a correctness bug. I guess it depends on what you count as a correctness bug, and what you mean by "julia". The core language has no obvious bugs, but there are packages of dubious quality.
Say you use some package for numerical integration. One day you cook up your own floating point type, and use the same package with success. Then you change your floating point type subtly, and suddenly weird things start to happen. Is it a correctness bug? Whose bug?
Surely, the author of the integration package didn't have your weird floating type in mind, but it still worked. Until you made it even weirder. These are the things some people think are correctness bugs in julia. It's mostly poor coding.
I've found that I'm most productive in Julia when minimizing the number of third party dependencies for this reason, even more so than other languages. That's not to say there are not many high quality packages available but rather the benefits of the type system align better when I have a strong understanding or control over the most pertinent interfaces. As a language Julia definitely rewards you heavily for this type of thing. Coming from Python my first instinct was to try to solve as many problems as possible with third party packages and filling in between the lines. Unsurprisingly this was the worst of both worlds.
If there was one thing I could change about Julia it most certainly wouldn't be correctness issues in my own experience. Filling in the ecosystem in terms of boring glue type stuff like a production grade gRPC client would be amazing. This was the type of problem that almost got me to give up on the language.
Which is a bummer, however the basics are always critical. Numerical stability and known correctness is still (or was) a reason a lot of old fortran libraries were used for so long.
I'm really surprised by the list of issues as some of those are pretty recent (2024) and pretty important parts of the ecosystem like ordereddict.
Julia is not without warts, but this blog post is kinda rubbish. The post claims vague but scary "correctness issues", trying to support this with a collection of unrelated issue tickets from all across Julia and the Julia package ecosystem. Not all of which were even bugs in the first place, and many of which have long been resolved.
The fact that bugs happen in software should not surprise anyone. Even software of critical importance, such as GCC or LLVM, whose correctness is relied upon by the implementations of many programming languages (including C, C++ and Julia itself), are buggy.
Instead the post could have focused more on actual design issues, such as some of the Base interfaces being underspecified:
> the nature of many common implicit interfaces has not been made precise (for example, there is no agreement in the Julia community on what a number is)
The underspecified nature of Number (or Real, or IO) is an issue, albeit not related with the rest of the blog post. It does not excuse the scaremongering in the blog post, however.
Number isn’t an interface—there are no operations common to all numbers. Subtyping Number is a way to opt into numeric promotion and a few other useful generic behaviors. That’s it. The fact that some abstract types are interfaces with expected behaviors, while others are dispatch points to opt into behaviors is a double edged sword: powerful and flexible, but only explicitly expressed/explained in documentation.
> Number isn’t an interface—there are no operations common to all numbers.
When creating a new type, it should be more clear cut when is subtyping Number (or Real, etc.) valid. Should unitful quantities be numbers? Should intervals be numbers? Related: I think there are some attempts by Tim Holy and others to create/document "thick numbers".
Furthermore, I believe it might be good to align the Number type hierarchy with math/abstract algebra as much as possible without breaking backwards compatibility, which might making Number, or some subtypes of it, actual interfaces.
> Subtyping Number is a way to opt into numeric promotion and a few other useful generic behaviors. That’s it.
> My conclusion after using Julia for many years is that there are too many correctness and composability bugs throughout the ecosystem to justify using it in just about any context where correctness matters.
Does this have any impact on the cosmological emulator written in Julia?
The OP shows examples of people being unable to solve a problem in Julia that they solve quickly after switching to PyTorch, Jax, or TensorFlow, so the OP is implicitly recommending those alternatives.
He doesn't recommend any alternatives. Looking at his GitHub profile (https://github.com/yurivish?tab=repositories), looks like he's using a lot of Rust and Go these days, though looks like mainly for projects unrelated to the sorts of data crunching suitable for Julia, R, etc
Scrolling through this list, it’s clear that many are “correctness issues.”
I do not link this to argue that scipy bugs are more serious or more frequent. I don’t think that kind of statistical comparison is meaningful.
However, I think a motivated reasoner could write a very similar blog post to the OP, but about $arb_python_lib instead of $arb_julia_lib.
I suppose my position is closer to “only Kahan, Boyd and Higham write correct numerical algorithms” (a hyperbole, but illustrative of the difficulty level).
Explicit errors are not correctness issues. Some numerical instability issues in certain algorithms in certain corner cases might be considered as such though.
Regardless, overall, these are grossly of another complexity and seriousness than the base sum function being just wrong, or cultural issues among developers with not verifying inputs "for performance", or things of that nature. The scientific Python community has, in my experience, a much higher adherence to good standards than that.
I have found multiple "correctness bugs" of equal seriousness in Polars, and one of them is still open. that is not to throw shade at polars --- I love that package! but my point is that these things happen everywhere in software.
I like the design of the language, but I eventually went through too many cycles of "fast compilation and/or module caching is coming in the next release" and "ahead of time compilation is coming soon" and got burned out. I remember believing the same stuff from java for years until forgetting about it.
The pre-compilation speed/caching performance ("time to first plot") has practically been solved since 2024, when Julia 1.10 became the current LTS version. The current focus is on improving the generation of reasonably-sized stand-alone binaries.
I heard it for 10 years, I gave it too many chances. Each time it was solved, then it was going to be solved in a new release right around the corner, again and again. Maybe it is now, I don't care anymore.
@dang: I'm not sure exactly when this was posted since it seems to have no date, but it's at least (2022) per HN's link from that year: https://news.ycombinator.com/item?id=31396861
I mention this because this is definitely the sort of content that can age poorly. I have no direct experience, I've never so much as touched Julia.
Edit: since there are (again? I seem to remember this last time) complaints about the title being a bit too baity, I've pilfered that previous title for this thread as well
As an experiment, I would be interested to see if somebody would make a 1-based python list-like data structure (or a 0-based R array), to check how many 3rd party (or standard library) function would no longer work.
We really don't need all the hall monitoring here, it's lame when you see a submission has comments but then it's meta stuff like complaints about dates and titles.
I don't think 'hall monitoring' is fair here - it's standard for HN titles to include the year that an article dates from, and it's standard for users to point out when such a year hasn't been added to the title yet.
The culture of "hall monitoring" is one of the best things about HN, IMO. It's one of the few places on the internet where people - including/not just the mods - care about maintaining a high quality of discourse.
"This Is A Really Cool Insect" doesn't really need (1998) in it, but for things like deficiencies of programming language it's helpful to know when the article comes from. "C++ Really Sucks (1995)" is a very different article from "C++ Really Sucks (2025)", and has very different takeaways for the reader.
I've seen this article several times, and I'm sure Jerf has as well.
Our instinct, with years of being here, is it isn't a good fit for HN, at least at its current age and as labelled.
It is not conducive to healthy discussion to have an aged[1] blanket dismissal of a language coupled to an assertion that saying "the issues looked fixed?" is denial of people's lived experience.
[1] We can infer it was written in 2021, as the newest issue they created is from then, and they avowed never to use the language again.
I'm the author of the original post. I still like and miss my friends in the Julia community, but the technical and cultural issues persist.
I may refresh the post with more recent information at some point. In the meantime, those curious can find a short story of one newer correctness bug here: https://discourse.julialang.org/t/why-is-it-reliable-to-use-...
The person who eventually fixed the issue, mkitti, had to push through a lot of "institutional" friction to do so, and the eventual fix is the result of his determined efforts.
While his part of the story mostly played out in venues outside of the Discourse forum some of it is on display in this thread: https://discourse.julialang.org/t/csv-jl-findmax-and-argmax-...
while I certainly agree that bug was a bit frustrating, in all that
* it happened in the first place
* it took so long to get noticed
* it took so long to merge the fix after being reported
I do feel like I should push back on the term "institutional friction." it was more of a bus factor problem; there were not enough (aka zero) maintainer eyeballs on the proposed fix. but there wasn't exactly anybody saying not to fix it, which is what I think of as friction.
Julia is a very powerful and flexible language. With very powerful tools you can get a lot done quickly, including shooting yourself into the foot. Julia's type-system allows you to easily compose different elements of Julia's vast package ecosystem in ways that possibly were never tested or even intended or foreseen by the authors of these packages to be used that way. If you don't do that, you may have a much better experience than the author. My own Julia code generally does not feed the custom type of one package into the algorithms of another package.
One can hardly call using the canonical autograd library and getting incorrect gradients or using arrays whose indices aren't 1:len and getting OOB errors “shooting oneself in the foot” — these things are supposed to Just Work, but they don't. Interfaces would go a long way towards codifying interoperability expectations (although wouldn't help with plain old correctness bugs).
With regard to power and flexibility, homoiconicity and getting to hook into compiler passes does make Julia powerful and flexible in a way that most other languages aren't. But I'm not sure if that power is what results in bugs — more likely it's the function overloading/genericness, whose power and flexibility I think is a bit overstated.
Zygote hasn’t been the “canonical” autodiff library for some time now, the community recognized its problems long ago. Enzyme and Mooncake are the major efforts and both have a serious focus on correctness
In this debate there seems to be a pervasive impulse to point out "that specific problem doesn't exist anymore," and to their credit the developers are generally good at responding to serious problems.
However, the spirit of the original post was about the lack of safeguards and cohesive direction by the community to find ways to preempt such errors. It's not an easy problem to solve since Julia's composability and flexibility adds complexity not encountered in other languages. The current solution is, 'users beware', while there are a few people working on ways to enforce correct composability. I think it's best to acknowledge that this is an ongoing issue and that it's not a problem anymore because the specific ones pointed out are fixed.
One of the basic marketing claims of the language developers is that one author's algorithm can be composed with another author's custom data type. If that's not really true in general, even for some of the most popular libraries and data types, maybe the claims should be moderated a bit.
> one author's algorithm can be composed with another author's custom data type
This is true, and it's a powerful part of the language - but you can implement it incorrectly when you compose elements together that expect some attributes from the custom data type. There is no way to formally enforce that, so you can end up with correctness bugs.
My experience is that it is true, if you throughly implement the interfaces that your types are supposed to respect. If you dont, well, thats not the languages fault.
I'm reminded of the recent post about R's (CRAN's) somewhat radical approach to integration testing [1] and I wonder if something like that would help with the composition issues described here.
[1] https://news.ycombinator.com/item?id=45259623
The Julia world is already quite careful with testing and CI. Apart from the usual unit testing, many packages do employ integration testing. The Julia project itself (compiler, etc) is tested against the package ecosystem quite often (regularly and for select pull requests).
I certainly didn't mean to imply that Julia's community was incompetent or that they were not doing integration testing. CRAN's approach (which is mandatory integration testing against all known dependents enforced by the packaging authority - the global and mandatory nature being what makes it different) is genuinely innovative and radical. I don't think that's an approach that should be adopted lightly or by most ecosystems, but I do observe that a.) these languages have similar goals and b.) it's an approach intended to solve problems of much the same shape as described in the article.
Again I think this approach is too radical for most ecosystems, but Julia is pursuing a similarly radical level of composability/reusability and evidently encountering difficulties with it, so I think there may be a compatibility there.
I don't think testing against every existing dependent would make sense currently. The issue is the lack of tooling for mechanically checking whether the dependent accesses implementation details of the dependency, in which case it would be valid for the dependency to break the dependent.
There are some proposals to forbid the registration of a package release which trespasses on the internals of another package, though.
I hope someone tackles the above sooner or later, but another issue is the approach of testing every known dependent package might be very costly, both in terms of compute and manual labor, the latter because someone would have to do the work of maintaining a blacklist for packages with flaky unit tests. The good news is that this work might considerably overlap with the already existing PkgEval infrastructure. We'll see.
Has anything changed since then? What are y'all's thoughts about correctness in Julia in 2025?
I've been using Julia for seven years, four years as my main language at work.
It's my view that all the major points in the blog post are true, and the problem persists. It's slightly better now, because Julia has more usage in industry and less usage by hobby hackers.
I'm convinced it's caused by two factors: The first is the duck-typed, dynamic nature of the language, which, like Python, gives the developer no tools to check or enforce correctness.
More fundamentally, the culture of Julia is a cowboy hacking culture where we just start writing, and then we can always kick the code around once bugs appears. There seem to be an almost complete disinterest in careful documentation of behaviour and edge cases, or even actual descriptions of what some abstraction is supposed to do. The natural result is that people interpret all kinds of meaning to any abstraction and use them in slightly different ways. It's madness.
As an example, consider [the definition of Base.seek](https://docs.julialang.org/en/v1/base/io-network/#Base.seek). There is no description of what the position is, what type is can be or operations it's supposed to support. Nor that the seek position is typically zero-indexed. There is no description of what should happen for out of bounds seeks, or how it differs for files open in reading and writing mode. Nor any description of the errors it can throw.
I must emphasize that this kind of documentation is the norm, not the exception.
This kind of indifference towards actually specifying behaviour is not a foundation you can build a language on. And it's very hard to change in retrospect, because by now, seek means a bunch of different things in Base Julia and the ecosystem, and it would be breaking to change.
I've several times seen a core dev change some behaviour of some code because they clearly thought the behaviour was always meant to be X, even though it actually did Y, arguing that Y was an implementation detail. No shit - everything is an implementation detail when nothing is documented.
I think Julia needs to grow up and begin taking it's documentation and interfaces seriously.
improving, still not perfect. it was true then, and is even more true now, that a large fraction of "correctness bugs" (maybe even the majority) arise from `OffsetArrays.jl`, so a simple solution besides "avoid Julia" is "avoid that package"
The greater issue that there is still no way to prevent those types of composability bugs.
The simplest approach is to always read the interface of packages one wants to use, and if one isn't provided look at the code / open an issue to interact with the developers about their input assumptions. One should also make tests to ensure the interface behaves in the expected manner when working with your code.
Using this approach since 2017 I've never really encountered the types of issues mentioned in Yuri's blog post. The biggest issue I've had is if some user-package makes a change that is effectively breaking but they don't flag it the associated release as breaking. But this isn't really a Julia issue so much as a user-space issue, and can happen in any language when relying on others' libraries.
It doesn't happen so frequently in practice, which is maybe why it's not felt equally by everyone in the community. I've followed some discussions on this topic and the current solution is, as you say, to place the onus entirely on the user. There is another faction that wants to support the user through the IDE to warn them of errors they might be making, and others that want to implement formal specifications that catch the errors during specifications. It's not an easy problem to fix and therefore the "simplest approach" as you describe remains the leading solution.
sure there are ways. they're just not employed as diligently as they should be. that's more of a social problem than a technical problem.
It baffles me that they dug this hole in the first place. I have feelings on the zero-indexing vs one-indexing debate, but at the end of the day you can write correct code in either, as long as you know which one you're using.
But Julia fucked it up to where it's not clear what you're using, and library writers don't know which one has been passed! It's insane. They chose style over consistency and correctness and it's caused years of suffering.
Technically you don't need to know what array indexing is being used if you iterate using firstindex(arr):lastindex(arr). AFAIK the issue was that this wasn't consistently done across the Julia ecosystem including parts of the standard library at the time. No clue as to whether this still holds true, but I don't worry about it because I don't use OffsetArrays.
This is a reasonable article, but way out of date now. Almost all issues raised were solved a while ago.
I say this as a huge Julia fan, but the point is not the specific bugs in the article, but the culture of not prioritizing correctness in computation. The initial response by many (not all) in the community was look, those specific bugs are fixed; all languages have bugs; more importantly - look at the benchmark speeds of these computations! Which only reinforced this negative perception.
My understanding is that it's a difficult problem to solve, and there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge. In practice, composability problems arise seldomly, but there is no formal way to guard against it yet. I believe there was some work done at Northeastern U. [1] toward this goal but it's still up to the user to "be careful", essentially.
[1] https://repository.library.northeastern.edu/files/neu:4f20cn...
> the culture of not prioritizing correctness in computation
On the contrary, it is my impression the experienced Julia programmers, including those involved in JuliaLang/julia, take correctness seriously. More so than in many other PL communities.
> there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge
What exactly do you mean by "traits" or "interfaces"? Why do you think these "traits" would help with the issues that bug you?
True, actually they are good in following up on numerical correctness, so I should be rephrase 'correctness in computation' to 'correctness in composition' - the types of bugs that arise from mashing a lot of modules together. On the one hand it's not a Julia issue but a package ecosystem issue.
I think you're actually even more active in the Julia community so maybe I don't have to summarize the debate but these are the types of traits and interface packages being developed that are meant to formalize how modules can be used and extended by others.
https://github.com/rafaqz/Interfaces.jl
https://discourse.julialang.org/t/interfaces-traits-in-julia...
What I wanted to say is that I'm skeptical regarding "interfaces", either as a language feature or as a package. Although TBH I have not yet given any specific "interfaces" design more than a cursory glance, so my position is not really justified.
> the culture of not prioritizing correctness in computation.
In a language with pervasive use of generic methods, I don't know what actually means. If I write a function like:
Is it correct or not? What does "correct" even mean here? If you call it with values where `+` is defined on them and does what you expect, then my function probably does what you expect too. But if you pass values where `+` does something weird, then `add3()` does something weird too.Is that correct? What expectations should someone have about a function whose behavior is defined in terms of calls to other open-ended generic functions?
I should have been more precise in my language - it's not numerical correctness but composability correctness. They won't appear in a simple example like the one you provided, but more complicated ones are provided in the original post - in the example it partially centers around how getindex should be used with a particular struct and so on.
There's nothing numerical in my comment. There's just arguments and function calls, which happened to be named "+".
My point is that there is an implied contract that `add3()` only does what you expect if you pass it values where `+` happens to do what you expect. When you have a language with fully open generic methods like Julia, it's very powerful, but the trade-off is that every function is effectively like middleware where all it can really say is "if you give me things that to delegate to the right things, I'll do the right thing too".
When I'm writing `add3()`, I don't know what `+` does. I'm writing a function in terms of open-ended abstractions that I don't control, so it's very hard to make any promises about the semantics of my function.
Your perspective tracks with mine. Without contracts, either specified in documentation or as static guarantees, it is hard or impossible to build robust programs.
In Julia it's almost as if every function is an interface, with (usually quite terse) documentation as its only semantic constraint. For example, here is the full documentation for `+`: https://docs.julialang.org/en/v1/base/math/#Base.:+
I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
> Without contracts, either specified in documentation or as static guarantees, it is hard or impossible to build robust programs.
Right. I think a big part of this is expectation management. Julia lets you compose unrelated libraries much more freely than most other languages do. That's very powerful, but if you come into it expecting all of those compositions to magically work, I think you just have an unrealistic expectation.
There's no silver bullet when it comes to code reuse and Conway's Law can't be entirely avoided.
> I love Game Programming Patterns, by the way! Laughed out loud when I first saw the back cover.
:D
> Julia lets you compose unrelated libraries much more freely than most other languages do. That's very powerful, but if you come into it expecting all of those compositions to magically work, I think you just have an unrealistic expectation.
Yep, and it is unfortunate that this unrealistic expectation is explicitly encouraged by the creators of the language:
> It is actually the case in Julia that you can take generic algorithms that were written by one person and custom types that were written by other people and just use them together efficiently and effectively.
It seems worth reiterating that on a personal level I really like and appreciate the vast majority of the folks I’ve met in the Julia community. I’m glad I got to hang out with them and learn from them. But in my opinion setting expectations like this fosters bad science.
When I look, an issue such as "the base `sum!` is wrong" is still an open issue. Which, I think, is a bit ridiculous.
This is a typical example:
• The documentation (currently) of the function warns not to use it this way;
• This is a rather perverse use of the function(s) that would be unlikely unless you’re trying to break things;
• The discussion on the issue page demonstrates the exact opposite of a culture not caring about correctness;
• This kind of stuff doesn’t matter to all the scientists who are actually using Julia to do real work.
Nevertheless, sum!() and friends should be, somehow, made to avoid this problem, certainly.
I've nearly exclusively used Julia since 2017. I don't think this is a perverse use of such functions -- long ago I naturally guessed I could use `cumsum!` on the same input and output and it would correctly overwrite the values (which now gives a similar warning in the documentation). However, when I first used it that way I tested if it did what I expected to verify my assumption.
It is good the documentation is now explicit that the behavior is not guaranteed in this case, but even better would be if aliasing were detected and handled (at least for base Julia arrays, so that the warning would only be needed for non-base types).
Still, the lesson is that when using generic functions one should look at what they expect of their input, and if this isn't documented one should at least test what they are giving thoroughly and not assume it just works. I've always worked this way, and never run into surprises like the types of issues reported in the blog post.
Currently there is no documentation on what properties an input to `sum!` must support in the doc string, so one needs to test its correctness when using it outside of base Julia data types (I haven't checked the broader docs for an interface specification, but if there is one it really should be linked in the docstring).
But your use of cumsum!() seems natural; I can see using it that way, and might have done so myself. The use of sum!() under discussion seems weird, though.
I quit Julia after running into serious bugs in basic CSV package functionality a few years back.
The language is elegant, intuitive and achieves what it promises 99% of the time, but that’s not enough compared to other programming languages.
I use julia intensively and have done so for 5 years or so. I have never encountered anything I would call a correctness bug. I guess it depends on what you count as a correctness bug, and what you mean by "julia". The core language has no obvious bugs, but there are packages of dubious quality.
Say you use some package for numerical integration. One day you cook up your own floating point type, and use the same package with success. Then you change your floating point type subtly, and suddenly weird things start to happen. Is it a correctness bug? Whose bug?
Surely, the author of the integration package didn't have your weird floating type in mind, but it still worked. Until you made it even weirder. These are the things some people think are correctness bugs in julia. It's mostly poor coding.
I've found that I'm most productive in Julia when minimizing the number of third party dependencies for this reason, even more so than other languages. That's not to say there are not many high quality packages available but rather the benefits of the type system align better when I have a strong understanding or control over the most pertinent interfaces. As a language Julia definitely rewards you heavily for this type of thing. Coming from Python my first instinct was to try to solve as many problems as possible with third party packages and filling in between the lines. Unsurprisingly this was the worst of both worlds.
If there was one thing I could change about Julia it most certainly wouldn't be correctness issues in my own experience. Filling in the ecosystem in terms of boring glue type stuff like a production grade gRPC client would be amazing. This was the type of problem that almost got me to give up on the language.
Which is a bummer, however the basics are always critical. Numerical stability and known correctness is still (or was) a reason a lot of old fortran libraries were used for so long.
I'm really surprised by the list of issues as some of those are pretty recent (2024) and pretty important parts of the ecosystem like ordereddict.
Julia is not without warts, but this blog post is kinda rubbish. The post claims vague but scary "correctness issues", trying to support this with a collection of unrelated issue tickets from all across Julia and the Julia package ecosystem. Not all of which were even bugs in the first place, and many of which have long been resolved.
The fact that bugs happen in software should not surprise anyone. Even software of critical importance, such as GCC or LLVM, whose correctness is relied upon by the implementations of many programming languages (including C, C++ and Julia itself), are buggy.
Instead the post could have focused more on actual design issues, such as some of the Base interfaces being underspecified:
> the nature of many common implicit interfaces has not been made precise (for example, there is no agreement in the Julia community on what a number is)
The underspecified nature of Number (or Real, or IO) is an issue, albeit not related with the rest of the blog post. It does not excuse the scaremongering in the blog post, however.
Number isn’t an interface—there are no operations common to all numbers. Subtyping Number is a way to opt into numeric promotion and a few other useful generic behaviors. That’s it. The fact that some abstract types are interfaces with expected behaviors, while others are dispatch points to opt into behaviors is a double edged sword: powerful and flexible, but only explicitly expressed/explained in documentation.
> Number isn’t an interface—there are no operations common to all numbers.
When creating a new type, it should be more clear cut when is subtyping Number (or Real, etc.) valid. Should unitful quantities be numbers? Should intervals be numbers? Related: I think there are some attempts by Tim Holy and others to create/document "thick numbers".
Furthermore, I believe it might be good to align the Number type hierarchy with math/abstract algebra as much as possible without breaking backwards compatibility, which might making Number, or some subtypes of it, actual interfaces.
> Subtyping Number is a way to opt into numeric promotion and a few other useful generic behaviors. That’s it.
OK, but I think that's not documented either.
Yeah, probably should be documented.
I hear similar bugs exist in python libraries.
Any recommended libraries (or languages) that have thoroughly verified libraries?
Do you mean any specific library?
Any really, library, language etc
In my professional experience, the older numerics libraries tend to be more reliable, with the notable exception of Intel’s MKL.
How to quickly kill a language primarily meant for technical computing.
> My conclusion after using Julia for many years is that there are too many correctness and composability bugs throughout the ecosystem to justify using it in just about any context where correctness matters.
Does this have any impact on the cosmological emulator written in Julia?
https://news.ycombinator.com/item?id=45346538
I wonder if you can even distinguish correctness issues caused by these bugs in Julia from just the underlaying ML model behaving weirdly.
It certainly would if it were a timely and justified conclusion. Since it’s not, no, it has no impact.
I think I may have missed what alternative is being recommended instead, after scrolling through the whole article.
The OP shows examples of people being unable to solve a problem in Julia that they solve quickly after switching to PyTorch, Jax, or TensorFlow, so the OP is implicitly recommending those alternatives.
He doesn't recommend any alternatives. Looking at his GitHub profile (https://github.com/yurivish?tab=repositories), looks like he's using a lot of Rust and Go these days, though looks like mainly for projects unrelated to the sorts of data crunching suitable for Julia, R, etc
We’ve had a lot of scientist use R and the “tidyverse” collection of packages. Ggplot2 is fantastic for graphing.
I thing a lot of them used “rstudio” to browse the data.
https://www.tidyverse.org/
Hinted as Python not having these issues.
Python also suffers from such problems. For example, here’s scipy’s issue tracker, filtered for bugs only:
https://github.com/scipy/scipy/issues?q=is%3Aissue%20state%3...
Scrolling through this list, it’s clear that many are “correctness issues.”
I do not link this to argue that scipy bugs are more serious or more frequent. I don’t think that kind of statistical comparison is meaningful.
However, I think a motivated reasoner could write a very similar blog post to the OP, but about $arb_python_lib instead of $arb_julia_lib.
I suppose my position is closer to “only Kahan, Boyd and Higham write correct numerical algorithms” (a hyperbole, but illustrative of the difficulty level).
Explicit errors are not correctness issues. Some numerical instability issues in certain algorithms in certain corner cases might be considered as such though.
Regardless, overall, these are grossly of another complexity and seriousness than the base sum function being just wrong, or cultural issues among developers with not verifying inputs "for performance", or things of that nature. The scientific Python community has, in my experience, a much higher adherence to good standards than that.
I have found multiple "correctness bugs" of equal seriousness in Polars, and one of them is still open. that is not to throw shade at polars --- I love that package! but my point is that these things happen everywhere in software.
> Explicit errors are not correctness issues.
Yes, of course. I am not conflating the two.
> The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Not in my experience. Nor am I defending Julia.
None
I like the design of the language, but I eventually went through too many cycles of "fast compilation and/or module caching is coming in the next release" and "ahead of time compilation is coming soon" and got burned out. I remember believing the same stuff from java for years until forgetting about it.
The pre-compilation speed/caching performance ("time to first plot") has practically been solved since 2024, when Julia 1.10 became the current LTS version. The current focus is on improving the generation of reasonably-sized stand-alone binaries.
I heard it for 10 years, I gave it too many chances. Each time it was solved, then it was going to be solved in a new release right around the corner, again and again. Maybe it is now, I don't care anymore.
Sad to hear that, CyberDildonics
@dang: I'm not sure exactly when this was posted since it seems to have no date, but it's at least (2022) per HN's link from that year: https://news.ycombinator.com/item?id=31396861
I mention this because this is definitely the sort of content that can age poorly. I have no direct experience, I've never so much as touched Julia.
It cites an example from 2024 so the text has definitely been updated since 2022.
Usually when an article is substantively the same but has been updated, we use the original year. I've put 2022 in the title now.
The previous HN thread:
Correctness and composability bugs in the Julia ecosystem - https://news.ycombinator.com/item?id=31396861 - May 2022 (407 comments)
Edit: since there are (again? I seem to remember this last time) complaints about the title being a bit too baity, I've pilfered that previous title for this thread as well
@dang, thank you for doing that.
When I posted the OP, I considered changing the title, but decided not to editorialize it, per the guidelines.
The incorrect example of julia's documentation was fixed 2021: https://github.com/JuliaLang/julia/commit/f31ef767ef9cb0eb1d...
As an experiment, I would be interested to see if somebody would make a 1-based python list-like data structure (or a 0-based R array), to check how many 3rd party (or standard library) function would no longer work.
We really don't need all the hall monitoring here, it's lame when you see a submission has comments but then it's meta stuff like complaints about dates and titles.
I don't think 'hall monitoring' is fair here - it's standard for HN titles to include the year that an article dates from, and it's standard for users to point out when such a year hasn't been added to the title yet.
The culture of "hall monitoring" is one of the best things about HN, IMO. It's one of the few places on the internet where people - including/not just the mods - care about maintaining a high quality of discourse.
"This Is A Really Cool Insect" doesn't really need (1998) in it, but for things like deficiencies of programming language it's helpful to know when the article comes from. "C++ Really Sucks (1995)" is a very different article from "C++ Really Sucks (2025)", and has very different takeaways for the reader.
Jerf's been here for 17 years, me, 16 years.
I've seen this article several times, and I'm sure Jerf has as well.
Our instinct, with years of being here, is it isn't a good fit for HN, at least at its current age and as labelled.
It is not conducive to healthy discussion to have an aged[1] blanket dismissal of a language coupled to an assertion that saying "the issues looked fixed?" is denial of people's lived experience.
[1] We can infer it was written in 2021, as the newest issue they created is from then, and they avowed never to use the language again.