I'm always shocked at how much performance we leave on the table in these places...

rand_r · on Aug 9, 2024

Development speed and conciseness is what sells Python. Something simple like being able to do `breakpoint()` anywhere and get an interactive debugger is unmatched in Go.

randomdata · on Aug 9, 2024

Development speed in Go blows Python out of the water when approaching the project as an engineering project, but Python has it beat when approaching the project as a computer science project.

And therein lies the rub for anything that is pushing the modernity of computer science (AI/ML, DSP, etc.) When born in the computer science domain, much of the preexisting work able to be leveraged is going to be built around Python, not Go (or any other language, for that matter), and it takes a lot of work to bring that to another system.

It is pretty clear that just about anything other than Python is used in engineering when the science end is already well explored and the aforementioned work has already been done, but the available libraries are Python's selling feature otherwise.

igouy · on Aug 9, 2024

Back in the day:

https://cuis-smalltalk.github.io/TheCuisBook/Halt_0021.html

maccard · on Aug 9, 2024

Go's compile times are comparable to a decent sized python app's startup time.

> Something simple like being able to do `breakpoint()` anywhere and get an interactive debugger is unmatched in Go.

As opposed to using a debugger and setting a break and/or tracepoint, which you can do to a running program?

necovek · on Aug 10, 2024

In a Python debugger, you can trivially augment the state of the program by executing Python code right there while debugging.

I don't remember what you can do with Go's debugger, but generally for compiled programs and GDB, you can do a lot, but you can't really execute snippets of new code in-line.

The simplicity is in the fact that Python debugger syntax is really mostly Python syntax, so you already know that if you've written the Python program you are debugging.

imtringued · on Aug 10, 2024

This objection only works in comparisons with python Vs Java, C++ and perhaps Rust (compile times). It doesn't really hold up against more modern languages such as Kotlin or heck even Groovy, which is an ancient relic in JVM land.

pjmlp · on Aug 9, 2024

Yeah, but Go isn't the only alterantive, and something like `breakpoint()` anywhere goes back to Smalltalk and Lisp, with performance Python doesn't offer.

heavyset_go · on Aug 9, 2024

Sounds about right, you can get a ~4x speed up by compiling Python with Nuitka or Mypyc.

bongodongobob · on Aug 9, 2024

Prototype in Python so you can get your ideas on the page then rewrite in something else has been my method for a while now.

pjmlp · on Aug 9, 2024

I rather do it in a language that has JIT in the box, or even AOT as alternative, then I hardly bother with rewrites.

Something I learned by rewriting too much Tcl into C.

serjester · on Aug 9, 2024

Cpython deliberately chooses to prioritize simplicity, stability and easy C extensions over raw speed. If you run the benchmark in PyPy (the performance optimized implementation) it’d be 10X faster. You could argue anything that’s performance bottlenecked shouldn’t be implemented in Python in the first place and therefore C compatibility is critical.

maccard · on Aug 9, 2024

I don't have easy access to pypy so I can't run the benchmark on that I'm afraid. I'd love to see some relative results though.

> You could argue anything that’s performance bottlenecked shouldn’t be implemented in Python in the first place and therefore C compatibility is critical.

Honestly, I think I'm heading towards "python should only be used for glue where performance does not matter". The sheer amount of time lost to waiting for slow runtimes is often brought up with node and electron, but I ran the benchmark in node and it's about halfway between go and python.

igouy · on Aug 9, 2024

CPython 3.12.3 vs Node.js v22.2.0

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

camdenreslink · on Aug 9, 2024

Node isn't the slow part of electron. Node is actually pretty darn fast.

pjmlp · on Aug 10, 2024

Indeed, shipping a full browser is.

sgarland · on Aug 9, 2024

Write it in Python, profile it, and move the bottlenecks into C, called with ctypes.

bbkane · on Aug 9, 2024

If you're going to rewrite large chunks anyway, I'd use Go from the start.

carlmr · on Aug 9, 2024

This is something I find quite fascinating. I've seen so much time wasted on Python this way it's incredible.

Insanely smart optimizations done. But then you rewrite the whole thing in vanilla Rust without profiling once and it's still 20x faster, and much more readable than the mix of numpy vectorization and C-routines.

It's kind of knowingly entering a sunk-cost fallacy for convenience. Before the cost is even sunk.

Python is often just the wrong choice. And it's not even that much simpler if you ask me.

In a language where you use value-based errors, have powerful sum types and strong typing guarantees, you can often get there faster, too IMO, because you can avoid a lot of mistakes by using the strictness of the compiler in your favor.

lpapez · on Aug 9, 2024

> Insanely smart optimizations done. But then you rewrite the whole thing in vanilla Rust without profiling once and it's still 20x faster, and much more readable than the mix of numpy vectorization and C-routines.

I'd argue that if you can rewrite the whole thing in Rust (or any other language), then it's not really a large project and it doesn't matter what you originally wrote it in.

On multiple occassions I had to port parts of Python projects to C++ for performance reasons, and the amount of code you have to write (and time you need to spend) is mind blowing. Sometimes a single line of Python would expand to several files of C++, and you always had to redesign it somehow so that memory and lifetime management becomes feasible.

Python is often the right choice IMO, because most of the time the convenience you mention trumps all other concerns, including performance. I bet you would get tons of examples from people if you "Ask HN" about "how many quickly written Python prototypes are still out there years later because their C++ replacement could never be implemented on time?"

KolenCh · on Aug 9, 2024

I have an example to support: https://github.com/hpc4cmb/toast/pull/380/commits/a38d1d6dbc...

A one-liner in Python is replaced by hundreds of lines of changes to implement in C++. (The one-liner has some boilerplates around it too, but the C++ function itself is longer and the boilerplates around is even more.)

Edit: 230 additions and 36 deletions in 14 files.

maccard · on Aug 9, 2024

That’s replacing one line which calls a a third party dependency with a manual implementation in native code.

KolenCh · on Aug 9, 2024

Technically right, but the original code only optionally depends on Numba, and Numpy is ubiquitous in scientific Python.

The problem I was trying to solve in all those boilerplates is basically that Numpy still has overhead and dropping it to lower level is trivial if Numba is used, but the package maintainers doesn’t want to introduce Numba dependency for various reasons. And the diff is showing that change from Numpy/Numba to C++ with pybind11.

maccard · on Aug 9, 2024

That doesn't change the fact that it's not representative of how many lines something takes in python vs C++. If you replaced it with a call to native that used a third party library it would be back to being one line.

KolenCh · on Aug 10, 2024

First, no sane mind will think one single example will be representative to anything.

Second, if you want to focus on 3rd party or not, sure. But you could also think Numba jit vs C++ and in this case it puts them on a more equal footing to compare them.

As far as Numba is concerned, numpy is in its stdlib as it aren’t calling Numpy but jit compile it.

You may say Numba is not Python, true, but you could consider Numba an implementation of a subset of Python.

We can argue about the details but the example is not that long to read. My summary and your summary are both at best distilled through our own point of view a.k.a. biases. Anyone can read the diff itself to judge. That’s why I link to it.

maccard · on Aug 10, 2024

My point is that you replaced calling a third party library from python with an implementation in native. You could easily replace a c++ third party library with a full python implementation of the same thing and come to the opposite conclusion.

The example matters because it’s bad - you’re not comparing like for like.

maccard · on Aug 9, 2024

There's a huge swathe of languages between Python and C++. On threads about Node's performance (which is somewhere in between) people regularly say that it's unfair of developers to solely focus on devex, which is the argument here.

> On multiple occassions I had to port parts of Python projects to C++ for performance reasons, and the amount of code you have to write (and time you need to spend) is mind blowing. Sometimes a single line of Python would expand to several files of C++, and you always had to redesign it somehow so that memory and lifetime management becomes feasible.

Part of that is because it's designed for python. Using this benchmark example, the python code (without imports) is 36 lines and the go code is 70. Except, we're doing a 1:1 copy of python to go here, not writing it like you would write go. Most of the difference is braces, and the test scaffold.

I'd also argue the go code is more readable than the python code, doubly so when it scales.

rbanffy · on Aug 9, 2024

> it's unfair of developers to solely focus on devex, which is the argument here.

For most companies and applications, developer time is many orders of magnitude more expensive than machine time. I agree that, if are running applications that run across thousands of servers, then it's worth writing it in the most performant way possible. If your app runs on half a dozen average instances in a dozen-node K8s cluster, it's often cheaper to add more metal than to rewrite a large app.

PhilipRoman · on Aug 9, 2024

There is more to it than just time. If your app is customer facing (or even worse, runs on your customer's own resources), high latency and general slowness is a great way to make your users hate it. That can cost a lot more than few days of developer time for optimization.

rbanffy · on Aug 9, 2024

There is, of course, a limit. OTOH, if your devex is so terrible you can’t push out improvements at the same cadence of your competitors, you are doomed as well.

maccard · on Aug 9, 2024

No, it’s more expensive to you, and you get to ignore the externalities. It’s like climate change - you don’t care because it doesn’t affect you.

How many hundreds of hours are wasted every day by Jira’s lazy loaded UI? I spend seconds waiting for UI transitions to happen on that app. How much of my life do I spend waiting for applications to load and show a single string that should take 50ms to fetch but instead takes 30 seconds to load a full application?

lpapez · on Aug 9, 2024

I don't know, but personally I've spent orders of magnitude more time debugging memory leaks in languages without GC than the time I've wasted waiting for shitty apps to load.

Software bloat is a real thing, I agree, but it's a fact of life and in my opinion not a hill worth dying on. People just don't care.

If you want proof of this, try to find a widely used Jira alternative which doesn't suck and doesn't need a terabyte cluster to run. There isn't one because most people just don't care about it, they care about the bottom line.

sgarland · on Aug 9, 2024

> People just don’t care.

Devs don’t care, because they aren’t being forced to care, and they have loaded M3 MBPs, so they don’t notice a difference.

I firmly believe that we’ll see more of a swing back to self-hosting in some form or another, be it colo or just more reliance on VPS, and when we do, suddenly performance will matter. Turns out when you can’t just magically get 20 more cores on a whim, you might care.

rbanffy · on Aug 9, 2024

Not only that - when your ticket system becomes a Jira because of all the feature creep required, it’ll perform like a Jira.

Perhaps a little better if you can implement more in less crazy ways (it seems to keep a lot of logic into a relational database, which is a big no-no for the long term evolution of your system).

rbanffy · on Aug 9, 2024

How many years of our lives would be saved if Jira was rewritten in Verilog and ran on the fastest FPGAs?

Not much. I bet it spends most of the time looking up data on their massive databases, which do the best they can to cope with the data structures Jira uses internally. You could, conceivably, write a competitor from scratch in highly optimised Rust (or C++) and, by the time you have covered all the millions of corner cases that caused Atlassian to turn a ticketing system into Jira, you’ll have a similar mess.

maccard · on Aug 11, 2024

This ad absurdem argument is ridiculous. Nobody is talking about writing web apps in verilog except you.

I can’t claim to know why Jura is so slow - I don’t work for atlassian. It was an example of an app I use daily that I spend minutes per day staring at skeletons, loading throbbers and interstitials on.

Core architectural choices like “programming language” set a baseline of performance that you can’t really fix. Python is unbelievably slow and your users pay that cost. There are lots of options in between that offer fast iteration speeds and orders of magnitude faster runtime speeds - go and .net are widely popular, have excellent ecosystems and don’t require you to flash an FPGA.

carlmr · on Aug 9, 2024

>For most companies and applications, developer time is many orders of magnitude more expensive than machine time.

For me the context was mostly working in DevOps CI/CD. Slow CI scripts slow down developer feedback and cost machine time as well as developer time.

If your customers are your developers, performance engineering is about saving the expensive developer time.

And if you ask me the effects are non-linear, too.

If your build system takes 1 minute or 10 minutes for a diff build the time lost won't be a mere factor of 10. Any sufficiently long time and your developers will lose focus and their mental model, will start making a coffee, and maybe they might come back later to that work after 3 meetings that got scheduled in between.

igouy · on Aug 9, 2024

> Using this benchmark example, the python code (without imports) is 36 lines and the go code is 70.

Look at the gz values and notice that longer programs might be faster programs:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

~

"How source code size is measured"

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

visarga · on Aug 9, 2024

It all depends on execution time. If you run it just a few times or the total execution time doesn't take days, you can do it faster in Python end-to-end.

Quothling · on Aug 9, 2024

> I've seen so much time wasted on Python this way it's incredible.

It sort of goes both ways. You'll see a lot of time wasted on Python when the developers probably could have know it would need better performance. On the flip side you also see teams choosing a performant language for something which will never need it. One of the reasons I've come to prefer Go over Python is that you sort of get the best of both worlds with an amazing STL. The biggest advantage is probably that it's harder to fuck things up with Go. I've seen a lot of Python code where the developer used a List rather than a generator, loops instead of build in functions and so on.

That being said, I think we'll see a lot more Rust/Zig/C/C++ as things like ESG slowly makes its way into digitalisation decisions in some organisations. Not that it'll necessarily making things more performant but it sounds good to say that you're using the climate friendly programming language. With upper management rarely caring about IT... well...

maccard · on Aug 11, 2024

That reasoning is why I mentioned go - .net is another ecosystem that offers excellent bang for buck.

albrewer · on Aug 9, 2024

> But then you rewrite the whole thing in vanilla Rust without profiling once and it's still 20x faster

The main reason Python is used is because developer time is expensive and computer time is cheap. When that balance shifts the other way (i.e. the compute slowdown is more expensive then the dev time), Python is no longer the right choice.

There's no point in developing at half speed or less in Rust compared to Python if your code only runs a few times, or if the 20x performance gains only results in 60 minutes of total compute time saved.

carlmr · on Aug 9, 2024

>There's no point in developing at half speed or less in Rust compared to Python

In a lot of cases the Rust development speed is faster IME, because you can really get the correctness on the first go much more often than with Python. Also proc-macros make dealing with serialization and deserialization or CLI parameters a breeze.

albrewer · on Aug 12, 2024

> because you can really get the correctness on the first go much more often than with Python

Helpful only when you have a full-ish picture of the problem ahead of you and know what's correct when you start programming.

sgarland · on Aug 9, 2024

A lot comes down to familiarity, to be fair. I’m not a SWE by trade, I’m an SRE / DBRE. I can read code in just about any language, but the only one I consider myself really good in is Python.

I could and should learn another (Rust seems very appealing, yes), but at the same time, I find joy in getting Python to be faster than people expect. C being leaned on is cheating a bit, of course, but even within pure Python, there are plenty of ways to gain speedups that aren’t often considered.

v3ss0n · on Aug 9, 2024

You can enforce strict typing by using MyPy and Rust. Those are part of modern python development stack. If your program can be rewritten in python you are definitely not developing Fullstack Web Development or DataScience / Machine Learning. Rust isn't good at those since there are no good framework for DL/ML stuff - and working with unstructured data is very much necessary in those field where it is against Rust .

v3ss0n · on Aug 10, 2024

i meant Ruff in first line.

sgarland · on Aug 9, 2024

I dislike Go as a language for a variety of reasons, but nothing really objective.

I can agree with your sentiment that if performance is important from the outset, that writing everything in a language more performant by default is a smart move.

In a particular case for me, the project evolved organically, and I learned performance bottlenecks along the way. The actual functions I rewrote in C were fairly small.

maccard · on Aug 9, 2024

If you dislike go, then c# is another great alternative. I chose go because it’s fast, simple and the iteration times for go are often quicker than python apps in my experience