Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Julia, the One Programming Language to Rule Them All (wired.com)
163 points by luu on Feb 3, 2014 | hide | past | favorite | 137 comments


I'm dismayed that this article portrays Julia so much as my creation and downplays the roles of Jeff and Viral – if anyone deserves the lion's share of the credit for the language, it's Jeff. I'm guessing this happened because I had a publication-quality photo available for use and told a relatable narrative in the interview we did, whereas Jeff stuck somewhat more to technical matters.

Oh, and Julia 1.0 hasn't been released yet. We're currently working on 0.3, which should be out in about a month.


The "One Programming Language to Rule Them All" headline was pretty horrible too. I don't hold it against you though, – Julia is an exciting language and I really hope you guys do well.


I think it was a poor interpretation of: a single language that is good for prototyping and also good for writing performant code.


Precious...


You are bold and confident (on the internet, at least), qualities often associated with leadership, which can give the impression that you are the one in charge.


> You are bold and confident

That is what it said in the fortune cookie I got the other day.


A broken clock gives the right time twice a day :-)


So how often does a broken cookie give the right fortune?


Only once I'm afraid.


Anytime you feel like it.


Not if its a 24 hour clock!

I'll show myself out. >.<


Wired in inaccurate gloss of complicated technical subject shocker! Seriously, they're like the Daily Mail of Silicon Valley.

P.S. well done on the language, sounds interesting.


You have to give them points for trying though.


rewarding effort? In this meritocracy?!


I'm just loving Julia. Thanks for the work you've put in.

What is your timeline for 1.0, if that's on the near horizon?


The main dynamic this article basically fails to describe clearly is that the status quo for scientific computing is that you do your nice, high level modeling in MATLAB or R, and then when you get good results, you port it over to C/C++/Java for production. It's usually a write-only process since the code you end up with in Java for example completely obfuscates the clean algorithm you designed in MATLAB. Julia's core value proposition (if I understand it correctly) is that you can build your models and algorithms in a MATLAB-like environment (with macros!) and then, lo and behold, you can actually just tune and deploy this code right where it is, without having to do a port.


(Note: I haven't tried Julia yet)

This would save so many man hours... At my shop, for every MATLAB engineer (mostly older guys) you need a C++ guru (mostly younger guys) to translate his stuff into something that will run fast.

On the other hand having seen the code the MATLAB folks write... I'd be worried. It's possible to write MATLAB in a way that is clean and makes the core algorithm presentable but that's rarely a priority. The real advantage of MATLAB is the quick development cycle. With the right packages you can make C++ look comparable... sorta.

Fundamentally the problem with MATLAB engineers is that they don't understand how to optimize code b/c they don't understand what's going on under the hood and they don't understand how to organize code b/c they don't have a concept of object oriented development, writing to interfaces, design patterns etc. etc.

EDIT: Just to clarify, I don't mean to denigrate MATLAB engineers. I'm just saying that in the current system you have two groups that specialize in different things. One focuses on algorithm development, and one on programming. I'm just speculating that maybe there is a good reason for it, and it may be unreasonable to ask one person to do both.


This is why one of the design goals of Julia is to have a fairly transparent performance model. In particular, if you write C-like code, you get C-like performance. A lot of the most twisted and impenetrable Matlab code exists because Matlab forces you to vectorize everything to get acceptable performance. Generic programming and design are also very important in Julia. It takes a while to get the hang of it, but people do seem to after a little bit.


Fortran 90 and later versions of Fortran have array operations, as in Matlab, but you can also write loops and have the compiler optimize the code. There already is a high-level language for scientific computing with C-like performance.


People often sneer at Fortran, but there's a lot that can be done with it in scientific computing, including very performant GPGPU computing. That being said, the dream of having a matlab like prototyping environment directly result in production level HPC code is probably never going to be realised in Fortran - maybe Julia will get there? All I can say is that the tooling needs to be top notch for that to happen and the performance can't just be in the order of magnitude of Fortran/C - it needs to be within 20% or so. Once you run software on clusters that cost millions, programmer time becomes rather cheap, especially in academic settings.


Is there a guide to performance optimisation in Julia anywhere?



Sure, but I guess the question is if the the MATLAB-oriented engineers don't have these skills because of the way things work now. If they had Julia, they could still throw together their hacky prototype but then, in the same environment, slowly learn how to refactor it into an appropriate structure for long term maintainability and production use. Currently, the path to learn this facet of software engineering has a huge gating function, "Learn C++", so it's unsurprising that the path of least resistance is for them to just throw their MATLAB scripts over the fence to the C++ folks.


I totally see your point and for a lot of people what you're saying will resonate (I can see this really catching on with gradstudents), but the question really boils down to: do you really want your engineers spending time learning these things?

From what I've seen - it's incredibly hard to find a good engineer that knows his stuff. They generally have PhDs and years of training in math and whatever particular field they work on. If you get one that is actually doing novel algorithm development, each one is a golden goose that will bring in revenue for the company for years to come.

By contrast if you offer a decent salary you can get a good C++ programmer with 3-4 years of experience to do all your C++ grunt work. (we hire CS students from the local University, and bring them up to speed within 6 months)

While not incredibly hard, the problem with learning "appropriate structure for long term maintainability and production use" is that it takes an awfully long amount of time. You need to go through a lot of trial and error and learning from your mistakes. You need to read programming textbooks, and keep up with the latest trends. It's actually a huge time investment.


    > it's incredibly hard to find a good engineer that knows 
    > his stuff. They generally have PhDs and years of 
    > training in math and whatever particular field they work 
    > on. If you get one that is actually doing novel 
    > algorithm development, each one is a golden goose that 
    > will bring in revenue for the company for years to come.
Like ritchiea, I admit I don't have a huge breadth of experience, but I've worked with enough PhDs to say with confidence that, more often than not, they represent a net negative contribution to an engineering team. Novel algorithms aren't useful if they can't run in production, and the PhD's I've known, with one or two exceptions, lacked both the ability and desire to produce production-quality code. I grew to deeply resent those "engineers" who would read papers for half the day, make buggy commits to prototype repos that never got released, and refuse requests from both peers and managers alike to contribute to the team's extant backlogs.


Sounds like a CS PhD! hahaha (maybe Math?) I want to preface by saying that for most work that goes on in Silicon Valley a PhD adds nothing.

I know what you're talking about.

However outside of SV there are a lot of places where that's simply not the case. What you have is basically a cross disciplinary collaboration where you just don't have the educational background nor the expertise to do the engineering yourself.

The resentment your describing is really the same resentment you can feel towards your boss -"All he does is tell us what to do. He's not in the trenches like us!" - and it's directly related to the amount of mutual respect. In essence, how hard do you feel the other person is working relative to you.

Your description of your coworkers seems to indicate that you didn't feel like they're pulling their weight, and that's definitely a big problem.

I've heard second hand that in the valley there are a lot of incompetent PhDs that use their educational background as leverage to slack off. People will often hire graduates solely based on a degree thinking that if someone has a Math PhD "they're smart, they can learn on the job, and they'll contribute in some magical way just by being there". But actually these guys have just spent 5 years in poverty getting an advanced degree and don't want to be a code monkeys with a bunch of fresh out of college 21 year-olds. And, oh yeah - reality check, no one cares about their degree in abstract algebra.

9/10 of these PhDs are crappy engineers and 9/10 times they are thrown at problems they don't really have a background in (so they can't even be crappy and regurgitate equations they're memorized).


What kind of work do you do where you find that PhD level education is requisite for good work and an algorithm designer is essential to your revenue? Hope this doesn't come off as accusatory, I'm very curious.


There is a whole world of programming outside of web frameworks and cell phone apps haha.

Think of the people that for example made the traction control system for your car. You think the same person that worked out the math in some simulation in MATLAB wrote the code that went on the controller circuit? It's not impossible, but it's not likely.

Closer to home, you can think of companies like Leap Motion, the Oculus folks, the Kinect people. Often companies that deal with video processing or control systems depend on a core set of algorithms to set them apart from the competition. The engineers behind it generally have a high level of understanding of linear algebra and often statistics and 15+ years of experience in a field.

It's very hard (though in many areas not impossible) for a programmer to learn that stuff to a level where they can innovate and contribute - so you end up relying on specially trained people who don't know about inheritance and how to use git hahaha

Where I work algo development is done by about 50% PhDs, 25% MAs and 25% BA/BS.


Thanks for replying. It's not so much that I don't realize there is a huge world of programming outside of web frameworks and cell phone apps, as it's nice to hear about those jobs. I don't get the opportunity very often.


I don't think the bar needs to be that high. Just write a decent API for your magic algorithm, choose decent names and types, tune the inner loops and grok the performance edge cases, write a few unit tests, and I think the 80/20 rule applies for anything much after that. A much better scenario for a handoff to another engineer than "here is my .m file, enjoy porting it to C++." They can refactor it up more and introduce interfaces, etc, from there.


These good engineers who are worth their weight in gold spend a lot of hours jumping through hoops to get Matlab to run in an acceptable time frame. And it makes it hard for anyone to figure out what they've done (with weird index tricks, for example).


What about code generation tools that matlab has been providing for the past few years? When I left the automotive industry the adoption of these tools seemed like a sure thing


This is a good point. The article puts up a straw man ("one language to rule them all") when it would be just as effective to lay out the actual situation in scientific computing -- it's uncomfortably heterogeneous at the moment.

Having one language that spans more of the space from the bare metal (C-like fine-grained numerical manipulation) to the top level (say, training a complex model on a large dataset in a REPL) would be very helpful.


Well articulated!

I'm actually working on related prototyping then deploy tools myself for numerical computing. Albeit using ghc Haskell as the substrate so I'm making a different set of tradeoffs than julia.


Also, having Macros and a built-in C-FFI.

Now give me some plotting and good Qt bindings, and I'm all set!


I see that the julia page says that Julia can call out to C functions[0], but can C functions call into Julia? ie: make a program "myprog" like:

  cc -o myprog myprog.c -ljulia
and be able to call julia_domath()?

edit: reference to julia documentation

[0] http://docs.julialang.org/en/latest/manual/calling-c-and-for...


It is a bit hidden, but calling julia from C is possible in the 0.3 prerelease: http://docs.julialang.org/en/latest/manual/embedding/


I don't see an 0.3.0rc or obvious github branch -- what's the best way to follow 0.3.0?


It is developed on the master branch, and there's a v0.3 milestone for the related issues. https://github.com/JuliaLang/julia/issues?milestone=7&page=1...

You can find recent binaries here: http://status.julialang.org/

You can also check the dev mailing list: https://groups.google.com/forum/?fromgroups#!forum/julia-dev


You can get compiled nightlies at http://status.julialang.org/ Otherwise building the master branch will get you prerelease.


No RC yet, 0.3-pre is the github trunk.


That is _exactly_ what I was imagining, and very exciting. Let the experimenting begin!


Julia can call out to C functions[0], but can C functions call into Julia?

There are several aspects to this:

- Julia cannot generate linkable shared libraries, yet.

- There is a fairly small "libjulia" C API (the embedding reference in another comment), but this is a means to interact with/control the program "julia" (for dynamic interop. or IDE use, for example). "libjulia" does not export symbols for arbitrary Julia code.

- It is possible to generate a C-callable function pointer from a Julia function at runtime and to pass this into a C routine controlled by Julia (for example, to pass a comparator to a sorting routine, or an objective function to an optimizer).


There are a few plotting libraries for Julia; I'm not sure how robust they are though. http://julialang.org/downloads/


Sounds like a fluff piece. It first tells us that this language is the "One Programming Language to Rule Them All", but then it adds:

> "That said, it isn’t for everyone. Bezanson says it’s not exactly ideal for building desktop applications or operating systems, and though you can use it for web programming, it’s better suited to technical computing.".

So what is it? Is this the supposed "One true language", or is it just another language with a narrow scientific focus?


I'd agree about it being a fluff piece, although I think the binary options you present aren't fair.

I'm not a core developer on Julia, just a user and package creator. It is geared towards technical/scientific computing, and excels in that task - it has replaced the combination of Python and C++ I used previously. I think what Jeff Bezanson is saying is that there is nothing intrinsically unsuited about Julia for doing web programming or desktop applications. It benefits from a fresh start in this respect, and also by using best-of-breed libraries from OpenBLAS to libuv. This is in contrast to other languages for technical computing like MATLAB or R which I wouldn't consider to be practical choices for making, for example, a math-as-a-service application (although R is getting into this territory with things like shiny).


After all the Julia hype on HN I thought I'd try it out.

I was surprised to discover how slow it actually is to use. Here's a quick demo of booting up the Julia repl with two libraries loaded and then immediately exiting:

    $ time julia -e 'using Distributions; using Gadfly; exit()'
    real	0m22.778s
With behavior like that it's not going to replace R as my go-to tool for graphically exploring data.

I'm sure they'll improve this eventually. Much like with Rust, my hopes are high. But also much like with Rust, I wish the hype came after the promises were met.

PS: this article claims that Julia 1.0 was released in 2012, but as best as I can tell the most recent code you can get calls itself 0.3. The above was with Version 0.3.0-prerelease+1174 (2014-01-23 16:58 UTC).


You literally picked the single package that loads the most other packages. Clearly this needs work, but just running vanilla julia without loading packages is not bad at all these days:

    $ time julia -e 'println("Hello, world.")'
    Hello, world.

    real    0m0.209s
    user    0m0.299s
    sys     0m0.104s
Julia now pre-compiles the jitted code for it's base system, and the plan is to allow the same thing for packages. As others have noted, starting up Matlab is extremely slow and starting up a full R environment isn't all that fast either, nor is starting a JVM all that speedy. Startup time alone doesn't prevent a system from being useful.


Thanks for replying! I didn't anticipate that my comment would generate so much controversy. :( I was only relating my own experience in trying out the tool, and comparing it to the tool I currently use frequently despite not liking its semantics very much: R.

When I had encountered the slowness I was complaining about above, I poked around online and saw that you all were certainly aware of it and had plans to improve it. I think I even downloaded a 0.2 binary and then tried out trunk to see if it had already improved. All I had meant to say was (like Rust) that I anticipate I'll need to wait for a more complete version.


>>Startup time alone doesn't prevent a system from being useful.

Well, it does not completely ... If I run a simulation that will take 10 hours, 2 seconds is not much at all.

However, it will severely reduce its usefullness in many other cases (quick&dirty scripting, testing)... Just for the record, on an i5 system:

  $ time julia --version
  julia version 0.3.0-prerelease
  
  real    0m2.156s
  user    0m2.119s
  sys     0m0.030s
  
  $time julia -e 'println("Hello, world.")'
  Hello, world.
  
  real    0m2.119s
  user    0m2.074s
  sys     0m0.036s

In comparison, starting an equivalent ipython script takes <0.1s on the same machine.


Where is this "use" part of which you speak?

Aggressive JIT languages are historically slow at startup. The typical tradeoff (similar to the server/client Java VM flags) is to amortize the optimizations over the length of the application if startup time is a factor.

If you start Julia (or Mathematica, Excel, or whatever) up every time you need to 'graphically explore data', you're doing it wrong. Unless you fubar your environment (not that I ever do this on a regular basis), there isn't a great reason to restart your exploratory environment.


> Where is this "use" part of which you speak

Maybe you don't want to use the old repl, there there is something else going on there and you don't want to mess it up. Maybe I want to show someone an example, or try something out quickly.

I use and spawn ipython windows all the time. Sometimes I have 4 open at the same time.

> Aggressive JIT languages are historically slow at startup.

Explaining the reason for the slow start-up doesn't make the slow startups faster.


>> Aggressive JIT languages are historically slow at startup. Explaining the reason for the slow start-up doesn't make the slow startups faster.

No, but insisting that you want startups faster, when you've been explained the technical reasons why they are slow, doesn't make them faster either. To me it sounds like:

"This ball shouldn't fall to earth".

"It fell because of gravity".

"Explaining the reason it fell, doesn't make it go upwards".


> No, but insisting that you want startups faster, when you've been explained the technical reasons why they are slow, doesn't make them faster either.

Oh please. "We've explained the technical reasons to you" excuse. As a customer the response to that is "ok, good luck with that, I am not using your product".

> "This ball shouldn't fall to earth" / "It fell because of gravity" / "Explaining the reason it fell, doesn't make it go upwards".

No the answer to that is, screw this planet, I'll go fly into space and find another planet. Which is the response people have with software they don't like.

It is more like:

"Oh your database is slow" and getting the response of "Well of course it is,the locks at level 3 in our API have not been optimized because John is on vacation still. Geez how dare you not accept that as a valid answer?"

See you've equated software features hard laws of physics. So escaping from it is "ridiculous".

To be more serious, having an interpreter JIT a module only if you use, on-demand, or save the trace of it, across execution maybe he hard but doesn't require changing the fine structure constant.


In fact, this is how a lot of heavy iron operates. You can use the same "binaries" ('TIMI') on your i/System i/AS400/POWER7 and the system will recompile what is essentially bytecode and save the machine-specific object code.

Alpha did something similar, and a number of feedback-directed/profile-based optimizing compilers will perform similar optimizations, but they are less flexible unless repeated cycles are performed in the deployed environment.


Pure JIT compilation is historically slow at startup. There are hybrid approaches, such as caching code once it's been compiled, and just loading the cached version on the next startup. IBM Java 7 does this (not just for shared classes, which client-side Java has done for a long time, but for an application's classes, running on a server vm), and it cuts about 70% of the startup time out of my app.


Seriously? Startup time as a deal-breaker for using this? What about other, much more important and relevant metrics? If you have to spend 23 seconds but save, let's suppose, thirty minutes with faster development and runtime, isn't the tradeoff worth it? I'm not saying that's a guaranteed with Julia as I can't speak to that, but I think of all the things to care about, startup time would be a relatively trivial concern.

Also, "one programming language to rule them all" is quite a fanciful stretch considering people having been trying to write that since the first programming language.


It's a Wired article, please don't confuse it with the Julia community itself[1] - in my experience the latter is very enthusiastic, very ambitious, but also very willing to admit the current limitations and that it's in a very early open beta stage.

[1] Upon rereading, I don't think you are, but just making sure.


Thanks for charitably interpreting my maybe-ambiguous comments! My general experience (seen also in e.g. Haskell) is that the core communities are usually quite reasonable, and it falls on journalists or newbies to create hype, which is what produces confusing articles like this one.


Note that one of Julia's main competitor, Matlab, is laughably bad at this, too.

Of course Julia has a slow startup because it is JIT-compiled and therefore will run fast. This analogy does not extend to Matlab.


Startup times are not fantastic with packages. This hasn't been a big area of focus yet, but there is a roadmap for getting this down.

Its definitely a typo. Stable release is currently 0.2 (released near end of 2013) and upcoming version is 0.3 (a month or so away).


Unless julia switches to an ahead of time compilation model, startup times will fundamentally stay slow.

LLVM can be used as a jit, but it's really tuned for ahead of time compilation and while the llvm devs want to support jit, their focus will always favor ahead of time compilation.


There's no need to switch Julia to an AOT model, it suffices to supplement it with one, which is exactly what happens for the Julia standard library on master. It's only a question to extend that capability to packages, which is ongoing work.


Couldn't the language simply cache the bytecode compiled when a package runs, and only reload if the timestamp on the compiled code is older than that on the source? I believe this is what python does with its .pyc files. Then it could go a step further and cache the binary generated by compiling the bytecode to machine code... seems like there's a lot of things that could be done to speed up package loading.


This is basically what I meant. Julia doesn't really have a bytecode. There is a lower, type-inferred form of the AST as well as the llvm IR representation. However, instruction selection itself takes quite a bit of time, so if we're caching the IR, we might as well cache the generated machine code (the challenges are basically the same). As I said, it's work-in-progress (it's not that it's hard or anything, but it's a bit of work do it properly, which isn't done yet).


So it becomes an AOT model with good jit support? Neat! I look forward to seeing how the design crystallizes.


Downvoted because:

1) Julia's performance is great. What you're measuring is a precompilation time.

2) I hate that the top comment on HN is usually negative. Is that some stupid rule or something?


@1) Is it? [0] seems to indicate that it is usually about twice as slow as C and sometimes fares much worse. Of course, that article is over a year old by now so maybe there are improvements, but if I can get a 2x performance gain ‘just’ by writing in C/C++, I will certainly do so.

[0] http://justindomke.wordpress.com/2012/09/17/julia-matlab-and...


Wait, what? 2x slower than C is slow? And 2x performance gain is worth dropping from high to low-level programming? In what world, I'd like to ask? Do you frequently "just" write programs in asm to get additional performance gains?

It could well be that you really need every bit of performance; what I want to say is that areas where "2x slower than C" are few and far in between.


Numerical calculations (e.g. DMRG) on large systems easily take days, if not weeks, to converge sensibly. At the same time, the algorithms can be implemented in a relatively small amount of code (500 sloc in the Python prototype with some help from Numpy and Scipy, about 10k in C++).

So, yes, if code can be reused later on and spending a month rather than a week writing it will save you many thousand hours of CPU time, it might well make sense not to use some high-level language (although I wouldn’t classify C++11 as low-level, albeit sometimes a bit verbose).


I really don't get debates like this. You should simply account for the (a) time it takes to write the program AND (b) the time it takes to run the program. Selecting a language only based on (b) is clearly suboptimal.


Note that one of the benchmarks take 5 ms in c -- if that's all you're trying to compute, why care about speed at all? And of course JIT will be a huge penalty on that time scale. There are a bunch of benchmarks on the front page of http://julialang.org/ that show only a small penalty for using julia over c.


Compared to Python or MATLAB? Yes. 2x slower than C is amazing for a high-level and dynamic language. But of course, if performance is critical, C/C++ may be the better choice.


There's almost always a performance tradeoff when choosing to use some other language than C/C++; how much of a performance cost is the key question. If it's consistently within a factor of 2 of C performance, that's actually quite good, considering all of the high-level benefits you get out of it.


Are you making a case here that Matlab and R should never be used?


I don't think startup time, on its own, is a valuable metric for most use cases.


Julia's JIT makes startup a little slow.


I'm not really familiar with the intricacies of JIT, but wouldn't it be possible to pre-compile just the startup portion of the code, thereby dramatically increasing startup time? This part of the code would seem to almost always run in exactly the same way each time (with respect to each individual package being loaded, the startups of each being also pre-compiled), so there doesn't seem to be any value gained from JIT-compiling this process there.


Yes, this is done now for the standard library, which has reduced time-to-REPL from ~3s to ~.5s. The same idea will eventually be extended to packages.


Sweet, sounds good. I look forward to seeing this. Or maybe contributing to it :)


Yes, Lisp systems have been doing exactly this for decades.


1) "Ruby and Python are good general purpose languages, beloved by web developers because they make coding faster and easier. But they don’t run as quickly as languages like C and Java." --- sorry to spoil the party but NumPy is quite a bit faster than Julia in real-world applications. Something as trivial as summing elements of an array runs much slower in Julia (unless they fixed it, last I checked was about 4 months ago). I'm not even gonna get started on pypy / llvmpy / numba / cython. R, albeit being a horrendous language, can be hardly beaten by anything at the moment due to its epic plotting tools like ggplot2 and the sheer amount of statistical packages created by scientists from all over the world; it's like github for many of the stats guys. Except unlike github, most of the packages are not open-source and hence can't be directly ported to python/julia/whatever..

2) Lack of existing infrastructure for scientists and general purpose stuff like various SQL adapters and bindings. Take any data-oriented/scientific field, there's gonna be a python library or a c++ library with python bindings for it. Data munging? pandas. Plotting? matplotlib. Machine learning? scikit-learn. MC Monte-Carlo? PyMC or PySTAN. Optimization? scipy. That's just to name a few... those are all huge libraries with many thousands commits in their repos and many thousands human hours behind them -- it will be pretty hard for anyone to catch up fast. /* it's worth mentioning here though that many libraries in scientific Python stack were developed by aggravated R / Matlab users who loved the tools but hated the language => matplotlib/pyplot was initially a direct port of Matlab's plotting tools, pandas was initially a direct port of R's data.frame, etc /

3) 1-based indexing. Seriously, wtf? / insert an arbitrary Dijkstra joke here */

Not trying to bash Julia deliberately but I'd say by publishing a fluff piece like this Wired in fact discredits the Julia community who are nice and enthusiastic folks.


1) is probably still true at the moment as long as you really need that large array, but the performance gap is likely to disappear once we have better garbage collection and SIMD support, which are both open pull requests. On the other hand, if your algorithm can be expressed in a loop that touches less memory, in my experience, devectorized code in Julia beats vectorized code in NumPy or MATLAB, and absolutely destroys devectorized code in those languages. This is vitally important in my work, since the data sets I work with are so large that expressing the algorithm in terms of vector operations would require more memory than my computer actually has (and it has 64 GB of memory).

Numba and especially cython are bolted on hacks to deal with the poor performance of the interpreter. They are not conducive to the kind of modularity you can get with Julia or plain Python, and they have their own set of rules for achieving high performance. Their main upside is that, as you say below, people already have a lot of Python code, but if you were starting from scratch you would probably not design them into the language. NumPyPy still doesn't implement all of NumPy, although I think it's been under development for roughly as long as Julia.

2) would be a problem for any new language, but it is partially mitigated by the PyCall package, which makes it easy to call almost any Python code from Julia. There are even Julia wrappers on top of pandas and matplotlib that allow you to use them while following standard Julia programming conventions. I've played around with calling PySTAN from Julia, and it's not substantially harder than calling PySTAN from Python.


> once we have better garbage collection

IMO the importance of the garbage collector needs to be underscored.

My experience with Julia was that if I pre-allocated arrays and wrote in a low level like (C-like) way, then it would be fairly fast (also, need to avoid closures but that's another issue).

If, however, I generated any new values inside loops (and using proper abstractions, this is hard to avoid), then garbage collection would often massively slow down computations and make performance hard to predict.

Unfortunately, I think having a truly high performance garbage collector is a massive undertaking. I hope I'm wrong about that.


See https://github.com/JuliaLang/julia/pull/5227 for the open PR. The benchmark there suggests 2-3x improvement. There is a decent amount of low-hanging fruit, although "truly high performance garbage collection" means different things for different applications and LLVM is apparently not a good platform for generational GC.

There are further possible improvements that are not strictly related to GC. We could determine that an array of the same size is being created on every iteration of the loop and allocate that memory only once. Ideally we'd automatically devectorize code to remove temporaries, but that is quite hard.


Numba and especially cython are bolted on hacks to deal with the poor performance of the interpreter.

Could you explain this a little - What is so fundamentally broken about pypy/numba ? From the little I have read [1], it looks like they are inferring types to enhance performance and does a jit (just like Julia).

I am of course wondering if the magic of Julia cannot be ported on top of Numba or something.

[1] http://continuum.io/blog/numba_performance


pypy is generally incompatible with CPython libraries -- most notable being plain vanilla numpy. There are ongoing attempts to port numpy to pypy (numpypy), but other than that you're just left with pure python and pure python libraries.

Numba/Cython restrict what kind of code you can write so they can (jit-)compile it. Cython is less restrictive since it's more mature but it uses special syntax and is technically not python anymore.


> 3) 1-based indexing. Seriously, wtf? / insert an arbitrary Dijkstra joke here */

There are arguments in favor of 1-based indexing, especially given one of Julia's target replacement languages is R.


1) Most of NumPy is written in C. Pypy is slower in some scenarios than CPython (when the JIT doesn't kick in). Numba and Cython are supersets of subsets of Python.

2) Lack of libraries should never be a show stopper when it comes to adopting new languages.

3) Lua.


> "What we need, Karpinski realized after struggling to build his network simulation tool, is a single language that does everything well."

and later:

> "Bezanson says it’s not exactly ideal for building desktop applications or operating systems, and though you can use it for web programming, it’s better suited to technical computing."

So, Julia is not what we need? ;)

Besides this contradiction I'm not entirely convinced that we need one language to rule them all. I'm not even sure I want multi-paradigm languages (C++ anyone?).

Aren't the problems explained in the introduction of this article not caused by the tools instead of the programming languages? I don't see how using multiple languages complicates patching (unless the article is talking about monkey patching which I doubt), but I would understand how going through 4 different debuggers could drive you crazy.

Disclaimer: I'm not familiar with Julia and I will not judge it based on a Wired article.


> What we need, Karpinski realized after struggling to build his network simulation tool, is a single language that does everything well.

That wasn't quoted in the article, so I'm assuming it was the author's words, not Karpinski's. The Julia devs seem to be more interested in solving the (prototype) -> (translate to low level language) -> (optimize for production) inefficiency rather than making one language to solve every conceivable problem.


Me reading the Julia documentation: "Interesting... hmm, don't quite understand that, but moving on... oh that's neat... wait, arrays are 1-indexed, not 0-indexed? ABORT!"

EDIT: and I consider this to be all the proof I ever need: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EW...


1-indexed arrays are pretty common in languages targeted at scientific/mathematical usage. Also the case in R, Fortran, Matlab, and Mathematica, among others.


To be fair to Fortran, you can easily choose whatever indexing you want. 1 is just the default.


That's even worse!


"This is the way we've always done it" is not a convincing argument.


The argument is more along the lines of 'the users for these systems use 1-based subscripts in their technical literature so we'll make conversion from formula-in-physics-paper to running code easier by having the corresponding data structure match the user's expectation.'

0-based arrays are easier for computer folks who occasionally have to implement the code that translates 'a[i]' into a memory access. 1-based is easier for physicists and mathematicians who were subscripting variables before there were computer implementations to worry about and had a pile of literature already written using that idiom. It's just a clash of conventions.


Except that's not the argument.


... for or against 1 or 0 indexed arrays? ;)


A lot of mathematics (and real life, but lets not go there) is written 1-based, which makes it quite a pleasure to translate algorithms into Julia. But I don't think its a big criticism OR selling point really - sometimes its good, sometimes its bad. I personally enjoy it, and I make less indexing errors than I do when writing array/matrix intensive code in C++.


"This is the way we've always done it" is not a convincing argument. Dijkstra makes all of the convincing arguments for why 0-based is better than 1-based.

I haven't made an indexing-based error in over 10 years, and the only reason I ever had it before was from having to switch back and forth between C and VB6 years ago.


I'm guessing you don't translate algorithms in scientific papers. The reality is that having one notation for mathematics and programming is a convincing argument to many. More broadly, this indexing thing is trivial. Let's not pretend Dijkstra's paper is proof of anything. At the end of the day his argument is just style: "That is ugly, so for the upper bound we prefer < as in a) and d)"

http://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831....


> "This is the way we've always done it" is not a convincing argument.

You're fighting a straw man; that isn't the argument. 1-based indexing is elegant for many mathematical uses. Technical computing often implements algorithms that are best written down using mathematical notation. In such a case the largest conceptual difficulty is not the origin of the index but the successful translation of the algorithm. To minimize the possibility of error, the original indexing (especially when it involves nontrivial mathematical maps) is often preserved.

> Dijkstra makes all of the convincing arguments for why 0-based is better than 1-based.

He essentially makes only one point. Namely that, when using 0-based indexing we can easily determine the length of a sequence by only its upper index. Though it's not a profound observation since the index bias is zero, it often gets treated as such.

Either way, it's a convention. A competent programmer should be able to handle any well-specified indexing convention (including those starting at negative indices). Different circumstances will confer different benefits on different conventions, and you should use the one that's most appropriate for the task at hand rather than religiously promoting The One True Way (TM).


The big drawback is that with 0-based indexing you can reference elements relative to the tail. If 0 is the first index and -1 the last index, then 1 is the second index, -2 the second last, 2 the third, -3 the third last and so on. 1..-1 would be all elements from the second up to including the second last.

I don't see how you could have indexing from the tail being internally consistent if the arrays are 1-indexed.


Julia uses the keyword `end` to indicate the last index.

1 and end for the first and last, 2 and end-1 for the second and second to last, and so on.


One counterargument is that in natural language, we tend to talk about numbers 2 to 12 and not the interval 2 <= x < 13. The argument for the latter versus the former is probably Dijkstra's weakest: the claim is merely that this makes the empty range "unnatural." The former convention implies one-based indexing for the same reason that the the latter implies zero-based indexing.

In Julia, the natural numbers 2, 3, ..., 12 are expressed as 2:12 and the empty range starting at 2 is expressed as 2:1.


This is also typically written in algorithms like:

  N = rows(data)
  for i in 1:N
    blah
So even though 1:0 may seem not unnatural, it actually reads perfectly in most code.


For me, the biggest positive is avoiding indexing errors between 0-indexed languages (Python) and 1-indexed data (almost everything I get). Patient IDs, simulation code, etc. hardly ever starts with "0" as the first ID unless it's been created by a computer scientist, and having to make sure everything is looking at ID-1 instead of ID is a pain. And makes code less useful to people who use the code instead of making it.



And, once again,

"And that’s the most coherent argument I can find; there are dozens of other arguments for zero-indexing involving 'natural numbers' or 'elegance'[2] or some other unresearched hippie voodoo nonsense that are either wrong or too dumb to rise to the level of wrong."

I still believe this is the only time EWD's work has ever been referred to as "unresearched hippie voodoo nonsense".

Is this the same Mike Hoye who hacks Zelda?[potato]

[potato] http://www.huffingtonpost.com/2012/11/16/mike-hoye-hacks-zel...

[2] http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EW...


That article assumes that history is the most important consideration. It doesn't actually evaluate the pros/cons of 0-vs-1-based indexing at all, it just traces the story, as if the path we took to get here is more important than the content of the argument.

History can inform the argument, but history does not make the argument. Observing the effects of 0-vs-1-based indexing in existing languages can give us data about the pragmatic effects of either decision. But the precise chain of cause-and-effect of decisions made in the 1960s in an environment where both the hardware and social structures around computing were radically different than they are now is really not that relevant to the question of which approach is better overall.

The article is also snide and openly insulting to nearly everybody (including Dijkstra); combined with its lack of a convincing argument, this article does not deserve to be treated as a credible position piece in support of 1-based indexing.


Thanks. Thoroughly enjoyed that article. :)


1-indexing has it's drawbacks (ranges) but it also has benefits. From the projects I've done in Julia my impression was that the net effect of 1-indexing was positive.


When you say, "1-based is more natural to how humans count", You're really saying, "I'm a child who can't think abstractly, so I probably shouldn't be let anywhere near a computer programming language."


I'm a very good programmer and I agree with him.


OT: are you the same RivieraKid who used to post in the Gamedev.net Lounge several years ago?


No, I'm surprised my nick isn't unique.


Julia seems nice, but the fact that the developers don't care about TCO is annoying and vaguely reminiscent of Python. If this were added, not only would many algorithms become more natural to express, but it'd go to the top of my list as a recommended first language (instead of Scheme or Lua).


Wasn't one of Bjarne Stroustrup's reasons for building c++ to make the network simulations he was working on more tractable?


More clickbait and erroneous reporting from Wired. Typically, I stay away from most of their articles, besides the Threat Level column, which actually hosts engaging pieces of journalism.


I keep seeing Julia promoted as the grand new language but honestly, I am incredibly sceptical. Working in the scientific programming community, I am keenly aware at how slow things move. In only the last few years have I seen a push towards even using Python as a viable scientific tool. The problem is that science moves slow and, for the most part, scientists aren't interested in the programming aspect too much, but getting the results. They have one system that works and they stick with it.

One problem, that I don't see mentioned, is that verification of correctness. There is a reason people use decades old packages and code: they know that the tool is reliable and correct. My supervisor's code is based off his supervisor's code, which has been around since the 80s. At least 50+ papers have been published based off this specific code so we know that it is correct. The amount of time it would take to re-write and test the code written in a new language would be a very large times-sync that no one wants to do because they could spend that time doing research. I think this is part of the reason why things change slowly in the scientific computing community as whole. There have been tens of thousands of papers published that use Fortran/C/C++/Matlab and this new kid Julia comes along and how many scientific papers have been published using Julia code? I would imagine barely even a fraction. To scientists, this is reason to be sceptical. I'm not saying I agree with this but it is how scientific computing does think.

Also, on a personal note, the marketing needs to change. Since of the creators is reading this, if you want get the scientists, drop the grandiose claims of "one language to rule them all", although that's probably not your fault. To me, I immediately smell bullshit a mile away when I hear that. Too many times in the history of scientific programming languages have people made similar types of claims only to see it crash and burn. This is why many the older folks won't bother.

Reading Julia's homepage, I would be unconvinced about why I should choose Julia over Matlab. Heck, why not use Python? Python is open source and has been around a lot longer and people have certainly of that.

Also your speed tests, well fix them. For example Matlab vs Julia. Calculating the Fibonacci numbers? Honestly who cares. Plus why are you even computing them recursively, that's a silly way to do it. I understand it demonstrates recursion, but it's a pretty useless thing in my opinion. I only ever see the Fibonacci numbers come up in programming contest problems. I can honestly say I've never seen them used in real applications that scientists do. And if they were you would never implement them this way. Also the implementation of quicksort, why did you write your own? Use the built-in one for a more fair comparison. No one writes their own sorting algorithm but uses the built-in languages ones. Do that here. It is disingenuous to Matlab to not use the built-in functions.

The real tests that I would care about would be the random matrix multiplication. That's something that is way more common than calculating Fibonacci numbers. And here Julia is a little faster than Matlab but not too much. I'd be sceptical of why I should switch. Similarly a single number tells me nothing. Just because it says Julia is faster, well that's one test. How well does Julia scale? Is it true that Julia is faster than Matlab for multiplying NxN random matrices? How well does this scale? A single test tells me nothing. I'm obviously bias towards matrix tests since my research used a lot of matrix operations, but if you read papers about various codes they almost always include scale tests. This is what my colleagues would care about more than anything since they have massive datasets so scalability is critical. Remember you are trying to sell this to scientists who are, obviously, scientists, so having good quality data to show them is critical. I discussed Julia with my colleagues and this was one of the things we immediately mentioned. The tests shown didn't convince us at all that it was even worth considering.

Plus, I'll be honest, I feel the homepage is marketed more towards computer scientists than numerical scientists. The first feature is "Multiple dispatch: providing ability to define function behavior across many combinations of argument types". The hell is that? Why do I care? Remember who your audience is! To give you an example, I will tell you about my office mate. He is a PhD student working on turbulence and uses Matlab for all his data processing afterwards. His code comes from simulations written in Fortran that run on the local supercomputing cluster. They are stored in the NetCDF format. After the simulations are done, he opens up Matlab, reads in the NetCDF file and runs a bunch of data-processing functions to compute various quantities from the raw data. He then plots these things. On the Julia homepage I don't even know if Julia can even do graphics. Can it? If it can, the homepage doesn't mention any plotting libraries. So, I repeat, what exactly does multiple dispatch mean and why is it so important? If you want him to be using Julia, then you need to explain to him why he should be using it over Matlab. Look at the numpy homepage. It is short and to the point. Plus it has a nice link to "NumPy for Matlab users" which is something I would want to know immediately. Julia, as far as I can tell, you have to go to docs, Julia manual, Noteworthy differences from other languages (which is at the bottom of the screen!!).

I could probably come up with more criticisms and suggestions but I don't want to write a book. I should say, I completely and fully support what Julia is trying to achieve. I find that Python is just too awkward to be a good Matlab replacement. To me, Python is a very general tool that also has the ability to do similar things to Matlab. However since Matlab is designed to do numerical computing, and Python isn't, I find Python to be awkward for some things. But, at least it can run on the computing clusters. Matlab is stupid with licensing and trying to get it to run on the supercomputing clusters is near impossible. Definitely in the future I am going to try Julia as a data-processing step since it is free and open-source and not subject to any of that licensing crap. Plus the parallel aspect interests me although I have no idea how good it is since I don't see any graphs showing what sorts of advantages I might get. Plus the documentation page is rather dense text-wise and not that appealing to read right now.

But, I think the tl;dr is that remember who you are marketing to. Remember that scientific computing moves slow. I know colleagues who have just discovered version control and think it's amazing.


Why not use Python? Do you mean Python 2.7 or Python 3.x?

Multiple dispatch lets the user define several functions with the same name and have the compiler pick the right one based on the arguments. For example, you could have three versions of the foo() function, one for reals, another for complex, and a third for matrices. You could define your own types and write additional versions of foo()—the compiler will pick the right one based on the underlying types. Julia's type system makes a lot of sense once you've gotten the hang of it.

I'd suggest fiddling with Julia if you have some time, as I think it rolls together some good ideas. But it still has a way to go IMHO. Redefining functions with dependencies in the interactive toplevel doesn't work quite right, and you can't delete entries from the symbol table. It concerns me that these things don't work. I figure that I'll give it another look as a tool for serious work when 1.0 gets released.

I'll stick with R in the mean time—less change of unpleasant surprises. But I could see Julia replacing R for me in a few years if it's done right.


I'll admit that is an interesting feature honestly the first thing on the list? It's certainly a cool and interesting idea, but I'll admit it's not a super important feature in my view. Thinking back on all the research I've done I can't think of a single time where I would need such a feature. I never use any data strucures beyond matrices, which Matlab handles by default. Since 1x1 matrices are just numbers, every function that works for matrices, works for single numbers.

I do use Python, but it came into it late on my project. Future projects will definitely use Python since it's good enough for data processing stuff. Hard core numerics will still be done in Fortran/C.

I definitely do see Julia being the better version of Matlab but it certainly has a ways to go.


> I'd suggest fiddling with Julia if you have some time

This is my biggest problem with Julia. I'd love to tinker with Julia, port some code over, especially some slow Python code, but it's hard to do trying to shove papers out the door and knowing it'll be days/weeks before I'm up to speed, and that there's libraries I may have to reimplement (probably poorly, what with being a scientist rather than a programmer).


>On the Julia homepage I don't even know if Julia can even do graphics. Can it?

Well, you could scroll to the bottom of the page...

My personal experience with Julia is this: I rewrote an algorithm from MATLAB to Julia, and received a 20x speed boost without any effort. That, and a very clean language design are why I'm using it a lot more these days. I would encourage you to just try it out.


Haha, completely true. Still it's not as explicit as it could be. Am I being nitpicky? Sure, but the word "graphical" only appears in the context of the IJulia notebook. Again if you want to really sell this to scientist, really explain the graphical stuff since plotting is an essential part of doing computational mathematics.

It is interesting your example about Matlab. Did you design it with vectorisation in mind? Not to dismiss your coding skills, but I've seen a lot of poorly-designed Matlab code that just doesn't exploit the Matlab language. The office mate I mentioned above was one of them. I took a look at his code once and vectorised it and something that was taking 10 minutes took 30 seconds.

I am definitely going to use it on future projects. Since right now I'm finishing up stuff it wouldn't make sense to re-write all my Matlab processing routines into Julia. I think a good specifically designed open-source version of Matlab is needed. Yes I know about Octave but I've never seen anyone really use Octave since it never seems to work for more complicated things.


> It is interesting your example about Matlab. Did you design it with vectorisation in mind?

Sure did. I've been using MATLAB since before it had a JIT. In fact, for the Julia code I didn't even devectorize which would play to its relative strengths. I also didn't have to pull out a profiler for all of this, it just happened with the first stab at it.

>Since right now I'm finishing up stuff it wouldn't make sense to re-write all my Matlab processing routines into Julia.

No need to, just do what I did and pick one function that's called in a tight loop and try testing that. If you don't find the performance that you'd like, post a code example on the julia-users mailing list (under the Community link on the website). Folks will either help you to 1) improve your code with Julia's design in mind, or 2) use your code as a test case to improve Julia itself. The community is very helpful.

A peek at the performance tips section in the manual is a good idea. Don't forget to wrap your code in a function for testing, and to "warm up" your code (= give the JIT a chance) before calling the @time macro on your function. Good luck!


Working for a small company that does a lot of parallel combustion CFD modeling and simulation, I'd love to see our engineers try out Julia and see how it much easier it could be on them.

But, from my experience trying to establish rules for using version control, I also realize that they're unlikely to ever rewrite (or even modularlize) their codebase to take advantage of any other language.


FTA: "Julia is also designed parallelism." Did the author even proofread the article?


I'm guessing the best predictor of whether Julia will be the "one language to rule them all", is the difficulty of writing macros in it. If they somehow figured out a way to make writing lisp-like macros in a language with infix notation easy, Julia will easily become the next big thing. Otherwise I would have to say that the trade offs between syntax and macros would make lisp a better choice (see pg's argument for lisp macros[1]).

For someone who has used both Julia and lisp macros, how do they compare?

[1] http://www.paulgraham.com/avg.html


Way back in the heyday of Byte magazine, I recall ads for the pending release of The Last One, a programming language to end all programming languages. Once released, it disappeared. (Anyone remember it?)


Is it easy to create non-blocking TLS socket connections with Julia? How about making an HTTPS request? If Julia has async operations, are they inline, promises, events or callbacks?


Yes, actually, this is quite easy. Basically the I/O model uses cooperative coroutines. For TLS, you'll want to look at https://github.com/loladiro/GnuTLS.jl. Then, after setting up the stream with the underlying socket, protocol information, etc. (see the README for how to specify this), you can just use it as any other IO stream...

    # Set up stream `sock` here
    @async begin
        println(sock,"Hello World")
        line = readline(sock)
        println(sock,"You typed ",line)
    end
Basically the encryption happens on the fly and the stream can be used just like any other socket. For example, in the http client code here: https://github.com/loladiro/Requests.jl/blob/master/src/Requ..., the only difference between HTTP and HTTPS is how the socket is set up, the actual reading and writing is the same. Hope that helps!


I got to play with Julia at the UChicago workshop. Good stuff, especially if you want to move away from MATLAB.


The one programming language to rule them all that isn't for everyone.

I had to laugh at the irony of the juxtaposition of the title and the final sentiment! I like languages so I'll definitely be checking out Julia.



The article contradicts itself resulting in nothing more than a silly, uninformed rant.

If:

"Together they fashioned a general purpose programming language that was also suited to advanced mathematics and statistics and could run at speeds rivaling C, the granddaddy of the programming world."

Then:

"That said, it isn’t for everyone. Bezanson says it’s not exactly ideal for building desktop applications or operating systems, and though you can use it for web programming, it’s better suited to technical computing."

does not follow based on the definition of "general purpose."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: