envelopmenuskypeburger-menulink-externalfacebooktwitterlinkedin2crossgithub-minilinkedin-minitwitter-miniarrow_rightarrow_leftphonegithubphone-receiverstack-overflow

Interactive programming for Machine Learning in 2017

Introduction

I am always aiming to reduce the time between writing code and actually seeing the results. Not only it allows to test rarely-used API faster and catch bugs earlier, but also enables faster prototyping. In some areas, like machine learning and data science, it is not even a matter of convenience, but necessity.

We can see the rise of IDEs and tools that allow to perform exploratory analysis easier, and more and more languages incorporate REPL (Read–eval–print loop) in their toolset. This stuff is designed to give you feedback about your code and your data as soon as possible. It gives you the feel of how your data is structured, how changing simple parameters (or entire algorithm) affects your solution and lets you reduce time between adding new code and seeing it in action.

I am a big fan of Bret Victor work, especially his talk Inventing on Principle and related work, like Learnable Programming. His ideas inspired projects like Light Table and Khans Academy live coding. Light Table was positioned as the IDE allowing to run code inline and get instant feedback, it also supported many different languages.

Light Table watches

It had a really successful Kickstarter campaign and later it somehow lost steam, but before that it gave inspiration (of course alongside Bret Victor’s work) to many great things, like powerful Swift Playground from Apple and... Hydrogen, which I will show you today.

Hydrogen

Hydrogen is a package for Atom editor that allows interactive programming in different languages. I would call it a bridge, or even a sweet spot, between Jupyter Notebooks and a full blown IDE (like IntelliJ IDEA). The former is designed solely as an exploratory tool (maybe it is not designed for it, but used by many people for this kind of work), the latter as an application development IDE. With Hydrogen you can have the best of both worlds with much more. You can test everything right away and have room for organizing code like in IDE for normal applications.

But what exactly is this interactive programming, instant feedback and so on? For starter, you can do something like this:

Run parts of the code and get instant results

Select the code you are interested in and just press ⌘+Enter. Instant feedback.

You can execute and see anything inline, not only the parts of code that output text:

See graphics and plots inline in Atom

Watch expressions

This is really nice, you can tweak your algorithm and with every change you send to underlying kernel, the watcher will execute your code. Whenever you change something, e.g. plot new values for precision, recall, accuracy and so on, you can browse the history of your changes, like this:

Watch expressions in action

Precise autocompletion

You can get the precise autocompletion for your living objects because Hydrogen does not need to guess about your code (like in dynamic languages), it can inspect objects and get all info from them.

Autocompletion right from living objects

Object inspection

More features

There are even more fancy features of the tool. With these examples you can check what Hydrogen can do: interactive JSON browser, rendering images, HTML iframe or even interactive plots with Plotly. You can also connect to remote kernels, if you need more power. Yes, Hydrogen in fact connects to the underlying Jupyter kernel so you can have all of its magic at your disposal. And you can use different kernels... even in one file.

Different kernels in one file

So right from your code you can do stuff like ! pip install numpy or execute Jupyter’s magic commands. You can also divide your code into separate cells (like in Jupyter or RStudio) with # %% and with one shortcut execute all code in the entire cell. If you need to see your code, press ⌥+⌘+backspace, it will clear all results, plots etc.; you can even move all outputs to the right dock and focus on the code.

Ecosystem

Everything is done without leaving the hackable, highly configurable Atom. And this is the reason I think Hydrogen will succeed. They do not need to write everything by themselves, like folks at LightTable. They simply incorporate kernels from Jupyter. Also, they have the entire ecosystem from Atom at their disposal, and all its libraries written by the community: linters, scripts, simple utilities and the entire configurable editor.

Useful Atom packages

Here is my subjective list of useful Atom packages:

If you liked the editor in IntelliJ IDEA, here are interesting ones:

Remember that you can install them right from the terminal using apm install PACKAGE_NAME and after this, in Atom's command list, choose Window: Reload to get everything installed and refreshed.

Similar environments

There is nothing exactly like Hydrogen, but you can think of some similar tools:

  • Rodeo: More similar to RStudio, but unfortunately, the company behind it (Yhat) was acquired and there have been no new commits for half a year, and without further improvement it is unusable for me.
  • Jupyter Labs: Another thing worth watching, but for now it is not even near to completion according to their site.

There are other IDEs similar to RStudio: wingware, spyder. It is worth to check them out, but for me they were not configurable enough to make my work pleasant (Atom raised the bar significantly), and they lack some necessary shortcuts.

BONUS

  1. As a bonus you could also try a replacement for front-end of Jupyter notebooks nteract desktop by folks behind Hydrogen.
  2. If you are interested in what the founder of Light Table is doing these days, check his new project - Eve. It is not even close to commercial use, but it presents some really interesting concepts.