ninepints.co

The last two years have been, to put it mildly, politically frustrating. Not a week has gone by without some terrible policy decision or profound ethical lapse (often several of both!) emanating from the current administration. I've had to dial back my news consumption for the sake of my time and my sanity.

With midterm elections approaching, we finally have a chance to do something about it. It goes without saying that you should vote, whatever your political affiliation. It's your duty as a member of this democracy to help us determine the future of the country. But if you live in a state where your favored candidates are overwhelmingly likely to win, as I do, casting your vote might not feel particularly satisfying.

The obvious solution is to work towards changing other people's votes. I recently discovered an organization called Tech Solidarity, and they're funding a slate of thirteen house candidates dubbed the Great Slate for the upcoming midterms. You can learn more about the project, meet the candidates, and donate by visiting this page. I also encourage you to read this 2017 meeting transcript for a better look at what Tech Solidarity is about. I think they're doing good work that deserves my support, and I hope you feel the same way.

The end of the transcript mentions the importance of early investment. Campaign funding tends to pick up as election day approaches and the closer races become apparent, but it's most needed at the beginning of the election cycle. Intuitively, a strong base is built on early, persistent, ground-level voter engagement, not a deluge of last-minute ads. I wish I'd though about all of this last year, but I figure better late than never.

All of these candidates have pledged to accept no money from corporations or associated political action committees. They're running campaigns powered by individual donations, which means your dollar really counts.

One of my work projects is publicly available! We call it Reflow. Written in Java, it's a library for composing individual units of work into a directed acyclic graph. My team has started using it to drive a bunch of our data processing, and now you can try it out too. Source and documentation can be found on GitHub, and as of today, we're also publishing build artifacts to JCenter.

Once a dependency graph has been defined, Reflow enables you to run it end to end in a single method call, with multiple tasks executing in parallel when possible. And there's more:

  • Each task can declare that it will produce some output (database tables, files on local disk, etc.), and those tasks can be skipped when the output is already present.
  • Task definitions are extremely flexible—in fact, the only real requirement is that each task be representable with a Java object. If your tasks happen to implement the Runnable interface, it's easy to get them scheduled on an Executor of your choosing, but you can opt to handle scheduling yourself and even schedule tasks outside of the JVM.
  • If you do schedule tasks externally, the state of the overall workflow can be serialized even while tasks are running. This allows you to bring down one “coordinator” process and bring up another without missing a beat.

Change in gears! Time for a math post.

I've been working through various online machine learning courses over the last eighteen months, beginning with the Stanford/Coursera ML course taught by Andrew Ng. It opens with three weeks on linear and logistic regression, covering the structure of the linear/logistic regression models, the specific loss functions involved in each, and how to minimize said functions via gradient descent.

The lectures are well executed, but I could have gone for more background on the loss functions, which are sort of handed down from above. Where did they come from? Why do they produce good regression coefficients? I think the answers to these questions are pretty neat—there's actually a straightforward statistical interpretation of what's happening.