Consuming, Contributing, Collaborating: Open Source from all sides by Ron Minsky

Automatic Summary

Engaging with Open Source Communities: A Business Perspective

Today, we're exploring how businesses can actively participate and engage in open source communities. This crucial aspect evolves from my own experience at James Street, a trading firm deeply embedded in technological operations. While open source might not seem a direct fit to a trading firm's business model, the benefits are immense. In this article, we dive into our open source engagements, sharing insights on what worked, what didn't, and why it all mattered.

James Street: A Quick Overview

Before we dive into the benefits of open source, it's important to understand the technological context of our operations at James Street. Operating as liquidity providers in the trading market, we serve as the middlemen for investments, working to profit from the gap between the buy and sell price. This seemingly harmless operation is veritably complex; we've been in the industry for 20 years, handle nearly 2000 employees, operate across multiple time zones and regulatory regimes, handling multiple types of securities and strategies. As a result, our business is incredibly technology-driven.

We invest heavily in building software that simplifies and scales our operations, optimizing the delivery of data and making it easier for people to discover new ideas. This foundation is extensively built on open source software, like many other technologically-driven endeavors.

Our Open Source Journey

Over the years, we have invested more deeply in various open source ecosystems, supporting pre-existing projects and launching our own open source software. Here's a glimpse into our open source journey:

  • Mercurial:
  • Mecurial, a version control system, was initially used directly off-the-shelf. However, as our operations grew, we encountered more stress points and opted to take a more hands-on role with the platform.

  • O Camel:
  • We began by using O Camel, a programming language, quite naïvely, progressively becoming major contributors to the O Camel community. Today, our team of compiler developers regularly make changes that benefit our internal operations, later refining them for submission upstream, to benefit the larger O Camel community.

  • Exporting Internal Code:
  • To boost the O Camel ecosystem and extend its suitability to our internal workings, we began exporting parts of our internal code. Released via Github, an influx of vibrant code from within our walls saw an increased alignment between our internal software and the external O Camel ecosystem.

  • dune:
  • As an attempt to build encompassing open source communities, we chose Dune as a community-focused project. Interestingly, there are more external contributors to Dune today than there are at James Street.

  • Magic Trace:
  • The Magic Trace project, another of our open-source undertakings, provides a detailed analysis of the operations of Hello World.

    Key Takeaways: Engaging in Open Source

    Getting onboard the open source train can be a challenging yet rewarding journey. Here are some insights to aid your transition:

  • Align Open Source with Organizational Goals:
  • Be it the growth of the open source ecosystem, customizing contributions, or providing consumable data sets, aligning open source work with organizational goals can greatly amplify its benefits and sustainability.

  • Engage Diversely:
  • Funding people to work on projects, contributing to discussion forums, submitting bug reports, or even releasing your projects are all ways you can engage. However, the choice of your open source strategy is pivotal.

  • Consider the Developer Experience:
  • The tooling used can greatly affect the developer experience. So, when choosing your open source strategy, ensure that the interface is easy to use for all contributors.

    Open source contributions, therefore, go a long way in not just fostering developer communities but also propelling your business growth. So, embark on your open source journey today!


    Video Transcription

    Hi, everybody. Uh so I am here today to talk about open source and in and in particular what it is like and how you should approach or how you can approach uh engaging in open source communities kind of from a, a kind of company and business perspective.Uh And I just wanted to say folks should feel free to throw questions into the chat as we go, I'll be paying attention to that and try and pick off some questions along the way to answer. So, uh the whole talk is basically going to be from the perspective of my experience working at James Street. And so I want to spend a few minutes talking about who J Street is just kind of. So you can whoops, sorry, I lost my video there for a second. Uh Talk about who J Street is just so you can get a sense of like where the talk is coming from, the perspective that this all comes out of. So James Street is a trading firm and the basic business we have is that we are liquidity providers. We are people who are kind of middle men if you want to make an investment and you buy some security, then there's a good chance that we are on the other side of that purchase.

    If you decide to sell a security, there's a decent chance that we are on the other side there as well. And it's a kind of, it's sort of like the moral equivalent of working in a grocery store. You know, you buy from one, you sell to the other, you try and make money from the gap between the price you buy and the price you sell. And that all kind of sounds, you know, pretty pedestrian, but it's actually really complicated. We have been doing this for 20 years. We have nearly 2000 employees. We operate in dozens of venues across many time zones and regulatory regimes, lots of different types of securities, tons of different strategies that we run and the business is as a result enormously and intensely technological.

    We spend a lot of time and effort building software to simplify and scale, to optimize the delivery of data to kind of make all of the things that we do more scalable and easier and make it easier for people to discover new ideas and, and build up new trades. So that is not on the face of it, an open source business model, right? Uh And so how does open source fit into that? Well, in some sense, open source fits into any serious technological enterprise, the foundations that we work on the stones we stand on are open source, right from the operating system on up, we're using lots of open source software, but that's just kind of the same as everybody else.

    Um In addition to that, we've spent a lot of time and effort over the years investing more deeply in the, in various open source ecosystems, including supporting and engaging in pre-existing projects and creating and launching some of our own open source software. And that's really what I want to talk about, both how we've approached that and why we've done it and I'll do that by walking through a few examples. So to start with, I want to talk about an example where we started out as a mere consumer of the software in question and then moved on to kind of have a more deeply engaged role over time. And this is with a software, a piece of software called Mercurial. Uh Mercury is a version control system, a competitor to GIT. It was actually uh designed at around the same time by almost the same people as Git. Uh both Git and Mercury were built by Linux kernel developers. Um And Git is far more popular and pervasive than Mercury, but Mercury has an important niche which is it's better suited to working with very large repos. And that's good for what we sometimes call a mono repo style workflow where instead of as you might do in a kind of common open source context, having lots of little repositories.

    One for each project that you have, you have one enormous compo uh repository that you put lots of different projects into. And mono refills are useful, especially in large companies because they simplify the versioning story. Instead of having to think what version do I need of each project I want to include just every revision has a kind of complete and consistent decision as to what is all the software of all your dependencies. And a lot of big companies use mono repos and a number of those in fact use Mercurial for some part of their version control to manage those mono repos, including Google Meta and Mozilla, all use Mercury materially. So we started using Mercury a long time ago back in 2005. And as our use of it grew, I mean, we started, we just like downloaded it, it was software, we got it off the shelf, we used it, it was fine, it solved it solved our problems, but we started doing more growing the scope and scale of our efforts and writing more software. And over time, we started hitting more stress points and we didn't want to build out a big effort to work on Mercury as a first class thing. So we try to do something small. We hired some consultants and this is kind of a common thing.

    Lots of open source projects have people who've taken on a role as consultants in that part of the ecosystem, who you can hire them to kind of help you out, help you with adoption and also drive changes that are useful and necessary into the project itself. And so we worked with those consultants both to advise us on how we approach things and also to kind of help build up uh more functionality in Mercury as a whole to support the kind of uses we wanted. And that worked pretty well for a while. But eventually, as we continue to grow and continue to do more, we needed to do to move faster than the consultants could move on their own. And the limitation for the consultants was mostly that they weren't inside. They didn't see our problems, they didn't see our repo, they weren't directly hooked into our workflows. And so we started getting more directly involved. And so we rewrite, we ended up rewriting significant bits of Mercurial and Rust. Most of Mercury is implemented in Python.

    So for speed, we implemented some pieces in rust. Uh and we did it in a kind of narrow way that was fit to our needs. But we kept the consultants on and we worked with them to take that work, generalize it, improve it and upstream it eventually because in the long run, we didn't want to have our own fork of the software. Uh We wanted to be able to make the advances that we needed for our own use case, but also to kind of integrate nicely with open source. So anyway, that's one that's kind of one example of how we've engaged another case where we engaged more deeply went from, kind of starting out as kind of an engaged user and went on to become a primary stakeholder is with O Camel. So uh O Camel is a programming language that's not used incredibly widely, but it's one that we're relatively well known for using. Um We've used it for almost 20 years at this point and we were pretty deeply embedded in the O Camel community from the beginning of our using of the language.

    Uh But as with Mercury, we started out as fairly naive users at first just, you know, picking up the language and using it over time, submitting bug reports, submitting feature requests and eventually starting to hire people to work on uh to work on OAM internally to start submitting PR S and trying to get them accepted by the upstream uh the upstream people.

    And, and that, that allowed us to kind of make a lot of important changes that made our own internal lives better. But there was still a lot of friction there because we went through the ordinary release process and like, you know, it takes time to get your pr reviewed and accepted and then there's the next stable release, which happens twice a year. And so it was a pretty long cycle. And today our, our engagement is much deeper. We have a really big team of compiler developers. We have about nine people working on that team and we have our own fork of the language where we're constantly making changes, changes we make now land internally first and then we iterate and refine and try and make those features better, quickly getting feedback from our internal users.

    But we don't want to deviate away and break away from the larger O Camel world. So when those features stabilize, we try and submit them upstream and try and convince people that it makes sense as in addition to the language in general. And today, if you look at any given OCAA release a large fraction of the improvements, surely not all of them, but a lot of the work that's done there comes from people who work at J Street. So we're now really one of the primary people involved in that ecosystem. So Mercury and O Camel are both cases where there's a pre-existing uh software ecosystem and we were kind of joining into it, but that's not all all we've done. We've also spent time taking our internal code and exporting it for the outside world. So if you go on to O camel.org and do a search for things that are authored by us, you'll see 240 ish packages up there that came from Jane Street and that's all in almost a million lines of code. Uh and that code is not a bunch of independent uh projects that are like sourced on github and are done kind of fully out in the open.

    What's really going on is we're taking the work that we do internally, taking a subset of that work and exporting it to github where other people can see and use it. Um And the reason we do that is we get a lot of benefits from working in our internal world, working with our internal mono repo. But it just doesn't make any sense to take that mono repo and export its contents outside. First of all, it's got a lot of stuff that's private stuff that we don't want to export. Also, it would be kind of useless and unhelpful to people to have all of that stuff there. Uh Instead what we do is we have a process that takes our internal software as it's developed and periodically typically once or twice a week, exports it onto github, just the subset that we want to export, broken into a bunch of individual uh repos for individual projects. And we have, we tie things together in the other direction as well when a new issue is filed or APR is submitted, those are imported and kicked into our internal systems. So developers can see that something is happening and respond to it in a reasonable way. So this is all a lot of work, the tooling to hook these things up is not trivial, it's not stuff that people just kind of like have out there. Um kind of in the outside world is stuff you have to build for yourself.

    And you might ask why do we bother doing it? Um Because given that it's a lot of work and there's a couple of reasons, one is uh from our perspective, it's really important to grow the Okam ecosystem. We get a lot of benefit from all the software and work that's done on the outside in Okam that we want to be able to use internally and also from a kind of recruiting and hiring point of view, people here largely work in O Camel. And we want that to be like an exciting ecosystem, an exciting set of code for people to dive into. And so the more vibrant and healthy the community is the better off we are. And so part of what we did in releasing all those software is to try to, you know, put out more stuff that makes O Camel more useful and encourages more people to use it and grow that ecosystem. The other somewhat less obvious uh motivation for it is that by open sourcing all of this stuff we provide, uh we provide something for other people to integrate with. There's lots of people on the outside who build useful libraries and useful tools and in fact extend the compiler in in useful ways.

    And the more of our software that's out there for people to compile against and integrate with the better those things will be matched to our needs, right? And so in some ways, by open sourcing all this stuff, we align the software that we build internally more closely with the external ecosystem and that has a lot of knock on benefits as well. OK. So that's a case of like taking stuff that we have in like exporting it. But it's a clear like first class, second class thing, the first class development experience is inside of our walls. And there's this kind of second class get export, which is like pretty good but limited. There are also cases where we've decided to really go out and try and build whole communities uh around pieces of software. And maybe the most interesting example of this is Dune. Uh So Dune is uh the these days is the pre eminent build system for O Camel. So if you have like a big O Camel program and you wanna do all the compilation and linking and preprocessing steps that are required, you can use Dune for doing that and it gives you a nice flexible, efficient system for that. It was developed at James Street.

    Uh But we made a decision relatively early on to organize it as a kind of community focused project, which in part meant picking the community's tools over our own tools. So instead of hosting it internally and exporting it to github we hosted on github and bring things, bring it internally periodically, right? So it changes where the frictions are. And it means that if you want to work on dune and you're in the outside world, you have basically the same experience for doing that work as anyone else does. And I think this was really important for making dune a place that was attractive for other people to come and contribute. And it's really worked today. There are more people contributing to dune outside of Jane Street than there are people inside of James Street, right? So I think of that as a pretty big success. Uh And I think it also helped in the adoption. Like if you look at oam package manager, you'll see more a larger and larger fraction of a larger and larger fraction of the packages over time are built using dune. And that has other benefits when we want to import someone's project internally.

    The fact that it's built in Dune, which is a built system that we understand very well, makes it way easier for us to put, for us to pull it in to our internal systems. OK. So those are like three different projects uh or four actually four different projects, four different stories of uh things that we've done open source work that you've done. Um They're not the only ones, there's a bunch of other neat things that you could look at. Uh one of my favorite examples is a project called Magic Trace, uh which you go to Magic trace.org, you can see some beautiful uh analysis of exactly what Hello World does when you run it uh in a kind of elaborate detail that highlights that Hello World is way more complicated than you might ever have imagined.

    Um And that's another example of a piece of software that we've built as a kind of open source first piece of software. Um But anyway, these are examples, but there's lots more stuff that we've done over the years uh to kind of engage with the open source world. So those are some examples. What are the takeaways from all that? What are, what are things to remember? So the first one that I think is really important is if you're going to do open source work in the context of a business in the context of a larger organization, I think it's really important to align that work with the kind of needs and goals and requirements of the organization.

    I want to be clear personally, I love open source software. It's kind of what I cut my teeth on how I learned to be a software engineer. And I get a lot of personal satisfaction from doing stuff that contributes in this kind of public way and both in terms of like contributing to existing projects and open sourcing our own, our own stuff. I find it enormously satisfying, but building open source software and doing it well and sustainably is really hard. And you know, in some sense, the excitement that you might have around open source can get you a certain distance. But I think it's hard to justify a large and sustained level of investment unless you really think about how that open source work can line up with the goals of the company. And I think part of what I think is interesting about the story here is even if your business model isn't fundamentally about open source, it's still the case that open source and active contribution and, and collaboration with open source communities can really benefit the business in a kind of useful and tangible way.

    The other thing that I hope these examples highlight is that there's a bunch of different ways that you can engage, right? You can fund people to work on open source projects, you can communicate on forums and submit bug reports and write documentation. Uh You can write your own patches and you can create your own freestanding uh open source projects to contribute to the world. And you should think of which approach you want as part of the open source strategy, you're picking, there are a ton of different choices and which choice you pick matters. It affects how you set things up, it affects what you can achieve. Uh And so you really want to be intentional when you're making that choice. Um And the other thing is that, that's maybe less obvious is that the tooling that you use matters an enormous amount. Like every time you make a choice about where are you going to host it and how are you going to manage it and how are you gonna manage contributions and things like that? There's a bunch of tooling implications and that affects the friction, the friction for your internal developers and the friction for external people who are going to participate.

    And so again, when you're thinking about what's your strategy for, how to engage in open source stopping and taking a deep breath and thinking, what is the user experience? What's the developer experience going to be like for people who are participating and what should we do on the tooling side to make that as good as possible for the approach that we're taking? I think it's a really important kind of fundamental question you should be asking early on in the process. OK. So that's basically what I wanted to talk about. Um uh I, we have a few minutes left. And so if folks have questions, I would be happy to answer them. I see a lot of hellos but not much in the way of questions. I think so. Yeah. If anyone has stuff to say I would be happy to answer. OK? And while we're doing that here, I'm gonna uh oh never mind. I was gonna pull up uh a different one of our projects called Magic Face. To show people, but I think, I don't have a moment. I think I can't quite do that now with how I have things set up. All right. Ah, here, what are some best practices for open source contribution? That's a good question.

    Like, I'm not sure I have like, a concise set of best practices, but I can say something about mindset, which is, I feel like when you engage in, with open source communities, I think it's important to really think hard about what is best for that community because I, I think an important thing to remember is contributing an open source is a repeated game, right?

    You're going in, you're not just trying to get one patch in or change one thing you want people over time to trust and respect the things you're doing. So that like when you go in to make contributions, uh people are actually willing to accept them. So I think like one thing I thought found to be really important is to not just like throw stuff up there without thinking, but to really take effort to think hard about the engineering and the quality of the changes you propose and do that in a way that so that people on the side of the of the kind of open source project come to think of your contributions as high quality things uh that, that they would be happy to accept.

    Um OK. And another question for you is in terms of materials, I can point you to. Uh so there's a few things. Um So we actually have a tech blog that covers a lot of our work and is more focused on some of the open source stuff. So I think if you go to blog dot James street.com, uh you'll see a bunch of posts that cover all sorts of different aspects of the work we do. But a lot of focus on uh on this, on the kind of open source side of the work, you can also see actually all the concrete stuff we have. There's another website open source dot J street.com that has links to some of the major projects. And then github.com/j street uh is just our kind of our organization on github that has all that there. All right. Well, I think we're at the end of our time. Thank you very much.