Posted on

Microsoft launches Open Service Mesh

Microsoft today announced the launch of a new open-source service mesh based on the Envoy proxy. The Open Service Mesh is meant to be a reference implementation of the Service Mesh Interface (SMI) spec, a standard interface for service meshes on Kubernetes that has the backing of most of the players in this ecosystem.

The company plans to donate Open Service Mesh to the Cloud Native Computing Foundation (CNCF) to ensure that it is community-led and has open governance.

“SMI is really resonating with folks and so we really thought that there was room in the ecosystem for a reference implementation of SMI where the mesh technology was first and foremost implementing those SMI APIs and making it the best possible SMI experience for customers,” Microsoft partner program manager (and CNCF board member) Gabe Monroy told me.

Image Credits: Microsoft

He also added that, because SMI provides the lowest common denominator API design, Open Service Mesh gives users the ability to “bail out” to raw Envoy if they need some more advanced features. This “no cliffs” design, Monroy noted, is core to the philosophy behind Open Service Mesh.

As for its feature set, SMI handles all of the standard service mesh features you’d expect, including securing communications between services using mTLS, managing access control policies, service monitoring and more.

Image Credits: Microsoft

There are plenty of other service mesh technologies in the market today, though. So why would Microsoft launch this?

“What our customers have been telling us is that solutions that are out there today, Istio being a good example, are extremely complex,” he said. “It’s not just me saying this. We see the data in the AKS support queue of customers who are trying to use this stuff — and they’re struggling right here. This is just hard technology to use, hard technology to build at scale. And so the solutions that were out there all had something that wasn’t quite right and we really felt like something lighter weight and something with more of an SMI focus was what was going to hit the sweet spot for the customers that are dabbling in this technology today.”

Monroy also noted that Open Service Mesh can sit alongside other solutions like Linkerd, for example.

A lot of pundits expected Google to also donate its Istio service mesh to the CNCF. That move didn’t materialize. “It’s funny. A lot of people are very focused on the governance aspect of this,” he said. “I think when people over-focus on that, you lose sight of how are customers doing with this technology. And the truth is that customers are not having a great time with Istio in the wild today. I think even folks who are deep in that community will acknowledge that and that’s really the reason why we’re not interested in contributing to that ecosystem at the moment.”

Read More

Posted on

LA-based Replicated adds former GitLab head of product as its chief product officer

Replicated, the Los Angeles-based company pitching monitoring and management services for Kubernetes-based applications, has managed to bring on the former head of product of the $2.75 billion-valued programming giant GitLab as its new chief product officer. 

Mark Pundsack is joining the company as it moves to scale its business. At GitLab, Pundsack saw the company grow from 70 employees to 1,300 as it scaled its business through its on-premise offerings.

Replicated is hoping to bring the same kind of on-premise services to a broad array of enterprise clients, according to company chief executive Grant Miller.

First introduced to Replicated while working with CircleCI, it was the company’s newfound traction since the launch of its Kubernetes deployment management toolkit that caused him to take a second look.

“The momentum that Replicated has created with their latest offering is tremendous; really changing the trajectory of the company,” said Pundsack in a statement. “When I was able to get close to the product, team, and customers, I knew this was something that I wanted to be a part of. This company is in such a unique position to create value throughout the entire enterprise software ecosystem; this sort of reach is incredibly rare. The potential reminds me a lot of the early days of GitLab.”

It’s a huge coup for Replicated, according to Miller.

“Mark created the core product strategy at GitLab; transforming GitLab from a source control company to a complete DevOps platform, with incredible support for Kubernetes,” said Miller. “There really isn’t a better background for a product leader at Replicated; Mark has witnessed GitLab’s evolution from a traditional on-prem installation towards a Kubernetes-based installation and management experience. This is the same transition that many of our customers are going through and Mark has already done it with one of the best. I have so much confidence that his involvement with our product will lead to more success for our customers.”

Pundsack is the second new executive hire from Replicated in six months, as the company looks to bring more muscle to its C-suite and expand its operations.

Read More

Posted on

AWS and Facebook launch an open-source model server for PyTorch

AWS and Facebook today announced two new open-source projects around PyTorch, the popular open-source machine learning framework. The first of these is TorchServe, a model-serving framework for PyTorch that will make it easier for developers to put their models into production. The other is TorchElastic, a library that makes it easier for developers to build fault-tolerant training jobs on Kubernetes clusters, including AWS’s EC2 spot instances and Elastic Kubernetes Service.

In many ways, the two companies are taking what they have learned from running their own machine learning systems at scale and are putting this into the project. For AWS, that’s mostly SageMaker, the company’s machine learning platform, but as Bratin Saha, AWS VP and GM for Machine Learning Services, told me, the work on PyTorch was mostly motivated by requests from the community. And while there are obviously other model servers like TensorFlow Serving and the Multi Model Server available today, Saha argues that it would be hard to optimize those for PyTorch.

“If we tried to take some other model server, we would not be able to quote optimize it as much, as well as create it within the nuances of how PyTorch developers like to see this,” he said. AWS has lots of experience in running its own model servers for SageMaker that can handle multiple frameworks, but the community was asking for a model server that was tailored toward how they work. That also meant adapting the server’s API to what PyTorch developers expect from their framework of choice, for example.

As Saha told me, the server that AWS and Facebook are now launching as open source is similar to what AWS is using internally. “It’s quite close,” he said. “We actually started with what we had internally for one of our model servers and then put it out to the community, worked closely with Facebook, to iterate and get feedback — and then modified it so it’s quite close.”

Bill Jia, Facebook’s VP of AI Infrastructure, also told me, he’s very happy about how his team and the community has pushed PyTorch forward in recent years. “If you look at the entire industry community — a large number of researchers and enterprise users are using AWS,” he said. “And then we figured out if we can collaborate with AWS and push PyTorch together, then Facebook and AWS can get a lot of benefits, but more so, all the users can get a lot of benefits from PyTorch. That’s our reason for why we wanted to collaborate with AWS.”

As for TorchElastic, the focus here is on allowing developers to create training systems that can work on large distributed Kubernetes clusters where you might want to use cheaper spot instances. Those are preemptible, though, so your system has to be able to handle that, while traditionally, machine learning training frameworks often expect a system where the number of instances stays the same throughout the process. That, too, is something AWS originally built for SageMaker. There, it’s fully managed by AWS, though, so developers never have to think about it. For developers who want more control over their dynamic training systems or to stay very close to the metal, TorchElastic now allows them to recreate this experience on their own Kubernetes clusters.

AWS has a bit of a reputation when it comes to open source and its engagement with the open-source community. In this case, though, it’s nice to see AWS lead the way to bring some of its own work on building model servers, for example, to the PyTorch community. In the machine learning ecosystem, that’s very much expected, and Saha stressed that AWS has long engaged with the community as one of the main contributors to MXNet and through its contributions to projects like Jupyter, TensorFlow and libraries like NumPy.

Read More