“SMI is really resonating with folks and so we really thought that there was room in the ecosystem for a reference implementation of SMI where the mesh technology was first and foremost implementing those SMI APIs and making it the best possible SMI experience for customers,” Microsoft partner program manager (and CNCF board member) Gabe Monroy told me.
Image Credits: Microsoft
He also added that, because SMI provides the lowest common denominator API design, Open Service Mesh gives users the ability to “bail out” to raw Envoy if they need some more advanced features. This “no cliffs” design, Monroy noted, is core to the philosophy behind Open Service Mesh.
As for its feature set, SMI handles all of the standard service mesh features you’d expect, including securing communications between services using mTLS, managing access control policies, service monitoring and more.
Image Credits: Microsoft
There are plenty of other service mesh technologies in the market today, though. So why would Microsoft launch this?
“What our customers have been telling us is that solutions that are out there today, Istio being a good example, are extremely complex,” he said. “It’s not just me saying this. We see the data in the AKS support queue of customers who are trying to use this stuff — and they’re struggling right here. This is just hard technology to use, hard technology to build at scale. And so the solutions that were out there all had something that wasn’t quite right and we really felt like something lighter weight and something with more of an SMI focus was what was going to hit the sweet spot for the customers that are dabbling in this technology today.”
Monroy also noted that Open Service Mesh can sit alongside other solutions like Linkerd, for example.
A lot of pundits expected Google to also donate its Istio service mesh to the CNCF. That move didn’t materialize. “It’s funny. A lot of people are very focused on the governance aspect of this,” he said. “I think when people over-focus on that, you lose sight of how are customers doing with this technology. And the truth is that customers are not having a great time with Istio in the wild today. I think even folks who are deep in that community will acknowledge that and that’s really the reason why we’re not interested in contributing to that ecosystem at the moment.”
Replicated, the Los Angeles-based company pitching monitoring and management services for Kubernetes-based applications, has managed to bring on the former head of product of the $2.75 billion-valued programming giant GitLab as its new chief product officer.
Mark Pundsack is joining the company as it moves to scale its business. At GitLab, Pundsack saw the company grow from 70 employees to 1,300 as it scaled its business through its on-premise offerings.
Replicated is hoping to bring the same kind of on-premise services to a broad array of enterprise clients, according to company chief executive Grant Miller.
First introduced to Replicated while working with CircleCI, it was the company’s newfound traction since the launch of its Kubernetes deployment management toolkit that caused him to take a second look.
“The momentum that Replicated has created with their latest offering is tremendous; really changing the trajectory of the company,” said Pundsack in a statement. “When I was able to get close to the product, team, and customers, I knew this was something that I wanted to be a part of. This company is in such a unique position to create value throughout the entire enterprise software ecosystem; this sort of reach is incredibly rare. The potential reminds me a lot of the early days of GitLab.”
It’s a huge coup for Replicated, according to Miller.
“Mark created the core product strategy at GitLab; transforming GitLab from a source control company to a complete DevOps platform, with incredible support for Kubernetes,” said Miller. “There really isn’t a better background for a product leader at Replicated; Mark has witnessed GitLab’s evolution from a traditional on-prem installation towards a Kubernetes-based installation and management experience. This is the same transition that many of our customers are going through and Mark has already done it with one of the best. I have so much confidence that his involvement with our product will lead to more success for our customers.”
Pundsack is the second new executive hire from Replicated in six months, as the company looks to bring more muscle to its C-suite and expand its operations.
Cloud hosting company Scaleway has launched Kubernetes Kapsule, a new service that lets you manage Kubernetes clusters on Scaleway’s infrastructure. The service works with a wide-range of Scaleway instances and lets you create large clusters that scales depending on demand.
Kubernetes is an open-source platform to manage containers and …
By now we know that Kubernetes is a wildly popular container management platform, but if you want to use it, you pretty much have to choose between having someone manage it for you or building it yourself. Spectro Cloud emerged from stealth today with a $7.5 million investment to give you …
VMWare describes the shift to the cloud as the most significant shift in enterprise architecture of the decade and that 500 million new apps will be launched in the next five years. Thanks to containers, compute costs are falling by 50%, and resources can be provisioned 450X faster. With Kubernetes going mainstream — …
Machine learning inferencing requires investment to create a reliable and efficient service. For an XGBoost model, developers have to create an application, such as through Flask that will load the model and then run the endpoint, which requires developers to think about queue management, faultless deployment, and reloading of newly trained models. The serving container then has to be pushed to a Docker repository, where Kubernetes can be configured to pull from, and deploy on the cluster. These steps require your data scientist to work on tasks unrelated to improving model accuracy, or bringing in a dev ops engineer, which adds to development schedules and requires more time to iterate.
With the SageMaker Operators, developers only need to write a yaml file that specifies the S3 stored locations of the saved models, and live predictions become available through a secure endpoint. Reconfiguring the endpoint is as simple as updating the yaml file. On top of being easy to use, the service also has the following features:
Multi-model endpoint – Hosting dozens or more models can be challenging to configure and lead to many machines operating with low utilization. Multi-model endpoints sets up one instance with on the fly loading of model artifacts for serving
Elastic Inference – Run your smaller workloads on split GPUs that you can deploy at low cost
High Utilization & Dynamic Auto Scaling – Endpoints can run with 100% utilization and add replicas based on custom metrics you define, such as invocations per second. Alternatively, automatic scaling can be configured on predefined metrics for client performance
Availability Zone Transfer – If there is an outage, Amazon SageMaker will automatically move your endpoint to another Availability Zone within your VPC
A/B Testing – Set up multiple models, and direct traffic proportional to the amount that you set on a single endpoint
Security – Endpoints are created with HTTPS and can be configured to be run in a private VPC (no internet egress) and accessed through AWS PrivateLink
Compliance Ready – Amazon SageMaker has been certified compliant with HIPAA, PCI DSS, and SOC (1, 2, 3) rules and regulations
Packaged together, the features that are available in Kubernetes through SageMaker Operators shorten time to launch model serving, and reduce your development resources to setup and maintain production infrastructure. This can be a drop of 90% in total cost of ownership over EKS or EC2 alone.
This post demonstrates how to set up Amazon SageMaker Operators for Kubernetes to create and update endpoints for a pre-trained XGBoost model completely from kubectl. The solution contains the following steps:
Create an IAM Amazon SageMaker role, which gives Amazon SageMaker permissions needed to serve your model
Prepare a YAML file that deploys your model to Amazon SageMaker
Deploy your model to Amazon SageMaker
Query the endpoint to obtain predictions
Perform an eventually consistent update to the deployed model
This post assumes you have the following prerequisites:
A Kubernetes cluster
The Amazon SageMaker Operators installed on your cluster
This bash command connects to the HTTPS endpoint using AWS CLI. The model you created is based on the MNIST digit dataset, and your predictor reads what number is in the image. When you make this call, it sends an inference payload that contains 784 features in CSV format, which represent pixels in an image. You see the predicted number that the model believes is in the payload. See the following code:
This confirms that your endpoint is up and running.
Eventually consistent updates
After you deploy a model, you can make changes to the Kubernetes YAML and the operator updates the endpoint. The updates propagate to Amazon SageMaker in an eventually consistent way. This enables you to configure your endpoints declaratively and lets the operator handle the details.
To demonstrate this, you can change the instance type of the model from ml.r5.large to ml.c5.2xlarge. Complete the following steps:
Modify the instance type in hosting.yaml to be ml.c5.2xlarge. See the following code:
This post demonstrated how Amazon SageMaker Operators for Kubernetes supports real-time inference. It also supports training and hyperparameter tuning.
As always, please share your experience and feedback, or submit additional example YAML specs or operator improvements. You can share how you’re using Amazon SageMaker Operators for Kubernetes by posting on the AWS forum for Amazon SageMaker, creating issues in the GitHub repo, or sending it through your AWS Support contacts.
About the authors
Cade Daniel is a Software Development Engineer with AWS Deep Learning. He develops products that make training and serving DL/ML models more efficient and easy for customers. Outside of work, he enjoys practicing his Spanish and learning new hobbies.
Alex Chung is a Senior Product Manager with AWS in Deep Learning. His role is to make AWS Deep Learning products more accessible and cater to a wider audience. He’s passionate about social impact and technology, getting his regular gym workout, and cooking healthy meals.
Epsagon, an Israeli startup that wants to help monitor modern development environments like serverless and containers, announced a $16 million Series A today.
U.S. Venture Partners (USVP), a new investor led the round. Previous investors Lightspeed Venture Partners and StageOne Ventures also participated. Today’s investment brings the total raised to $20 million, according to the company.
CEO and co-founder Nitzan Shapira says that the company has been expanding its product offerings in the last year to cover not just its serverless roots, but also giving deeper insights into a number of forms of modern development.
“So we spoke around May when we launched our platform for microservices in the cloud products, and that includes containers, serverless and really any kind of workload to build microservices apps. Since then we have had a few several significant announcements,” Shapira told TechCrunch.
For starters, the company announced support or tracing and metrics for Kubernetes workloads including native Kubernetes along with managed Kubernetes services like AWS EKS and Google GKE. “A few months ago, we announced our Kubernetes integration. So, if you’re running any Kubernetes workload, you can integrate with Epsagon in one click, and from there you get all the metrics out of the box, then you can set up a tracing in a matter of minutes. So that opens up a very big number of use cases for us,” he said.
The company also announced support for AWS AppSync, a no-code programming tool on the Amazon cloud platform. “We are the only provider today to introduce tracing for AppSync and that’s [an area] where people really struggle with the monitoring and troubleshooting of it,” he said.
The company hopes to use the money from today’s investment to expand the product offering further with support for Microsoft Azure and Google Cloud Platform in the coming year. He also wants to expand the automation of some tasks that have to be manually configured today.
“Our intention is to make the product as automated as possible, so the user will get an amazing experience in a matter of minutes including advanced monitoring, identifying different problems and troubleshooting,” he said
Shapira says the company has around 25 employees today, and plans to double headcount in the next year.