Demystifying Deployment of Nephio Workload Clusters on Multi-Cloud
Introduction
Welcome back to our technical deep-dive series on Nephio! Following our previous discussion on the capabilities of Nephio, this blog post focuses on how we can extend Nephio to deploy workload clusters on multi-cloud. We will discuss in detail about creating and deploying a new workload cluster kpt package for Azure AKS, using cluster API Kubernetes operator.
Nephio Tool Chain
Nephio's fundamental elements comprise Nephio controllers and a collection of open-source tools. The primary tools integrated into the framework consist of kpt, porch, gitea, and configsync. Therefore, prior to delving into the deployment of the workload cluster package, we will go through specific details of these core toolchains and their key functionalities.
1 - Kpt
Kpt is a tool that simplifies the way we manage and customize Kubernetes configurations. It treats these configurations as data, which means they're the master plan that defines how things should work in a Kubernetes cluster. kpt organizes these configurations into packages, which are like folders containing all the instructions Kubernetes needs, written in YAML files.
A kpt package is identified by a special file called a Kptfile which holds information about the package, kind of like a table of contents. Just as you can have folders within folders on your computer, kpt packages can contain subpackages, allowing for complex but organized configurations.
Nephio relies on kpt for these capabilities:
- Automation: kpt brings the automation needed for managing configurations at scale.
- Customization: It allows for the necessary customization of configurations, which is a common requirement for real-world deployments. This is also possible with helm like tools, but they use templates and sometimes over parameterization.
- Version Control: kpt uses Git for version control, providing a familiar workflow for managing changes.
- Interoperability: With its function-based approach, kpt ensures that different tools and processes can work together smoothly.
2 - Porch
Porch is one of the key components within the Nephio project aimed at simplifying the management of Kubernetes configurations. Its approach to package orchestration underscores Nephio’s intent-driven automation by ensuring that the high-level objectives of network deployments are translated into effective, actionable configurations. It facilitates the creation, tracking, and updating of KRM files as KPT packages. These packages contain configuration data that define how software should be run in a Kubernetes cluster. It also provides a UI experience of WYSIWYG (what you see is what you get) and is the main UI for Nephio in R1 release.
In essence, Porch is designed to handle these major tasks:
- Package Versioning: It keeps track of different versions of configuration packages, allowing for easy updates and rollbacks.
- Repository Management: Porch helps manage repositories where these packages are stored, making it easier to organize and access the configurations needed for deploying network services.
- Package Lifecycle Management: From creating new configurations to proposing changes and publishing final versions, Porch automates and streamlines the entire process.
- Deployment Readiness: It ensures that once configurations are deemed ready, they can be deployed to the actual Kubernetes environments whether they're in the cloud or at the network edge.
We may need clarification now as to why we need both porch and kpt, so in simple terms kpt is the client-side tool while porch is a server-side tool to handle packages and its lifecycle.
3 - Configsync
Nephio R1 relies on configsync to implement Gitops capability, but this can be easily replaced with any other Gitops tool. It helps the KRM resources on the cluster always be in sync with the kpt package revision in Git repository. Porch manages the lifecycle of the packages, but it is configsync which applies and actuates the Kubernetes resources. Later in the section, we will detail how argocd can be used as a Gitops tool to sync packages to cluster.
4 - Gitea
Gitea is the primary git tool that comes with R1 release of Nephio. This is where the repositories are created, which will be registered and managed through porch. There are two types of repositories: a blueprint repository which holds model packages and a deployment repository, which contains package instances. This tool also can be replaced with any other git repository like github.
Deploying Workload Clusters on Azure
Nephio’s primary capabilities includes its ability to provision and configure multi-vendor cloud infrastructure as workload clusters. While it natively supports kind and GCP as the designated cluster platforms, it is adaptable to other cloud providers through the installation of the respective custom resource and the creation of the necessary KPT packages. The upcoming sections will guide you through the process of creating the Nephio blueprint upstream package and provisioning of workload cluster downstream package on Azure.
Pre-requisites:
- An active Azure account.
- An Existing Kubernetes clusters.
- Access to a GitHub account for repository management.
In this demo we have used the Nephio’s install script to install Nephio on google cloud VM. This installs a Kubernetes cluster as well as Nephio core components. If you do not have access to GCP, you can convert any Kubernetes cluster into Nephio system by installing these components separately as shown in the diagram below.
Installing Nephio Components (Optional)
Follow the below steps to install Nephio components. This will convert any Kubernetes clusters to a Nephio management cluster.
Install Cluster CAPI Azure Provider
To allow Nephio management cluster to provision a workload cluster on Azure we need to install the Cluster API Azure provider operator. Below is a snippet of an Azure installation.
We can generate the Azure aks capi cluster configuration using the clusterctl generate cluster command for aks and then modify the template to suit our site-specific requirements. The generated configuration will have definitions for all Azure custom resources, which are Azure manage control plane, managed machine pool, capi cluster. These are used by the provider to create the AKS cluster.
KPT Package Structure
We want to maintain blueprints which are common for all Azure capi clusters, into a single folder. We create an upstream package “cluster-capi-aks" which will contain yaml configurations to create Azure cluster and Kptfile with setters that needs to be updated when the final package is created. nephio-workload-cluster-aks package is the downstream kpt package that will be cloned and deployed as a workload cluster.
High-level overview of the files in kpt package.
PackageVariant Configuration
The PackageVariant lets you automate the creation and lifecycle management of a specific configuration variant derived from a source package or upstream (cluster-capi-aks in the blueprints-infra-aks repo). Key aspects include:
- Upstream & Downstream: Specifies the source (upstream) of the configuration and where the modified configuration (downstream) will be stored.
- Injectors: Uses the ConfigMap named azure-context to inject specific Azure configuration details into the package.
- Pipeline Mutators: Defines a sequence of KRM functions (set-annotations, apply-replacements, apply-setters) for transforming the package. These functions modify the package based on the provided configuration data.
Azure Context (ConfigMap)
The Azure-context ConfigMap includes essential Azure-related configuration data like subscription ID, client ID, tenant ID, and more. Marked as required for config injection (kpt.dev/config-injection), this data is crucial for tailoring the package to a specific Azure environment.
Setters (ConfigMap)
This setters ConfigMap holds key-value pairs for various configuration settings, allowing for dynamic modification of package resources. This includes subscription ID, client ID, tenant ID and all other environment variables that are unique to the deployment and need to be replaced inside the Azure cluster configuration using the setters function.
Apply Replacements
The ApplyReplacements configuration specifies how to propagate certain values throughout the package. It ensures consistency across various components by dynamically updating fields based on the values defined in azure-context and setters ConfigMap.
Kptfile
The Kptfile is central to managing the package lifecycle with kpt. It describes the package (nephio-workload-cluster-aks) and defines the pipeline of functions to be applied to the package, ensuring the desired transformations are executed.
The user-defined values are configured in the Azure context file which will hold data, as we configure the azure-context.yaml within git, the “annotation config.kubernetes.io/local-config is set to false”.
Registering Blueprint repo in Porch
To integrate the above blueprint repository with Porch for cloning and deployment, apply the following YAML to create the repository and the associated secret:
Once the repo is registered, you should see all the revisions and tags available as remote packages.
Create Package Revisions
We need to create package revisions to allow configsync to execute PackageVariant definitions on the management server.
The first step is to clone the nephio-workload-cluster-aks package to the target repository, then propose and approve. It is not required to modify anything locally as all rendering will be done by package variant based on the pipeline setup.
The usual flow of a package is depicted below.
The first time when you register the repo, only the main branch and main revision will be visible. When creating package revisions for the first time use the clone command, and for any subsequent revision, use the copy command.
In this demo, our target repository is “mgmt”, and we want the package to be executed on the mgmt cluster. Once the package is approved in porch, the configsync, which has 1-1 mapping with mgmt repository, will execute the packages as a new revision is made available.
This package revision will clone the upstream package and perform kpt fn rendering and create azure workload cluster.
We can use PackageVariantSet to create multiple clusters at the same time based on selectors if required. It has a similar template to PackageVariant actuating upstream packages.
It is important to mention that there could be additional NF related CRDs and network configurations required on the workload cluster to deploy free5gcore and OAI RAN, which is not covered in this blog.
Gitops with argoCD in Nephio
We can also use argocd as a Gitops tool in Nephio to sync deployment repositories to workload clusters. But it is important to note that argocd doesn’t support full features of kpt packages like functions.
Follow these steps to install and configure argocd on Kubernetes clusters.
There are a few steps involved in creating a sync between the workload repo and the newly created workload cluster.
- Get the cluster kubeconfig. In this demo, we created an Azure cluster, so we used azurecli to get kubeconfig saved.
- Add the clusters to argocd
- Add the repo to argocd. For demo we placed a simple nginx deployment to the repo.
- Create application which defines the source repo and destination kubernetes cluster. There is a more detailed configuration with regards to how you want to perform the sync. In this demo we create a simple config without the autosync option enabled.
Once the application is created. Click the sync and perform the sync. We can enable the autosync either via declarative configuration or through UI.
Nephio: A Game-Changer in 5G Network Automation
Deploying and managing 5G Network Functions and edge applications on a massive scale across multiple cloud vendors and edge clusters has been a daunting challenge for many of our customers. This process involves extensive planning, spanning months, to identify and provision the necessary infrastructure, network functions, and their configurations. The management of deployments, including day zero (0), day one (1) & day two (2) configurations, becomes a substantial task, particularly when dealing with systems at scale, each having its own set of automation tools.
Before any significant deployment, we typically observe product design and customer service teams dispersed across various regions collaborate using numerous spreadsheets, word documents, and other tools to document the configuration values. These values eventually find their way into Helm charts or other YAML files, serving as inputs to standalone scripts or localized pipelines. Coordinating between different teams to ensure the correct configuration is applied to production deployments proves to be a significant challenge.
There is often some coupling between the network function and the cloud platform or workload cluster it can run on. So, different Network Vendors employ unique provisioning mechanisms, each equipped with its APIs and SDKs. And as service providers expand to additional cloud providers and workload clusters, the complexity multiplies. Despite the adoption of various cloud-native technologies, we have observed customers spending months deploying network functions to production.
So, having a unified automation framework with common components and common workflows with standardized templates can significantly reduce many of the above problems and that solution is Nephio!
What is Nephio?
Nephio, as its goal states, is to deliver a carrier-grade, simple and open Kubernetes based cloud native intent automation. It achieves this goal by implementing a single unified platform for automation using an intent-based and declarative configuration with active reconciliation.
Features of Nephio
- Kubernetes as underlying platform – Kubernetes as the fundamental underlying platform with its orchestration capabilities.
- Intent Driven – It is an approach based on high-level goals rather than detailed instructions. For instance, deploy a network function rather than providing step-by-step instructions to provision a NF. Nephio enables users to articulate high-level objectives transitioning away from manual granular configurations. This intent-driven approach simplifies network function automation, making it user-friendly and less error-prone.
- Declarative & CaD – The automation is declarative with the help of Configuration-as-Data; it understands the user’s intent and helps setup the cloud and edge workloads. Configurations are managed in a standard way by kpt packaging. A declarative system is one which continuously evaluate current state with intended state and reconcile to realize the intended state.
- Reconciliation – Control loop to ensure the running state always matches with the declaration and avoid any configuration drift. This is achieved by powerful Kubernetes CRD extensions and operator patterns.
- Distributed Actuation – Distributed management of workloads so that the system is resilient. For example, CRD in edge cluster to manage workloads deployed on edge.
- Gitops at heart – Nephio enables version control, collaboration and compliance, making network function management both efficient and transparent by embracing the principles of Gitops.
Nephio’s approach to tackling complexity.
Nephio is designed to handle the complexity of multi-vendor, multi-site deployments efficiently. Its Kubernetes-based framework acts as a unified control plane, offering a standardized method to manage different network functions irrespective of the underlying vendor or site-specific peculiarities. This uniform approach eliminates the need for bespoke solutions for each vendor, streamlining the entire process.
By centralizing the management of network functions across various sites, Nephio enables seamless coordination and deployment, ensuring consistency and reducing the risk of configuration drift.
One of the key innovations of Nephio is its adoption of machine-manipulable configurations. This approach facilitates automated, programmable, and repeatable configurations, which are crucial in managing complex network environments.
How does Nephio fit with existing orchestration solutions?
Nephio primarily focuses on domain and infrastructure orchestration in conformance with O-RAN and 3GPP, with the help of Kubernetes and its CRDs. It will complement many of the existing open-source projects like ONAP in service orchestration to provide an e2e automation in telecommunication networking.
Changing every existing automation layer is not feasible. However, integration with Nephio is possible for any Kubernetes-based systems that work with KRM. We know today most of the cloud-native deployments use helm charts, which produces a manifest that cannot alter on runtime, making reconciliation a difficult choice. With Nephio, Network Vendors can still use their helm charts to provision network functions by implementing a helm operator; though it will not reap all the benefits of Nephio, it certainly can facilitate organizations to adopting Nephio swiftly.
Zinkworks and Nephio
Zinkworks is currently engaged in dynamic research efforts focused on Nephio, which involves the implementation and execution of a diverse range of use cases across multiple cloud platforms. Our partnership with Google enables us to provide unparalleled expertise and support for Nephio projects. Through our collaboration with industry leaders, we are well-positioned to deliver cutting-edge solutions and drive innovation in the field of Nephio. Learn how your business can leverage Nephio by speaking with our Zinkworks team, contact marketing@zinkworks.com
References - https://nephio.org/