Category Archives: Kubernetes

GKE with VPN – Networking options

While working on a recent hybrid GCP plus on-premise customer architecture, we had a need to connect GKE cluster running in GCP to a service running in on-premise through a VPN. There were few unique requirements like needing to expose only a small IP range to the on-premise and having full control over the IP addresses exposed. In this blog, I will talk about the different approaches possible from a networking perspective when connecting GKE cluster to a on-premise service. Following options are covered in this blog:

  • Flexible pod addressing scheme
  • Connecting using NAT service running on VM
  • Using IP masquerading at the GKE node level

I did not explore Cloud NAT managed service as that works only with private clusters and it does not work through VPN. I have used VPC native clusters as that has become the default networking scheme and it is more straightforward to use than route-based clusters. For more information on VPC native clusters and IP aliasing, please refer to my earlier blog series here.

Requirements

Following was the high level architecture:

Architecture diagram

On-premise application is exposing the service on a specific tcp port that we needed to access from the pods running in GKE cluster. We had a need to expose only few specific GCP ip addresses to on-premise.
For this use-case, I have used VPN using dynamic routing. There is a need to open up the firewall in on-premise for the source ip addresses that are accessed from GCP. To try this example where you don’t have on-premise network, you can setup 2 VPCs and make one to simulate on-premise.

Flexible pod addressing scheme

In VPC native clusters, there are separate IP address ranges allocated for GKE nodes, pods and services. The node ip address is allocated from the VPC subnet range. There are 2 ways to allocate ip addresses to pods and services.

GKE managed secondary address

In this scheme, GKE manages secondary address ranges. When the cluster is created, GKE automatically creates 2 IP alias ranges, 1 for the pods and another for the services. The user has a choice to enter the IP address range for the pods and services or let the GKE pickup the address ranges. Following are the default, minimum and maximum subnet range sizes for the pods and services.

DefaultMinimumMaximum
Pods/14
(2^18 pod ip address)
/9
(2^23 pod ip address)
/21
(2^11 pod ip address)
Services/20
(2^12 service ip address)
/16
(2^16 service ip address)
/27
(2^5 service ip address)

There is another important parameter called number of pods per node. By default, GKE reserves a /24 block or 256 ip addresses per node. Considering ip address reuse among pods when pods gets autoscaled, 110 pods share the 256 ip addresses and so the number of pods per node is set by default to 110. This number can be user configured.

For example, taking /21 for pods, we can have a total of 2048 pods. Assuming default of 110 pods(/24 address range for pods) in each node, then we can have only a maximum of 2^(24-21) = 8 nodes. This limit is irrespective of the subnet range reserved for the nodes. If we reduce the number of ip addresses for pods per node to 64(/26 range), then we can have a maximum of 2^(26-21)=32 nodes.

User managed secondary address

For my use case, GKE managed secondary address did not help since the minimal pod ip range is /21 and the customer was not willing to expose a /21 ip range in their on-premise firewall. The customer was willing to provide a /25 or /27 ip range to the pods. We settled on the configuration below:

  • /25 range for the pods, 8 pods per node. /25 range would give us 128 pod addresses. 8 pods per node would need 16 ip address(4 bits) per node. This provided us 2^(7-4)=8 nodes maximum in the cluster.
  • /27 range for the services. It was not needed to expose the service ip range to the on-premise as service ip addresses are used more for egress from on-premise.
  • /27 range for the nodes. Even though we could have created 32 nodes with this range, we are limited to 8 nodes because of the first point above.

Following are the steps to create a cluster with user managed secondary address:

  • Create ip alias range for pods and services from the VPC section in the console
  • When creating cluster, disable option “automatically create secondary range” and select pre-created ip alias range for the pods and services.
  • Set maximum number of pods per node to 8.

Following picture shows the 2 ip alias addresses created along with the VPC subnet:

VPC subnet with 2 alias IP ranges

Following picture shows the networking section of cluster creation part where we have specified the primary and secondary ip ranges for pod and service ip addresses.

Cluster with custom IP ranges for pods and services

Connecting using NAT service running on VM

Rather than exposing individual pod ip address to the on-premise service, we can expose a single IP address using a NAT service running in GCP. With this approach, all the pod IP addresses gets translated to the single NAT IP address. We only need to expose the single NAT ip address to the on-premise firewall.

Following picture shows how the architecture would look:

Connecting to on-premise using NAT

Following are the steps needed:

  • Create NAT instance on compute engine. As I mentioned earlier, Cloud NAT managed service could not be used as its not integrated with VPN. We can either create a standalone NAT instance or HA NAT instance as described here.
    I used the following command to create NAT instance:
 gcloud compute instances create nat-gateway --network gcp-vpc \
     --subnet subnet-a \
     --can-ip-forward \
     --zone us-east1-b \
     --image-family debian-9 \
     --image-project debian-cloud \
     --tags nat 
  • Login to the NAT instance and setup iptables rules to setup the NAT.
sudo sysctl -w net.ipv4.ip_forward=1
sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADEsudo sysctl -w net.ipv4.ip_forward=1
  • Create GKE cluster with network tag. I was not able to use network tag for creating GKE cluster from the console and the only way is to to use gcloud CLI. The network tag is needed so that the route entry to forward to NAT applies only to the instances that are part of the GKE cluster.
gcloud container clusters create mygcpr-cluster --tags use-nat \
 --zone us-east1-b \
 --network gcp-vpc --subnetwork subnet-a --enable-ip-alias 
  • Create route entry to forward traffic from GKE cluster destined to on-premise service through the NAT gateway. Please make sure that the priority of this route entry supersedes other route entries. (Please note that the priority increase is in reverse, lower number means higher priority)
gcloud compute routes create nat-vpn-route1 \
     --network gcp-vpc \
     --destination-range 192.168.0.0/16 \
     --next-hop-instance nat-gateway \
     --next-hop-instance-zone us-east1-b \
     --tags use-nat --priority 50 

To test this, I created a pod on the GKE cluster and tried to ping a on-premise instance and with “tcpdump” verified that the source ip address of the ping request is not a pod IP but the NAT gateway IP address.

Using masquerading at the GKE node

The alternative to use NAT gateway is to do masquerading at the node level. What this will do is to translate the pod ip address to the node ip address when packets egress from the GKE node. With this case, it is needed to expose only the node IP addresses to on-premise and it’s not needed to expose pod ip addresses. There is a masquerading agent that runs in each GKE node to achieve this.

Following are the steps to setup masquerading:

  • The 2 basic requirements for masquerading agent to run in each node is to enable network control policy and have the pod ip address outside the RFC 1918 ip address range 10.0.0.0/8. Network control policy can be enabled when creating GKE cluster.
  • By default, masquerading is setup to avoid masquerading rfc 1918 addresses(10.0.0.0/8, 192.168.0.0/16, 172.16.0.0/12) as well as link local address(169.254.0.0/16). This can be overridden by the use of config file at the cluster level.
  • When there is a change in config file, the agent at the node periodically reads the config level and updates the cluster. This interval can also be configured.

Following is the config file I used:

nonMasqueradeCIDRs:
resyncInterval: 60s
masqLinkLocal: true

In my case, since I used rfc 1918 address for my pod, I wanted those ip addresses to be also masqueraded. The config file works in a negative direction w.r.to specifying ip address. Since I have not specified any ip address, all rfc1918 address will get masqueraded with this configuration. You can add specific ip address that you do not want masquerading to happen.

To apply the config, we can use the following kubectl command:

kubectl create configmap ip-masq-agent --from-file config --namespace kube-system

In the customer scenario, we have gone with user managed secondary address and that worked fine. The other 2 options I described above would also have worked. We hit few other issues with GCP VPN working with Cisco ASA which we were able to eventually overcome, more details on a different blog…

References

Migrate for Anthos

Anthos is a hybrid/multi-cloud platform from GCP. Anthos allows customers to build their application once and run in GCP or in any other private or public cloud. Anthos unifies the control, management and data plane when running a container based application across on-premise and multiple clouds. Anthos was launched in last year’s NEXT18 conference and made generally available recently. VMWare integration is available now, integration with other clouds is planned in the roadmap. 1 of the components of Anthos is called “Migrate for Anthos” which allows direct migration of VM into Containers running on GKE. This blog will focus on “Migrate for Anthos”. I will cover the need for “Migrate for Anthos”, platform architecture and move a simple application from GCP VM into a GKE container. Please note that “Migrate for Anthos” is in BETA now and it is not ready for production.

Need for “Migrate for Anthos”

Modern application development typically use microservices and containers to improve the application’s agility. Containers, Docker and Kubernetes provides the benefits of agility and portability to applications. It is easier to build a greenfield application using microservices and containers. What should we do with applications that are already existing as monoliths? Enterprises typically spend a lot of effort in modernizing their applications which could typically mean a long journey for a lot of them. What if we had an automatic way to convert VMs to Containers. Does this sound like magic? Yes, “Migrate for Anthos”(earlier called as V2K) does quite a bit of magic underneath to automatically convert VMs to Containers.

Following diagram shows the different approaches that enterprises take in their modernization and cloud journey. The X-axis shows classic and cloud native applications, Y-axis show on-prem and cloud.

Picture borrowed from “Migrate for Anthos” presentations to customers

Migrate and Modernize:
In this approach, we first do a lift and shift of the VMs to cloud and we then modernize the application to Containers. Velostrata is GCP’s tool to do lift and shift VM migration.

Modernize and Migrate:
In this approach, we first modernize the application on-prem and then migrate the modernized application to the cloud. If the on-prem application is modernized using Docker and Kubernetes, then it can be migrated easily to GKE.

Migrate for Anthos:
Both the above approaches are 2 step approaches. With “Migrate for Anthos”, migration and modernization happens in the same step. The modernization is not fully complete in this approach. Even though the VM is migrated to containers, the monolith application is not broken down into microservices.

You might be wondering why migrate to containers if the monolith application is not converted to microservices. There are some basic advantages that we get with containerizing the monolith application and that includes portability, better packing and integration with other container services like Istio. As a next step, the monolith container application can be broken down into microservices. There are some roadmap items in “Migrate for Anthos” that will facilitate this.

For some legacy applications, it might not make sense to break it down into microservices and they can live as a single monolithic container for a long time using this approach. In a typical VM environment, we need to worry about patching, security, networking, monitoring, logging and other infrastructure components which comes out of the box with gke and kubernetes after doing the migration to Containers. This is another advantage of “Migrate for Anthos”.

“Migrate for Anthos” Architecture

“Migrate for Anthos” converts the source VMs to system containers running in GKE. System containers when compared to application containers run multiple processes and applications in a single container. Initial support for “Migrate for Anthos” is available for VMWare VMs or GCE VMs as source. Following changes are done to convert VM to Container.

  • VM operating system is converted into kernel supported by GKE.
  • VM system disks are mounted inside container using persistent volume(PV) and stateful dataset.
  • Networking, logging and monitoring use GKE constructs.
  • Applications running inside VM using systemd scripts run in container user space.
  • During the initial migration phase, storage is streamed to container using CSI. The storage can then be migrated to any storage class supported by GKE.

Following are the components of “Migrate for Anthos”:

  • “Migrate for Compute Engine” (formerly Velostrata) – Velostrata team has enhanced the VM migration tool to also convert VM to containers and then do the migration. The fundamentals of Velostrata including agentless and streaming technologies still remain the same for “MIgrate for Anthos”. Velostrata manager and cloud extensions needs to be installed in GCP environment to do the migration. Because Velostrata uses streaming technology, the complete VM storage need not be migrated to run the container in GKE, this speeds up the entire migration process.
  • GKE cluster – “Migrate for Anthos” will run in the GKE cluster as an application containers and can be installed from the GKE marketplace.
  • Source VM – Source VM can be in GCE or in VMWare environment. In VMWare environment, “Migrate for Anthos” component needs to be installed in VMWare as well.

Following picture shows the different components in the VM and how it will look when they are migrated.

Picture borrowed from “Migrate for Anthos” presentations to customers

The second column in the picture is what exists currently when the VM is migrated to GKE container. The only option currently is to do vertical scaling when the capacity is reached. The yellow components leverage kubernetes and the green components run inside containers. The third column in the picture is how the future would look like where we can have multiple containers with horizontal pod autoscaling.

“Migrate for Anthos” hands-on

I did a migration of GCE VM to Container running in GKE using “Migrate for Anthos”. The GCE VM has a base Debian OS with nginx web server installed.

Following are a summary of the steps to do the migration:

  • Create service account for Velostrata manager and cloud extension.
  • Install Velostrata manager from marketplace with the service accounts created in previous step.
  • Create cloud extension from Velostrata manager.
  • Create GKE cluster.
  • Install “Migrate for Anthos” from GKE marketplace on the GKE cluster created in previous step.
  • Create source VM in GCE and install needed application in the source VM.
  • Create YAML configuration file(persistent volume, persistent volume claim, stateful dataset) from the source VM.
  • Stop source VM.
  • Apply the YAML configuration on top of the GKE cluster.
  • Create Kubernetes service configuration files to expose the container services.

Service account creation:
I created service accounts for Velostrata manager and cloud extension using steps listed here. I used the single project configuration example.

Velostrata manager installation:
I used the steps listed here to install Velostrata manager from marketplace and to do the initial configuration. Velostrata manager provides the management interface for Velostrata where all migrations can be managed. I have used the “default” network for my setup. We need to remember the api password for future steps.

Create cloud extension:
I used the steps here to install cloud extension from Velostrata manager. The cloud storage takes care of storage caching in GCP.

Create GKE cluster:
I used the steps here to create GKE cluster. GKE nodes and source VM needs to be in the same zone. Because of this restriction, it is better to create a regional cluster so that we have a GKE node in all the regions. When I first tried the migration, I got an error like below:

Events:
  Type     Reason             Age                 From                Message
  ----     ------             ----                ----                -------
  Normal   NotTriggerScaleUp  1m (x300 over 51m)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added):
  Warning  FailedScheduling   1m (x70 over 51m)   default-scheduler   0/9 nodes are available: 9 node(s) had volume node affinity conflict.

Based on discussion with Velostrata engineering team, I understood that the problem lies with not able to schedule the pod since none of the GKE nodes are in the same zone as source VM. In my case, I created a regional cluster in us-central-1, but it created nodes only in 3 zones instead of the 4 zones available in us-central-1. My source VM unfortunately resided in the 4th zone where GKE node is not present. This looks like a bug in GKE regional cluster creation where GKE nodes are not created in all zones. After I created the source VM in 1 of the zones where GKE nodes were present, the problem got resolved.

Install “MIgrate for Anthos”:
I used the steps here to install “Migrate for Anthos” in GKE cluster. There is a need to mention Velostrata manager IP address and cloud extension name that we created in the previous steps.

Create source VM:
I created a debian VM and installed nginx webserver.

sudo apt-get update 
sudo apt-get install -y nginx 
sudo service nginx start 
sudo sed -i -- 's/nginx/Google Cloud Platform - '"\$HOSTNAME"'/' /var/www/html/index.nginx-debian.html

Create YAML configuration from source VM:
I used the steps here. This is the command I used to create the kubernetes configuration. The configuration contains details to create persistent volume claim(PVC), persistent volume(PV) and stateful dataset.

python3 /google/migrate/anthos/gce-to-gke/clone_vm_disks.py \
-p sreemakam-anthos `#Your GCP project name` \
-z us-central1-b `#GCP Zone that hosts the VM, for example us-central1-a` \
-i webserver `#Name of the VM. For example, myapp-vm` \
-A webserver `#Name of workload that will be launched in GKE` \
-o webserver.yaml `#Filename of resulting YAML configuration`

Apply YAML configuration:
Before applying the YAML config, we need to stop the source VM. This will create a consistent snapshot. I used the following command as in this link to create persistent volume claim(PVC), persistent volume(PV) and stateful dataset. The volume would use the GCE persistent disk.

kubectl apply -f webserver.yaml

Create Kubernetes service configuration:
To expose the container service running on port 80, we can create a Kubernetes service mentioned below.

kind: Service
 apiVersion: v1
 metadata:
   name: webserver
 spec:
   type: LoadBalancer
   selector:
     app: webserver
   ports:
 name: http
 protocol: TCP
 port: 80
 targetPort: 80 

After applying the service, it will create a load balancer with an external IP address using which we can access the nginx webservice .

The above example shows a migration of a simple VM to Container. The link here talks about how to migrate a two tier application involving application and database. Examples of applications that can be migrated includes web applications, middleware frameworks and any applications built on linux. Supported operating systems are mentioned here.

I want to convey special thanks to Alon Pildus from “Migrate for Anthos” team who helped to review and suggest improvements to this blog.

References

Devops with Kubernetes

I did the following presentation “Devops with Kubernetes” in Kubernetes Sri Lanka inaugural meetup earlier this week. Kubernetes is one of the most popular open source projects in the IT industry currently. Kubernetes abstractions, design patterns, integrations and extensions make it very elegant for Devops. The slides delve little deep on these topics.

NEXT 100 Webinar – Top 3 reasons why you should run your enterprise workloads on GKE

I presented this webinar “Top 3 reasons why you should run your enterprise workloads on GKE” at NEXT100 CIO forum earlier this week. Businesses are increasingly moving to Containers and Kubernetes to simplify and speed up their application development and deployment. The slides and demo covers the top reasons why Google Kubernetes engine(GKE) is one of the best Container management platforms for enterprises to deploy their containerized workloads.

Following are the slides and recording:

Recording link

 

 

Container Conference Presentation

This week, I did a presentation in Container Conference, Bangalore. The conference was well conducted and it was attended by 400+ quality attendees. I enjoyed some of the sessions and also had fun talking to attendees. The topic I presented was “Deep dive into Kubernetes Networking”. Other than covering Kubernetes networking basics, I also touched on Network control policy, Istio service mesh, hybrid cloud and best practises.

Slides:

Recording:

Demo code and Instructions:

Github link

Recording of the Istio section of the demo: (the recording was not at conference)

As always, feedback is welcome.

I was out of blogging action for last 9 months as I was settling into my new Job at Google and I also had to take care of some personal stuff. Things are getting little clear now and I am hoping to start my blogging soon…

 

Comparing Docker deployment options in public cloud

Few weeks back, I gave a presentation in Container conference, Bangalore comparing different solutions available to deploy Docker in the public cloud.

Slides are available here. I have also put the steps necessary along with short video for each of the options in the github page here.

Abstract of the talk:

Containers provide portability for applications across private and public clouds. Since there are many options to deploy Docker Containers in public cloud, customers get confused in the decision making process. I will compare Docker machine, Docker Cloud, Docker datacenter, Docker for AWS, Azure and Google cloud, AWS ECS, Google Container engine, Azure Container service. A sample multi-container application will be deployed using the different options. The deployment differences including technical internals for each option will be covered. At the end of the session, the user will be able to choose the right Docker deployment option for their use-case.

Note:

  • I have focused mainly on Docker centric options in the comparison.
  • There are few CaaS platforms like Tectonic, Rancher that I have not included since I did not get a chance to try them.
  • Since all the solutions are under active development, some of the gaps will get covered by the solutions in the future.

Kubernetes CRI and Minikube

Kubernetes CRI(Container runtime interface) is introduced in experimental mode in Kubernetes 1.15 release. Kubernetes CRI introduces a common Container runtime layer that allows for Kubernetes orchestrator to work with multiple Container runtimes like Docker, Rkt, Runc, Hypernetes etc. CRI makes it easy to plug in a new Container runtime to Kubernetes. Minikube project simplifies Kubernetes installation for development and testing purposes. Minikube project allows Kubernetes master and worker components to run in a single VM which facilitates developers and users of Kubernetes to easily try out Kubernetes. In this blog, I will cover basics of Minikube usage, overview of CRI and steps to try out CRI with Minikube.

Minikube

Kubernetes software is composed of multiple components and beginners normally get overwhelmed with the installation steps. It is also easier to have a lightweight Kubernetes environment for development and testing purposes. Minikube has all Kubernetes components in a single VM that runs in the local laptop. Both master and worker functionality is combined in the single VM.

Following are some major features present in Minikube:

Continue reading Kubernetes CRI and Minikube