AKS
On this page
- What are Kubernetes and Azure Kubernetes Services (AKS)?
- Deploy AKS
- Node pools sizing and configuration
- Node pools
- Node Selectors
- Requests and Limits
- Disks
What are Kubernetes and Azure Kubernetes Services (AKS)?
Kubernetes is open-source software that helps deploy and manage containerized applications at scale. It orchestrates a cluster of Azure virtual machines, schedules containers, automatically manages service discovery, incorporates load balancing, and tracks resource allocation. It also checks the health of individual resources and heals apps with auto-restart and auto-replication. AKS provides a managed Kubernetes service with automated provisioning, upgrading, monitoring, and on-demand scaling. (Source: https://azure.microsoft.com/en-us/services/kubernetes-service/#faq)
There are several ways to install an AKS cluster: ARM templates, Azure CLI, Terraform - the choice is yours.
To deploy Azure resources for CluedIn, you need to provide CluedIn Partner GUID (e64d9978-e282-4d1c-9f2e-0eccb50582e4
). The way you provide the Partner GUID depends on the way you deploy AKS:
Deploy AKS
Role-Based Access Control (RBAC)
To deploy and manage Azure resources, you need sufficient access rights. You can read more about it in Microsoft Documentation: Manage access to your Azure environment with Azure role-based access control Azure built-in roles You need a Contributor role on the Subscription level. If it’s not possible to have this role, you need to ask someone with enough permissions to create an AKS cluster. When you create a new AKS cluster in a particular resource group, Microsoft Azure automatically creates an infrastructure resource group (with “MC_” prefix) to keep AKS-related resources: disks, public IP, identity, etc. Therefore, you should have enough permissions to create resource groups in a given subscription to create a cluster. Then, to manage the cluster, you need to be a Contributor in two AKS resource groups - the group where you have created the AKS and the related infrastructure group.
Azure CLI
Walkthrough (Microsoft Docs): https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough Microsoft’s instructions to deploy the Partner GUID: https://docs.microsoft.com/en-us/azure/marketplace/azure-partner-customer-usage-attribution#example-azure-cli
To install the Partner GUID, you need to add an environment variable to your terminal session.
Bash:
export AZURE_HTTP_USER_AGENT='pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4' ;
echo AZURE_HTTP_USER_AGENT # should print pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4
PowerShell:
$env:AZURE_HTTP_USER_AGENT='pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4' ;
$env:AZURE_HTTP_USER_AGENT # should print pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4
ARM Template
Walkthrough (Microsoft Docs): https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough-rm-template
Walkthrough with Partner GUID (CluedIn Docs): https://documentation.cluedin.net/kb/azure-customer-usage-attribution.
Microsoft’s instructions to deploy the Partner GUID: https://docs.microsoft.com/en-us/azure/marketplace/azure-partner-customer-usage-attribution#add-a-guid-to-a-resource-manager-template
To deploy with the Partner GUID, you only need to add this deployment to the resources section:
{
"apiVersion": "2020-06-01",
"name": "pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4",
"type": "Microsoft.Resources/deployments",
"properties": {
"mode": "Incremental",
"template": {
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources": []
}
}
},
Node pools sizing and configuration
When you install CluedIn from Azure Marketplace, the AKS cluster is properly configured during the installation. However, you can use this setup as a reference for custom installs.
Node pools
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools. These node pools contain the underlying VMs that run your applications. https://learn.microsoft.com/en-us/azure/aks/use-multiple-node-pools
See also:
- https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
- https://azureprice.net/
The CluedIn configuration comes with a set of node pools listed below. The YAML snippets for each node type include only their unique features to make them more brief.
system
Node count: 1
apiVersion: v1
kind: Node
metadata:
labels:
agentpool: system
kubernetes.azure.com/mode: system
kubernetes.cluedin.com/pooltype: system
node.kubernetes.io/instance-type: Standard_DS2_v2 # General purpose compute, 2 vCPUs, 7 GiB RAM.
spec:
taints:
- key: CriticalAddonsOnly
value: 'true'
effect: NoSchedule
The node pool runs critical add-on pods only.
general
Node count: 2
apiVersion: v1
kind: Node
metadata:
labels:
agentpool: general
kubernetes.azure.com/agentpool: general
kubernetes.azure.com/mode: user
kubernetes.cluedin.com/pooltype: general
node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.
The node pool runs more or less lightweight microservices but not databases or processing pods.
data
Node count: 2
apiVersion: v1
kind: Node
metadata:
labels:
agentpool: data
kubernetes.azure.com/agentpool: data
kubernetes.azure.com/mode: user
kubernetes.cluedin.com/pooltype: data
node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.
spec:
taints:
- key: kubernetes.cluedin.com/pool
value: data
effect: NoSchedule
The node pool runs databases and the message broker: SQL Server, Elasticsearch, RabbitMQ, Redis, but not Neo4j.
dataneo
Node count: 1
apiVersion: v1
kind: Node
metadata:
labels:
agentpool: dataneo
kubernetes.azure.com/agentpool: dataneo
kubernetes.azure.com/mode: user
kubernetes.cluedin.com/pooltype: data-neo
node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.
spec:
taints:
- key: kubernetes.cluedin.com/pool
value: data-neo
effect: NoSchedule
The node pool is dedicated to running Neo4j.
processing
Node count: 1
apiVersion: v1
kind: Node
metadata:
labels:
agentpool: processing
kubernetes.azure.com/agentpool: processing
kubernetes.azure.com/mode: user
kubernetes.cluedin.com/pooltype: processing
node.kubernetes.io/instance-type: Standard_F8s_v2 # Compute optimized VMs, 8 vCPUs, 16 GiB RAM.
spec:
taints:
- key: kubernetes.cluedin.com/pool
value: processing
effect: NoSchedule
The node pool runs processing pods. You can scale this node pool horizontally when needed. For example, during historical data loads or full reprocessing.
Node Selectors
infrastructure:
elasticsearch:
nodeSelector:
kubernetes.cluedin.com/pooltype: data
neo4j:
nodeSelector:
kubernetes.cluedin.com/pooltype: data-neo
rabbitmq:
nodeSelector:
kubernetes.cluedin.com/pooltype: data
redis:
master:
nodeSelector:
kubernetes.cluedin.com/pooltype: data
mssql:
nodeSelector:
kubernetes.cluedin.com/pooltype: data
application:
cluedin:
roles:
processing:
nodeSelector:
kubernetes.cluedin.com/pooltype: processing
Requests and Limits
infrastructure:
neo4j:
core:
resources:
requests:
cpu: "7"
memory: "28Gi"
limits:
cpu: "7"
memory: "28Gi"
elasticsearch:
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "6"
memory: "26Gi"
redis:
master:
resources:
limits:
cpu: "1"
memory: "2Gi"
mssql:
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "5"
memory: "22Gi"
rabbitmq:
resources:
limits:
cpu: "2"
memory: "6Gi"
application:
cluedin:
roles:
main:
resources:
limits:
cpu: "2"
memory: "12Gi"
processing:
resources:
limits:
cpu: "15"
memory: "28Gi"
crawling:
resources:
limits:
cpu: "2"
memory: "12Gi"
cluedincontroller:
resources:
limits:
cpu: "0.5"
memory: "512Mi"
annotation:
resources:
limits:
cpu: "0.75"
memory: "512Mi"
prepare:
resources:
limits:
cpu: "0.75"
memory: "512Mi"
datasource:
resources:
limits:
cpu: "1"
memory: "4Gi"
submitter:
resources:
limits:
cpu: "0.5"
memory: "512Mi"
gql:
resources:
limits:
cpu: "0.5"
memory: "2Gi"
ui:
resources:
limits:
cpu: "0.5"
memory: "2Gi"
webapi:
resources:
limits:
cpu: "0.5"
memory: "512Mi"
Disks
infrastructure:
elasticsearch:
volumeClaimTemplate:
resources:
requests:
storage: "500Gi"
mssql:
persistence:
dataSize: "750Gi"
transactionLogSize: "750Gi"
masterSize: "128Gi"
neo4j:
core:
persistentVolume:
size: "500Gi"
rabbitmq:
persistence:
size: "150Gi"
redis:
master:
persistence:
size: "32Gi"