AKS

On this page

  1. What are Kubernetes and Azure Kubernetes Services (AKS)?
  2. Deploy AKS
    1. Role-Based Access Control (RBAC)
    2. Azure CLI
    3. ARM Template
  3. Node pools sizing and configuration
  4. Node pools
    1. system
    2. general
    3. data
    4. dataneo
    5. processing
  5. Node Selectors
  6. Requests and Limits
  7. Disks

What are Kubernetes and Azure Kubernetes Services (AKS)?

Kubernetes is open-source software that helps deploy and manage containerized applications at scale. It orchestrates a cluster of Azure virtual machines, schedules containers, automatically manages service discovery, incorporates load balancing, and tracks resource allocation. It also checks the health of individual resources and heals apps with auto-restart and auto-replication. AKS provides a managed Kubernetes service with automated provisioning, upgrading, monitoring, and on-demand scaling. (Source: https://azure.microsoft.com/en-us/services/kubernetes-service/#faq)

There are several ways to install an AKS cluster: ARM templates, Azure CLI, Terraform - the choice is yours.

To deploy Azure resources for CluedIn, you need to provide CluedIn Partner GUID (e64d9978-e282-4d1c-9f2e-0eccb50582e4 ). The way you provide the Partner GUID depends on the way you deploy AKS:

Deploy AKS

Role-Based Access Control (RBAC)

To deploy and manage Azure resources, you need sufficient access rights. You can read more about it in Microsoft Documentation: Manage access to your Azure environment with Azure role-based access control Azure built-in roles You need a Contributor role on the Subscription level. If it’s not possible to have this role, you need to ask someone with enough permissions to create an AKS cluster. When you create a new AKS cluster in a particular resource group, Microsoft Azure automatically creates an infrastructure resource group (with “MC_” prefix) to keep AKS-related resources: disks, public IP, identity, etc. Therefore, you should have enough permissions to create resource groups in a given subscription to create a cluster. Then, to manage the cluster, you need to be a Contributor in two AKS resource groups - the group where you have created the AKS and the related infrastructure group.

Azure CLI

Walkthrough (Microsoft Docs): https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough Microsoft’s instructions to deploy the Partner GUID: https://docs.microsoft.com/en-us/azure/marketplace/azure-partner-customer-usage-attribution#example-azure-cli

To install the Partner GUID, you need to add an environment variable to your terminal session.

Bash:

export AZURE_HTTP_USER_AGENT='pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4' ;
echo AZURE_HTTP_USER_AGENT # should print pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4

PowerShell:

$env:AZURE_HTTP_USER_AGENT='pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4' ;
$env:AZURE_HTTP_USER_AGENT # should print pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4

ARM Template

Walkthrough (Microsoft Docs): https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough-rm-template

Walkthrough with Partner GUID (CluedIn Docs): https://documentation.cluedin.net/kb/azure-customer-usage-attribution.

Microsoft’s instructions to deploy the Partner GUID: https://docs.microsoft.com/en-us/azure/marketplace/azure-partner-customer-usage-attribution#add-a-guid-to-a-resource-manager-template

To deploy with the Partner GUID, you only need to add this deployment to the resources section:

    { 
      "apiVersion": "2020-06-01",
      "name": "pid-e64d9978-e282-4d1c-9f2e-0eccb50582e4",
      "type": "Microsoft.Resources/deployments",
      "properties": {
          "mode": "Incremental",
          "template": {
              "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
              "contentVersion": "1.0.0.0",
              "resources": []
          }
      }
    }, 

Node pools sizing and configuration

When you install CluedIn from Azure Marketplace, the AKS cluster is properly configured during the installation. However, you can use this setup as a reference for custom installs.

Node pools

In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools. These node pools contain the underlying VMs that run your applications. https://learn.microsoft.com/en-us/azure/aks/use-multiple-node-pools

See also:

The CluedIn configuration comes with a set of node pools listed below. The YAML snippets for each node type include only their unique features to make them more brief.

system

Node count: 1

apiVersion: v1
kind: Node
metadata:
  labels:
    agentpool: system
    kubernetes.azure.com/mode: system
    kubernetes.cluedin.com/pooltype: system
    node.kubernetes.io/instance-type: Standard_DS2_v2 # General purpose compute, 2 vCPUs, 7 GiB RAM.
spec:
  taints:
    - key: CriticalAddonsOnly
      value: 'true'
      effect: NoSchedule

The node pool runs critical add-on pods only.

general

Node count: 2

apiVersion: v1
kind: Node
metadata:
  labels:
    agentpool: general
    kubernetes.azure.com/agentpool: general
    kubernetes.azure.com/mode: user
    kubernetes.cluedin.com/pooltype: general
    node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.

The node pool runs more or less lightweight microservices but not databases or processing pods.

data

Node count: 2

apiVersion: v1
kind: Node
metadata:
  labels:
    agentpool: data
    kubernetes.azure.com/agentpool: data
    kubernetes.azure.com/mode: user
    kubernetes.cluedin.com/pooltype: data
    node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.
spec:
  taints:
    - key: kubernetes.cluedin.com/pool
      value: data
      effect: NoSchedule

The node pool runs databases and the message broker: SQL Server, Elasticsearch, RabbitMQ, Redis, but not Neo4j.

dataneo

Node count: 1

apiVersion: v1
kind: Node
metadata:
  labels:
    agentpool: dataneo
    kubernetes.azure.com/agentpool: dataneo
    kubernetes.azure.com/mode: user
    kubernetes.cluedin.com/pooltype: data-neo
    node.kubernetes.io/instance-type: Standard_D8s_v4 # General purpose compute, 8 vCPUs, 32 GiB RAM.
spec:
  taints:
    - key: kubernetes.cluedin.com/pool
      value: data-neo
      effect: NoSchedule

The node pool is dedicated to running Neo4j.

processing

Node count: 1

apiVersion: v1
kind: Node
metadata:
  labels:
    agentpool: processing
    kubernetes.azure.com/agentpool: processing
    kubernetes.azure.com/mode: user
    kubernetes.cluedin.com/pooltype: processing
    node.kubernetes.io/instance-type: Standard_F8s_v2 # Compute optimized VMs, 8 vCPUs, 16 GiB RAM.
spec:
  taints:
    - key: kubernetes.cluedin.com/pool
      value: processing
      effect: NoSchedule

The node pool runs processing pods. You can scale this node pool horizontally when needed. For example, during historical data loads or full reprocessing.

Node Selectors

infrastructure:
  elasticsearch:
    nodeSelector:
      kubernetes.cluedin.com/pooltype: data

  neo4j:
    nodeSelector:
      kubernetes.cluedin.com/pooltype: data-neo

  rabbitmq:
    nodeSelector:
      kubernetes.cluedin.com/pooltype: data

  redis:
    master:
      nodeSelector:
        kubernetes.cluedin.com/pooltype: data

  mssql:
    nodeSelector:
      kubernetes.cluedin.com/pooltype: data

application:
  cluedin:
    roles:
      processing:
        nodeSelector:
          kubernetes.cluedin.com/pooltype: processing

Requests and Limits

infrastructure:
  neo4j:
    core:
      resources:
        requests:
          cpu: "7"
          memory: "28Gi"
        limits:
          cpu: "7"
          memory: "28Gi"

  elasticsearch:
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "6"
        memory: "26Gi"

  redis:
    master:
      resources:
        limits:
          cpu: "1"
          memory: "2Gi"

  mssql:
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "5"
        memory: "22Gi"

  rabbitmq:
    resources:
      limits:
        cpu: "2"
        memory: "6Gi"


application:
  cluedin:
    roles:
      main:
        resources:
          limits:
            cpu: "2"
            memory: "12Gi"
      processing:
        resources:
          limits:
            cpu: "15"
            memory: "28Gi"
      crawling:
        resources:
          limits:
            cpu: "2"
            memory: "12Gi"

  cluedincontroller:
    resources:
      limits:
        cpu: "0.5"
        memory: "512Mi"

  annotation:
    resources:
      limits:
        cpu: "0.75"
        memory: "512Mi"

  prepare:
    resources:
      limits:
        cpu: "0.75"
        memory: "512Mi"

  datasource:
    resources:
      limits:
        cpu: "1"
        memory: "4Gi"

  submitter:
    resources:
      limits:
        cpu: "0.5"
        memory: "512Mi"

  gql:
    resources:
      limits:
        cpu: "0.5"
        memory: "2Gi"

  ui:
    resources:
      limits:
        cpu: "0.5"
        memory: "2Gi"

  webapi:
    resources:
      limits:
        cpu: "0.5"
        memory: "512Mi"

Disks

infrastructure:
  elasticsearch:
    volumeClaimTemplate:
      resources:
        requests:
          storage: "500Gi"

  mssql:
    persistence:
      dataSize: "750Gi"
      transactionLogSize: "750Gi"
      masterSize: "128Gi"

  neo4j:
    core:
      persistentVolume:
        size: "500Gi"

  rabbitmq:
    persistence:
      size: "150Gi"

  redis:
    master:
      persistence:
        size: "32Gi"