This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

You're encouraged to make pull requests to enhance any wholes that you may find in the documentation.

Welcome to the LambdaStack documentation center. At this moment, the documentation is on https://github.com/lambdastack/lambdastack.

This will all be updated by the day!

1 - Overview

Find out if LambdaStack is for you!

LambdaStack is NOT a customized version of Kubernetes! LambdaStack IS a complete Kubernetes Automation Platform that also exercises best practices of setting up Kafka, Postgres, RabbitMQ, Elastic Search (Open Distro), HAProxy, Vault, KeyCloak, Apache Ignite, Storage, HA, DR, Multicloud and more.

Open Source License

LambdaStack is fully open sourced and uses the Apache-2.0 license so that you may use it as you see fit. By using the Apache-2.0 license, scans such as Blackduck show no issues for Enterprises.

Automation

LambdaStack uses Terraform, Ansible, Docker, and Python. It's also fully Data Driven. Meaning, all of the data is stored in YAML files. It's so dynamic that even some templates are auto-generated based on the YAML data files before passing through the automation engines thus allowing your to customize everything without modifying source code!

How does it differ from others?

Most, if not all, competitors only setup basic automation of Kubernetes. Meaning, they may use YAML data files but they:

  • Have a number of options that may be hard-coded
  • Only install Kubernetes
  • Do not install and setup default production like environments
  • Do not install and setup Kafka, HAProxy, Postgres, RabbitMQ, KeyCloak, Elastic Search, Apache Ignite, advanced logging, enhanced monitoring and alerting, persistent storage volumes and much more
  • Do not build their own repo environment for air-gapped systems (used in highly secured and mission critical areas)
  • Cannot run on a single node
  • Are not architected for hybrid and multicloud systems
  • May not be able to run the same on-premise as in the cloud
  • May not be able to build out IaaS and cloud vendor specific managed Kubernetes (e.g., EKS, AKS, GKE)

Why was LambdaStack created?

Enterprises and Industry are moving toward Digital Transformations but they journey is not simple or fast. There may be many product lines of legacy applications that need to be modernized and/or re-architected to take advantage of the benifits for true Microservices. Also, some industry domains are so specialized that the core group of software engineers are more domain specialist than generalist.

LambdaStack was architected to remove the burden of:

  • Learning how to setup and manage Kubernetes
  • Build true Microservices where needed
  • Understand high-performing event processing
  • Auto-scaling
  • Knowing how-to build hybrid clusters
  • Knowing how-to build true Multicloud systems
  • Knowing how-to build Kubernetes and all of the above in air-gapped environments
  • Knowing how-to build hardened secure systems
  • Knowing how-to operate the above in production like environments
  • A lot more...

Basically, allow the development teams focus on adding business value and increasing time-to-market!

Next steps

  • Getting Started: Get started with LambdaStack
  • Examples: Check out some examples of building clusters and building applications to be used on the clusters

2 - Getting Started

Simple to get start with LambdaStack

LambdaStack comes with a number of simple defaults that only require Cloud vendor Key/Secret or UserID/Password!

Information in this section helps your user try your project themselves.

  • What do your users need to do to start using your project? This could include downloading/installation instructions, including any prerequisites or system requirements.

  • Introductory “Hello World” example, if appropriate. More complex tutorials should live in the Tutorials section.

Consider using the headings below for your getting started page. You can delete any that are not applicable to your project.

Prerequisites (Runtime only - no development)

LambdaStack works on OSX, Windows, and Linux. You can launch it from your desktop/laptop or from build/jump servers. The following are the basic requirements:

  • Docker
  • Directory on harddrive (laptop or build server depending on where you're wanting to store your LambdaStack generated manifest files)
  • Git (only if using GitHub fork/cloning to download the source code)
  • Internet access (can be executed in an air-gapped environment - details in documentation)
  • Python 3.x is NOT required. It's listed here just to illustrate it's not actually required. The LambdaStack container has it already built in

Prerequisites (Development)

If you plan to contribute to the LambdaStack project by doing development then you will need a development and build environment. LambdaStack works on OSX, Windows, and Linux. You can launch it from your desktop/laptop or from build/jump servers. The following are the basic requirements for development:

  • Docker
  • Git
  • GitHub account - Free or paid. You will need to Fork LambdaStack to your GitHub account, make changes, commit, and issue a pull request. The development documentation details this for you with examples
  • Internet access (can be executed in an air-gapped environment - details in documentation). If your environment requires proxies then see documentation on how to set that up)
  • Python 3.x
  • IDE (Visual Code or PyCharm are good environments). We use Visual Code since it's open source. We recommend a few plugin extensions but they get automatically installed if you follow the instructions in the documention on setting up your development environment using Visual Code
  • Forked and cloned LambdaStack source code - Start contributing!

Installation

Where can your user find your project code? How can they install it (binaries, installable package, build from source)? Are there multiple options/versions they can install and how should they choose the right one for them?

Setup

Is there any initial setup users need to do after installation to try your project?

Try it out!

As of LambdaStack v1.3, there are two ways to get started:

  1. Simply issue a docker run ... command
  2. Fork/Clone the LambdaStack GitHub repo at https://github.com/lambdastack/lambdastack

The upcoming pre-relase version, LambdaStack v2.0.0beta, will have a full Admin UI that will use the default APIs/CLI to manage the full automation of Kubernetes

The easiest is option #1:

Most of below are actually defaults but they are explain for your reference

  1. Decide where to store your LambdaStack generated manifest files and your SSH key pair (private and public). For example, you may decide to launch from your laptop if you're leaving the default use_public_ips: True. So, create a directory if one does not already exist:
  • mkdir -p /projects/lambdastack/build/<whatever name you give it>/keys/ssh - Linux/Mac (note: build would also happen automatically with lambdastack init... but by creating it and <whatever name you give it>/keys/ssh here you don't have to exit the docker container) The lambdastack init -p <cloud provider> -n <use the name you gave it above> (e.g., lambdastack init -p aws -n demo). The init command will automatically append the /build/<name you give it>/keys/ssh if it is not already present. So, using the example, projects/lambdastack/ will have build/demo/keys/ssh append to form projects/lambdastack/build/demo/keys/ssh (the only hardcoded values are build at the beginning keys/ssh at the end. The rest are up to you)
  1. Create your SSH keys unless you already have a pair you would like to use (note - if you don't have a keypair to use then the following will create them and this is only required once):
  • ssh-keygen - It will prompt you for a name and a few other items. At the end it will generate a private and public key (e.g., give it the name and directory - using the example above, give it /projects/lambdastack/build/demo/keys/ssh/ and it will create the key pair there). If you left the default keypair name of lambdastack-operations then you would see projects/lambdastack/build/demo/keys/ssh/lambdastack-operations and another file called lambdastack-operations.pub in the .../keys/ssh/ directory
  1. If doing AWS - simply copy and paste the key and secret from the AWS settings console into the <name you give it>.yml file that was generated by lambdastack init.... Using the example from above, the name of the file would be demo.yml located in the /projects/lambdastack/build/demo directory to create /operations/lambdastack/build/demo/demo.yml. Simply exchange the XXXXXXX value for the corresponding key: and secret: values
  2. Run cd <whereever the base of projects is located>/projects/lambdastack/
  3. Launch Docker... as follows (using the example above):
  • docker run -it -v $PWD:/shared --rm lambdastack/lambdastack:latest. -v $PWD:/shared is very important. It represents the drive volume you wish to associate with the container (containers can't persist data so you have to mount a volume [some drive location] to it). $PWD simply means Present Workding Directory (Linux/Mac). The :/shared is the name of the volume LambdaStack is looking for. -it tells the container it will be an interactive container that allows you interact at the command line of the Linux container. -rm tells the container to stop and release itself from memory after you type exit on the container command line to exit. The lambdastack/lambdastack:latest is the latest version of lambdastack from the public lambdastack registry at https://hub.docker.com/lambdastack
  1. The docker command above will put you into /shared directory that shows build/demo/keys/ssh

Option #2:

If you wish to pull down the open source repo and execute from there then simply do the following:

  1. Click the Fork option button in the top right in the https://github.com/lambdastack/lambdastack repo. This assumes you have GitHub account already. It will ask where you want the forked version to be copied to
  2. Next step is to now Clone your newly forked LambdaStack repo onto your local hard drive
  3. Next step is to go to the root directory of the newly cloned repo on your hard drive
  4. By using the following simple bash script, lsio, a default clusters subdirectory will automatically get created and all of the build information for the cluster will reside there:

./lsio - this is a bash script located in the root directory and uses clusters as the required /shared directory needed for the docker run -it -v $PWD/clusters:/shared --rm lambdastack/lambdastack:{tag} that gets executed by the lsio bash script. An improvement to this would be to allow for passin a parameter of specific location for LambdaStack to store the build information in - great opportunity for a Pull Request (PR)!

3 - Concepts

Core concepts of LambdaStack

The Concepts section helps you learn about the parts of the LambdaStack system and the abstractions LambdaStack uses to represent your cluster, and helps you obtain a deeper understanding of how LambdaStack works.

3.1 - Configurations

Configurations of LambdaStack

Different configuration options for LambaStack. There is a minimal and a full option when calling lambdastack init.... Inside the /schema directory at the LambdaStack Github repo is where ALL yml configurations are managed. They are broken down into Cloud providers and on-premise (any).

The following are the breakdowns:

  • Any - On-premise or those cloud providers not mentioned below like Oracle as an example
  • AWS - Amazon Web Services
  • Azure - Microsoft Azure
  • Common - Anything common among the providers
  • GCP - Google Cloud Platform

Links below...

3.1.1 - Any

Minimal and Full configuration options

3.1.1.1 - Minimal

Minimal configuration options

Minimal Configuration

This option is mainly for on-premise solutions. However, it can be used in a generic way for other clouds like Oracle Cloud, etc.

There are a number of changes to be made so that it can fit your on-premise or non-standard cloud provider environment.

  1. prefix: staging - (optional) Change this to something else like production if you like
  2. name: operations - (optional) Change the user name to anything you like
  3. key_path: lambdastack-operations - (optional) Change the SSH key pair name if you like
  4. hostname: ... - (optional/required) If you're good with keeping the default hostname then leave it or change it to support your environment for each host below
  5. ip: ... - (optional/required) If you're good with keeping the default 192.168.0.0 IP range then leave it or change it to support your environment for each host below
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: any
name: "default"
build_path: "build/path" # This gets dynamically built
specification:
  name: lambdastack
  prefix: staging  # Can be anything you want that helps quickly identify the cluster
  admin_user:
    name: operations # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  components:
    repository:
      count: 1
      machines:
        - default-repository
    kubernetes_master:
      count: 1
      machines:
        - default-k8s-master1
    kubernetes_node:
      count: 2
      machines:
        - default-k8s-node1
        - default-k8s-node2
    logging:
      count: 1
      machines:
        - default-logging
    monitoring:
      count: 1
      machines:
        - default-monitoring
    kafka:
      count: 2
      machines:
        - default-kafka1
        - default-kafka2
    postgresql:
      count: 1
      machines:
        - default-postgresql
    load_balancer:
      count: 1
      machines:
        - default-loadbalancer
    rabbitmq:
      count: 1
      machines:
        - default-rabbitmq
---
kind: infrastructure/machine
provider: any
name: default-repository
specification:
  hostname: repository # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.112 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-k8s-master1
specification:
  hostname: master1 # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.101 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-k8s-node1
specification:
  hostname: node1 # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.102 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-k8s-node2
specification:
  hostname: node2 # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.103 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-logging
specification:
  hostname: elk # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.105 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-monitoring
specification:
  hostname: prometheus # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.106 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-kafka1
specification:
  hostname: kafka1 # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.107 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-kafka2
specification:
  hostname: kafka2 # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.108 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-postgresql
specification:
  hostname: postgresql # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.109 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-loadbalancer
specification:
  hostname: loadbalancer # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.110 # YOUR-MACHINE-IP
---
kind: infrastructure/machine
provider: any
name: default-rabbitmq
specification:
  hostname: rabbitmq # YOUR-MACHINE-HOSTNAME
  ip: 192.168.100.111 # YOUR-MACHINE-IP

3.1.1.2 - Full

Full configuration options

Full Configuration

This option is mainly for on-premise solutions. However, it can be used in a generic way for other clouds like Oracle Cloud, etc.

There are a number of changes to be made so that it can fit your on-premise or non-standard cloud provider environment.

  1. prefix: staging - (optional) Change this to something else like production if you like
  2. name: operations - (optional) Change the user name to anything you like
  3. key_path: lambdastack-operations - (optional) Change the SSH key pair name if you like
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: any
name: "default"
build_path: "build/path" # This gets dynamically built
specification:
  prefix: staging  # Can be anything you want that helps quickly identify the cluster
  name: lambdastack
  admin_user:
    name: operations # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  components:
    kubernetes_master:
      count: 1
      machine: kubernetes-master-machine
      configuration: default
    kubernetes_node:
      count: 2
      machine: kubernetes-node-machine
      configuration: default
    logging:
      count: 1
      machine: logging-machine
      configuration: default
    monitoring:
      count: 1
      machine: monitoring-machine
      configuration: default
    kafka:
      count: 2
      machine: kafka-machine
      configuration: default
    postgresql:
      count: 0
      machine: postgresql-machine
      configuration: default
    load_balancer:
      count: 1
      machine: load-balancer-machine
      configuration: default
    rabbitmq:
      count: 0
      machine: rabbitmq-machine
      configuration: default
    ignite:
      count: 0
      machine: ignite-machine
      configuration: default
    opendistro_for_elasticsearch:
      count: 0
      machine: logging-machine
      configuration: default
    repository:
      count: 1
      machine: repository-machine
      configuration: default
    single_machine:
      count: 0
      machine: single-machine
      configuration: default

3.1.2 - AWS

Minimal and Full configuration options

3.1.2.1 - Minimal

Minimal configuration

As of v1.3.4, LambdaStack requires you to change the following attributes in the either the minimal or full configuration YAML. Beginning in v2.0, you will have the option to pass in these parameters to override whatever is present in the yaml file. v2.0 is in active development

All but the last two options are defaults. The last two are AWS Key and AWS Secret - these two are required

Attributes to change for the minimal configuration After you run `lambdastack init -p aws -n :

  • prefix: staging - Staging is a default prefix. You can use whatever you like (e.g., production). This value can help group your AWS clusters in the same region for easier maintenance
  • name: ubuntu - This attribute is under specification.admin_user.name. For ubuntu on AWS the default user name is ubuntu. For Redhat we default to operations
  • key_path: lambdastack-operations - This is the default SSH key file(s) name. This is the name of your SSH public and private key pairs. For example, in this example, one file (private one) would be named lambdastack-operations. The second file (public key) typically has a .pub suffix such as lambdastack-operations.pub
  • use_public_ips: True - This is the default public IP value. Important, this attribute by default allows for AWS to build your clusters with a public IP interface. We also build a private (non-public) interface using private IPs for internal communication between the nodes. With this attribute set to public it simply allows you easy access to the cluster so you can SSH into it using the name attribute value from above. This is NOT RECOMMENDED for sure not in production and not as a general rule. You should have a VPN or direct connect and route for the cluster
  • region: us-east-1 - This is the default region setting. This means that your cluster and storage will be created in AWS' us-east-1 region. Important - If you want to change this value in any way, you should use the full configuration and then change ALL references of region in the yaml file. If you do not then you may have services in regions you don't want and that may create problems for you
  • key: XXXXXXXXXX - This is very important. This, along with secret are used to access your AWS cluster programmatically which LambdaStack needs. This can be found at specification.cloud.credentials.key. This can be found under your AWS Account menu option in Security Credentials
  • secret: XXXXXXXXXXXXX - This is very important. This, along with key are used to access your AWS cluster programmatically which LambdaStack needs. This can be found at specification.cloud.credentials.secret. This can be found under your AWS Account menu option in Security Credentials. This can only be seen at the time you create it so use the download option and save the file somewhere safe. DO NOT save the file in your source code repo!

Now that you have made your changes to the .yml now run lambdastack apply -f build/<whatever you name your cluster>/<whatever you name your cluster>.yml. Now the building of a LambdaStack cluster will begin. Apply option will generate a final manifest.yml file that will be used for Terraform, Ansible and LambdaStack python code. The manifest.yml will combine the values from below plus ALL yaml configuration files for each service.

---
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: aws
name: "default"
build_path: "build/path"  # This gets dynamically built
specification:
  name: lambdastack
  prefix: staging  # Can be anything you want that helps quickly identify the cluster
  admin_user:
    name: ubuntu # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  cloud:
    k8s_as_cloud_service: False
    use_public_ips: True # When not using public IPs you have to provide connectivity via private IPs (VPN)
    region: us-east-1
    credentials:
      key: XXXXXXXXXX # AWS Subscription Key
      secret: XXXXXXXXX # AWS Subscription Secret
    default_os_image: default
  components:
    repository:
      count: 1
    kubernetes_master:
      count: 1
    kubernetes_node:
      count: 2
    logging:
      count: 1
    monitoring:
      count: 1
    kafka:
      count: 2
    postgresql:
      count: 1
    load_balancer:
      count: 1
    rabbitmq:
      count: 1

3.1.2.2 - Full

Full configuration

As of v1.3.4, LambdaStack requires you to change the following attributes in the either the minimal or full configuration YAML. Beginning in v2.0, you will have the option to pass in these parameters to override whatever is present in the yaml file. v2.0 is in active development

All but the last two options are defaults. The last two are AWS Key and AWS Secret - these two are required

Attributes to change for the full configuration After you run `lambdastack init -p aws -n :

  • prefix: staging - Staging is a default prefix. You can use whatever you like (e.g., production). This value can help group your AWS clusters in the same region for easier maintenance
  • name: ubuntu - This attribute is under specification.admin_user.name. For ubuntu on AWS the default user name is ubuntu. For Redhat we default to operations
  • key_path: lambdastack-operations - This is the default SSH key file(s) name. This is the name of your SSH public and private key pairs. For example, in this example, one file (private one) would be named lambdastack-operations. The second file (public key) typically has a .pub suffix such as lambdastack-operations.pub
  • use_public_ips: True - This is the default public IP value. Important, this attribute by default allows for AWS to build your clusters with a public IP interface. We also build a private (non-public) interface using private IPs for internal communication between the nodes. With this attribute set to public it simply allows you easy access to the cluster so you can SSH into it using the name attribute value from above. This is NOT RECOMMENDED for sure not in production and not as a general rule. You should have a VPN or direct connect and route for the cluster
  • region: us-east-1 - This is the default region setting. This means that your cluster and storage will be created in AWS' us-east-1 region. Important - If you want to change this value in any way, you should use the full configuration and then change ALL references of region in the yaml file. If you do not then you may have services in regions you don't want and that may create problems for you
  • key: XXXXXXXXXX - This is very important. This, along with secret are used to access your AWS cluster programmatically which LambdaStack needs. This can be found at specification.cloud.credentials.key. This can be found under your AWS Account menu option in Security Credentials
  • secret: XXXXXXXXXXXXX - This is very important. This, along with key are used to access your AWS cluster programmatically which LambdaStack needs. This can be found at specification.cloud.credentials.secret. This can be found under your AWS Account menu option in Security Credentials. This can only be seen at the time you create it so use the download option and save the file somewhere safe. DO NOT save the file in your source code repo!

Now that you have made your changes to the .yml now run lambdastack apply -f build/<whatever you name your cluster>/<whatever you name your cluster>.yml. Now the building of a LambdaStack cluster will begin. Apply option will generate a final manifest.yml file that will be used for Terraform, Ansible and LambdaStack python code. The manifest.yml will combine the values from below plus ALL yaml configuration files for each service.

---
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: aws
name: "default"
build_path: "build/path"  # This gets dynamically built
specification:
  prefix: staging  # Can be anything you want that helps quickly identify the cluster
  name: lambdastack
  admin_user:
    name: ubuntu # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  cloud:
    k8s_as_cloud_service: False
    vnet_address_pool: 10.1.0.0/20
    region: us-east-1
    use_public_ips: True # When not using public IPs you have to provide connectivity via private IPs (VPN)
    credentials:
      key: XXXXXXXXXXX # AWS Subscription Key
      secret: XXXXXXXXXXXX # AWS Subscription Secret
    network:
      use_network_security_groups: True
    default_os_image: default
  components:
    kubernetes_master:
      count: 1
      machine: kubernetes-master-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.1.0/24
        - availability_zone: us-east-1b
          address_pool: 10.1.2.0/24
    kubernetes_node:
      count: 2
      machine: kubernetes-node-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.1.0/24
        - availability_zone: us-east-1b
          address_pool: 10.1.2.0/24
    logging:
      count: 1
      machine: logging-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.3.0/24
    monitoring:
      count: 1
      machine: monitoring-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.4.0/24
    kafka:
      count: 2
      machine: kafka-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.5.0/24
    postgresql:
      count: 0
      machine: postgresql-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.6.0/24
    load_balancer:
      count: 1
      machine: load-balancer-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.7.0/24
    rabbitmq:
      count: 0
      machine: rabbitmq-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.8.0/24
    ignite:
      count: 0
      machine: ignite-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.9.0/24
    opendistro_for_elasticsearch:
      count: 0
      machine: logging-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.10.0/24
    repository:
      count: 1
      machine: repository-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.11.0/24
    single_machine:
      count: 0
      machine: single-machine
      configuration: default
      subnets:
        - availability_zone: us-east-1a
          address_pool: 10.1.1.0/24
        - availability_zone: us-east-1b
          address_pool: 10.1.2.0/24

3.1.3 - Azure

Minimal and Full configuration options

3.1.3.1 - Minimal

Minimal configuration

As of v1.3.4, LambdaStack requires you to change the following attributes in the either the minimal or full configuration YAML. Beginning in v2.0, you will have the option to pass in these parameters to override whatever is present in the yaml file. v2.0 is in active development

All options are defaults. Azure will automatically require you to login into the Azure portal before you can run LambdaStack for Azure unless you use the service principal option using the Full configuration. With the Full configuration you can specify your subscription name and service principal so that it can be machine-to-machine oriented not requiring any interaction

Attributes to change for the minimal configuration After you run `lambdastack init -p azure -n :

  • prefix: staging - Staging is a default prefix. You can use whatever you like (e.g., production). This value can help group your AWS clusters in the same region for easier maintenance
  • name: operations - This attribute is under specification.admin_user.name and the default
  • key_path: lambdastack-operations - This is the default SSH key file(s) name. This is the name of your SSH public and private key pairs. For example, in this example, one file (private one) would be named lambdastack-operations. The second file (public key) typically has a .pub suffix such as lambdastack-operations.pub
  • use_public_ips: True - This is the default public IP value. Important, this attribute by default allows for AWS to build your clusters with a public IP interface. We also build a private (non-public) interface using private IPs for internal communication between the nodes. With this attribute set to public it simply allows you easy access to the cluster so you can SSH into it using the name attribute value from above. This is NOT RECOMMENDED for sure not in production and not as a general rule. You should have a VPN or direct connect and route for the cluster
  • region: East US - This is the default region setting. This means that your cluster and storage will be created in Azure East US region. Important - If you want to change this value in any way, you should use the full configuration and then change ALL references of region in the yaml file. If you do not then you may have services in regions you don't want and that may create problems for you

Now that you have made your changes to the .yml now run lambdastack apply -f build/<whatever you name your cluster>/<whatever you name your cluster>.yml. Now the building of a LambdaStack cluster will begin. Apply option will generate a final manifest.yml file that will be used for Terraform, Ansible and LambdaStack python code. The manifest.yml will combine the values from below plus ALL yaml configuration files for each service.

---
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: azure
name: "default"
build_path: "build/path"  # This gets dynamically built
specification:
  name: lambdastack
  prefix: staging  # Can be anything you want that helps quickly identify the clusterprefix
  admin_user:
    name: operations # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  cloud:
    k8s_as_cloud_service: False
    use_public_ips: True # When not using public IPs you have to provide connectivity via private IPs (VPN)
    region: East US
    default_os_image: default
  components:
    repository:
      count: 1
    kubernetes_master:
      count: 1
    kubernetes_node:
      count: 2
    logging:
      count: 1
    monitoring:
      count: 1
    kafka:
      count: 2
    postgresql:
      count: 1
    load_balancer:
      count: 1
    rabbitmq:
      count: 1

3.1.3.2 - Full

Full configuration

As of v1.3.4, LambdaStack requires you to change the following attributes in the either the minimal or full configuration YAML. Beginning in v2.0, you will have the option to pass in these parameters to override whatever is present in the yaml file. v2.0 is in active development

All but the last two options are defaults. The last two are subscription_name and use_service_principal - these two are required

Attributes to change for the full configuration After you run `lambdastack init -p azure -n :

  • prefix: staging - Staging is a default prefix. You can use whatever you like (e.g., production). This value can help group your AWS clusters in the same region for easier maintenance
  • name: operations - This attribute is under specification.admin_user.name
  • key_path: lambdastack-operations - This is the default SSH key file(s) name. This is the name of your SSH public and private key pairs. For example, in this example, one file (private one) would be named lambdastack-operations. The second file (public key) typically has a .pub suffix such as lambdastack-operations.pub
  • use_public_ips: True - This is the default public IP value. Important, this attribute by default allows for AWS to build your clusters with a public IP interface. We also build a private (non-public) interface using private IPs for internal communication between the nodes. With this attribute set to public it simply allows you easy access to the cluster so you can SSH into it using the name attribute value from above. This is NOT RECOMMENDED for sure not in production and not as a general rule. You should have a VPN or direct connect and route for the cluster
  • region: East US - This is the default region setting. This means that your cluster and storage will be created in Azure East US region. Important - If you want to change this value in any way, you should use the full configuration and then change ALL references of region in the yaml file. If you do not then you may have services in regions you don't want and that may create problems for you
  • subscription_name: <whatever the sub name is> - This is very important. This, along with use_service_principal are used to access your Azure cluster programmatically which LambdaStack needs. This can be found at specification.cloud.subscrition_name. This can be found under your Azure Account menu option in settings
  • use_service_principal: True - This is very important. This, along with subcription_name are used to access your Azure cluster programmatically which LambdaStack needs. This can be found at specification.cloud.use_service_principal. This can be found under your Azure Account menu option in Security Credentials.

Now that you have made your changes to the .yml now run lambdastack apply -f build/<whatever you name your cluster>/<whatever you name your cluster>.yml. Now the building of a LambdaStack cluster will begin. Apply option will generate a final manifest.yml file that will be used for Terraform, Ansible and LambdaStack python code. The manifest.yml will combine the values from below plus ALL yaml configuration files for each service.

---
kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: azure
name: "default"
build_path: "build/path"  # This gets dynamically built
specification:
  prefix: staging  # Can be anything you want that helps quickly identify the cluster
  name: lambdastack
  admin_user:
    name: operations # YOUR-ADMIN-USERNAME
    key_path: lambdastack-operations # YOUR-SSH-KEY-FILE-NAME
    path: "/shared/build/<name of cluster>/keys/ssh/lambdastack-operations" # Will get dynamically created
  cloud:
    k8s_as_cloud_service: False
    subscription_name: <YOUR-SUB-NAME>
    vnet_address_pool: 10.1.0.0/20
    use_public_ips: True # When not using public IPs you have to provide connectivity via private IPs (VPN)
    use_service_principal: False
    region: East US
    network:
      use_network_security_groups: True
    default_os_image: default
  components:
    kubernetes_master:
      count: 1
      machine: kubernetes-master-machine
      configuration: default
      subnets:
        - address_pool: 10.1.1.0/24
    kubernetes_node:
      count: 2
      machine: kubernetes-node-machine
      configuration: default
      subnets:
        - address_pool: 10.1.1.0/24
    logging:
      count: 1
      machine: logging-machine
      configuration: default
      subnets:
        - address_pool: 10.1.3.0/24
    monitoring:
      count: 1
      machine: monitoring-machine
      configuration: default
      subnets:
        - address_pool: 10.1.4.0/24
    kafka:
      count: 2
      machine: kafka-machine
      configuration: default
      subnets:
        - address_pool: 10.1.5.0/24
    postgresql:
      count: 0
      machine: postgresql-machine
      configuration: default
      subnets:
        - address_pool: 10.1.6.0/24
    load_balancer:
      count: 1
      machine: load-balancer-machine
      configuration: default
      subnets:
        - address_pool: 10.1.7.0/24
    rabbitmq:
      count: 0
      machine: rabbitmq-machine
      configuration: default
      subnets:
        - address_pool: 10.1.8.0/24
    ignite:
      count: 0
      machine: ignite-machine
      configuration: default
      subnets:
        - address_pool: 10.1.9.0/24
    opendistro_for_elasticsearch:
      count: 0
      machine: logging-machine
      configuration: default
      subnets:
        - address_pool: 10.1.10.0/24
    repository:
      count: 1
      machine: repository-machine
      configuration: default
      subnets:
        - address_pool: 10.1.11.0/24
    single_machine:
      count: 0
      machine: single-machine
      configuration: default
      subnets:
        - address_pool: 10.1.1.0/24

3.1.4 - Common

Minimal and Full configuration options

ALL yaml configuration options listed in this section are for the internal use of LambdaStack only

3.1.4.1 - Cluster Applications

Applications that run in the LambdaStack cluster. This not applications that application developers create

The content of the applications.yml file is listed for reference only

---
kind: configuration/applications
title: "Kubernetes Applications Config"
name: default
specification:
  applications:

## --- ignite ---

  - name: ignite-stateless
    enabled: false
    image_path: "lambdastack/ignite:2.9.1" # it will be part of the image path: {{local_repository}}/{{image_path}}
    use_local_image_registry: true
    namespace: ignite
    service:
      rest_nodeport: 32300
      sql_nodeport: 32301
      thinclients_nodeport: 32302
    replicas: 1
    enabled_plugins:
    - ignite-kubernetes # required to work on K8s
    - ignite-rest-http

# Abstract these configs to separate default files and add
# the ability to add custom application roles.

## --- rabbitmq ---

  - name: rabbitmq
    enabled: false
    image_path: rabbitmq:3.8.9
    use_local_image_registry: true
    #image_pull_secret_name: regcred # optional
    service:
      name: rabbitmq-cluster
      port: 30672
      management_port: 31672
      replicas: 2
      namespace: queue
    rabbitmq:
      #amqp_port: 5672 #optional - default 5672
      plugins: # optional list of RabbitMQ plugins
        - rabbitmq_management
        - rabbitmq_management_agent
      policies: # optional list of RabbitMQ policies
        - name: ha-policy2
          pattern: ".*"
          definitions:
            ha-mode: all
      custom_configurations: #optional list of RabbitMQ configurations (new format -> https://www.rabbitmq.com/configure.html)
        - name: vm_memory_high_watermark.relative
          value: 0.5
      cluster:
        #is_clustered: true #redundant in in-Kubernetes installation, it will always be clustered
        #cookie: "cookieSetFromDataYaml" #optional - default value will be random generated string

## --- auth-service ---

  - name: auth-service # requires PostgreSQL to be installed in cluster
    enabled: false
    image_path: lambdastack/keycloak:14.0.0
    use_local_image_registry: true
    #image_pull_secret_name: regcred
    service:
      name: as-testauthdb
      port: 30104
      replicas: 2
      namespace: namespace-for-auth
      admin_user: auth-service-username
      admin_password: PASSWORD_TO_CHANGE
    database:
      name: auth-database-name
      #port: "5432" # leave it when default
      user: auth-db-user
      password: PASSWORD_TO_CHANGE

## --- pgpool ---

  - name: pgpool # this service requires PostgreSQL to be installed in cluster
    enabled: false
    image:
      path: bitnami/pgpool:4.2.4
      debug: false # ref: https://github.com/bitnami/minideb-extras/#turn-on-bash-debugging
    use_local_image_registry: true
    namespace: postgres-pool
    service:
      name: pgpool
      port: 5432
    replicas: 3
    pod_spec:
      affinity:
        podAntiAffinity: # prefer to schedule replicas on different nodes
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - pgpool
                topologyKey: kubernetes.io/hostname
      nodeSelector: {}
      tolerations: {}
    resources: # Adjust to your configuration, see https://www.pgpool.net/docs/41/en/html/resource-requiremente.html
      limits:
        # cpu: 900m # Set according to your env
        memory: 310Mi
      requests:
        cpu: 250m # Adjust to your env, increase if possible
        memory: 310Mi
    pgpool:
      # https://github.com/bitnami/bitnami-docker-pgpool#configuration + https://github.com/bitnami/bitnami-docker-pgpool#environment-variables
      env:
        PGPOOL_BACKEND_NODES: autoconfigured # you can use custom value like '0:pg-node-1:5432,1:pg-node-2:5432'
        # Postgres users
        PGPOOL_POSTGRES_USERNAME: ls_pgpool_postgres_admin # with SUPERUSER role to use connection slots reserved for superusers for K8s liveness probes, also for user synchronization
        PGPOOL_SR_CHECK_USER: ls_pgpool_sr_check # with pg_monitor role, for streaming replication checks and health checks
        # ---
        PGPOOL_ADMIN_USERNAME: ls_pgpool_admin # Pgpool administrator (local pcp user)
        PGPOOL_ENABLE_LOAD_BALANCING: true # set to 'false' if there is no replication
        PGPOOL_MAX_POOL: 4
        PGPOOL_CHILD_LIFE_TIME: 300 # Default value, read before you change: https://www.pgpool.net/docs/42/en/html/runtime-config-connection-pooling.html
        PGPOOL_POSTGRES_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_postgres_password
        PGPOOL_SR_CHECK_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_sr_check_password
        PGPOOL_ADMIN_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_admin_password
      secrets:
        pgpool_postgres_password: PASSWORD_TO_CHANGE
        pgpool_sr_check_password: PASSWORD_TO_CHANGE
        pgpool_admin_password: PASSWORD_TO_CHANGE
      # https://www.pgpool.net/docs/41/en/html/runtime-config.html
      pgpool_conf_content_to_append: |
        #------------------------------------------------------------------------------
        # CUSTOM SETTINGS (appended by LambdaStack to override defaults)
        #------------------------------------------------------------------------------
        # num_init_children = 32
        connection_life_time = 900
        reserved_connections = 1        
      # https://www.pgpool.net/docs/42/en/html/runtime-config-connection.html
      pool_hba_conf: autoconfigured

## --- pgbouncer ---

  - name: pgbouncer
    enabled: false
    image_path: bitnami/pgbouncer:1.16.0
    init_image_path: bitnami/pgpool:4.2.4
    use_local_image_registry: true
    namespace: postgres-pool
    service:
      name: pgbouncer
      port: 5432
    replicas: 2
    resources:
      requests:
        cpu: 250m
        memory: 128Mi
      limits:
        cpu: 500m
        memory: 128Mi
    pgbouncer:
      env:
        DB_HOST: pgpool.postgres-pool.svc.cluster.local
        DB_LISTEN_PORT: 5432
        MAX_CLIENT_CONN: 150
        DEFAULT_POOL_SIZE: 25
        RESERVE_POOL_SIZE: 25
        POOL_MODE: session
        CLIENT_IDLE_TIMEOUT: 0

## --- istio ---

  - name: istio
    enabled: false
    use_local_image_registry: true
    namespaces:
      operator: istio-operator # namespace where operator will be deployed
      watched: # list of namespaces which operator will watch
        - istio-system
      istio: istio-system # namespace where istio control plane will be deployed
    istio_spec:
      profile: default # Check all possibilities https://istio.io/latest/docs/setup/additional-setup/config-profiles/
      name: istiocontrolplane

3.1.4.2 - Backups

Cluster backup options

The content of the backup.yml file is listed for reference only

---
kind: configuration/backup
title: Backup Config
name: default
specification:
  components:
    load_balancer:
      enabled: false
    logging:
      enabled: false
    monitoring:
      enabled: false
    postgresql:
      enabled: false
    rabbitmq:
      enabled: false
# Kubernetes recovery is not supported by LambdaStack at this point.
# You may create backup by enabling this below, but recovery should be done manually according to Kubernetes documentation.
    kubernetes:
      enabled: false

3.1.4.3 - ElasticSearch-Curator

ElasticSearch options

The content of the elasticsearch-curator.yml file is listed for reference only

---
kind: configuration/elasticsearch-curator
title: Elasticsearch Curator
name: default
specification:
  delete_indices_cron_jobs:
    - description: Delete indices older than N days
      cron:
        hour: 1
        minute: 0
        enabled: true
      filter_list:
        - filtertype: age
          unit_count: 30
          unit: days
          source: creation_date
          direction: older
    - description: Delete the oldest indices to not consume more than N gigabytes of disk space
      cron:
        minute: 30
        enabled: true
      filter_list:
        - filtertype: space
          disk_space: 20
          use_age: True
          source: creation_date

3.1.4.4 - Feature-Mapping

Feature mapping options

The content of the feature-mapping.yml file is listed for reference only

---
kind: configuration/feature-mapping
title: "Feature mapping to roles"
name: default
specification:
  available_roles:
    - name: repository
      enabled: true
    - name: firewall
      enabled: true
    - name: image-registry
      enabled: true
    - name: kubernetes-master
      enabled: true
    - name: kubernetes-node
      enabled: true
    - name: helm
      enabled: true
    - name: logging
      enabled: true
    - name: opendistro-for-elasticsearch
      enabled: true
    - name: elasticsearch-curator
      enabled: true
    - name: kibana
      enabled: true
    - name: filebeat
      enabled: true
    - name: logstash
      enabled: true
    - name: prometheus
      enabled: true
    - name: grafana
      enabled: true
    - name: node-exporter
      enabled: true
    - name: jmx-exporter
      enabled: true
    - name: zookeeper
      enabled: true
    - name: kafka
      enabled: true
    - name: rabbitmq
      enabled: true
    - name: kafka-exporter
      enabled: true
    - name: postgresql
      enabled: true
    - name: postgres-exporter
      enabled: true
    - name: haproxy
      enabled: true
    - name: haproxy-exporter
      enabled: true
    - name: vault
      enabled: true
    - name: applications
      enabled: true
    - name: ignite
      enabled: true

  roles_mapping:
    kafka:
      - zookeeper
      - jmx-exporter
      - kafka
      - kafka-exporter
      - node-exporter
      - filebeat
      - firewall
    rabbitmq:
      - rabbitmq
      - node-exporter
      - filebeat
      - firewall
    logging:
      - logging
      - kibana
      - node-exporter
      - filebeat
      - firewall
    load_balancer:
      - haproxy
      - haproxy-exporter
      - node-exporter
      - filebeat
      - firewall
    monitoring:
      - prometheus
      - grafana
      - node-exporter
      - filebeat
      - firewall
    postgresql:
      - postgresql
      - postgres-exporter
      - node-exporter
      - filebeat
      - firewall
    custom:
      - repository
      - image-registry
      - kubernetes-master
      - node-exporter
      - filebeat
      - rabbitmq
      - postgresql
      - prometheus
      - grafana
      - node-exporter
      - logging
      - firewall
    single_machine:
      - repository
      - image-registry
      - kubernetes-master
      - helm
      - applications
      - rabbitmq
      - postgresql
      - firewall
      - vault
    kubernetes_master:
      - kubernetes-master
      - helm
      - applications
      - node-exporter
      - filebeat
      - firewall
      - vault
    kubernetes_node:
      - kubernetes-node
      - node-exporter
      - filebeat
      - firewall
    ignite:
      - ignite
      - node-exporter
      - filebeat
      - firewall
    opendistro_for_elasticsearch:
      - opendistro-for-elasticsearch
      - node-exporter
      - filebeat
      - firewall
    repository:
      - repository
      - image-registry
      - firewall
      - filebeat
      - node-exporter

3.1.4.5 - Filebeat

Filebeat options

The content of the filebeat.yml file is listed for reference only

---
kind: configuration/filebeat
title: Filebeat
name: default
specification:
  kibana:
    dashboards:
      index: filebeat-*
      enabled: auto
  disable_helm_chart: false
  postgresql_input:
    multiline:
      pattern: >-
                '^\d{4}-\d{2}-\d{2} '
      negate: true
      match: after

3.1.4.6 - Firewall

Firewall options

The content of the firewall.yml file is listed for reference only

---
kind: configuration/firewall
title: OS level firewall
name: default
specification:
  Debian:                         # On RHEL on Azure firewalld is already in VM image (pre-installed)
    install_firewalld: false      # false to avoid random issue "No route to host" even when firewalld service is disabled
  firewall_service_enabled: false # for all inventory hosts
  apply_configuration: false      # if false only service state is managed
  managed_zone_name: LambdaStack
  rules:
    applications:
      enabled: false
      ports:
        - 30104/tcp       # auth-service
        - 30672/tcp       # rabbitmq-amqp
        - 31672/tcp       # rabbitmq-http (management)
        - 32300-32302/tcp # ignite
    common: # for all inventory hosts
      enabled: true
      ports:
        - 22/tcp
    grafana:
      enabled: true
      ports:
        - 3000/tcp
    haproxy:
      enabled: true
      ports:
        - 443/tcp
        - 9000/tcp # stats
    haproxy_exporter:
      enabled: true
      ports:
        - 9101/tcp
    ignite:
      enabled: true
      ports:
        - 8080/tcp  # REST API
        - 10800/tcp # thin client connection
        - 11211/tcp # JDBC
        - 47100/tcp # local communication
        - 47500/tcp # local discovery
    image_registry:
      enabled: true
      ports:
        - 5000/tcp
    jmx_exporter:
      enabled: true
      ports:
        - 7071/tcp # Kafka
        - 7072/tcp # ZooKeeper
    kafka:
      enabled: true
      ports:
        - 9092/tcp
      # - 9093/tcp # encrypted communication (if TLS/SSL is enabled)
    kafka_exporter:
      enabled: true
      ports:
        - 9308/tcp
    kibana:
      enabled: true
      ports:
        - 5601/tcp
    kubernetes_master:
      enabled: true
      ports:
        - 6443/tcp      # API server
        - 2379-2380/tcp # etcd server client API
        - 8472/udp      # flannel (vxlan backend)
        - 10250/tcp     # Kubelet API
        - 10251/tcp     # kube-scheduler
        - 10252/tcp     # kube-controller-manager
    kubernetes_node:
      enabled: true
      ports:
        - 8472/udp  # flannel (vxlan backend)
        - 10250/tcp # Kubelet API
    logging:
      enabled: true
      ports:
        - 9200/tcp
    node_exporter:
      enabled: true
      ports:
        - 9100/tcp
    opendistro_for_elasticsearch:
      enabled: true
      ports:
        - 9200/tcp
    postgresql:
      enabled: true
      ports:
        - 5432/tcp
        - 6432/tcp #PGBouncer
    prometheus:
      enabled: true
      ports:
        - 9090/tcp
        - 9093/tcp # Alertmanager
    rabbitmq:
      enabled: true
      ports:
        - 4369/tcp    # peer discovery service used by RabbitMQ nodes and CLI tools
        # - 5671/tcp  # encrypted communication (if TLS/SSL is enabled)
        - 5672/tcp    # AMQP
        # - 15672/tcp # HTTP API clients, management UI and rabbitmqadmin (only if the management plugin is enabled)
        - 25672/tcp   # distribution server
    zookeeper:
      enabled: true
      ports:
        - 2181/tcp # client connections
        - 2888/tcp # peers communication
        - 3888/tcp # leader election

3.1.4.7 - Grafana

Grafana options

The content of the grafana.yml file is listed for reference only

---
kind: configuration/grafana
title: "Grafana"
name: default
specification:
  grafana_logs_dir: "/var/log/grafana"
  grafana_data_dir: "/var/lib/grafana"
  grafana_address: "0.0.0.0"
  grafana_port: 3000

  # Should the provisioning be kept synced. If true, previous provisioned objects will be removed if not referenced anymore.
  grafana_provisioning_synced: false
  # External Grafana address. Variable maps to "root_url" in grafana server section
  grafana_url: "https://0.0.0.0:3000"

  # Additional options for grafana "server" section
  # This section WILL omit options for: http_addr, http_port, domain, and root_url, as those settings are set by variables listed before
  grafana_server:
    protocol: https
    enforce_domain: false
    socket: ""
    cert_key: "/etc/grafana/ssl/grafana_key.key"
    cert_file: "/etc/grafana/ssl/grafana_cert.pem"
    enable_gzip: false
    static_root_path: public
    router_logging: false

  # Variables correspond to ones in grafana.ini configuration file
  # Security
  grafana_security:
    admin_user: admin
    admin_password: PASSWORD_TO_CHANGE
  #  secret_key: ""
  #  login_remember_days: 7
  #  cookie_username: grafana_user
  #  cookie_remember_name: grafana_remember
  #  disable_gravatar: true
  #  data_source_proxy_whitelist:

  # Database setup
  grafana_database:
    type: sqlite3
  #  host: 127.0.0.1:3306
  #  name: grafana
  #  user: root
  #  password: ""
  #  url: ""
  #  ssl_mode: disable
  #  path: grafana.db
  #  max_idle_conn: 2
  #  max_open_conn: ""
  #  log_queries: ""

  # Default dashboards predefined and available in online & offline mode
  grafana_external_dashboards: []
    #   # Kubernetes cluster monitoring (via Prometheus)
    # - dashboard_id: '315'
    #   datasource: 'Prometheus'
    #   # Node Exporter Server Metrics
    # - dashboard_id: '405'
    #   datasource: 'Prometheus'
    #   # Postgres Overview
    # - dashboard_id: '455'
    #   datasource: 'Prometheus'
    #   # Node Exporter Full
    # - dashboard_id: '1860'
    #   datasource: 'Prometheus'
    #   # RabbitMQ Monitoring
    # - dashboard_id: '4279'
    #   datasource: 'Prometheus'
    #   # Kubernetes Cluster
    # - dashboard_id: '7249'
    #   datasource: 'Prometheus'
    #   # Kafka Exporter Overview
    # - dashboard_id: '7589'
    #   datasource: 'Prometheus'
    #   # PostgreSQL Database
    # - dashboard_id: '9628'
    #   datasource: 'Prometheus'
    #   # RabbitMQ cluster monitoring (via Prometheus)
    # - dashboard_id: '10991'
    #   datasource: 'Prometheus'
    #   # 1 Node Exporter for Prometheus Dashboard EN v20201010
    # - dashboard_id: '11074'
    #   datasource: 'Prometheus'

  # Get dashboards from https://grafana.com/dashboards. Only for online mode
  grafana_online_dashboards: []
    # - dashboard_id: '4271'
    #   revision_id: '3'
    #   datasource: 'Prometheus'
    # - dashboard_id: '1860'
    #   revision_id: '4'
    #   datasource: 'Prometheus'
    # - dashboard_id: '358'
    #   revision_id: '1'
    #   datasource: 'Prometheus'

  # Deployer local folder with dashboard definitions in .json format
  grafana_dashboards_dir: "dashboards" # Replace with your dashboard directory if you have dashboards to include

  # User management and registration
  grafana_welcome_email_on_sign_up: false
  grafana_users:
    allow_sign_up: false
    # allow_org_create: true
    # auto_assign_org: true
    auto_assign_org_role: Viewer
    # login_hint: "email or username"
    default_theme: dark
    # external_manage_link_url: ""
    # external_manage_link_name: ""
    # external_manage_info: ""

  # grafana authentication mechanisms
  grafana_auth: {}
  #  disable_login_form: false
  #  disable_signout_menu: false
  #  anonymous:
  #    org_name: "Main Organization"
  #    org_role: Viewer
  #  ldap:
  #    config_file: "/etc/grafana/ldap.toml"
  #    allow_sign_up: false
  #  basic:
  #    enabled: true

  grafana_ldap: {}
  #  verbose_logging: false
  #  servers:
  #    host: 127.0.0.1
  #    port: 389 # 636 for SSL
  #    use_ssl: false
  #    start_tls: false
  #    ssl_skip_verify: false
  #    root_ca_cert: /path/to/certificate.crt
  #    bind_dn: "cn=admin,dc=grafana,dc=org"
  #    bind_password: grafana
  #    search_filter: "(cn=%s)" # "(sAMAccountName=%s)" on AD
  #    search_base_dns:
  #      - "dc=grafana,dc=org"
  #    group_search_filter: "(&(objectClass=posixGroup)(memberUid=%s))"
  #    group_search_base_dns:
  #      - "ou=groups,dc=grafana,dc=org"
  #    attributes:
  #      name: givenName
  #      surname: sn
  #      username: sAMAccountName
  #      member_of: memberOf
  #      email: mail
  #  group_mappings:
  #    - name: Main Org.
  #      id: 1
  #      groups:
  #        - group_dn: "cn=admins,ou=groups,dc=grafana,dc=org"
  #          org_role: Admin
  #        - group_dn: "cn=editors,ou=groups,dc=grafana,dc=org"
  #          org_role: Editor
  #        - group_dn: "*"
  #          org_role: Viewer
  #    - name: Alternative Org
  #      id: 2
  #      groups:
  #        - group_dn: "cn=alternative_admins,ou=groups,dc=grafana,dc=org"
  #          org_role: Admin

  grafana_session: {}
  #  provider: file
  #  provider_config: "sessions"

  grafana_analytics: {}
  #  reporting_enabled: true
  #  google_analytics_ua_id: ""

  # Set this for mail notifications
  grafana_smtp: {}
  #  host:
  #  user:
  #  password:
  #  from_address:

  # Enable grafana alerting mechanism
  grafana_alerting:
    execute_alerts: true
  #  error_or_timeout: 'alerting'
  #  nodata_or_nullvalues: 'no_data'
  #  concurrent_render_limit: 5

  # Grafana logging configuration
  grafana_log: {}
  # mode: 'console file'
  # level: info

  # Internal grafana metrics system
  grafana_metrics: {}
  #  interval_seconds: 10
  #  graphite:
  #    address: "localhost:2003"
  #    prefix: "prod.grafana.%(instance_name)s"

  # Distributed tracing options
  grafana_tracing: {}
  #  address: "localhost:6831"
  #  always_included_tag: "tag1:value1,tag2:value2"
  #  sampler_type: const
  #  sampler_param: 1

  grafana_snapshots: {}
  #  external_enabled: true
  #  external_snapshot_url: "https://snapshots-origin.raintank.io"
  #  external_snapshot_name: "Publish to snapshot.raintank.io"
  #  snapshot_remove_expired: true
  #  snapshot_TTL_days: 90

  # External image store
  grafana_image_storage: {}
  #  provider: gcs
  #  key_file:
  #  bucket:
  #  path:


  #######
  # Plugins from https://grafana.com/plugins
  grafana_plugins: []
  #  - raintank-worldping-app
  #


  # Alert notification channels to configure
  grafana_alert_notifications: []
  #   - name: "Email Alert"
  #     type: "email"
  #     isDefault: true
  #     settings:
  #       addresses: "example@example.com"

  # Datasources to configure
  grafana_datasources:
    - name: "Prometheus"
      type: "prometheus"
      access: "proxy"
      url: "http://localhost:9090"
      basicAuth: false
      basicAuthUser: ""
      basicAuthPassword: ""
      isDefault: true
      editable: true
      jsonData:
        tlsAuth: false
        tlsAuthWithCACert: false
        tlsSkipVerify: true

  # API keys to configure
  grafana_api_keys: []
  #  - name: "admin"
  #    role: "Admin"
  #  - name: "viewer"
  #    role: "Viewer"
  #  - name: "editor"
  #    role: "Editor"

  # Logging options to configure
  grafana_logging:
    log_rotate: true
    daily_rotate: true
    max_days: 7

3.1.4.8 - HAProxy-Exporter

HAProxy-Exporter options

The content of the haproxy-exporter.yml file is listed for reference only

---
kind: configuration/haproxy-exporter
title: "HAProxy exporter"
name: default
specification:
  description: "Service that runs HAProxy Exporter"

  web_listen_port: "9101"

  config_for_prometheus: # configuration that will be written to Prometheus to allow scraping metrics from this exporter
    exporter_listen_port: "9101"
    prometheus_config_dir: /etc/prometheus
    file_sd_labels:
      - label: "job"
        value: "haproxy-exporter"

3.1.4.9 - HAProxy

HAProxy options

The content of the haproxy.yml file is listed for reference only

---
kind: configuration/haproxy
title: "HAProxy"
name: default
specification:
  logs_max_days: 60
  self_signed_certificate_name: self-signed-fullchain.pem
  self_signed_private_key_name: self-signed-privkey.pem
  self_signed_concatenated_cert_name: self-signed-test.tld.pem
  haproxy_log_path: "/var/log/haproxy.log"

  stats:
    enable: true
    bind_address: 127.0.0.1:9000
    uri: "/haproxy?stats"
    user: operations
    password: your-haproxy-stats-pwd
  frontend:
    - name: https_front
      port: 443
      https: true
      backend:
      - http_back1
  backend: # example backend config below
    - name: http_back1
      server_groups:
      - kubernetes_node
      # servers: # Definition for server to that hosts the application.
      # - name: "node1"
      #   address: "lambdastack-vm1.domain.com"
      port: 30104

3.1.4.10 - Helm-Charts

Helm-Charts options

The content of the helm-charts.yml file is listed for reference only

---
kind: configuration/helm-charts
title: "Helm charts"
name: default
specification:
  apache_lsrepo_path: "/var/www/html/lsrepo"

3.1.4.11 - Helm

Helm options - Internal for LambdaStack

The content of the helm.yml file is listed for reference only

---
kind: configuration/helm
title: "Helm"
name: default
specification:
  apache_lsrepo_path: "/var/www/html/lsrepo"

3.1.4.12 - Apache Ignite

Ignite caching options

The content of the ignite.yml file is listed for reference only

---
kind: configuration/ignite
title: "Apache Ignite stateful installation"
name: default
specification:
  enabled_plugins:
  - ignite-rest-http
  config: |
    <?xml version="1.0" encoding="UTF-8"?>

    <!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements.  See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License.  You may obtain a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->

    <beans xmlns="http://www.springframework.org/schema/beans"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="
          http://www.springframework.org/schema/beans
          http://www.springframework.org/schema/beans/spring-beans.xsd">

        <bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
          <property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
              <!-- Set the page size to 4 KB -->
              <property name="pageSize" value="#{4 * 1024}"/>
              <!--
              Sets a path to the root directory where data and indexes are
              to be persisted. It's assumed the directory is on a separated SSD.
              -->
              <property name="storagePath" value="/var/lib/ignite/persistence"/>

              <!--
                  Sets a path to the directory where WAL is stored.
                  It's assumed the directory is on a separated HDD.
              -->
              <property name="walPath" value="/wal"/>

              <!--
                  Sets a path to the directory where WAL archive is stored.
                  The directory is on the same HDD as the WAL.
              -->
              <property name="walArchivePath" value="/wal/archive"/>
            </bean>
          </property>

          <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
              <property name="ipFinder">
                <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                  <property name="addresses">
    IP_LIST_PLACEHOLDER
                  </property>
                </bean>
              </property>
              <property name="localPort" value="47500"/>
              <!-- Limit number of potentially used ports from 100 to 10 -->
              <property name="localPortRange" value="10"/>
            </bean>
          </property>

          <property name="communicationSpi">
            <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
              <property name="localPort" value="47100"/>
              <!-- Limit number of potentially used ports from 100 to 10 -->
              <property name="localPortRange" value="10"/>
            </bean>
          </property>

          <property name="clientConnectorConfiguration">
            <bean class="org.apache.ignite.configuration.ClientConnectorConfiguration">
              <property name="port" value="10800"/>
              <!-- Limit number of potentially used ports from 100 to 10 -->
              <property name="portRange" value="10"/>
            </bean>
          </property>

          <property name="connectorConfiguration">
            <bean class="org.apache.ignite.configuration.ConnectorConfiguration">
              <property name="port" value="11211"/>
              <!-- Limit number of potentially used ports from 100 to 10 -->
              <property name="portRange" value="10"/>
            </bean>
          </property>

        </bean>
    </beans>    

3.1.4.13 - Image-Registry

Image Registry options

The content of the image-registry.yml file is listed for reference only

---
kind: configuration/image-registry
title: "LambdaStack image registry"
name: default
specification:
  description: "Local registry with Docker images"
  registry_image:
    name: "registry:2"
    file_name: registry-2.tar
  images_to_load:
    x86_64:
      generic:
        - name: "lambdastack/keycloak:14.0.0"
          file_name: keycloak-14.0.0.tar
        - name: "rabbitmq:3.8.9"
          file_name: rabbitmq-3.8.9.tar
        - name: "lambdastack/ignite:2.9.1"
          file_name: ignite-2.9.1.tar
        - name: "kubernetesui/dashboard:v2.3.1"
          file_name: dashboard-v2.3.1.tar
        - name: "kubernetesui/metrics-scraper:v1.0.7"
          file_name: metrics-scraper-v1.0.7.tar
        - name: "vault:1.7.0"
          file_name: vault-1.7.0.tar
        - name: "hashicorp/vault-k8s:0.10.0"
          file_name: vault-k8s-0.10.0.tar
        - name: "istio/proxyv2:1.8.1"
          file_name: proxyv2-1.8.1.tar
        - name: "istio/pilot:1.8.1"
          file_name: pilot-1.8.1.tar
        - name: "istio/operator:1.8.1"
          file_name: operator-1.8.1.tar
        # postgres
        - name: bitnami/pgpool:4.2.4
          file_name: pgpool-4.2.4.tar
        - name: bitnami/pgbouncer:1.16.0
          file_name: pgbouncer-1.16.0.tar
      current:
        - name: "haproxy:2.2.2-alpine"
          file_name: haproxy-2.2.2-alpine.tar
        # K8s v1.20.12 - LambdaStack 1.3 (transitional version)
        # https://github.com/kubernetes/kubernetes/blob/v1.20.12/build/dependencies.yaml
        - name: "k8s.gcr.io/kube-apiserver:v1.20.12"
          file_name: kube-apiserver-v1.20.12.tar
        - name: "k8s.gcr.io/kube-controller-manager:v1.20.12"
          file_name: kube-controller-manager-v1.20.12.tar
        - name: "k8s.gcr.io/kube-proxy:v1.20.12"
          file_name: kube-proxy-v1.20.12.tar
        - name: "k8s.gcr.io/kube-scheduler:v1.20.12"
          file_name: kube-scheduler-v1.20.12.tar
        - name: "k8s.gcr.io/coredns:1.7.0"
          file_name: coredns-1.7.0.tar
        - name: "k8s.gcr.io/etcd:3.4.13-0"
          file_name: etcd-3.4.13-0.tar
        - name: "k8s.gcr.io/pause:3.2"
          file_name: pause-3.2.tar
        # flannel
        - name: "quay.io/coreos/flannel:v0.14.0-amd64"
          file_name: flannel-v0.14.0-amd64.tar
        - name: "quay.io/coreos/flannel:v0.14.0"
          file_name: flannel-v0.14.0.tar
        # canal & calico
        - name: "calico/cni:v3.20.2"
          file_name: cni-v3.20.2.tar
        - name: "calico/kube-controllers:v3.20.2"
          file_name: kube-controllers-v3.20.2.tar
        - name: "calico/node:v3.20.2"
          file_name: node-v3.20.2.tar
        - name: "calico/pod2daemon-flexvol:v3.20.2"
          file_name: pod2daemon-flexvol-v3.20.2.tar
      legacy:
        # K8s v1.18.6 - LambdaStack 0.7.1 - 1.2
        - name: "k8s.gcr.io/kube-apiserver:v1.18.6"
          file_name: kube-apiserver-v1.18.6.tar
        - name: "k8s.gcr.io/kube-controller-manager:v1.18.6"
          file_name: kube-controller-manager-v1.18.6.tar
        - name: "k8s.gcr.io/kube-proxy:v1.18.6"
          file_name: kube-proxy-v1.18.6.tar
        - name: "k8s.gcr.io/kube-scheduler:v1.18.6"
          file_name: kube-scheduler-v1.18.6.tar
        - name: "k8s.gcr.io/coredns:1.6.7"
          file_name: coredns-1.6.7.tar
        - name: "k8s.gcr.io/etcd:3.4.3-0"
          file_name: etcd-3.4.3-0.tar
        # flannel
        - name: "quay.io/coreos/flannel:v0.12.0-amd64"
          file_name: flannel-v0.12.0-amd64.tar
        - name: "quay.io/coreos/flannel:v0.12.0"
          file_name: flannel-v0.12.0.tar
        # canal & calico
        - name: "calico/cni:v3.15.0"
          file_name: cni-v3.15.0.tar
        - name: "calico/kube-controllers:v3.15.0"
          file_name: kube-controllers-v3.15.0.tar
        - name: "calico/node:v3.15.0"
          file_name: node-v3.15.0.tar
        - name: "calico/pod2daemon-flexvol:v3.15.0"
          file_name: pod2daemon-flexvol-v3.15.0.tar
    aarch64:
      generic:
        - name: "lambdastack/keycloak:14.0.0"
          file_name: keycloak-14.0.0.tar
        - name: "rabbitmq:3.8.9"
          file_name: rabbitmq-3.8.9.tar
        - name: "lambdastack/ignite:2.9.1"
          file_name: ignite-2.9.1.tar
        - name: "kubernetesui/dashboard:v2.3.1"
          file_name: dashboard-v2.3.1.tar
        - name: "kubernetesui/metrics-scraper:v1.0.7"
          file_name: metrics-scraper-v1.0.7.tar
        - name: "vault:1.7.0"
          file_name: vault-1.7.0.tar
        - name: "hashicorp/vault-k8s:0.10.0"
          file_name: vault-k8s-0.10.0.tar
      current:
        - name: "haproxy:2.2.2-alpine"
          file_name: haproxy-2.2.2-alpine.tar
        # K8s v1.20.12 - LambdaStack 1.3 (transition version)
        - name: "k8s.gcr.io/kube-apiserver:v1.20.12"
          file_name: kube-apiserver-v1.20.12.tar
        - name: "k8s.gcr.io/kube-controller-manager:v1.20.12"
          file_name: kube-controller-manager-v1.20.12.tar
        - name: "k8s.gcr.io/kube-proxy:v1.20.12"
          file_name: kube-proxy-v1.20.12.tar
        - name: "k8s.gcr.io/kube-scheduler:v1.20.12"
          file_name: kube-scheduler-v1.20.12.tar
        - name: "k8s.gcr.io/coredns:1.7.0"
          file_name: coredns-1.7.0.tar
        - name: "k8s.gcr.io/etcd:3.4.13-0"
          file_name: etcd-3.4.13-0.tar
        - name: "k8s.gcr.io/pause:3.2"
          file_name: pause-3.2.tar
        # flannel
        - name: "quay.io/coreos/flannel:v0.14.0-arm64"
          file_name: flannel-v0.14.0-arm64.tar
        - name: "quay.io/coreos/flannel:v0.14.0"
          file_name: flannel-v0.14.0.tar
        # canal & calico
        - name: "calico/cni:v3.20.2"
          file_name: cni-v3.20.2.tar
        - name: "calico/kube-controllers:v3.20.2"
          file_name: kube-controllers-v3.20.2.tar
        - name: "calico/node:v3.20.2"
          file_name: node-v3.20.2.tar
        - name: "calico/pod2daemon-flexvol:v3.20.2"
          file_name: pod2daemon-flexvol-v3.20.2.tar
      legacy:
        # K8s v1.18.6 - LambdaStack 0.7.1 - 1.2
        - name: "k8s.gcr.io/kube-apiserver:v1.18.6"
          file_name: kube-apiserver-v1.18.6.tar
        - name: "k8s.gcr.io/kube-controller-manager:v1.18.6"
          file_name: kube-controller-manager-v1.18.6.tar
        - name: "k8s.gcr.io/kube-proxy:v1.18.6"
          file_name: kube-proxy-v1.18.6.tar
        - name: "k8s.gcr.io/kube-scheduler:v1.18.6"
          file_name: kube-scheduler-v1.18.6.tar
        - name: "k8s.gcr.io/coredns:1.6.7"
          file_name: coredns-1.6.7.tar
        - name: "k8s.gcr.io/etcd:3.4.3-0"
          file_name: etcd-3.4.3-0.tar
        # flannel
        - name: "quay.io/coreos/flannel:v0.12.0-arm64"
          file_name: flannel-v0.12.0-arm64.tar
        - name: "quay.io/coreos/flannel:v0.12.0"
          file_name: flannel-v0.12.0.tar
        # canal & calico
        - name: "calico/cni:v3.15.0"
          file_name: cni-v3.15.0.tar
        - name: "calico/kube-controllers:v3.15.0"
          file_name: kube-controllers-v3.15.0.tar
        - name: "calico/node:v3.15.0"
          file_name: node-v3.15.0.tar
        - name: "calico/pod2daemon-flexvol:v3.15.0"
          file_name: pod2daemon-flexvol-v3.15.0.tar

3.1.4.14 - JMX-Exporter

JMX-Exporter options

The content of the jmx-exporter.yml file is listed for reference only

---
kind: configuration/jmx-exporter
title: "JMX exporter"
name: default
specification:
  file_name: "jmx_prometheus_javaagent-0.14.0.jar"
  jmx_path: /opt/jmx-exporter/jmx_prometheus_javaagent.jar # Changing it requires also change for same variable in Kafka and Zookeeper configs.  # Todo Zookeeper and Kafka to use this variable
  jmx_jars_directory: /opt/jmx-exporter/jars
  jmx_exporter_user: jmx-exporter
  jmx_exporter_group: jmx-exporter

3.1.4.15 - Kafka-Exporter

Kafka-exporter options

The content of the kafka-exporter.yml file is listed for reference only

kind: configuration/kafka-exporter
title: "Kafka exporter"
name: default
specification:
  description: "Service that runs Kafka Exporter"

  web_listen_port: "9308"
  config_flags:
    - "--web.listen-address=:9308" # Address to listen on for web interface and telemetry.
    - '--web.telemetry-path=/metrics' # Path under which to expose metrics.
    - '--log.level=info'
    - '--topic.filter=.*' # Regex that determines which topics to collect.
    - '--group.filter=.*' # Regex that determines which consumer groups to collect.
    #- '--tls.insecure-skip-tls-verify' # If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure.
    - '--kafka.version=2.0.0'
    #- '--sasl.enabled' # Connect using SASL/PLAIN.
    #- '--sasl.handshake' # Only set this to false if using a non-Kafka SASL proxy
    #- '--sasl.username=""'
    #- '--sasl.password=""'
    #- '--tls.enabled' # Connect using TLS
    #- '--tls.ca-file=""' # The optional certificate authority file for TLS client authentication
    #- '--tls.cert-file=""' # The optional certificate file for client authentication
    #- '--tls.key-file=""' # The optional key file for client authentication

  config_for_prometheus: # configuration that will be written to Prometheus to allow scraping metrics from this exporter
    exporter_listen_port: "9308"
    prometheus_config_dir: /etc/prometheus
    file_sd_labels:
      - label: "job"
        value: "kafka-exporter"

3.1.4.16 - Kafka

Kafka options

The content of the kafka.yml file is listed for reference only

---
kind: configuration/kafka
title: "Kafka"
name: default
specification:
  kafka_var:
    enabled: True
    admin: kafka
    admin_pwd: LambdaStack
    # javax_net_debug: all # uncomment to activate debugging, other debug options: https://colinpaice.blog/2020/04/05/using-java-djavax-net-debug-to-examine-data-flows-including-tls/
    security:
      ssl:
        enabled: False
        port: 9093
        server:
          local_cert_download_path: kafka-certs
          keystore_location: /var/private/ssl/kafka.server.keystore.jks
          truststore_location: /var/private/ssl/kafka.server.truststore.jks
          cert_validity: 365
          passwords:
            keystore: PasswordToChange
            truststore: PasswordToChange
            key: PasswordToChange
        endpoint_identification_algorithm: HTTPS
        client_auth: required
      encrypt_at_rest: False
      inter_broker_protocol: PLAINTEXT
      authorization:
        enabled: False
        authorizer_class_name: kafka.security.auth.SimpleAclAuthorizer
        allow_everyone_if_no_acl_found: False
        super_users:
          - tester01
          - tester02
        users:
          - name: test_user
            topic: test_topic
      authentication:
        enabled: False
        authentication_method: certificates
        sasl_mechanism_inter_broker_protocol:
        sasl_enabled_mechanisms: PLAIN
    sha: "b28e81705e30528f1abb6766e22dfe9dae50b1e1e93330c880928ff7a08e6b38ee71cbfc96ec14369b2dfd24293938702cab422173c8e01955a9d1746ae43f98"
    port: 9092
    min_insync_replicas: 1 # Minimum number of replicas (ack write)
    default_replication_factor: 1 # Minimum number of automatically created topics
    offsets_topic_replication_factor: 1 # Minimum number of offsets topic (consider higher value for HA)
    num_recovery_threads_per_data_dir: 1 # Minimum number of recovery threads per data dir
    num_replica_fetchers: 1 # Minimum number of replica fetchers
    replica_fetch_max_bytes: 1048576
    replica_socket_receive_buffer_bytes: 65536
    partitions: 8 # 100 x brokers x replicas for reasonable size cluster. Small clusters can be less
    log_retention_hours: 168 # The minimum age of a log file to be eligible for deletion due to age
    log_retention_bytes: -1 # -1 is no size limit only a time limit (log_retention_hours). This limit is enforced at the partition level, multiply it by the number of partitions to compute the topic retention in bytes.
    offset_retention_minutes: 10080 # Offsets older than this retention period will be discarded
    heap_opts: "-Xmx2G -Xms2G"
    opts: "-Djavax.net.debug=all"
    jmx_opts:
    max_incremental_fetch_session_cache_slots: 1000
    controlled_shutdown_enable: true
    group: kafka
    user: kafka
    conf_dir: /opt/kafka/config
    data_dir: /var/lib/kafka
    log_dir: /var/log/kafka
    socket_settings:
      network_threads: 3 # The number of threads handling network requests
      io_threads: 8 # The number of threads doing disk I/O
      send_buffer_bytes: 102400 # The send buffer (SO_SNDBUF) used by the socket server
      receive_buffer_bytes: 102400 # The receive buffer (SO_RCVBUF) used by the socket server      
      request_max_bytes: 104857600 # The maximum size of a request that the socket server will accept (protection against OOM)
  zookeeper_set_acl: false
  zookeeper_hosts: "{{ groups['zookeeper']|join(':2181,') }}:2181"
  jmx_exporter_user: jmx-exporter
  jmx_exporter_group: jmx-exporter
  prometheus_jmx_path: /opt/jmx-exporter/jmx_prometheus_javaagent.jar
  prometheus_jmx_exporter_web_listen_port: 7071
  prometheus_jmx_config: /opt/kafka/config/jmx-kafka.config.yml
  prometheus_config_dir: /etc/prometheus
  prometheus_kafka_jmx_file_sd_labels:
    "job": "jmx-kafka"

3.1.4.17 - Kibana

Kibana options

The content of the kibana.yml file is listed for reference only

---
kind: configuration/kibana
title: "Kibana"
name: default
specification:
  kibana_log_dir: /var/log/kibana

3.1.4.18 - Kubernetes-ControlPlane

Kubernetes ControlPlane (aka kubernetes-master) options

The content of the kubernetes-master.yml file is listed for reference only

---
kind: configuration/kubernetes-master
title: Kubernetes Control Plane Config
name: default
specification:
  version: 1.20.12
  cni_version: 0.8.7
  cluster_name: "kubernetes-lambdastack"
  allow_pods_on_master: False
  storage:
    name: lambdastack-cluster-volume # name of the Kubernetes resource
    path: / # directory path in mounted storage
    enable: True
    capacity: 50 # GB
    data: {} #AUTOMATED - data specific to cloud provider
  advanced: # modify only if you are sure what value means
    api_server_args: # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/
      profiling: false
      enable-admission-plugins: "AlwaysPullImages,DenyEscalatingExec,NamespaceLifecycle,ServiceAccount,NodeRestriction"
      audit-log-path: "/var/log/apiserver/audit.log"
      audit-log-maxbackup: 10
      audit-log-maxsize: 200
      secure-port: 6443
    controller_manager_args: # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/
      profiling: false
      terminated-pod-gc-threshold: 200
    scheduler_args:  # https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/
      profiling: false
    networking:
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
      plugin: flannel # valid options: calico, flannel, canal (due to lack of support for calico on Azure - use canal)
    imageRepository: k8s.gcr.io
    certificates:
      expiration_days: 365 # values greater than 24855 are not recommended
      renew: false
    etcd_args:
      encrypted: true
    kubeconfig:
      local:
        api_server:
          # change if you want a custom hostname (you can use jinja2/ansible expressions here, for example "{{ groups.kubernetes_master[0] }}")
          hostname: 127.0.0.1
          # change if you want a custom port
          port: 6443
#  image_registry_secrets:
#  - email: emaul@domain.com
#    name: secretname
#    namespace: default
#    password: docker-registry-pwd
#    server_url: docker-registry-url
#    username: docker-registry-user

3.1.4.19 - Kubernetes Nodes

Kubernetes Nodes options

The content of the kubernetes-nodes.yml file is listed for reference only

---
kind: configuration/kubernetes-node
title: Kubernetes Node Config
name: default
specification:
  version: 1.20.12
  cni_version: 0.8.7
  node_labels: "node-type=lambdastack"

3.1.4.20 - Logging

Logging options

The content of the logging.yml file is listed for reference only

---
kind: configuration/logging
title: Logging Config
name: default
specification:
  cluster_name: LambdaStackElastic
  admin_password: PASSWORD_TO_CHANGE
  kibanaserver_password: PASSWORD_TO_CHANGE
  kibanaserver_user_active: true
  logstash_password: PASSWORD_TO_CHANGE
  logstash_user_active: true
  demo_users_to_remove:
  - kibanaro
  - readall
  - snapshotrestore
  paths:
    data: /var/lib/elasticsearch
    repo: /var/lib/elasticsearch-snapshots
    logs: /var/log/elasticsearch
  jvm_options:
    Xmx: 1g # see https://www.elastic.co/guide/en/elasticsearch/reference/7.9/heap-size.html
  opendistro_security:
    ssl:
      transport:
        enforce_hostname_verification: true

3.1.4.21 - Logstash

Logstash options

The content of the logstash.yml file is listed for reference only

---
kind: configuration/logstash
title: "Logstash"
name: default
specification: {}

3.1.4.22 - Node-Exporter

Node-exporter options

The content of the node-exporter.yml file is listed for reference only

---
kind: configuration/node-exporter
title: "Node exporter"
name: default
specification:
  disable_helm_chart: false
  helm_chart_values:
    service:
      port: 9100
      targetPort: 9100 
  files:
    node_exporter_helm_chart_file_name: node-exporter-1.1.2.tgz
  enabled_collectors:
    - conntrack
    - diskstats
    - entropy
    - filefd
    - filesystem
    - loadavg
    - mdadm
    - meminfo
    - netdev
    - netstat
    - sockstat
    - stat
    - textfile
    - time
    - uname
    - vmstat
    - systemd

  config_flags:
    - "--web.listen-address=:9100"
    - '--log.level=info'
    - '--collector.diskstats.ignored-devices=^(ram|loop|fd)\d+$'
    - '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/)'
    - '--collector.netdev.device-blacklist="^$"'
    - '--collector.textfile.directory="/var/lib/prometheus/node-exporter"'
    - '--collector.systemd.unit-whitelist="(kafka\.service|zookeeper\.service)"'

  web_listen_port: "9100"
  web_listen_address: ""

  config_for_prometheus: # configuration that will be written to Prometheus to allow scraping metrics from this exporter
    exporter_listen_port: "9100"
    prometheus_config_dir: /etc/prometheus
    file_sd_labels:
      - label: "job"
        value: "node"

3.1.4.23 - Opendistro-for-ElasticSearch

Opendistro options

The content of the opendistro-for-elasticsearch.yml file is listed for reference only

---
kind: configuration/opendistro-for-elasticsearch
title: Open Distro for Elasticsearch Config
name: default
specification:
  cluster_name: LambdaStackElastic
  clustered: true
  admin_password: PASSWORD_TO_CHANGE
  kibanaserver_password: PASSWORD_TO_CHANGE
  kibanaserver_user_active: false
  logstash_password: PASSWORD_TO_CHANGE
  logstash_user_active: false
  demo_users_to_remove:
  - kibanaro
  - readall
  - snapshotrestore
  - logstash
  - kibanaserver
  paths:
    data: /var/lib/elasticsearch
    repo: /var/lib/elasticsearch-snapshots
    logs: /var/log/elasticsearch
  jvm_options:
    Xmx: 1g # see https://www.elastic.co/guide/en/elasticsearch/reference/7.9/heap-size.html
  opendistro_security:
    ssl:
      transport:
        enforce_hostname_verification: true

3.1.4.24 - Postgres-Exporter

Postgres-Exporter options

The content of the postgres-exporter.yml file is listed for reference only

---
kind: configuration/postgres-exporter
title: Postgres exporter
name: default
specification:
  config_flags:
  - --log.level=info
  - --extend.query-path=/opt/postgres_exporter/queries.yaml
  - --auto-discover-databases
  # Please see optional flags: https://github.com/prometheus-community/postgres_exporter/tree/v0.9.0#flags
  config_for_prometheus:
    exporter_listen_port: '9187'
    prometheus_config_dir: /etc/prometheus
    file_sd_labels:
    - label: "job"
      value: "postgres-exporter"

3.1.4.25 - PostreSQL

PostgreSQL options

The content of the postgresql.yml file is listed for reference only

---
kind: configuration/postgresql
title: PostgreSQL
name: default
specification:
  config_file:
    parameter_groups:
      - name: CONNECTIONS AND AUTHENTICATION
        subgroups:
          - name: Connection Settings
            parameters:
              - name: listen_addresses
                value: "'*'"
                comment: listen on all addresses
          - name: Security and Authentication
            parameters:
              - name: ssl
                value: 'off'
                comment: to have the default value also on Ubuntu
      - name: RESOURCE USAGE (except WAL)
        subgroups:
          - name: Kernel Resource Usage
            parameters:
              - name: shared_preload_libraries
                value: AUTOCONFIGURED
                comment: set by automation
      - name: ERROR REPORTING AND LOGGING
        subgroups:
          - name: Where to Log
            parameters:
              - name: log_directory
                value: "'/var/log/postgresql'"
                comment: to have standard location for Filebeat and logrotate
              - name: log_filename
                value: "'postgresql.log'"
                comment: to use logrotate with common configuration
      - name: WRITE AHEAD LOG
        subgroups:
          - name: Settings
            parameters:
              - name: wal_level
                value: replica
                when: replication
          # Changes to archive_mode require a full PostgreSQL server restart,
          # while archive_command changes can be applied via a normal configuration reload.
          # See https://repmgr.org/docs/repmgr.html#CONFIGURATION-POSTGRESQL
          - name: Archiving
            parameters:
              - name: archive_mode
                value: 'on'
                when: replication
              - name: archive_command
                value: "'/bin/true'"
                when: replication
      - name: REPLICATION
        subgroups:
          - name: Sending Server(s)
            parameters:
              - name: max_wal_senders
                value: 10
                comment: maximum number of simultaneously running WAL sender processes
                when: replication
              - name: wal_keep_size
                value: 500
                comment: the size of WAL files held for standby servers (MB)
                when: replication
          - name: Standby Servers # ignored on master server
            parameters:
              - name: hot_standby
                value: 'on'
                comment: must be 'on' for repmgr needs, ignored on primary but recommended in case primary becomes standby
                when: replication
  extensions:
    pgaudit:
      enabled: false
      shared_preload_libraries:
        - pgaudit
      config_file_parameters:
        log_connections: 'off'
        log_disconnections: 'off'
        log_statement: 'none'
        log_line_prefix: "'%m [%p] %q%u@%d,host=%h '"
        # pgaudit specific, see https://github.com/pgaudit/pgaudit/tree/REL_13_STABLE#settings
        pgaudit.log: "'write, function, role, ddl, misc_set'"
        pgaudit.log_catalog: 'off # to reduce overhead of logging' # default is 'on'
        # the following first 2 parameters are set to values that make it easier to access audit log per table
        # change their values to the opposite if you need to reduce overhead of logging
        pgaudit.log_relation: 'on # separate log entry for each relation' # default is 'off'
        pgaudit.log_statement_once: 'off' # same as default
        pgaudit.log_parameter: 'on' # default is 'off'
    pgbouncer:
      enabled: false
    replication:
      enabled: false
      replication_user_name: ls_repmgr
      replication_user_password: PASSWORD_TO_CHANGE
      privileged_user_name: ls_repmgr_admin
      privileged_user_password: PASSWORD_TO_CHANGE
      repmgr_database: ls_repmgr
      shared_preload_libraries:
        - repmgr
  logrotate:
    pgbouncer:
      period: weekly
      rotations: 5
    # Configuration partly based on /etc/logrotate.d/postgresql-common provided by 'postgresql-common' package from Ubuntu repo.
      # PostgreSQL from Ubuntu repo:
        # By default 'logging_collector' is disabled, so 'log_directory' parameter is ignored.
        # Default log path is /var/log/postgresql/postgresql-$version-$cluster.log.
      # PostgreSQL from SCL repo (RHEL):
        # By default 'logging_collector' is enabled and there is up to 7 files named 'daily' (e.g. postgresql-Wed.log)
        # and thus they can be overwritten by built-in log facility to provide rotation.
    # To have similar configuration for both distros (with logrotate), 'log_filename' parameter is modified.
    postgresql: |-
      /var/log/postgresql/postgresql*.log {
          maxsize 10M
          daily
          rotate 6
          copytruncate
          # delaycompress is for Filebeat
          delaycompress
          compress
          notifempty
          missingok
          su root root
          nomail
          # to have multiple unique filenames per day when dateext option is set
          dateformat -%Y%m%dH%H
      }      

3.1.4.26 - Prometheus

Prometheus options

The content of the prometheus.yml file is listed for reference only

---
kind: configuration/prometheus
title: "Prometheus"
name: default
specification:
  config_directory: "/etc/prometheus"
  storage:
    data_directory: "/var/lib/prometheus"
  config_flags:                                                            # Parameters that Prometheus service will be started with.
    - "--config.file=/etc/prometheus/prometheus.yml"                       # Directory should be the same as "config_directory"
    - "--storage.tsdb.path=/var/lib/prometheus"                            # Directory should be the same as "storage.data_directory"
    - "--storage.tsdb.retention.time=180d"                                 # Data retention time for metrics
    - "--storage.tsdb.retention.size=20GB"                                 # Data retention size for metrics
    - "--web.console.libraries=/etc/prometheus/console_libraries"          # Directory should be the same as "config_directory"
    - "--web.console.templates=/etc/prometheus/consoles"                   # Directory should be the same as "config_directory"
    - "--web.listen-address=0.0.0.0:9090"                                  # Address that Prometheus console will be available
    - "--web.enable-admin-api"                                             # Enables administrative HTTP API
  metrics_path: "/metrics"
  scrape_interval : "15s"
  scrape_timeout: "10s"
  evaluation_interval: "10s"
  remote_write: []
  remote_read: []
  alertmanager:
    enable: false # To make Alertmanager working, you have to enable it and define receivers and routes
    alert_rules:
      common: true
      container: false
      kafka: false
      node: false
      postgresql: false
      prometheus: false
    # config: # Configuration for Alertmanager, it will be passed to Alertmanager service.
    #   # Full list of configuration fields https://prometheus.io/docs/alerting/configuration/
    #   global:
    #     resolve_timeout: 5m
    #     smtp_from: "alert@test.com"
    #     smtp_smarthost: "smtp-url:smtp-port"
    #     smtp_auth_username: "your-smtp-user@domain.com"
    #     smtp_auth_password: "your-smtp-password"
    #     smtp_require_tls: True
    #   route:
    #     group_by: ['alertname']
    #     group_wait: 10s
    #     group_interval: 10s
    #     repeat_interval: 1h
    #     receiver: 'email' # Default receiver, change if another is set to default
    #     routes: # Example routes, names need to match 'name' field of receiver
    #       - match_re:
    #           severity: critical
    #         receiver: opsgenie
    #         continue: true
    #       - match_re:
    #           severity: critical
    #         receiver: pagerduty
    #         continue: true
    #       - match_re:
    #           severity: info|warning|critical
    #         receiver: slack
    #         continue: true
    #       - match_re:
    #           severity: warning|critical
    #         receiver: email
    #   receivers: # example configuration for receivers # api_url: https://prometheus.io/docs/alerting/configuration/#receiver
    #     - name: 'email'
    #       email_configs:
    #         - to: "test@domain.com"
    #     - name: 'slack'
    #       slack_configs:
    #         - api_url: "your-slack-integration-url"
    #     - name: 'pagerduty'
    #       pagerduty_configs:
    #         - service_key: "your-pagerduty-service-key"
    #     - name: 'opsgenie'
    #       opsgenie_config:
    #         api_key: <secret> | default = global.opsgenie_api_key
    #         api_url: <string> | default = global.opsgenie_api_url

3.1.4.27 - RabbitMQ

RabbitMQ options

The content of the rabbitmq.yml file is listed for reference only

---
kind: configuration/rabbitmq
title: "RabbitMQ"
name: default
specification:
  rabbitmq_user: rabbitmq
  rabbitmq_group: rabbitmq
  stop_service: false

  logrotate_period: weekly
  logrotate_number: 10
  ulimit_open_files: 65535

  amqp_port: 5672
  rabbitmq_use_longname: AUTOCONFIGURED # true/false/AUTOCONFIGURED
  rabbitmq_policies: []
  rabbitmq_plugins: []
  custom_configurations: []
  cluster:
    is_clustered: false

3.1.4.28 - Recovery

Recovery options

The content of the recovery.yml file is listed for reference only

---
kind: configuration/recovery
title: Recovery Config
name: default
specification:
  components:
    load_balancer:
      enabled: false
      snapshot_name: latest
    logging:
      enabled: false
      snapshot_name: latest
    monitoring:
      enabled: false
      snapshot_name: latest
    postgresql:
      enabled: false
      snapshot_name: latest
    rabbitmq:
      enabled: false
      snapshot_name: latest

3.1.4.29 - Repository

Repository options

The content of the repository.yml file is listed for reference only

---
kind: configuration/repository
title: "LambdaStack requirements repository"
name: default
specification:
  description: "Local repository of binaries required to install LambdaStack"
  download_done_flag_expire_minutes: 120
  apache_lsrepo_path: "/var/www/html/lsrepo"
  teardown:
    disable_http_server: true # whether to stop and disable Apache HTTP Server service
    remove:
      files: false
      helm_charts: false
      images: false
      packages: false

3.1.4.30 - Shared-Config

Shared-Config options

The content of the shared-config.yml file is listed for reference only

---
kind: configuration/shared-config
title: "Shared configuration that will be visible to all roles"
name: default
specification:
  custom_repository_url: '' # leave it empty to use local repository or provide url to your repo
  custom_image_registry_address: '' # leave it empty to use local registry or provide address of your registry (hostname:port). This registry will be used to populate K8s control plane and should contain all required images.
  download_directory: /tmp # directory where files and images will be stored just before installing/loading
  vault_location: '' # if empty "BUILD DIRECTORY/vault" will be used
  vault_tmp_file_location: SET_BY_AUTOMATION
  use_ha_control_plane: False
  promote_to_ha: False

3.1.4.31 - Vault

Hashicorp Vault options

The content of the vault.yml file is listed for reference only

---
kind: configuration/vault
title: Vault Config
name: default
specification:
  vault_enabled: false
  vault_system_user: vault
  vault_system_group: vault
  enable_vault_audit_logs: false
  enable_vault_ui: false
  vault_script_autounseal: true
  vault_script_autoconfiguration: true
  tls_disable: false
  kubernetes_integration: true
  kubernetes_configuration: true
  kubernetes_namespace: default
  enable_vault_kubernetes_authentication: true
  app_secret_path: devwebapp
  revoke_root_token: false
  secret_mount_path: secret
  vault_token_cleanup: true
  vault_install_dir: /opt/vault
  vault_log_level: info
  override_existing_vault_users: false
  certificate_name: fullchain.pem
  private_key_name: privkey.pem
  selfsigned_certificate:
    country: US
    state: state
    city: city
    company: company
    common_name: "*"
  vault_tls_valid_days: 365
  vault_users:
    - name: admin
      policy: admin
    - name: provisioner
      policy: provisioner
  files:
    vault_helm_chart_file_name: v0.11.0.tar.gz
  vault_helm_chart_values:
    injector:
      image:
        repository: "{{ image_registry_address }}/hashicorp/vault-k8s"
      agentImage:
        repository: "{{ image_registry_address }}/vault"
    server:
      image:
        repository: "{{ image_registry_address }}/vault"

3.1.4.32 - Zookeeper

Zookeeper options

The content of the zookeeper.yml file is listed for reference only

---
kind: configuration/zookeeper
title: "Zookeeper"
name: default
specification:
  static_config_file:
    # This block is injected to $ZOOCFGDIR/zoo.cfg
    configurable_block: |
      # Limits the number of concurrent connections (at the socket level) that a single client, identified by IP address,
      # may make to a single member of the ZooKeeper ensemble. This is used to prevent certain classes of DoS attacks,
      # including file descriptor exhaustion. The default is 60. Setting this to 0 removes the limit.
      maxClientCnxns=0

      # --- AdminServer configuration ---

      # By default the AdminServer is enabled. Disabling it will cause automated test failures.
      admin.enableServer=true

      # The address the embedded Jetty server listens on. Defaults to 0.0.0.0.
      admin.serverAddress=127.0.0.1

      # The port the embedded Jetty server listens on. Defaults to 8080.
      admin.serverPort=8008      

3.1.5 - GCP

Minimal and Full configuration options

WIP - Comming Soon!

3.2 - How-To

LambdaStack how-tos

3.2.1 - Backup

LambdaStack how-tos - Backup

LambdaStack backup and restore

Introduction

LambdaStack provides solution to create full or partial backup and restore for some components, like:

Backup is created directly on the machine where component is running, and it is moved to the repository host via rsync. On the repository host backup files are stored in location /lsbackup/mounted mounted on a local filesystem. See How to store backup chapter.

1. How to perform backup

Backup configuration

Copy default configuration for backup from defaults/configuration/backup.yml into newly created backup.yml config file, and enable backup for chosen components by setting up enabled parameter to true.

This config may also be attached to cluster-config.yml or whatever you named your cluster yaml file.

kind: configuration/backup
title: Backup Config
name: default
specification:
  components:
    load_balancer:
      enabled: true
    logging:
      enabled: false
    monitoring:
      enabled: true
    postgresql:
      enabled: true
    rabbitmq:
      enabled: false
# Kubernes recovery is not supported at this point.
# You may create backup by enabling this below, but recovery should be done manually according to Kubernetes documentation.
    kubernetes:
      enabled: false

Run lambdastack backup command:

lambdastack backup -f backup.yml -b build_folder

If backup config is attached to cluster-config.yml, use this file instead of backup.yml.

2. How to store backup

Backup location is defined in backup role as backup_destination_host and backup_destination_dir. Default backup location is defined on repository host in location /lsbackup/mounted/. Use mounted location as mount point and mount storage you want to use. This might be:

  • Azure Blob Storage
  • Amazon S3
  • GCP Blob Storage
  • NAS
  • Any other attached storage

Ensure that mounted location has enough space, is reliable and is well protected against disaster.


NOTE

If you don't attach any storage into the mount point location, be aware that backups will be stored on the local machine. This is not recommended.


3. How to perform recovery

Recovery configuration

Copy existing default configuration from defaults/configuration/recovery.yml into newly created recovery.yml config file, and set enabled parameter for component to recovery. It's possible to choose snapshot name by passing date and time part of snapshot name. If snapshot name is not provided, the latest one will be restored.

This config may also be attached to cluster-config.yml

kind: configuration/recovery
title: Recovery Config
name: default
specification:
  components:
    load_balancer:
      enabled: true
      snapshot_name: latest           #restore latest backup
    logging:
      enabled: true
      snapshot_name: 20200604-150829  #restore selected backup
    monitoring:
      enabled: false
      snapshot_name: latest
    postgresql:
      enabled: false
      snapshot_name: latest
    rabbitmq:
      enabled: false
      snapshot_name: latest

Run lambdastack recovery command:

lambdastack recovery -f recovery.yml -b build_folder

If recovery config is attached to cluster-config.yml, use this file instead of recovery.yml.

4. How backup and recovery work

Load Balancer

Load balancer backup includes:

  • Configuration files: /etc/haproxy/
  • SSL certificates: /etc/ssl/haproxy/

Recovery includes all backed up files

Logging

Logging backup includes:

  • Elasticsearch database snapshot
  • Elasticsearch configuration /etc/elasticsearch/
  • Kibana configuration /etc/kibana/

Only single-node Elasticsearch backup is supported. Solution for multi-node Elasticsearch cluster will be added in future release.

Monitoring

Monitoring backup includes:

  • Prometheus data snapshot
  • Prometheus configuration /etc/prometheus/
  • Grafana data snapshot

Recovery includes all backed up configurations and snapshots.

Postgresql

Postgresql backup includes:

  • Database data and metadata dump using pg_dumpall
  • Configuration files: *.conf

When multiple node configuration is used, and failover action has changed database cluster status (one node down, switchover) it's still possible to create backup. But before database restore, cluster needs to be recovered by running lambdastack apply and next lambdastack recovery to restore database data. By default, we don't support recovery database configuration from backup since this needs to be done using lambdastack apply or manually by copying backed up files accordingly to cluster state. The reason of this is that is very risky to restore configuration files among different database cluster configurations.

RabbitMQ

RabbitMQ backup includes:

  • Messages definitions
  • Configuration files: /etc/rabbitmq/

Backup does not include RabbitMQ messages.

Recovery includes all backed up files and configurations.

Kubernetes

LambdaStack backup provides:

  • Etcd snapshot
  • Public Key Infrastructure /etc/kubernetes/pki
  • Kubeadm configuration files

Following features are not supported yet (use related documentation to do that manually):

  • Kubernetes cluster recovery
  • Backup and restore of data stored on persistent volumes described in persistent storage documentation

3.2.2 - Cluster

LambdaStack how-tos - Cluster

How to enable/disable LambdaStack repository VM

Enable for Ubuntu (default):

  1. Enable "repository" component:

    repository:
      count: 1
    

Enable for RHEL on Azure:

  1. Enable "repository" component:

    repository:
      count: 1
      machine: repository-machine-rhel
    
  2. Add repository VM definition to main config file:

    kind: infrastructure/virtual-machine
    name: repository-machine-rhel
    provider: azure
    based_on: repository-machine
    specification:
      storage_image_reference:
        publisher: RedHat
        offer: RHEL
        sku: 7-LVM
        version: "7.9.2021051701"
    

Enable for RHEL on AWS:

  1. Enable "repository" component:

    repository:
      count: 1
      machine: repository-machine-rhel
    
  2. Add repository VM definition to main config file:

    kind: infrastructure/virtual-machine
    title: Virtual Machine Infra
    name: repository-machine-rhel
    provider: aws
    based_on: repository-machine
    specification:
      os_full_name: RHEL-7.9_HVM-20210208-x86_64-0-Hourly2-GP2
    

Enable for CentOS on Azure:

  1. Enable "repository" component:

    repository:
      count: 1
      machine: repository-machine-centos
    
  2. Add repository VM definition to main config file:

    kind: infrastructure/virtual-machine
    name: repository-machine-centos
    provider: azure
    based_on: repository-machine
    specification:
      storage_image_reference:
        publisher: OpenLogic
        offer: CentOS
        sku: "7_9"
        version: "7.9.2021071900"
    

Enable for CentOS on AWS:

  1. Enable "repository" component:

    repository:
      count: 1
      machine: repository-machine-centos
    
  2. Add repository VM definition to main config file:

    kind: infrastructure/virtual-machine
    title: Virtual Machine Infra
    name: repository-machine-centos
    provider: aws
    based_on: repository-machine
    specification:
      os_full_name: "CentOS 7.9.2009 x86_64"
    

Disable:

  1. Disable "repository" component:

    repository:
      count: 0
    
  2. Prepend "kubernetes_master" mapping (or any other mapping if you don't deploy Kubernetes) with:

    kubernetes_master:
      - repository
      - image-registry
    

How to create an LambdaStack cluster on existing infrastructure

Please read first prerequisites related to hostname requirements.

LambdaStack has the ability to set up a cluster on infrastructure provided by you. These can be either bare metal machines or VMs and should meet the following requirements:

Note. Hardware requirements are not listed since this depends on use-case, component configuration etc.

  1. The cluster machines/VMs are connected by a network (or virtual network of some sorts) and can communicate with each other. At least one of them (with repository role) has Internet access in order to download dependencies. If there is no Internet access, you can use air gap feature (offline mode).
  2. The cluster machines/VMs are running one of the following Linux distributions:
    • RedHat 7.6+ and < 8
    • CentOS 7.6+ and < 8
    • Ubuntu 18.04
  3. The cluster machines/VMs are accessible through SSH with a set of SSH keys you provide and configure on each machine yourself (key-based authentication).
  4. The user used for SSH connection (admin_user) has passwordless root privileges through sudo.
  5. A provisioning machine that:
    • Has access to the SSH keys
    • Is on the same network as your cluster machines
    • Has LambdaStack running. Note. To run LambdaStack check the Prerequisites

To set up the cluster do the following steps from the provisioning machine:

  1. First generate a minimal data yaml file:

    lambdastack init -p any -n newcluster
    

    The any provider will tell LambdaStack to create a minimal data config which does not contain any cloud provider related information. If you want full control you can add the --full flag which will give you a configuration with all parts of a cluster that can be configured.

  2. Open the configuration file and set up the admin_user data:

    admin_user:
      key_path: id_rsa
      name: user_name
      path: # Dynamically built
    

    Here you should specify the path to the SSH keys and the admin user name which will be used by Ansible to provision the cluster machines.

  3. Define the components you want to install and link them to the machines you want to install them on:

    Under the components tag you will find a bunch of definitions like this one:

    kubernetes_master:
      count: 1
      machines:
      - default-k8s-master
    

    The count specifies how many machines you want to provision with this component. The machines tag is the array of machine names you want to install this component on. Note that the count and the number of machines defined must match. If you don't want to use a component you can set the count to 0 and remove the machines tag. Finally, a machine can be used by multiple component since multiple components can be installed on one machine of desired.

    You will also find a bunch of infrastructure/machine definitions like below:

    kind: infrastructure/machine
    name: default-k8s-master
    provider: any
    specification:
      hostname: master
      ip: 192.168.100.101
    

    Each machine name used when setting up the component layout earlier must have such a configuration where the name tag matches with the defined one in the components. The hostname and ip fields must be filled to match the actual cluster machines you provide. Ansible will use this to match the machine to a component which in turn will determine which roles to install on the machine.

  4. Finally, start the deployment with:

    lambdastack apply -f newcluster.yml --no-infra
    

    This will create the inventory for Ansible based on the component/machine definitions made inside the newcluster.yml and let Ansible deploy it. Note that the --no-infra is important since it tells LambdaStack to skip the Terraform part.

How to create an LambdaStack cluster on existing air-gapped infrastructure

Please read first prerequisites related to hostname requirements.

LambdaStack has the ability to set up a cluster on air-gapped infrastructure provided by you. These can be either bare metal machines or VMs and should meet the following requirements:

Note. Hardware requirements are not listed since this depends on use-case, component configuration etc.

  1. The air-gapped cluster machines/VMs are connected by a network or virtual network of some sorts and can communicate with each other.
  2. The air-gapped cluster machines/VMs are running one of the following Linux distributions:
    • RedHat 7.6+ and < 8
    • CentOS 7.6+ and < 8
    • Ubuntu 18.04
  3. The cluster machines/VMs are accessible through SSH with a set of SSH keys you provide and configure on each machine yourself (key-based authentication).
  4. The user used for SSH connection (admin_user) has passwordless root privileges through sudo.
  5. A requirements machine that:
    • Runs the same distribution as the air-gapped cluster machines/VMs (RedHat 7, CentOS 7, Ubuntu 18.04)
    • Has access to the internet. If you don't have access to a similar machine/VM with internet access, you can also try to download the requirements with a Docker container. More information here.
  6. A provisioning machine that:
    • Has access to the SSH keys
    • Is on the same network as your cluster machines
    • Has LambdaStack running. Note. To run LambdaStack check the Prerequisites

To set up the cluster do the following steps:

  1. First we need to get the tooling to prepare the requirements. On the provisioning machine run:

    lambdastack prepare --os OS
    

    Where OS should be centos-7, redhat-7, ubuntu-18.04. This will create a directory called prepare_scripts with the needed files inside.

  2. The scripts in the prepare_scripts will be used to download all requirements. To do that copy the prepare_scripts folder over to the requirements machine and run the following command:

    download-requirements.sh /requirementsoutput/
    

    This will start downloading all requirements and put them in the /requirementsoutput/ folder. Once run successfully the /requirementsoutput/ needs to be copied to the provisioning machine to be used later on.

  3. Then generate a minimal data yaml file on the provisioning machine:

    lambdastack init -p any -n newcluster
    

    The any provider will tell LambdaStack to create a minimal data config which does not contain any cloud provider related information. If you want full control you can add the --full flag which will give you a configuration with all parts of a cluster that can be configured.

  4. Open the configuration file and set up the admin_user data:

    admin_user:
      key_path: id_rsa
      name: user_name
      path: # Dynamically built
    

    Here you should specify the path to the SSH keys and the admin user name which will be used by Ansible to provision the cluster machines.

  5. Define the components you want to install and link them to the machines you want to install them on:

    Under the components tag you will find a bunch of definitions like this one:

    kubernetes_master:
      count: 1
      machines:
      - default-k8s-master
    

    The count specifies how many machines you want to provision with this component. The machines tag is the array of machine names you want to install this component on. Note that the count and the number of machines defined must match. If you don't want to use a component you can set the count to 0 and remove the machines tag. Finally, a machine can be used by multiple component since multiple components can be installed on one machine of desired.

    You will also find a bunch of infrastructure/machine definitions like below:

    kind: infrastructure/machine
    name: default-k8s-master
    provider: any
    specification:
      hostname: master
      ip: 192.168.100.101
    

    Each machine name used when setting up the component layout earlier must have such a configuration where the name tag matches with the defined one in the components. The hostname and ip fields must be filled to match the actual cluster machines you provide. Ansible will use this to match the machine to a component which in turn will determine which roles to install on the machine.

  6. Finally, start the deployment with:

    lambdastack apply -f newcluster.yml --no-infra --offline-requirements /requirementsoutput/
    

    This will create the inventory for Ansible based on the component/machine definitions made inside the newcluster.yml and let Ansible deploy it. Note that the --no-infra is important since it tells LambdaStack to skip the Terraform part. The --offline-requirements tells LambdaStack it is an air-gapped installation and to use the /requirementsoutput/ requirements folder prepared in steps 1 and 2 as source for all requirements.

How to create an LambdaStack cluster using custom system repository and Docker image registry

LambdaStack has the ability to use external repository and image registry during lambdastack apply execution.

Custom urls need to be specified inside the configuration/shared-config document, for example:

kind: configuration/shared-config
title: Shared configuration that will be visible to all roles
name: default
specification:
  custom_image_registry_address: "10.50.2.1:5000"
  custom_repository_url: "http://10.50.2.1:8080/lsrepo"
  use_ha_control_plane: true

The repository and image registry implementation must be compatible with already existing Ansible code:

  • the repository data (including apt or yum repository) is served from HTTP server and structured exactly as in the offline package
  • the image registry data is loaded into and served from standard Docker registry implementation

Note. If both custom repository/registry and offline installation are configured then the custom repository/registry is preferred.

Note. You can switch between custom repository/registry and offline/online installation methods. Keep in mind this will cause "imageRegistry" change in Kubernetes which in turn may cause short downtime.

By default, LambdaStack creates "repository" virtual machine for cloud environments. When custom repository and registry are used there is no need for additional empty VM. The following config snippet can illustrate how to mitigate this problem:

kind: lambdastack-cluster
title: LambdaStack Cluster Config
provider: <provider>
name: default
specification:
  ...
  components:
    repository:
      count: 0
    kubernetes_master:
      count: 1
    kubernetes_node:
      count: 2
---
kind: configuration/feature-mapping
title: "Feature mapping to roles"
provider: <provider>
name: default
specification:
  roles_mapping:
    kubernetes_master:
      - repository
      - image-registry
      - kubernetes-master
      - helm
      - applications
      - node-exporter
      - filebeat
      - firewall
      - vault
---
kind: configuration/shared-config
title: Shared configuration that will be visible to all roles
provider: <provider>
name: default
specification:
  custom_image_registry_address: "<ip-address>:5000"
  custom_repository_url: "http://<ip-address>:8080/lsrepo"
  1. Disable "repository" component:

    repository:
      count: 0
    
  2. Prepend "kubernetes_master" mapping (or any other mapping if you don't deploy Kubernetes) with:

    kubernetes_master:
      - repository
      - image-registry
    
  3. Specify custom repository/registry in configuration/shared-config:

    specification:
      custom_image_registry_address: "<ip-address>:5000"
      custom_repository_url: "http://<ip-address>:8080/lsrepo"
    

How to create an LambdaStack cluster on a cloud provider

Please read first prerequisites related to hostname requirements.

LambdaStack has the ability to set up a cluster on one of the following cloud providers:

  • AWS
  • Azure
  • GCP - WIP

Under the hood it uses Terraform to create the virtual infrastructure before it applies our Ansible playbooks to provision the VMs.

You need the following prerequisites:

  1. Access to one of the supported cloud providers, aws, azure or gcp.
  2. Adequate resources to deploy a cluster on the cloud provider.
  3. A set of SSH keys you provide.
  4. A provisioning machine that:
    • Has access to the SSH keys
    • Has LambdaStack running.

      Note. To run LambdaStack check the Prerequisites

To set up the cluster do the following steps from the provisioning machine:

  1. First generate a minimal data yaml file:

    lambdastack init -p aws/azure -n newcluster
    

    The provider flag should be either aws or azure and will tell LambdaStack to create a data config which contains the specifics for that cloud provider. If you want full control you can add the --full flag which will give you a config with all parts of a cluster that can be configured.

  2. Open the configuration file and set up the admin_user data:

    admin_user:
      key_path: id_rsa
      name: user_name
      path: # Dynamically built
    

    Here you should specify the path to the SSH keys and the admin user name which will be used by Ansible to provision the cluster machines.

    For AWS the admin name is already specified and is dependent on the Linux distro image you are using for the VM's:

    • Username for Ubuntu Server: ubuntu
    • Username for Redhat: ec2-user

    On Azure the name you specify will be configured as the admin name on the VM's.

    On GCP-WIP the name you specify will be configured as the admin name on the VM's.

  3. Set up the cloud specific data:

    To let Terraform access the cloud providers you need to set up some additional cloud configuration.

    AWS:

    cloud:
      region: us-east-1
      credentials:
        key: aws_key
        secret: aws_secret
      use_public_ips: false
      default_os_image: default
    

    The region lets you chose the most optimal place to deploy your cluster. The key and secret are needed by Terraform and can be generated in the AWS console. More information about that here

    Azure:

    cloud:
      region: East US
      subscription_name: Subscribtion_name
      use_service_principal: false
      use_public_ips: false
      default_os_image: default
    

    The region lets you chose the most optimal place to deploy your cluster. The subscription_name is the Azure subscription under which you want to deploy the cluster.

    Terraform will ask you to sign in to your Microsoft Azure subscription when it prepares to build/modify/destroy the infrastructure on azure. In case you need to share cluster management with other people you can set the use_service_principal tag to true. This will create a service principle and uses it to manage the resources.

    If you already have a service principle and don't want to create a new one you can do the following. Make sure the use_service_principal tag is set to true. Then before you run lambdastack apply -f yourcluster.yml create the following folder structure from the path you are running LambdaStack:

    /build/clustername/terraform
    

    Where the clustername is the name you specified under specification.name in your cluster yaml. Then in terraform folder add the file named sp.yml and fill it up with the service principal information like so:

    appId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"
    displayName: "app-name"
    name: "http://app-name"
    password: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"
    tenant: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"
    subscriptionId: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx"
    

    LambdaStack will read this file and automatically use it for authentication for resource creation and management.

    GCP-WIP:

    NOTE: GCP-WIP values may or may not be correct until official GCP release

    cloud:
      region: us-east-1
      credentials:
        key: gcp_key
        secret: gcp_secret
      use_public_ips: false
      default_os_image: default
    

    The region lets you chose the most optimal place to deploy your cluster. The key and secret are needed by Terraform and can be generated in the GCP console.

    For both aws, azure, and gcp the following cloud attributes overlap:

    • use_public_ips: When true, the VMs will also have a direct interface to the internet. While this is easy for setting up a cluster for testing, it should not be used in production. A VPN setup should be used which we will document in a different section (TODO).
    • default_os_image: Lets you more easily select LambdaStack team validated and tested OS images. When one is selected, it will be applied to every infrastructure/virtual-machine document in the cluster regardless of user defined ones. The following values are accepted: - default: Applies user defined infrastructure/virtual-machine documents when generating a new configuration. - ubuntu-18.04-x86_64: Applies the latest validated and tested Ubuntu 18.04 image to all infrastructure/virtual-machine documents on x86_64 on Azure and AWS. - redhat-7-x86_64: Applies the latest validated and tested RedHat 7.x image to all infrastructure/virtual-machine documents on x86_64 on Azure and AWS. - centos-7-x86_64: Applies the latest validated and tested CentOS 7.x image to all infrastructure/virtual-machine documents on x86_64 on Azure and AWS. - centos-7-arm64: Applies the latest validated and tested CentOS 7.x image to all infrastructure/virtual-machine documents on arm64 on AWS. Azure currently doesn't support arm64. The images which will be used for these values will be updated and tested on regular basis.
  4. Define the components you want to install:

    Under the components tag you will find a bunch of definitions like this one:

    kubernetes_master:
      count: 1
    

    The count specifies how much VM's you want to provision with this component. If you don't want to use a component you can set the count to 0.

    Note that for each cloud provider LambdaStack already has a default VM configuration for each component. If you need more control over the VM's, generate a config with the --full flag. Then each component will have an additional machine tag:

    kubernetes_master:
      count: 1
      machine: kubernetes-master-machine
      ...
    

    This links to a infrastructure/virtual-machine document which can be found inside the same configuration file. It gives you full control over the VM config (size, storage, provision image, security etc.). More details on this will be documented in a different section (TODO).

  5. Finally, start the deployment with:

    lambdastack apply -f newcluster.yml
    

Note for RHEL Azure images

LambdaStack currently supports RHEL 7 LVM partitioned images attached to standard RHEL repositories. For more details, refer to Azure documentation.

LambdaStack uses cloud-init custom data in order to merge small logical volumes (homelv, optlv, tmplv and varlv) into the rootlv and extends it (with underlying filesystem) by the current free space in its volume group. The usrlv LV, which has 10G, is not merged since it would require a reboot. The merging is required to deploy a cluster, however, it can be disabled for troubleshooting since it performs some administrative tasks (such as remounting filesystems or restarting services).

NOTE: RHEL 7 LVM images require at least 64 GB for OS disk.

Example config:

kind: infrastructure/virtual-machine
specification:
  storage_image_reference:
    publisher: RedHat
    offer: RHEL
    sku: "7-LVM"
    version: "7.9.2021051701"
  storage_os_disk:
    disk_size_gb: 64

Note for CentOS Azure images

LambdaStack supports CentOS 7 images with RAW partitioning (recommended) and LVM as well.

Example config:

kind: infrastructure/virtual-machine
specification:
  storage_image_reference:
    publisher: OpenLogic
    offer: CentOS
    sku: "7_9"
    version: "7.9.2021071900"

How to disable merging LVM logical volumes

In order to not merge logical volumes (for troubleshooting), use the following doc:

kind: infrastructure/cloud-init-custom-data
title: cloud-init user-data
provider: azure
name: default
specification:
  enabled: false

How to delete an LambdaStack cluster on a cloud provider

LambdaStack has a delete command to remove a cluster from a cloud provider (AWS, Azure). With LambdaStack run the following:

lambdastack delete -b /path/to/cluster/build/folder

From the defined cluster build folder it will take the information needed to remove the resources from the cloud provider.

Single machine cluster

Please read first prerequisites related to hostname requirements.


NOTE

Single machine cannot be scaled up or deployed alongside other types of cluster.


Sometimes it might be desirable to run an LambdaStack cluster on a single machine. For this purpose LambdaStack ships with a single_cluster component configuration. This cluster comes with the following main components:

  • kubernetes-master: Untainted so pods can be deployed on it
  • rabbitmq: Rabbitmq for messaging instead of Kafka
  • applications: For deploying the Keycloak authentication service
  • postgresql: To provide a database for Keycloak

Note that components like logging and monitoring are missing since they do not provide much benefit in a single machine scenario. Also, RabbitMQ is included over Kafka since that is much less resource intensive.

To get started with a single machine cluster you can use the following template as a base. Note that some configurations are omitted:

kind: lambdastack-cluster
title: LambdaStack Cluster Config
name: default
built_path: # Dynamically built
specification:
  prefix: dev
  name: single
  admin_user:
    name: operations
    key_path: id_rsa
    path: # Dynamically built
  cloud:
    ... # add other cloud configuration as needed
  components:
    kubernetes_master:
      count: 0
    kubernetes_node:
      count: 0
    logging:
      count: 0
    monitoring:
      count: 0
    kafka:
      count: 0
    postgresql:
      count: 0
    load_balancer:
      count: 0
    rabbitmq:
      count: 0
    ignite:
      count: 0
    opendistro_for_elasticsearch:
      count: 0
    single_machine:
      count: 1
---
kind: configuration/applications
title: "Kubernetes Applications Config"
name: default
specification:
  applications:
  - name: auth-service
    enabled: yes # set to yest to enable authentication service
    ... # add other authentication service configuration as needed

To create a single machine cluster using the "any" provider (with extra load_balancer config included) use the following template below:

kind: lambdastack-cluster
title: "LambdaStack Cluster Config"
provider: any
name: single
build_path: # Dynamically built
specification:
  name: single
  admin_user:
    name: ubuntu
    key_path: id_rsa
    path: # Dynamically built
  components:
    kubernetes_master:
      count: 0
    kubernetes_node:
      count: 0
    logging:
      count: 0
    monitoring:
      count: 0
    kafka:
      count: 0
    postgresql:
      count: 0
    load_balancer:
      count: 1
      configuration: default
      machines: [single-machine]
    rabbitmq:
      count: 0
    single_machine:
      count: 1
      configuration: default
      machines: [single-machine]
---
kind: configuration/haproxy
title: "HAProxy"
provider: any
name: default
specification:
  logs_max_days: 60
  self_signed_certificate_name: self-signed-fullchain.pem
  self_signed_private_key_name: self-signed-privkey.pem
  self_signed_concatenated_cert_name: self-signed-test.tld.pem
  haproxy_log_path: "/var/log/haproxy.log"

  stats:
    enable: true
    bind_address: 127.0.0.1:9000
    uri: "/haproxy?stats"
    user: operations
    password: your-haproxy-stats-pwd
  frontend:
    - name: https_front
      port: 443
      https: yes
      backend:
      - http_back1
  backend: # example backend config below
    - name: http_back1
      server_groups:
      - kubernetes_node
      # servers: # Definition for server to that hosts the application.
      # - name: "node1"
      #   address: "lambdastack-vm1.domain.com"
      port: 30104
---
kind: infrastructure/machine
provider: any
name: single-machine
specification:
  hostname: x1a1
  ip: 10.20.2.10

How to create custom cluster components

LambdaStack gives you the ability to define custom components. This allows you to define a custom set of roles for a component you want to use in your cluster. It can be useful when you for example want to maximize usage of the available machines you have at your disposal.

The first thing you will need to do is define it in the configuration/feature-mapping configuration. To get this configuration you can run lambdastack init ... --full command. In the available_roles roles section you can see all the available roles that LambdaStack provides. The roles_mapping is where all the LambdaStack components are defined and were you need to add your custom components.

Below are parts of an example configuration/feature-mapping were we define a new single_machine_new component. We want to use Kafka instead of RabbitMQ and don`t need applications and postgres since we don't want a Keycloak deployment:

kind: configuration/feature-mapping
title: Feature mapping to roles
name: default
specification:
  available_roles: # All entries here represent the available roles within LambdaStack
  - name: repository
    enabled: yes
  - name: firewall
    enabled: yes
  - name: image-registry
  ...
  roles_mapping: # All entries here represent the default components provided with LambdaStack
  ...
    single_machine:
    - repository
    - image-registry
    - kubernetes-master
    - applications
    - rabbitmq
    - postgresql
    - firewall
    # Below is the new single_machine_new definition
    single_machine_new:
    - repository
    - image-registry
    - kubernetes-master
    - kafka
    - firewall
  ...

Once defined the new single_machine_new can be used inside the lambdastack-cluster configuration:

kind: lambdastack-cluster
title: LambdaStack Cluster Config
name: default
build_path: # Dynamically built
specification:
  prefix: new
  name: single
  admin_user:
    name: operations
    key_path: id_rsa
    path: # Dynamically built
  cloud:
    ... # add other cloud configuration as needed
  components:
    ... # other components as needed
    single_machine_new:
      count: x

Note: After defining a new component you might also need to define additional configurations for virtual machines and security rules depending on what you are trying to achieve.

How to scale or cluster components

Not all components are supported for this action. There is a bunch of issues referenced below in this document.

LambdaStack has the ability to automatically scale and cluster certain components on cloud providers (AWS, Azure). To upscale or downscale a component the count number must be increased or decreased:

components:
  kubernetes_node:
    count: ...
    ...

Then when applying the changed configuration using LambdaStack, additional VM's will be spawned and configured or removed. The following table shows what kind of operation component supports:

Component Scale up Scale down HA Clustered Known issues
Repository :heavy_check_mark: :heavy_check_mark: :x: :x: ---
Monitoring :heavy_check_mark: :heavy_check_mark: :x: :x: ---
Logging :heavy_check_mark: :heavy_check_mark: :x: :x: ---
Kubernetes master :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: #1579
Kubernetes node :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: #1580
Ignite :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Kafka :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Load Balancer :heavy_check_mark: :heavy_check_mark: :x: :x: ---
Opendistro for elasticsearch :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Postgresql :x: :x: :heavy_check_mark: :heavy_check_mark: #1577
RabbitMQ :heavy_check_mark: :heavy_check_mark: :x: :heavy_check_mark: #1578, #1309
RabbitMQ K8s :heavy_check_mark: :heavy_check_mark: :x: :heavy_check_mark: #1486
Keycloak K8s :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Pgpool K8s :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Pgbouncer K8s :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---
Ignite K8s :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: ---

Additional notes:

  • Repository:
    In standard LambdaStack deployment only one repository machine is required.
    :arrow_up: Scaling up the repository component will create a new standalone VM.
    :arrow_down: Scaling down will remove it in LIFO order (Last In, First Out).
    However, even if you create more than one VM, by default all other components will use the first one.

  • Kubernetes master:
    :arrow_up: When increased this will set up additional control plane nodes, but in the case of non-ha k8s cluster, the existing control plane node must be promoted first.
    :arrow_down: At the moment there is no ability to downscale.

  • Kubernetes node:
    :arrow_up: When increased this will set up an additional node and join into the Kubernetes cluster.
    :arrow_down: There is no ability to downscale.

  • Load balancer:
    :arrow_up: Scaling up the load_balancer component will create a new standalone VM.
    :arrow_down: Scaling down will remove it in LIFO order (Last In, First Out).

  • Logging:
    :arrow_up: Scaling up will create new VM with both Kibana and ODFE components inside.
    ODFE will join the cluster but Kibana will be a standalone instance.
    :arrow_down: When scaling down VM will be deleted.

  • Monitoring:
    :arrow_up: Scaling up the monitoring component will create a new standalone VM.
    :arrow_down: Scaling down will remove it in LIFO order (Last In, First Out).

  • Postgresql:
    :arrow_up: At the moment does not support scaling up. Check known issues.
    :arrow_down: At the moment does not support scaling down. Check known issues.

  • RabbitMQ:
    If the instance count is changed, then additional RabbitMQ nodes will be added or removed.
    :arrow_up: Will create new VM and adds it to the RabbitMQ cluster.
    :arrow_down: At the moment scaling down will just remove VM. All data not processed on this VM will be purged. Check known issues.
    Note that clustering requires a change in the configuration/rabbitmq document:

    kind: configuration/rabbitmq
    ...
    specification:
      cluster:
        is_clustered: true
    ...
    
  • RabbitMQ K8s: Scaling is controlled via replicas in StatefulSet. RabbitMQ on K8s uses plugin rabbitmq_peer_discovery_k8s to works in cluster.

Additional known issues:

  • #1574 - Disks are not removed after downscale of any LambdaStack component on Azure.

Multi master cluster

LambdaStack can deploy HA Kubernetes clusters (since v0.6). To achieve that, it is required that:

  • the master count must be higher than 1 (proper values should be 1, 3, 5, 7):

    kubernetes_master:
      count: 3
    
  • the HA mode must be enabled in configuration/shared-config:

    kind: configuration/shared-config
    ...
    specification:
      use_ha_control_plane: true
      promote_to_ha: false
    
  • the regular lambdastack apply cycle must be executed

LambdaStack can promote / convert older single-master clusters to HA mode (since v0.6). To achieve that, it is required that:

  • the existing cluster is legacy single-master cluster

  • the existing cluster has been upgraded to Kubernetes 1.17 or above first

  • the HA mode and HA promotion must be enabled in configuration/shared-config:

    kind: configuration/shared-config
    ...
    specification:
      use_ha_control_plane: true
      promote_to_ha: true
    
  • the regular lambdastack apply cycle must be executed

  • since it is one-time operation, after successful promotion, the HA promotion must be disabled in the config:

    kind: configuration/shared-config
    ...
    specification:
      use_ha_control_plane: true
      promote_to_ha: false
    

Note: It is not supported yet to reverse HA promotion.

LambdaStack can scale-up existing HA clusters (including ones that were promoted). To achieve that, it is required that:

  • the existing cluster must be already running in HA mode

  • the master count must be higher than previous value (proper values should be 3, 5, 7):

    kubernetes_master:
      count: 5
    
  • the HA mode must be enabled in configuration/shared-config:

    kind: configuration/shared-config
    ...
    specification:
      use_ha_control_plane: true
      promote_to_ha: false
    
  • the regular lambdastack apply cycle must be executed

Note: It is not supported yet to scale-down clusters (master count cannot be decreased).

Build artifacts

LambdaStack engine produce build artifacts during each deployment. Those artifacts contain:

  • Generated terraform files.
  • Generated terraform state files.
  • Generated cluster manifest file.
  • Generated ansible files.
  • Azure login credentials for service principal if deploying to Azure.

Artifacts contain sensitive data, so it is important to keep it in safe place like private GIT repository or storage with limited access. Generated build is also important in case of scaling or updating cluster - you will it in build folder in order to edit your cluster.

LambdaStack creates (or use if you don't specified it to create) service principal account which can manage all resources in subscription, please store build artifacts securely.

Kafka replication and partition setting

When planning Kafka installation you have to think about number of partitions and replicas since it is strongly related to throughput of Kafka and its reliability. By default, Kafka's replicas number is set to 1 - you should change it in core/src/ansible/roles/kafka/defaults in order to have partitions replicated to many virtual machines.

  ...
  replicas: 1 # Default to at least 1 (1 broker)
  partitions: 8 # 100 x brokers x replicas for reasonable size cluster. Small clusters can be less
  ...

You can read more here about planning number of partitions.

NOTE: LambdaStack does not use Confluent. The above reference is simply for documentation.

RabbitMQ installation and setting

To install RabbitMQ in single mode just add rabbitmq role to your data.yaml for your server and in general roles section. All configuration on RabbitMQ, e.g., user other than guest creation should be performed manually.

How to use Azure availability sets

In your cluster yaml config declare as many as required objects of kind infrastructure/availability-set like in the example below, change the name field as you wish.

---
kind: infrastructure/availability-set
name: kube-node  # Short and simple name is preferred
specification:
# The "name" attribute is generated automatically according to LambdaStack's naming conventions
  platform_fault_domain_count: 2
  platform_update_domain_count: 5
  managed: true
provider: azure

Then set it also in the corresponding components section of the kind: lambdastack-cluster doc.

  components:
    kafka:
      count: 0
    kubernetes_master:
      count: 1
    kubernetes_node:
# This line tells we generate the availability-set terraform template
      availability_set: kube-node  # Short and simple name is preferred
      count: 2

The example below shows a complete configuration. Note that it's recommended to have a dedicated availability set for each clustered component.

# Test availability set config
---
kind: lambdastack-cluster
name: default
provider: azure
build_path: # Dynamically built
specification:
  name: test-cluster
  prefix: test
  admin_user:
    key_path: id_rsa
    name: di-dev
    path: # Dynamically built
  cloud:
    region: Australia East
    subscription_name: <your subscription name>
    use_public_ips: true
    use_service_principal: true
  components:
    kafka:
      count: 0
    kubernetes_master:
      count: 1
    kubernetes_node:
# This line tells we generate the availability-set terraform template
      availability_set: kube-node  # Short and simple name is preferred
      count: 2
    load_balancer:
      count: 1
    logging:
      count: 0
    monitoring:
      count: 0
    postgresql:
# This line tells we generate the availability-set terraform template
      availability_set: postgresql  # Short and simple name is preferred
      count: 2
    rabbitmq:
      count: 0
title: LambdaStack Cluster Config
---
kind: infrastructure/availability-set
name: kube-node  # Short and simple name is preferred
specification:
# The "name" attribute (omitted here) is generated automatically according to LambdaStack's naming conventions
  platform_fault_domain_count: 2
  platform_update_domain_count: 5
  managed: true
provider: azure
---
kind: infrastructure/availability-set
name: postgresql  # Short and simple name is preferred
specification:
# The "name" attribute (omitted here) is generated automatically according to LambdaStack's naming conventions
  platform_fault_domain_count: 2
  platform_update_domain_count: 5
  managed: true
provider: azure

Downloading offline requirements with a Docker container

This paragraph describes how to use a Docker container to download the requirements for air-gapped/offline installations. At this time we don't officially support this, and we still recommend using a full distribution which is the same as the air-gapped cluster machines/VMs.

A few points:

  • This only describes how to set up the Docker containers for downloading. The rest of the steps are similar as in the paragraph here.
  • Main reason why you might want to give this a try is to download arm64 architecture requirements on a x86_64 machine. More information on the current state of arm64 support can be found here.

Ubuntu 18.04

For Ubuntu, you can use the following command to launch a container:

docker run -v /shared_folder:/home <--platform linux/amd64 or --platform linux/arm64> --rm -it ubuntu:18.04

As the ubuntu:18.04 image is multi-arch you can include --platform linux/amd64 or --platform linux/arm64 to run the container as the specified architecture. The /shared_folder should be a folder on your local machine containing the required scripts.

When you are inside the container run the following commands to prepare for the running of the download-requirements.sh script:

apt-get update # update the package manager
apt-get install sudo # install sudo so we can make the download-requirements.sh executable and run it as root
sudo chmod +x /home/download-requirements.sh # make the requirements script executable

After this you should be able to run the download-requirements.sh from the home folder.

RedHat 7.x

For RedHat you can use the following command to launch a container:

docker run -v /shared_folder:/home <--platform linux/amd64 or --platform linux/arm64> --rm -it registry.access.redhat.com/ubi7/ubi:7.9

As the registry.access.redhat.com/ubi7/ubi:7.9 image is multi-arch you can include --platform linux/amd64 or --platform linux/arm64 to run the container as the specified architecture. The /shared_folder should be a folder on your local machine containing the requirement scripts.

For running the download-requirements.sh script you will need a RedHat developer subscription to register the running container and make sure you can access to official Redhat repos for the packages needed. More information on getting this free subscription here.

When you are inside the container run the following commands to prepare for the running of the download-requirements.sh script:

subscription-manager register # will ask for you credentials of your RedHat developer subscription and setup the container
subscription-manager attach --auto # will enable the RedHat official repositories
chmod +x /home/download-requirements.sh # make the requirements script executable

After this you should be able to run the download-requirements.sh from the home folder.

CentOS 7.x

For CentOS, you can use the following command to launch a container:

arm64:

docker run -v /shared_folder:/home --platform linux/arm64 --rm -it arm64v8/centos:7.9.2009

x86_64:

docker run -v /shared_folder:/home --platform linux/amd64 --rm -it amd64/centos:7.9.2009

The /shared_folder should be a folder on your local machine containing the requirement scripts.

When you are inside the container run the following commands to prepare for the running of the download-requirements.sh script:

chmod +x /home/download-requirements.sh # make the requirements script executable

After this you should be able to run the download-requirements.sh from the home folder.

3.2.3 - Configuration

LambdaStack how-tos - Configuration

Configuration file

Named lists

LambdaStack uses a concept called named lists in the configuration YAML. Every item in a named list has the name key to identify it and make it unique for merge operation:

...
  list:
  - name: item1
    property1: value1
    property2: value2
  - name: item2
    property1: value3
    property2: value4
...

By default, a named list in your configuration file will completely overwrite the defaults that LambdaStack provides. This behaviour is on purpose so when you, for example, define a list of users for Kafka inside your configuration it completely overwrites the users defined in the Kafka defaults.

In some cases, however, you don't want to overwrite a named list. A good example would be the application configurations.

You don't want to re-define every item just to make sure LambdaStack has all default items needed by the Ansible automation. That is where the _merge metadata tag comes in. It will let you define whether you want to overwrite or merge a named list by setting it to true or false.

For example you want to enable the auth-service application. Instead of defining the whole configuration/applications configuration you can do the following:

kind: configuration/applications
title: "Kubernetes Applications Config"
name: default
provider: azure
specification:
  applications:
  - _merge: true
  - name: auth-service
    enabled: true

The _merge item with true will tell lambdastack to merge the application list and only change the enabled: true setting inside the auth-service and take the rests of the configuration/applications configuration from the defaults.

3.2.4 - Databases

LambdaStack how-tos - Databases

How to configure PostgreSQL

To configure PostgreSQL, login to server using ssh and switch to postgres user with command:

sudo -u postgres -i

Then configure database server using psql according to your needs and PostgreSQL documentation.

PostgreSQL passwords encryption

LambdaStack sets up MD5 password encryption. Although PostgreSQL since version 10 is able to use SCRAM-SHA-256 password encryption, LambdaStack does not support this encryption method since recommended production configuration uses more than one database host with HA configuration (repmgr) cooperating with PgBouncer and Pgpool. Pgpool is not able to parse SCRAM-SHA-256 hashes list while this encryption is enabled. Due to limited Pgpool authentication options, it is not possible to refresh the pool_passwd file automatically. For this reason, MD5 password encryption is set up and this is not configurable in LambdaStack.

How to set up PostgreSQL connection pooling

PostgreSQL connection pooling in LambdaStack is served by PgBouncer application. It is available as Kubernetes ClusterIP or standalone package. The Kubernetes based installation works together with PgPool so it supports PostgreSQL HA setup. The standalone installation (described below) is deprecated and will be removed in the next release.


NOTE

PgBouncer extension is not supported on ARM.


PgBouncer is installed only on PostgreSQL primary node. This needs to be enabled in configuration yaml file:

kind: configuration/postgresql
specification:
  extensions:
    ...
    pgbouncer:
      enabled: yes
    ...

PgBouncer listens on standard port 6432. Basic configuration is just template, with very limited access to database. This is because of security reasons. Configuration needs to be tailored according component documentation and stick to security rules and best practices.

How to set up PostgreSQL HA replication with repmgr cluster


NOTE 1

Replication (repmgr) extension is not supported on ARM.



NOTE 2

Changing number of PostgreSQL nodes is not supported by LambdaStack after first apply. Before cluster deployment think over what kind of configuration you need, and how many PostgreSQL nodes will be needed.


This component can be used as a part of PostgreSQL clustering configured by LambdaStack. In order to configure PostgreSQL HA replication, add to your configuration file a block similar to the one below to core section:

---
kind: configuration/postgresql
name: default
title: PostgreSQL
specification:
  config_file:
    parameter_groups:
      ...
      # This block is optional, you can use it to override default values
    - name: REPLICATION
      subgroups:
      - name: Sending Server(s)
        parameters:
        - name: max_wal_senders
          value: 10
          comment: maximum number of simultaneously running WAL sender processes
          when: replication
        - name: wal_keep_size
          value: 500
          comment: the size of WAL files held for standby servers (MB)
          when: replication
      - name: Standby Servers
        parameters:
        - name: hot_standby
          value: 'on'
          comment: must be 'on' for repmgr needs, ignored on primary but recommended
            in case primary becomes standby
          when: replication
  extensions:
    ...
    replication:
      enabled: true
      replication_user_name: ls_repmgr
      replication_user_password: PASSWORD_TO_CHANGE
      privileged_user_name: ls_repmgr_admin
      privileged_user_password: PASSWORD_TO_CHANGE
      repmgr_database: ls_repmgr
      shared_preload_libraries:
      - repmgr
    ...

If enabled is set to true for replication extension, LambdaStack will automatically create a cluster of primary and secondary server with replication user with name and password specified in configuration file. This is only possible for configurations containing two PostgreSQL servers.

Privileged user is used to perform full backup of primary instance and replicate this at the beginning to secondary node. After that for replication only replication user with limited permissions is used for WAL replication.

How to stop PostgreSQL service in HA cluster

In order to maintenance work sometimes PostgreSQL service needs to be stopped. Before this action repmgr service needs to be paused, see manual page before. When repmgr service is paused steps from PostgreSQL manual page may be applied or stop it as a regular systemd service.

How to register database standby in repmgr cluster

If one of database nodes has been recovered to desired state, you may want to re-attach it to database cluster. Execute these steps on node which will be attached as standby:

  1. Clone data from current primary node:
repmgr standby clone -h CURRENT_PRIMARY_ADDRESS -U ls_repmgr_admin -d ls_repmgr --force
  1. Register node as standby
repmgr standby register

You may use option --force if the node was registered in cluster before. For more options, see repmgr manual: https://repmgr.org/docs/5.2/repmgr-standby-register.html

How to switchover database nodes

For some reason you may want to switchover database nodes (promote standby to primary and demote existing primary to standby).

  1. Configure passwordless SSH communication for postgres user between database nodes.

  2. Test and run initial login between nodes to authenticate host (if host authentication is enabled).

Execute commands listed below on actual standby node

  1. Confirm that standby you want to promote is registered in repmgr cluster:
repmgr cluster show
  1. Run switchover:
repmgr standby switchover
  1. Run command from step 3 and check status. For more details or troubleshooting, see repmgr manual: https://repmgr.org/docs/5.2/repmgr-standby-switchover.html

How to set up PgBouncer, PgPool and PostgreSQL parameters

This section describes how to set up connection pooling and load balancing for highly available PostgreSQL cluster. The default configuration provided by LambdaStack is meant for midrange class systems but can be customized to scale up or to improve performance.

To adjust the configuration to your needs, you can refer to the following documentation:

Component Documentation URL
PgBouncer https://www.pgbouncer.org/config.html
PgPool: Performance Considerations https://www.pgpool.net/docs/41/en/html/performance.html
PgPool: Server Configuration https://www.pgpool.net/docs/41/en/html/runtime-config.html
PostgreSQL: connections https://www.postgresql.org/docs/10/runtime-config-connection.html
PostgreSQL: resources management https://www.postgresql.org/docs/10/runtime-config-resource.html

Installing PgBouncer and PgPool


NOTE

PgBouncer and PgPool Docker images are not supported for ARM. If these applications are enabled in configuration, installation will fail.


PgBouncer and PgPool are provided as K8s deployments. By default, they are not installed. To deploy them you need to add configuration/applications document to your configuration yaml file, similar to the example below (enabled flags must be set as true):

---
kind: configuration/applications
version: 1.2.0
title: "Kubernetes Applications Config"
provider: aws
name: default
specification:
  applications:
  ...

## --- pgpool ---

  - name: pgpool
    enabled: true
    ...
    namespace: postgres-pool
    service:
      name: pgpool
      port: 5432
    replicas: 3
    ...
    resources: # Adjust to your configuration, see https://www.pgpool.net/docs/42/en/html/resource-requiremente.html
      limits:
        # cpu: 900m # Set according to your env
        memory: 310Mi
      requests:
        cpu: 250m # Adjust to your env, increase if possible
        memory: 310Mi
    pgpool:
      # https://github.com/bitnami/bitnami-docker-pgpool#configuration + https://github.com/bitnami/bitnami-docker-pgpool#environment-variables
      env:
        PGPOOL_BACKEND_NODES: autoconfigured # you can use custom value like '0:pg-node-1:5432,1:pg-node-2:5432'
        # Postgres users
        PGPOOL_POSTGRES_USERNAME: ls_pgpool_postgres_admin # with SUPERUSER role to use connection slots reserved for superusers for K8s liveness probes, also for user synchronization
        PGPOOL_SR_CHECK_USER: ls_pgpool_sr_check # with pg_monitor role, for streaming replication checks and health checks
        # ---
        PGPOOL_ADMIN_USERNAME: ls_pgpool_admin # Pgpool administrator (local pcp user)
        PGPOOL_ENABLE_LOAD_BALANCING: false # set to 'false' if there is no replication
        PGPOOL_MAX_POOL: 4
        PGPOOL_CHILD_LIFE_TIME: 300
        PGPOOL_POSTGRES_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_postgres_password
        PGPOOL_SR_CHECK_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_sr_check_password
        PGPOOL_ADMIN_PASSWORD_FILE: /opt/bitnami/pgpool/secrets/pgpool_admin_password
      secrets:
        pgpool_postgres_password: PASSWORD_TO_CHANGE
        pgpool_sr_check_password: PASSWORD_TO_CHANGE
        pgpool_admin_password: PASSWORD_TO_CHANGE
      # https://www.pgpool.net/docs/42/en/html/runtime-config.html
      pgpool_conf_content_to_append: |
        #------------------------------------------------------------------------------
        # CUSTOM SETTINGS (appended by LambdaStack to override defaults)
        #------------------------------------------------------------------------------
        # num_init_children = 32
        connection_life_time = 600
        reserved_connections = 1        
      # https://www.pgpool.net/docs/41/en/html/auth-pool-hba-conf.html
      pool_hba_conf: autoconfigured

## --- pgbouncer ---

  - name: pgbouncer
    enabled: true
    ...
    namespace: postgres-pool
    service:
      name: pgbouncer
      port: 5432
    replicas: 2
    resources:
      requests:
        cpu: 250m
        memory: 128Mi
      limits:
        cpu: 500m
        memory: 128Mi
    pgbouncer:
      env:
        DB_HOST: pgpool.postgres-pool.svc.cluster.local
        DB_LISTEN_PORT: 5432
        MAX_CLIENT_CONN: 150
        DEFAULT_POOL_SIZE: 25
        RESERVE_POOL_SIZE: 25
        POOL_MODE: session
        CLIENT_IDLE_TIMEOUT: 0

Default setup - main parameters

This chapter describes the default setup and main parameters responsible for the performance limitations. The limitations can be divided into 3 layers: resource usage, connection limits and query caching. All the configuration parameters can be modified in the configuration yaml file.

Resource usage

Each of the components has hardware requirements that depend on its configuration, in particular on the number of allowed connections.

PgBouncer
replicas: 2
resources:
  requests:
    cpu: 250m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 128Mi
PgPool
replicas: 3
resources: # Adjust to your configuration, see https://www.pgpool.net/docs/41/en/html/resource-requiremente.html
  limits:
    # cpu: 900m # Set according to your env
    memory: 310Mi
  requests:
    cpu: 250m # Adjust to your env, increase if possible
    memory: 310Mi

By default, each PgPool pod requires 176 MB of memory. This value has been determined based on PgPool docs, however after stress testing we need to add several extra megabytes to avoid failed to fork a child issue. You may need to adjust resources after changing num_init_children or max_pool (PGPOOL_MAX_POOL) settings. Such changes should be synchronized with PostgreSQL and PgBouncer configuration.

PostgreSQL

Memory related parameters have PostgreSQL default values. If your setup requires performance improvements, you may consider changing values of the following parameters:

  • shared_buffers
  • work_mem
  • maintenance _work_mem
  • effective_cache_size
  • temp_buffers

The default settings can be overridden by LambdaStack using configuration/postgresql doc in the configuration yaml file.

Connection limits

PgBouncer

There are connection limitations defined in PgBouncer configuration. Each of these parameters is defined per PgBouncer instance (pod). For example, having 2 pods (with MAX_CLIENT_CONN = 150) allows for up to 300 client connections.

    pgbouncer:
      env:
        ...
        MAX_CLIENT_CONN: 150
        DEFAULT_POOL_SIZE: 25
        RESERVE_POOL_SIZE: 25
        POOL_MODE: session
        CLIENT_IDLE_TIMEOUT: 0

By default, POOL_MODE is set to session to be transparent for Pgbouncer client. This section should be adjusted depending on your desired configuration. Rotating connection modes are well described in Official Pgbouncer documentation.
If your client application doesn't manage sessions you can use CLIENT_IDLE_TIMEOUT to force session timeout.

PgPool

By default, PgPool service is configured to handle up to 93 active concurrent connections to PostgreSQL (3 pods x 31). This is because of the following settings:

num_init_children = 32
reserved_connections = 1

Each pod can handle up to 32 concurrent connections but one is reserved. This means that the 32nd connection from a client will be refused. Keep in mind that canceling a query creates another connection to PostgreSQL, thus, a query cannot be canceled if all the connections are in use. Furthermore, for each pod, one connection slot must be available for K8s health checks. Hence, the real number of available concurrent connections is 30 per pod.

If you need more active concurrent connections, you can increase the number of pods (replicas), but the total number of allowed concurrent connections should not exceed the value defined by PostgreSQL parameters: (max_connections - superuser_reserved_connections).

In order to change PgPool settings (defined in pgpool.conf), you can edit pgpool_conf_content_to_append section:

pgpool_conf_content_to_append: |
  #------------------------------------------------------------------------------
  # CUSTOM SETTINGS (appended by LambdaStack to override defaults)
  #------------------------------------------------------------------------------
  connection_life_time = 900
  reserved_connections = 1  

The content of pgpool.conf file is stored in K8s pgpool-config-files ConfigMap.

For detailed information about connection tuning, see "Performance Considerations" chapter in PgPool documentation.

PostgreSQL

PostgreSQL uses max_connections parameter to limit the number of client connections to database server. The default is typically 100 connections. Generally, PostgreSQL on sufficient amount of hardware can support a few hundred connections.

Query caching

Query caching is not available in PgBouncer.

PgPool

Query caching is disabled by default in PgPool configuration.

PostgreSQL

PostgreSQL is installed with default settings.

How to set up PostgreSQL audit logging

Audit logging of database activities is available through the PostgreSQL Audit Extension: PgAudit. It provides session and/or object audit logging via the standard PostgreSQL log.

PgAudit may generate a large volume of logging, which has an impact on performance and log storage. For this reason, PgAudit is not enabled by default.

To install and configure PgAudit, add to your configuration yaml file a doc similar to the following:

kind: configuration/postgresql
title: PostgreSQL
name: default
provider: aws
version: 1.0.0
specification:
  extensions:
    pgaudit:
      enabled: yes
      config_file_parameters:
        ## postgresql standard
        log_connections: 'off'
        log_disconnections: 'off'
        log_statement: 'none'
        log_line_prefix: "'%m [%p] %q%u@%d,host=%h '"
        ## pgaudit specific, see https://github.com/pgaudit/pgaudit/blob/REL_10_STABLE/README.md#settings
        pgaudit.log: "'write, function, role, ddl' # 'misc_set' is not supported for PG 10"
        pgaudit.log_catalog: 'off # to reduce overhead of logging'
        # the following first 2 parameters are set to values that make it easier to access audit log per table
        # change their values to the opposite if you need to reduce overhead of logging
        pgaudit.log_relation: 'on # separate log entry for each relation'
        pgaudit.log_statement_once: 'off'
        pgaudit.log_parameter: 'on'

If enabled property for PgAudit extension is set to yes, LambdaStack will install PgAudit package and add PgAudit extension to be loaded in shared_preload_libraries . Settings defined in config_file_parameters section are populated to LambdaStack managed PostgreSQL configuration file. Using this section, you can also set any additional parameter if needed (e.g. pgaudit.role) but keep in mind that these settings are global.

To configure PgAudit according to your needs, see PgAudit documentation.

Once LambdaStack installation is complete, there is one manual action at database level (per each database). Connect to your database using a client (like psql) and load PgAudit extension into current database by running command:

CREATE EXTENSION pgaudit;

To remove the extension from database, run:

DROP EXTENSION IF EXISTS pgaudit;

How to work with PostgreSQL connection pooling

PostgreSQL connection pooling is described in design documentaion page. Properly configured application (kubernetes service) to use fully HA configuration should be set up to connect to pgbouncer service (kubernetes) instead directly to database host. This configuration provides all the benefits of user PostgreSQL in clusteres HA mode (including database failover). Both pgbouncer and pgpool stores database users and passwords in configuration files and needs to be restarted (pods) in case of PostgreSQL authentication changes like: create, alter username or password. Pods during restart process are refreshing stored database credentials automatically.

How to configure PostgreSQL replication

Note

PostgreSQL native replication is now deprecated and removed. Use PostgreSQL HA replication with repmgr instead.

How to start working with OpenDistro for Elasticsearch

OpenDistro for Elasticsearch is an Apache 2.0-licensed distribution of Elasticsearch enhanced with enterprise security, alerting, SQL. In order to start working with OpenDistro change machines count to value greater than 0 in your cluster configuration:

kind: lambdastack-cluster
...
specification:
  ...
  components:
    kubernetes_master:
      count: 1
      machine: aws-kb-masterofpuppets
    kubernetes_node:
      count: 0
    ...
    logging:
      count: 1
    opendistro_for_elasticsearch:
      count: 2

Installation with more than one node will always be clustered - Option to configure the non-clustered installation of more than one node for Open Distro is not supported.

kind: configuration/opendistro-for-elasticsearch
title: OpenDistro for Elasticsearch Config
name: default
specification:
  cluster_name: LambdaStackElastic

By default, Kibana is deployed only for logging component. If you want to deploy Kibana for opendistro_for_elasticsearch you have to modify feature mapping. Use below configuration in your manifest.

kind: configuration/feature-mapping
title: "Feature mapping to roles"
name: default
specification:
  roles_mapping:
    opendistro_for_elasticsearch:
      - opendistro-for-elasticsearch
      - node-exporter
      - filebeat
      - firewall
      - kibana

Filebeat running on opendistro_for_elasticsearch hosts will always point to centralized logging hosts (./LOGGING.md).

How to start working with Apache Ignite Stateful setup

Apache Ignite can be installed in LambdaStack if count property for ignite feature is greater than 0. Example:

kind: lambdastack-cluster
specification:
  components:
    load_balancer:
      count: 1
    ignite:
      count: 2
    rabbitmq:
      count: 0
    ...

Configuration like in this example will create Virtual Machines with Apache Ignite cluster installed. There is possible to modify configuration for Apache Ignite and plugins used.

kind: configuration/ignite
title: "Apache Ignite stateful installation"
name: default
specification:
  version: 2.7.6
  file_name: apache-ignite-2.7.6-bin.zip
  enabled_plugins:
  - ignite-rest-http
  config: |
    <?xml version="1.0" encoding="UTF-8"?>

    <!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements.  See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License.  You may obtain a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->

    <beans xmlns="http://www.springframework.org/schema/beans"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="
          http://www.springframework.org/schema/beans
          http://www.springframework.org/schema/beans/spring-beans.xsd">

        <bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
          <property name="dataStorageConfiguration">
            <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
              <!-- Set the page size to 4 KB -->
              <property name="pageSize" value="#{4 * 1024}"/>
              <!--
              Sets a path to the root directory where data and indexes are
              to be persisted. It's assumed the directory is on a separated SSD.
              -->
              <property name="storagePath" value="/var/lib/ignite/persistence"/>

              <!--
                  Sets a path to the directory where WAL is stored.
                  It's assumed the directory is on a separated HDD.
              -->
              <property name="walPath" value="/wal"/>

              <!--
                  Sets a path to the directory where WAL archive is stored.
                  The directory is on the same HDD as the WAL.
              -->
              <property name="walArchivePath" value="/wal/archive"/>
            </bean>
          </property>

          <property name="discoverySpi">
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
              <property name="ipFinder">
                <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                  <property name="addresses">
                  IP_LIST_PLACEHOLDER
                  </property>
                </bean>
              </property>
            </bean>
          </property>
        </bean>
    </beans>    

Property enabled_plugins contains list with plugin names that will be enabled. Property config contains xml configuration for Apache Ignite. Important placeholder variable is IP_LIST_PLACEHOLDER which will be replaced by automation with list of Apache Ignite nodes for self discovery.

How to start working with Apache Ignite Stateless setup

Stateless setup of Apache Ignite is done using Kubernetes deployments. This setup uses standard applications LambdaStack's feature (similar to auth-service, rabbitmq). To enable stateless Ignite deployment use following document:

kind: configuration/applications
title: "Kubernetes Applications Config"
name: default
specification:
  applications:
  - name: ignite-stateless
    image_path: "lambdastack/ignite:2.9.1" # it will be part of the image path: {{local_repository}}/{{image_path}}
    namespace: ignite
    service:
      rest_nodeport: 32300
      sql_nodeport: 32301
      thinclients_nodeport: 32302
    replicas: 1
    enabled_plugins:
    - ignite-kubernetes # required to work on K8s
    - ignite-rest-http

Adjust this config to your requirements with number of replicas and plugins that should be enabled.

3.2.5 - Helm

LambdaStack how-tos - Helm

Helm "system" chart repository

LambdaStack provides Helm repository for internal usage inside our Ansible codebase. Currently only the "system" repository is available, but it's not designed to be used by regular users. In fact, regular users must not reuse it for any purpose.

LambdaStack developers can find it inside this location roles/helm_charts/files/system. To add a chart to the repository it's enough just to put unarchived chart directory tree inside the location (in a separate directory) and re-run epcli apply.

When the repository Ansible role is run it copies all unarchived charts to the repository host, creates Helm repository (index.yaml) and serves all these files from Apache HTTP server.

Installing Helm charts from the "system" repository

LambdaStack developers can reuse the "system" repository from any place inside the Ansible codebase. Moreover, it's a responsibility of a particular role to call the helm upgrade --install command.

There is a helpler task file that can be reused for that purpose roles/helm/tasks/install-system-release.yml. It's only responsible for installing already existing "system" Helm charts from the "system" repository.

This helper task expects such parameters/facts:

- set_fact:
    helm_chart_name: <string>
    helm_chart_values: <map>
    helm_release_name: <string>
  • helm_chart_values is a standard yaml map, values defined there replace default config of the chart (values.yaml).

Our standard practice is to place those values inside the specification document of the role that deploys the Helm release in Kubernetes.

Example config:

kind: configuration/<mykind-used-by-myrole>
name: default
specification:
  helm_chart_name: mychart
  helm_release_name: myrelease
  helm_chart_values:
    service:
      port: 8080
    nameOverride: mychart_custom_name

Example usage:

- name: Mychart
  include_role:
    name: helm
    tasks_from: install-system-release.yml
  vars:
    helm_chart_name: "{{ specification.helm_chart_name }}"
    helm_release_name: "{{ specification.helm_release_name }}"
    helm_chart_values: "{{ specification.helm_chart_values }}"

By default all installed "system" Helm releases are deployed inside the ls-charts namespace in Kubernetes.

Uninstalling "system" Helm releases

To uninstall Helm release roles/helm/tasks/delete-system-release.yml can be used. For example:

- include_role:
    name: helm
    tasks_from: delete-system-release.yml
  vars:
    helm_release_name: myrelease

3.2.6 - Istio

LambdaStack how-tos - Istio

Istio

Open source platform which allows you to run service mesh for distributed microservice architecture. It allows to connect, manage and run secure connections between microservices and brings lots of features such as load balancing, monitoring and service-to-service authentication without any changes in service code. Read more about Istio here.

Installing Istio

Istio in LambdaStack is provided as K8s application. By default, it is not installed. To deploy it you need to add "configuration/applications" document to your configuration yaml file, similar to the example below (enabled flag must be set as true):

Istio is installed using Istio Operator. Operator is a software extension to the Kubernetes API which has a deep knowledge how Istio deployments should look like and how to react if any problem appears. It is also very easy to make upgrades and automate tasks that would normally be executed by user/admin.

---
kind: configuration/applications
version: 0.8.0
title: "Kubernetes Applications Config"
provider: aws
name: default
specification:
  applications:
  ...

## --- istio ---

  - name: istio
    enabled: