Magento 2 on Kubernetes – how do I do that?

In this article, I’ll take a closer look at what it takes to run Magento 2 on Kubernetes. Let’s dive in!

Prerequisites

This article assumes you have the fundamental knowledge of operating Magento 2, containers (Docker), and basic concepts of Kubernetes.

You’ll need a running Kubernetes cluster with:

Tip

Kind, Minikube, Docker Desktop are all viable options for local development on Kubernetes.

Additionally, we’ll be using the following tools:

  • kubectl with the correct context configured
  • (optional but recommended) make

Once the cluster and the tools are in place, we can start deploying Magento. You'll find all necessary files in our Magento 2 on Kubernetes repository on GitHub.

Let’s go through the deployment process step by step.

Step 1: Create a minimal Magento 2 deployment

Magento

We’ll need a container running Magento itself, so we might just as well start with that.

But first, we need to go through some of the aspects one needs to consider when running any PHP web application on Kubernetes, to shed light on some architectural choices done in this article.

PHP web application pod patterns

There are different patterns for deploying a PHP web application on Kubernetes – from single-process, single-container through multi-process containers to multiple single-process ones.

All-in-one

The most straightforward arrangement is to have a single container running Apache 2 with mod_php in a single process – the arrangement quite commonly used in tutorials. While an all-in-one container is the easiest to configure and manage, you might want to consider using NGINX to serve static content after all – either as a dedicated pod or a caching reverse-proxy.

Apache 2 with mod_php
Apache with mod_php in a single container

NGINX + PHP-FPM in a single container

If you decide to run NGINX, you’ll need PHP-FPM with it. You’ll also need either a custom script or a process manager (e.g., supervisord) to run them both in a single container.

It's best practice to separate areas of concern by using one service per container. That service may fork into multiple processes (for example, Apache web server starts multiple worker processes). It’s ok to have multiple processes but to get the most benefit out of Docker, avoid one container being responsible for multiple aspects of your overall application. You can connect multiple containers using user-defined networks and shared volumes.

While in this configuration there will be more than one process running in the container, it maintains separation of concerns. The container is still stateless, disposable, exports its services via port binding, and has all backing services (including storage) as attached resources.

NGINX and PHP-FPM
NGINX and PHP-FPM in a single container

This is the configuration we're using.

Single Pod running two containers

In this configuration, NGINX and PHP-FPM are running in separate containers, communicating over the network instead of a socket. This way, we don’t need supervisord anymore and can assign specific readiness and liveliness probes to each container as well as have more control over resource allocation.

There is one caveat, though: we need to make sure NGINX can access static assets.

It can be achieved in two ways: by either creating a custom NGINX image with project files inside or by sharing project files between NGINX and PHP containers via a volume.

The latter requires creating a volume shared between containers in the Pod (emptyDir would be just fine here) and copying the files from the PHP container to the volume upon pod initialization (i.e., in an init container).

Note

This may be a viable option in some cases, but we found it to be overly complex.

NGINX and PHP-FPM in separate containers
NGINX and PHP-FPM in separate containers, but in a single pod

Web server and PHP in separate Pods

This pattern is quite similar to the previous one, except it allows us to scale PHP and NGINX Pod independently. We cannot use an emptyDir volume here to share files between the pods and need to configure proper persistent volumes for static assets.

NGINX and PHP-FPM in separate Pods
NGINX and PHP-FPM in separate pods

Which is best?

On the one hand, single-process Apache+PHP containers are easier to manage, on the other NGINX has a reputation for its superior performance when serving static content, and putting it and PHP-FPM on separate Pods allows you to scale them independently.

Even so, this comes at the cost of higher complexity. NGINX adds little enough overhead that scaling it along with PHP shouldn't be an issue.

Tip

It’s often best to run benchmarks yourself, taking factors such as expected traffic patterns, CDN use, and caching into consideration.

Magento container image

As discussed above, we’ll use a Docker image based on the FPM variant of the official PHP image with added NGINX and Supervisord.

Configuration from environment

When deploying Kubernetes applications, it’s usually best to configure each program by setting environment variables on its container.

While it’s possible to mount ConfigMaps or Secrets to containers as regular configuration files, this isn’t ideal. Different applications use different configuration formats, and frequently the values must match across multiple applications. Managing configuration this way quickly becomes unnecessarily complicated.

Conversely, with environment variables, you only need to define everything once, as key-value pairs.

You can then pass those values by just referring to its key (variable name). This way, you have a single source of truth for each configuration value needed.

Environment variables in Magento 2

One of the features of Magento 2 is the ability to pick up settings from the environment – setting an environment variable like <SCOPE>__<SYSTEM__VARIABLE__NAME> has the same effect as writing the setting in app/etc/config.php.

For example, if one wants to configure Elasticsearch as the search engine, setting an environment variable CONFIG__DEFAULT__CATALOG__SEARCH__ENGINE=elasticsearch7 instructs Magento to set the catalog search engine option to “Elasticsearch 7” for the Default Scope. It also locks this setting in the admin panel to prevent accidental changes.

Unfortunately, this feature cannot be used to control environment-specific settings like database credentials. There are a few ways to work around it, though:

  • Mount app/etc/env.php from a ConfigMap or a Secret
  • Use bin/magento to pick configuration up from environment variables, and pass it to Magento during Pod initialization. It’s mostly the same as configuring a Magento instance via CLI, but automated. It takes quite some time, though, to save the configuration, which considerably prolongs the time each Magento Pod takes to start.
  • Modify app/etc/env.php and include it in the container image. Since env.php is a regular PHP file that must return an array with the configuration, PHP’s built-in getenv() function is perfect for taking values from the environment during execution, e.g., 'dbname' => getenv('DB_NAME').

Managing logs

One more issue to consider when deploying Magento 2 on Kubernetes is to make sure all relevant logs survive container restarts and are easily accessible.

The simplest solution would be to use a PersistentVolume for var/log and var/reports directories. A volume solves the issue of log persistence but may cause performance issues with many Magento instances writing to the same files. Moreover, the logs themselves quickly become too long to navigate efficiently.

To satisfy both requirements, multiple options are available.

Sidecar container

When using the sidecar pattern, a separate container within the Pod is responsible for reading log files as they grow and outputting their contents to stdout.

Tip

Using one container per log file running tail -f to put logs to stdout works quite well for a vanilla Magento deployment, but it doesn’t scale very well with more files to process.

containers:
- image: kiweeteam/magento2
  name: magento-web
  volumeMounts:
  - name: logs
    mountPath: /var/www/html/var/log
- image: busybox
  name: system-log
  command: ["/bin/sh"]
  args:
  - -c
  - |
    touch /var/www/html/var/log/system.log
    chown 33:33 /var/www/html/var/log/system.log
    tail -n+1 -f /var/www/html/var/log/system.log
  resources:
    limits:
      cpu: 5m
      memory: 64Mi
    requests:
      cpu: 5m
      memory: 64Mi
  volumeMounts:
  - name: logs
    mountPath: /var/www/html/var/log
volumes:
- name: logs
  emptyDir: {}
Part of the Magento Deployment manifest where sidecars are defined

Logging directly to stdout

With Magento’s PSR-3 compatibility, it's possible to configure all relevant log handlers to log directly to stdout. In this example we're using graycoreio/magento2-stdlogging module to do exactly that.

Note

Logging directly to stdout would satisfy Factor XI of the Twelve-factor App methodology and make sidecar containers obsolete.

In either case a dedicated tool (Elastic stack, Fluentd) collects output of all containers running in the cluster and stores these logs for future access and analysis.

Cron

Magento relies on cron jobs for several of its features.

In a bare-metal deployment scenario, you’d assign one of the hosts to run cronjobs and configure them directly via crontab. However, such setup would not work with Kubernetes since there’s no single Magento instance (container) that is guaranteed to be always running. That’s why we’ll use a Kubernetes CronJob to execute bin/magento cron:run every minute.

This way, we delegate the responsibility of scheduling and running cron jobs Kubernetes, which starts a new Pod for each execution and runs given commands until completion. Additionally, these Pods are assigned resources independently, which reduces the risk of decreased performance when big jobs are running.

---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: magento-cron
spec:
  schedule: '* * * * *'
  concurrencyPolicy: Forbid
  startingDeadlineSeconds: 600
  failedJobsHistoryLimit: 20
  successfulJobsHistoryLimit: 5
  jobTemplate:
    spec:
      template:
        metadata:
          labels:
            app: magento-cron
            k8s-app: magento
        spec:
          containers:
          - name: magento-cron
            image: kiweeteam/magento2
            command: ["/bin/sh"]
            args:
            - -c
            - php bin/magento cron:run
          restartPolicy: Never
Abbreviated CronJob manifest to run Magento 2 cron jobs on Kubernetes

Caveats

When running Magento cron as a Kubernetes CronJob, make sure that all Magento cron jobs are configured to run as a single process. This can easily be done by setting the following environment variables:

CONFIG__DEFAULT__SYSTEM__CRON__INDEX__USE_SEPARATE_PROCESS=0
CONFIG__DEFAULT__SYSTEM__CRON__DEFAULT__USE_SEPARATE_PROCESS=0
CONFIG__DEFAULT__SYSTEM__CRON__CONSUMERS__USE_SEPARATE_PROCESS=0
CONFIG__DEFAULT__SYSTEM__CRON__DDG_AUTOMATION__USE_SEPARATE_PROCESS=0
Environment variables disabling Magento cron forking

Otherwise, the cron container may be terminated before it completes running all scheduled jobs.

Additional Jobs

Another Kubernetes Object type we’ll make use of in this project is a Job.

We're using magento-install to automate application installation. It installs the database schema, generates performance fixtures that we use as sample data for demonstration, and ensures all indexes are in “On Schedule” update mode.

containers:
- name: magento-setup
  image: kiweeteam/magento2
  command: ["/bin/sh"]
  args:
  - -c
  - |
      set -o errexit
      ./bin/install.sh
      php bin/magento setup:perf:generate-fixtures setup/performance-toolkit/profiles/ce/mok.xml
      magerun --no-interaction index:list | awk '{print $2}' | tail -n+4 | xargs -I{} magerun --no-interaction index:set-mode schedule {}
      magerun --no-interaction index:reset
      magerun --no-interaction cache:flush
Magento installation command – part of magento-install Job manifest
Tip

Bear in mind that since Job’s pod template field is immutable, it’s impossible to update the Jobs with each new release. Instead, you need to make sure to delete the old ones and create new ones for each revision deployed.

While such automation is useful for initial Magento installation, it's not required for further deployments. Instead, magento-web Pod runs an initContainer responsible for running php bin/magento setup:upgrade or php bin/magento app:config:import when required.

initContainers:
- name: setup
  image: kiweeteam/magento2
  command:
  - /bin/bash
  args:
  - -c
  - |
    set -o errexit
    # Update database schema if needed
    php bin/magento setup:db:status || php bin/magento setup:upgrade --keep-generated
    # Fail if database schema is not up-to-date after setup:upgrade
    php bin/magento setup:db:status
    # Import config if needed
    php bin/magento app:config:status || php bin/magento app:config:import
    # Fail if config is not up-to-date after app:config:import
    php bin/magento app:config:status
  envFrom:
  - configMapRef:
      name: config
  - configMapRef:
      name: additional
initContainer configuration to run Magento config and database schema upgrades

Database

For the database, we’ll simply use a StatefulSet running Percona with the data stored in a PersistentVolume.

A plain StatefulSet works well for a small local/development cluster, but you might consider setting up an Xtradb cluster (e.g., using Percona Kubernetes Operator) for larger deployments. Such a solution requires more resources and adds complexity, so make sure to run appropriate benchmarks to ensure benefits are worth the investment.

Elasticsearch

The final requirement is to run Elasticsearch.

Due to constrained resources (development cluster on a laptop), we’ll go for a simple, single-node Elasticsearch cluster with somewhat limited resources. Since it is not exposed to any public networks, we can disable authentication and TLS for simplicity.

This setup is enough for our purposes, but for a larger project with higher traffic, you might consider setting up an Elastic cluster with more nodes and more resources each. And again, it’s always a good idea to run benchmarks specific to your project, to make sure you have the configuration that works for you.

Tip

While it's often sufficient to deploy a single-instance Elasticsearch StatefulSet, Elastic Cloud on Kubernetes simplifies the process of deploying more sophisticated Elastic Cluster configurations.

Magento configuration

All that’s left is to point Magento to the newly created Elasticsearch instance. We can easily do this by extending additional.env configuration with:

CONFIG__DEFAULT__CATALOG__SEARCH__ELASTICSEARCH7_SERVER_HOSTNAME=elasticsearch
CONFIG__DEFAULT__CATALOG__SEARCH__ELASTICSEARCH7_SERVER_PORT=9200
CONFIG__DEFAULT__CATALOG__SEARCH__ENGINE=elasticsearch7
Example of Elasticsearch configuration using environment variables in magento

We're using Kustomize to merge configuration files together and pass them to Magento as environment variables.

Ingress

At this point, all that we’re missing to have a working Magento 2 instance on Kubernetes is a way to access it from the outside.

We could simply expose the magento-web Service by making its type either NodePort to expose it on a specific port or LoadBalancer to expose it via an external load balancer.

In this case, however, we’ll use an Ingress resource – this way, we’ll get TLS termination out-of-the-box, along with the possibility to manage the TLS certificates in a declarative manner (e.g., using cert-manager). We could even expose additional services with routing based on paths or (sub-)domains, should we decide to do so.

Assuming that NGINX Ingress Controller is already installed, all we need to do here is to create an Ingress definition that will proxy all traffic to the HTTP port of the magento-web Service.

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: main
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.org/proxy-buffer-size: "256k"
    nginx.org/proxy-buffers: "4 256k"
    nginx.org/proxy-read-timeout: "60s"
spec:
  defaultBackend:
    service:
      name: magento-web
      port:
        name: http
  rules:
    - host: magento2.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: magento-web
                port:
                  name: http

Ingress manifest

Putting it all together

To deploy the stack discussed so far, run make step-1 in the companion repository.

Now you should have a working, although bare-bones, Magento deployment on Kubernetes. We’re still missing some essential parts – they're coming up in the following sections.

Step 2: Redis and auto-scaling

Having deployed all functionality-related pieces in the previous section, you might have noticed that Magento’s performance is less than stellar. Worry not – we haven’t taken advantage of almost any caching yet!

In other words: we made it work, now let’s make it work fast with Redis and auto-scaling.

Redis

Redis plays two essential roles in any performant Magento 2 deployment:

  • Fast session storage to allow multiple application instances to keep track of session information between requests
  • Cache storage for internal Magento cache (e.g., configuration, layout, HTML fragments)

Here again, we’ll use a simple StatefulSet to run a single Redis instance with separate databases for sessions and cache. We don’t need to attach any PersistentVolumes, so we won’t.

Same as with Elasticsearch, the last thing we need to do is instruct Magento to use the newly deployed Redis instance. Just like before, we’ll add a few keys to additional.env and let Kustomize handle merging the pieces:

REDIS_CACHE_HOST=redis
REDIS_CACHE_PORT=6379
REDIS_CACHE_DB=0
REDIS_SESSION_HOST=redis
REDIS_SESSION_PORT=6379
REDIS_SESSION_DB=2
Environment variables used to configure Redis as cache and session storage backend in Magento

Horizontal Pod Autoscalers

With Redis in place, we can now run multiple instances of Magento sharing session information and cache. While we could just increase replicas count when necessary in Magento’s Deployment manifest, why not make use of the full potential running Magento 2 on Kubernetes gives us?

Let’s create Horizontal Pod Autoscalers instead and let Kubernetes figure out the optimum number at any given time.

To do so, we’ll create a new HorizontalPodAutoscaler. What it does is monitor resource usage of the Pods defined in scaleTargetRef, starting new ones when targetCPUUtilizationPercentage goes above the defined threshold until it reaches maxReplicas.

---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: magento-web
spec:
  maxReplicas: 5
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: magento-web
  targetCPUUtilizationPercentage: 75
HorizontalPodAutoscaler manifest
Note

In this article, we’ve purposely assigned limited resources to Magento Pods to make it easier to show auto-scaling in action. When deploying Magento 2 on Kubernetes in a real-life scenario, you should make sure to tune both PHP settings and Pod resource constraints, as well as scaling rules in the HorizontalPodAutoscaler configuration.

Like before, to deploy Redis and auto-scalers, simply run make step-3.

Step 3: Varnish

The last piece of the puzzle is adding a caching reverse-proxy to take some of the load off Magento. Naturally, we’ll use Varnish, as it’s supported out-of-the-box.

Not unlike in the previous steps, we’ll start with creating a Varnish Deployment. Two notable things here are that we expose not one, but two ports and run a custom command to start the container, first starting Varnish in daemon mode, and then running varnishncsa.

Exposing two ports allows us to configure simple access rules in Varnish VCL, letting Magento clear cache using one, while the other can be safely exposed to the outside world.

Running varnishncsa as part of the custom command provides an access log directly on the container's stdout.

Next, we need to tell Magento how to connect to Varnish, by extending additional.env configuration file as before:

VARNISH_HOST=varnish
VARNISH_PORT=80
Environment variables used to configure Varnish cache in Magento
containers:
- image: varnish
  name: varnish
  command: ["/bin/sh"]
  args:
    - -c
    - |
      varnishd -a :80 -a :6091 -f /etc/varnish/default.vcl -s default,512M;
      varnishncsa -F '%h %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" %{Varnish:handling}x'
  ports:
  - containerPort: 80
  - containerPort: 6091
Custom command and port configuration – part of the Varnish Deployment manifests

Lastly, we want the Ingress to route all incoming requests to Varnish. Doing so requires changing the destination Service specified in the Ingress definition from before. A straightforward way to update the Ingress definition is to use Kustomize’s patchesJson6902.

---
- op: replace
  path: /spec/backend/serviceName
  value: varnish
- op: replace
  path: /spec/backend/servicePort
  value: 6091
JSON Patch to update Ingress backend configuration
Tip

While Varnish excels at taking some of the load off the other components and improving the performance of rarely changing pages, it does not improve the performance of the interactive elements such as shopping cart, checkout, or customer area.

To deploy and configure Varnish, run make step-3.

Summary

So there you have it: an overview of all the essentials needed to run Magento 2 on Kubernetes, with Magento deployment configured via environment variables, cronjobs, Elasticsearch, Redis, autoscaling and Varnish.

All manifests and configuration files are managed by Kustomize so that they can be conveniently adjusted to the needs of any particular project.

While we wouldn’t recommend you run it on production as-is, it should give you a good starting point for creating a production-ready configuration specific to your project.

FacebookTwitterPinterest

Maciej Lewkowicz

Senior Full-stack Developer & DevOps Engineer

I believe there's love in new things and in old things there is wisdom. I love exploring all sorts of systems, especially putting enterprise-grade open source tools to use in small-scale projects, all the while being on a personal mission to make the Internet a little better for everyone.