Using OpenTelemetry with Shopware on Kubernetes

The first step to recover from a system failure is to identify what caused it in the first place. The more relevant information about the system's inner state is available, the quicker one can identify the root cause of a problem. The amount, relevancy, and quality of this information falls under the term observability. When running Shopware on Kubernetes, observability requires the use of the right tools to keep track of its stability and performance.

In this article we explain where good observability comes from and how to choose the best tools. Additionally, we show how to set up OpenTelemetry for Shopware running on Kubernetes.

What is observability?

Observability is a system property that defines the degree to which the system can generate actionable insights. It allows users to understand a system's state from these external outputs and take (corrective) action.

https://glossary.cncf.io/observability/

The better the observability of a system, the easier it is to fix issues with it with little prior knowledge about it.

The three pillars of observability

Observability is based on three types of data: logs, metrics, and traces. Each type provides a different kind of information about the system.

The three pillars of observability are logs, metrics, and traces
Logs, mertics, and traces—each adds to observability

The first pillar: logs

Logs are records of events that occurred within a system. They can be anything from information about ongoing operations, to details of encountered errors.

NGINX logs visualized in Grafana
Shopware web server (NGINX) logs. Each entry contains a timestamp, request method, request path, response status code, and response size.

In Kubernetes, logs are aggregated by dedicated tools such as Loki. They are then stored for further processing and inspection.

The second pillar: metrics

Metrics are measurements of a certain system state at a particular point in time. They convey a wide variety of information. Some often used metrics are the number of successful and failed HTTP responses, memory usage of an application, or available disk space on a server.

A Grafana dashboard showing current CPU and memory usage, and a graph of CPU usage per Kubernetes namespace
Overview of metrics from a system running Shopware on Kubernetes. The dashboard shows how much resources are allocated and actually used. The graph below shows how much CPU was used by workloads in different namespaces over time.

Metrics are numeric. This makes them easy to process, correlate, and visualize how they change over time. However, they ultimately provide only a high-level overview of system health—not detailed enough to aid deeper analysis.

The third pillar: traces

Traces are records of a program's inner workings. They capture information about which parts of a program were executed, in what order, and how long it took. With traces, developers can identify parts of the code that are slow or unstable under real-world conditions.

Application trace for for a 'GET /' request visualized in Grafana
Trace of a request to a Shopware store homepage.

This is the most detailed information you can get about software running on production. While they are invaluable aid when you need to get down to the root cause of a problem, they're simply too detailed to be used for an overview of overall system health. One must use all three kinds of observability data to get the full picture.

Logging and monitoring tools are the first things you set up when creating a Kubernetes cluster. Major cloud providers often already include them in managed Kubernetes offers.

It's different with traces though.

As tracing is closely tied to the programming language used by the application, it requires a specialized tool for processing. That's when one needs to make a choice between using one of a variety of services and setting up one's own observability toolchain.

When to use OpenTelemetry and Grafana over an observability platform as a service

At Kiwee we've been using New Relic for many years now. However, after we started using Kubernetes back in 2019, it became evident that we needed more flexibility than a 3rd-party observability platform can provide, at a lower cost.

That's when Grafana came into the picture. It has quickly become our go-to tool for processing Kubernetes metrics and logs. For a while though, we still relied on New Relic for tracing. This changed, however, with the introduction of Grafana Tempo and Shopware's native support for OpenTelemetry.

Here's how New Relic compares to Grafana with OpenTelemetry.

Core characteristics

All components needed for processing traces in this article are open-source projects. They use the OpenTelemetry Protocol (OTLP), which is part of the OpenTelemetry project governed by the Cloud Native Computing Foundation.

The use of a common, open standard makes it possible to build your own, highly customized observability stack. In this setup each component specializes in a particular task, such as log, metric, and trace processing, data visualization, or alerting.

Conversely, commercial platforms aim to provide an all-in-one tool for observability. They combine data processing, visualization, and alerting in a single service. This means that after you send data to the platform, it takes over control. It decides how the data is processed and presented, who can access it, and which other services can be integrated with, e.g. to send alerts.

Getting started

With a guided installation, setting up New Relic instrumentation is quite straightforward. After creating an account, you install the agent and provide it with an application name and New Relic API credentials. Once this is done, the traces start appearing in New Relic.

On the other hand, visualizing application traces in Grafana requires setting up a few separate components. These include OTel exporter for PHP, OpenTelemetry Collector, Grafana Tempo, and Grafana itself.

Building your own observability stack is indeed a more involved process than integrating an external service. However, in the case of Kubernetes, the setup is largely simplified by the use of Helm charts and Operators.

Using New Relic with OpenTelemetry

It is worth noting that it is possible to use New Relic with OpenTelemetry.

As stated in the documentation, there's significant overlap between New Relic's own instrumentation and OpenTelemetry, but “[w]ith New Relic instrumentation, there are inherent advantages to developing instrumentation and platform features which work together, and New Relic integrations tend to work better out of the box.”

That being said, OpenTelemetry still is a viable option for New Relic users looking for a more vendor-neutral approach.

Flexibility

Open-source tools allow users to build an observability stack that is specific to their organization's needs. Being a tool for data visualization, Grafana combines data from a variety of sources. These include Grafana Tempo for traces, but also Loki for logs, Prometheus for metrics, and many more.

Conversely, all-in-one 3rd-party services must sacrifice some flexibility for the ease of use. Even with its rich feature set, a platform is ultimately an opinionated tool. As such, sometimes it falls short of fulfilling all observability needs of a particular project.

Data security

When self-hosting your own observability stack, you keep full control over—but also responsibility of—when, where, and how telemetry data is stored, who can access it, and how to keep it secure.

With 3rd-party solutions such as New Relic, you rely on the vendor to provide adequate levels of security.

When running Shopware, you should already have the processes and infrastructure to protect sensitive data. After all, they're needed to be compliant with data protection regulations such as GDPR. This makes it potentially easier to integrate a self-hosted observability stack into your system than to ensure a 3rd-party integration's compliance.

Cost

New Relic's free tier includes a single full user account and 100GB data ingest. This is usually enough for a small, low-traffic eCommerce store. Anything more than that requires a paid plan with pricing based on the number of users and amount of data processed.

After the initial setup, the cost of self-hosting your own Shopware on Kubernetes observability stack boils down to infrastructure costs and time spent on maintenance.

In our experience, the observability stack consumes a fraction of the compute resources needed to run Shopware. Additionally, the time spent on configuration tweaks and occasional updates is comparable to that spent on tweaking 3rd-party service configuration.

All in all, as a project evolves, so must the tooling supporting it.

How to choose?

Key considerations when choosing between an all-in-one observability platform as a service and a self-hosted observability stack include flexibility, data security, and cost.

Services like New Relic are invaluable to quickly add telemetry to existing projects—especially those running on bare-metal or a VPS.

Then again, Grafana is a more natural choice for cloud-native applications.

Shopware on Kubernetes observability in practice—visualizing traces in Grafana

The full observability stack has a few components for handling traces. First, the OTel exporter generates traces and sends them to the OpenTelemetry Collector. Then, the collector puts single traces together into larger batches and sends them to Grafana Tempo. Finally, Grafana queries Tempo for traces and visualizes them in a human-readable form.

Application traces go from Shopware container to Grafana through OpenTelemetry Collector and Gafna Tempo
The flow of application trace data from Shopware to Grafana.

Let's go through the full setup, starting with exporting traces from the Shopware container. One step at a time, we'll send the traces to the OpenTelemetry Collector, forward them to Tempo, and finally visualize them in Grafana.

Prerequisites

A Kubernetes cluster with Shopware deployed

For this article I used our Shopware on Kubernetes setup which provides everything needed to deploy Shopware to Kubernetes.

I also used Skaffold and Kind to facilitate local cluster setup.

Configure the Shopware container to produce traces

If all goes well, at the end of this step messages similar to the following should start appearing in Shopware container logs.

10.244.0.1 - - [19/Dec/2024:17:25:33 +0000] "GET /api/_info/health-check HTTP/1.1" 200 5 "-" "kube-probe/1.27" "-"
127.0.0.1 -  19/Dec/2024:17:25:33 +0000 "GET /index.php" 200
{
    "resource": {
        "attributes": {
            "host.name": "shopware-web-57fc87c8d9-n8w8r",
            "host.arch": "x86_64",
            "os.type": "linux",
            "os.description": "6.8.0-49-generic",
            "os.name": "Linux",
            "os.version": "#49-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov  4 02:06:24 UTC 2024",
            "process.pid": 94,
            "process.executable.path": "\/usr\/local\/sbin\/php-fpm",
            "process.owner": "application",
            "process.runtime.name": "fpm-fcgi",
            "process.runtime.version": "8.3.14",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.language": "php",
            "telemetry.sdk.version": "1.1.2",
            "telemetry.distro.name": "opentelemetry-php-instrumentation",
            "telemetry.distro.version": "1.1.0",
            "service.name": "shopware",
            "service.version": "1.0.0+no-version-set"
        },
        "dropped_attributes_count": 0
    },
    "scopes": [
An example of OpenTelemetry traces in Shopware container logs

Install and activate required PHP OpenTelemetry extension

First, the opentelemetry auto-instrumentation PHP extension is installed into the Shopware container image.

# Dockerfile

RUN MAKEFLAGS="-j $(nproc)" pecl install \
    opentelemetry-1.1.0 \
    ;

RUN docker-php-ext-enable \
    opentelemetry \
    ;
Part of the Shopware Dockerfile responsible for installing OpenTelemetry extension for PHP.

Install OpenTelemetry Composer packages

Next, a set of Composer packages is installed to enable generating application traces in Shopware.

composer require \
    shopware/opentelemetry \
    open-telemetry/exporter-otlp \
    open-telemetry/opentelemetry-logger-monolog \
    ;
Composer command to install OpenTelemetry packages for Shopware.

Configure OpenTelemetry exporter

Finally, the OTel exporter is configured to print traces to container logs. This will make it easier to confirm that this part of the setup was configured correctly.

# deploy/bases/app/config/otel.env

OTEL_PHP_AUTOLOAD_ENABLED=true
OTEL_PHP_LOG_DESTINATION=stderr
OTEL_PHP_INTERNAL_METRICS_ENABLED=true
OTEL_SERVICE_NAME=shopware
OTEL_LOGS_EXPORTER=console
OTEL_METRICS_EXPORTER=console
OTEL_TRACES_EXPORTER=console
.env file containing configuration for the OTel exporter. It's converted to a Kubernetes ConfigMap by Kustomize with the contents exposed to Shopware containers as environment variables.

Collect traces for further processing

Now we need a way to collect traces from Shopware and forward them for further processing. For that, we'll use the OpenTelemetry Collector. We'll deploy it using an Operator and a Custom Resource.

After completing this part, you should have a StatefulSet called shopware-collector running in the cluster. In its container logs you should see entries similar to the example below.

ResourceSpans #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.30.0
Resource attributes:
    -> host.name: Str(shopware-web-855d46dd5d-cbzsv)
    -> host.arch: Str(x86_64)
    -> os.type: Str(linux)
    -> os.description: Str(6.8.0-52-generic)
    -> os.name: Str(Linux)
    -> os.version: Str(#53-Ubuntu SMP PREEMPT_DYNAMIC Sat Jan 11 00:06:25 UTC 2025)
    -> process.pid: Int(114)
    -> process.executable.path: Str(/usr/local/sbin/php-fpm)
    -> process.owner: Str(application)
    -> process.runtime.name: Str(fpm-fcgi)
    -> process.runtime.version: Str(8.3.14)
    -> telemetry.sdk.name: Str(opentelemetry)
    -> telemetry.sdk.language: Str(php)
    -> telemetry.sdk.version: Str(1.2.2)
    -> telemetry.distro.name: Str(opentelemetry-php-instrumentation)
    -> telemetry.distro.version: Str(1.1.0)
    -> service.name: Str(shopware)
    -> service.version: Str(1.0.0+no-version-set)
An example of OpenTelemetry debug logs

Deploy OpenTelemetry Kube Stack

First, let's install the OpenTelemetry Kube Stack. It includes the OpenTelemetry Operator and a couple of built-in collectors to gather data from Kubernetes.

For simplicity, we'll temporarily disable the built-in collectors. They can be enabled later for a more comprehensive OpenTelemetry setup.

OpenTelemetry Kube Stack is distributed as a Helm chart which makes it convenient to install it using Skaffold.

# skaffold.yaml

deploy:
  statusCheck: true
  kubectl: {}
  helm:
    releases:
    - name: opentelemetry-kube-stack
      remoteChart: opentelemetry-kube-stack
      repo: https://open-telemetry.github.io/opentelemetry-helm-charts
      wait: true
      setValues:
        collectors:
          cluster:
            enabled: false
          daemon:
            enabled: false
Part of the skaffold.yaml file responsible for installing OpenTelemetry Kube Stack.

Deploy an instance of OpenTelemetry Collector

Now that the Operator is up and running, let's create an OpenTelemetryCollector Custom Resource. With this, the Operator will create a new instance of OpenTelemetry Collector. It will be a StatefulSet called shopware-collector.

For now, the collector will print all incoming data into its logs to make it easier to confirm that it's receiving traces from Shopware.

# deploy/overlays/local/opentelemetry-collector.yaml

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: shopware
spec:
  mode: statefulset
  replicas: 1
  managementState: managed
  config:
    receivers:
      otlp:
        protocols:
          http:
            endpoint: 0.0.0.0:4318
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 400
        # 25% of limit up to 2G
        spike_limit_mib: 100
        check_interval: 5s

    exporters:
      nop: {}
      debug:
        verbosity: detailed
        sampling_initial: 5
        sampling_thereafter: 200
        use_internal_logger: false
      otlp:
        endpoint: tempo.default.svc.cluster.local:4317
        tls:
          insecure: true

    logging:
      loglevel: debug

    service:
      telemetry:
        logs:
          level: debug
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [debug]
        metrics:
          receivers: [otlp]
          processors: []
          exporters: [debug]
        logs:
          receivers: [otlp]
          processors: []
          exporters: [debug]
OpenTelemetryCollector Custom Resource manifest

Update OpenTelemetry configuration in Shopware container to send data to OpenTelemetry Collector

Finally, let's update OTel exporter configuration to forward all data to the OpenTelemetry Collector instead of printing to Shopware container logs as before. This is done by updating environment variables controlling OTel exporter configuration in the Shopware container.

# deploy/bases/app/config/otel.env

OTEL_PHP_AUTOLOAD_ENABLED=true
OTEL_PHP_LOG_DESTINATION=stderr
OTEL_PHP_INTERNAL_METRICS_ENABLED=true
OTEL_SERVICE_NAME=shopware
OTEL_LOGS_EXPORTER=otlp
OTEL_METRICS_EXPORTER=otlp
OTEL_TRACES_EXPORTER=otlp
OTEL_EXPORTER_OTLP_PROTOCOL=http/json
OTEL_EXPORTER_OTLP_ENDPOINT=http://shopware-collector:4318
Updated OTel exporter environment variables. Traces are now forwarded to the shopware-collector Service using the OTLP protocol.

Visualize traces in Grafana

Once OpenTelemetry Collector starts receiving traces from Shopware, all that's left is to deploy Grafana and Grafana Tempo to process and visualize the data.

Deploy Grafana Operator and Tempo

In the last step, we need to send all data from OTel exporter to the OpenTelemetry Collector instead of just printing traces to Shopware container logs. To do this, we need to update the environment variables that control the OTel exporter configuration.

# skaffold.yaml

deploy:
  statusCheck: true
  kubectl: {}
  helm:
    releases:
      - name: tempo
        remoteChart: tempo
        repo: https://grafana.github.io/helm-charts
        wait: true
        version: 1.12.0
        setValues:
          tempoQuery:
            enabled: true
      - name: grafana-operator
        remoteChart: oci://ghcr.io/grafana/helm-charts/grafana-operator
        version: v5.15.1
        wait: true
        setValues:
          installCRDs: true
Part of the skaffold.yamlskaffold.yaml file responsible for installing Grafana Tempo and Grafana Operator

Send traces from OpenTelemetry Collector to Grafana Tempo

When Grafana Tempo is up and running, the last step is to forward all data from OpenTelemetry Collector to Grafana Tempo.

# deploy/overlays/local/otel-collector.yaml

apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: shopware
spec:
  mode: statefulset
  replicas: 1
  managementState: managed
  config:
    receivers:
      otlp:
        protocols:
          http:
            endpoint: 0.0.0.0:4318
    processors:
      batch:
      memory_limiter:
        # 80% of maximum memory up to 2G
        limit_mib: 400
        # 25% of limit up to 2G
        spike_limit_mib: 100
        check_interval: 5s

    exporters:
      nop: {}
      debug:
        verbosity: detailed
        sampling_initial: 5
        sampling_thereafter: 200
        use_internal_logger: false
      otlp:
        endpoint: tempo.default.svc.cluster.local:4317
        tls:
          insecure: true

    logging:
      loglevel: debug

    service:
      telemetry:
        logs:
          level: debug
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlp]
        metrics:
          receivers: [otlp]
          processors: []
          exporters: [otlp]
        logs:
          receivers: [otlp]
          processors: []
          exporters: [otlp]
Updated OpenTelemetryCollector Custom Resource manifest. From now on, all data received by the collector will be sent to Grafana Tempo.

Once this last change is reconciled, open Grafana in your web browser and select Explore from the main menu. Select Search and use shopware for the Service Name filter.

There you'll see Shopware application traces coming in.

A list of traces filtered by Service Name 'shopware' in Grafana
Shopware application traces in the Explore view in Grafana.

What’s next?

Now that traces from Shopware are available in Grafana, you get valuable insight into Shopware's inner workings. This alone will help identify performance bottlenecks, and enable you to better understand how your online store behaves in the real world.

When using Kubernetes, you probably already have tools for handling logs and metrics, e.g. Fluentd or Loki, and Prometheus. Now is the time to make this observability stack your own. With these tools, you can correlate logs, metrics, and traces from different services, and analyze them. All that information you can visualize on custom dashboards and set up alerts for when things go off.

Conclusion

For a stable deployment of Shopware on Kubernetes, observability is crucial. Running a stable, performant online store in the cloud is next to impossible without proper tools to investigate its inner state.

While commercial observability platforms offer a quick way to get insights into the production systems, they have inherent limitations. Their flexibility and customizability must make way for the ease of use. Moreover, using them to provide necessary insight into even small projects can get quite expensive.

As shown here, adding tracing to the standard Kubernetes monitoring tools doesn't take too much effort. The time investment is in fact often comparable to that required to integrate an external service. With that, it became possible to use OpenTelemetry as a part of the Shopware on Kubernetes observability toolkit.

Looking to improve the stability and performance of your online store? Contact us! We'll be happy to help you choose the best solutions for your organization.

FacebookTwitterPinterest

Maciej Lewkowicz

Senior Full-stack Developer & DevOps Engineer

I believe there's love in new things and in old things there is wisdom. I love exploring all sorts of systems, especially putting enterprise-grade open source tools to use in small-scale projects, all the while being on a personal mission to make the Internet a little better for everyone.