<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Apache SkyWalking – Cloud Native</title>
    <link>/tags/cloud-native/</link>
    <description>Recent content in Cloud Native on Apache SkyWalking</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Thu, 02 Apr 2026 00:00:00 +0000</lastBuildDate>
    
	  <atom:link href="/tags/cloud-native/feed.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Blog: Monitoring Envoy AI Gateway with Apache SkyWalking</title>
      <link>/blog/2026-04-02-envoy-ai-gateway-monitoring/</link>
      <pubDate>Thu, 02 Apr 2026 00:00:00 +0000</pubDate>
      <guid>/blog/2026-04-02-envoy-ai-gateway-monitoring/</guid>
      <description>
        
        
        &lt;h2 id=&#34;the-problem-flying-blind-with-llm-traffic&#34;&gt;The Problem: Flying Blind with LLM Traffic&lt;/h2&gt;
&lt;p&gt;LLM traffic is becoming a first-class citizen in production infrastructure. Teams are calling OpenAI, Anthropic,
AWS Bedrock, Azure OpenAI, Google Gemini — often multiple providers at once. But most organizations have
no unified visibility into this traffic:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Token costs spiral&lt;/strong&gt; without knowing which teams, models, or providers drive the spend.
A single misconfigured prompt template can burn through thousands of dollars before anyone notices.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provider outages cause cascading failures.&lt;/strong&gt; When OpenAI has a bad hour, your application goes down
with it — and you have no failover visibility to understand what happened or switch providers automatically.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No unified metrics&lt;/strong&gt; across heterogeneous LLM calls. Latency, Time to First Token (TTFT),
Time Per Output Token (TPOT), token usage, error rates — each provider reports these differently,
if at all. There is no single dashboard to compare them.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the same observability gap that microservices faced a decade ago. The solution then was
service meshes and API gateways with built-in telemetry. For AI workloads, the answer is an AI gateway.&lt;/p&gt;
&lt;h2 id=&#34;why-an-ai-gateway&#34;&gt;Why an AI Gateway&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; is an open-source AI gateway built on top of
&lt;a href=&#34;https://www.envoyproxy.io/&#34;&gt;Envoy Proxy&lt;/a&gt; and &lt;a href=&#34;https://gateway.envoyproxy.io/&#34;&gt;Envoy Gateway&lt;/a&gt;.
It is not a standalone SaaS product or a Python proxy — it is infrastructure-grade software built on
the same Envoy that already handles traffic for a large portion of cloud-native deployments.&lt;/p&gt;
&lt;p&gt;Key capabilities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multi-provider routing&lt;/strong&gt; — supports 16+ AI providers (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI,
Google Gemini, Mistral, Cohere, DeepSeek, and more) behind a unified API.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Token-based rate limiting&lt;/strong&gt; — rate limit by token consumption, not just request count.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Provider fallback&lt;/strong&gt; — automatic failover when a provider is down or slow.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model virtualization&lt;/strong&gt; — abstract model names so applications are decoupled from specific providers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Two-tier architecture&lt;/strong&gt; — a reference architecture with a centralized entry gateway (Tier 1) for
auth and global routing, and per-cluster gateways (Tier 2) for inference optimization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CNCF ecosystem native&lt;/strong&gt; — runs on Kubernetes, composes with existing Envoy filters, WASM plugins,
and standard Kubernetes Gateway API resources.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because Envoy AI Gateway natively emits GenAI metrics and access logs via OTLP following
&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt;,
it plugs directly into any OpenTelemetry-compatible backend.&lt;/p&gt;
&lt;p&gt;Starting from SkyWalking 10.4.0, the OAP server natively receives and analyzes Envoy AI Gateway&amp;rsquo;s
OTLP metrics and access logs — no OpenTelemetry Collector needed in between.&lt;/p&gt;
&lt;h2 id=&#34;data-flow&#34;&gt;Data Flow&lt;/h2&gt;
&lt;p&gt;The AI Gateway pushes telemetry directly to SkyWalking via OTLP gRPC:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;workflow.jpg&#34; alt=&#34;Data flow&#34;&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Application&lt;/strong&gt; sends LLM API requests through the Envoy AI Gateway.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway&lt;/strong&gt; routes requests to AI providers (or local models like Ollama)
and records GenAI metrics (token usage, latency, TTFT, TPOT) and access logs.&lt;/li&gt;
&lt;li&gt;The gateway pushes metrics and logs via &lt;strong&gt;OTLP gRPC&lt;/strong&gt; directly to &lt;strong&gt;SkyWalking OAP&lt;/strong&gt; on port 11800.&lt;/li&gt;
&lt;li&gt;SkyWalking OAP parses metrics with MAL rules and access logs with LAL rules,
then stores everything in &lt;strong&gt;BanyanDB&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;No OpenTelemetry Collector is needed. SkyWalking OAP&amp;rsquo;s built-in OTLP receiver handles everything.&lt;/p&gt;
&lt;h2 id=&#34;try-it-locally&#34;&gt;Try It Locally&lt;/h2&gt;
&lt;p&gt;This demo uses &lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt; as a local LLM backend so you can try
everything without an API key. The &lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt;
(&lt;code&gt;aigw&lt;/code&gt;) provides a standalone mode that runs outside Kubernetes — perfect for local testing.&lt;/p&gt;
&lt;h3 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Docker and Docker Compose&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt; installed on your host&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;step-1-start-ollama&#34;&gt;Step 1: Start Ollama&lt;/h3&gt;
&lt;p&gt;Start Ollama on all interfaces so Docker containers can reach it:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#953800&#34;&gt;OLLAMA_HOST&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;0.0.0.0 ollama serve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Pull a small model for testing:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ollama pull llama3.2:1b
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;step-2-start-the-stack&#34;&gt;Step 2: Start the Stack&lt;/h3&gt;
&lt;p&gt;Create a &lt;code&gt;docker-compose.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;services&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-banyandb:0.10.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17912:17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;wget -qO- http://localhost:17913/api/healthz || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;3s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;10&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-oap-server:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;oap&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;11800:11800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;12800:12800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE_BANYANDB_TARGETS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb:17912&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;bash -c &amp;#39;echo &amp;gt; /dev/tcp/localhost/12800&amp;#39; || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;start_period&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;60s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ui&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-ui:10.4.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ui&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;8080:8080&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_OAP_ADDRESS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;http://oap:12800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;aigw&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;envoyproxy/ai-gateway-cli:latest&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;aigw&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_BASE_URL=http://host.docker.internal:11434/v1&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OPENAI_API_KEY=unused&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_SERVICE_NAME=my-ai-gateway&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_ENDPOINT=http://oap:11800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_EXPORTER_OTLP_PROTOCOL=grpc&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRICS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_LOGS_EXPORTER=otlp&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_METRIC_EXPORT_INTERVAL=5000&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- OTEL_RESOURCE_ATTRIBUTES=job_name=envoy-ai-gateway,service.instance.id=aigw-1,service.layer=ENVOY_AI_GATEWAY&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;1975:1975&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;extra_hosts&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;host.docker.internal:host-gateway&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;run&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Start everything:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Wait for all services to become healthy (BanyanDB starts first, then OAP, then UI and AI Gateway):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose ps
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The key OTLP configuration on the &lt;code&gt;aigw&lt;/code&gt; service:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Env Var&lt;/th&gt;
          &lt;th&gt;Value&lt;/th&gt;
          &lt;th&gt;Purpose&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_SERVICE_NAME&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;my-ai-gateway&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Service name in SkyWalking&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;http://oap:11800&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;SkyWalking OAP gRPC endpoint&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_EXPORTER_OTLP_PROTOCOL&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;grpc&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;OTLP transport&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_METRICS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Enable metrics push&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;&lt;code&gt;OTEL_LOGS_EXPORTER&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;&lt;code&gt;otlp&lt;/code&gt;&lt;/td&gt;
          &lt;td&gt;Enable access log push&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;code&gt;OTEL_RESOURCE_ATTRIBUTES&lt;/code&gt; must include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;job_name=envoy-ai-gateway&lt;/code&gt; — routing tag for MAL/LAL rules&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.instance.id=&amp;lt;id&amp;gt;&lt;/code&gt; — instance identity&lt;/li&gt;
&lt;li&gt;&lt;code&gt;service.layer=ENVOY_AI_GATEWAY&lt;/code&gt; — routes logs to AI Gateway LAL rules&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The MAL and LAL rules are enabled by default in SkyWalking OAP. No OAP-side configuration is needed.&lt;/p&gt;
&lt;h3 id=&#34;step-3-run-the-demo-app&#34;&gt;Step 3: Run the Demo App&lt;/h3&gt;
&lt;p&gt;Create a simple Python application that sends requests through the AI Gateway (&lt;code&gt;app.py&lt;/code&gt;).
It mixes normal requests, streaming requests (for TTFT/TPOT metrics), and error requests
(non-existent model → HTTP 404, always captured by the LAL sampling policy):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;import&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;time&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;random&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#24292e&#34;&gt;requests&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;GATEWAY &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;http://localhost:1975&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;HEADERS &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Authorization&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Bearer unused&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Content-Type&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;application/json&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;questions &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Apache SkyWalking? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What is Envoy Proxy used for? Answer in one sentence.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;What are the benefits of an AI gateway? Answer in two sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;Explain observability in three sentences.&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#6639ba&#34;&gt;chat&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;False&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    resp &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; requests&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;post&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;GATEWAY&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;/v1/chat/completions&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        json&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; model&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;messages&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; question&lt;span style=&#34;color:#1f2328&#34;&gt;}],&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;stream&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        headers&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;HEADERS&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; timeout&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;60&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;stream&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; stream&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        chunks &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;for&lt;/span&gt; line &lt;span style=&#34;color:#0550ae&#34;&gt;in&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;iter_lines&lt;span style=&#34;color:#1f2328&#34;&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; line&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                chunks&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;append&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;line&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;decode&lt;span style=&#34;color:#1f2328&#34;&gt;())&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[streamed &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;len&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;chunks&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; chunks]&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;status_code&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; resp&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;json&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;while&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    r &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;random&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Error request: non-existent model triggers 404&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;non-existent-model&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;hello&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[error] model=non-existent-model status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;elif&lt;/span&gt; r &lt;span style=&#34;color:#0550ae&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Streaming request — generates TTFT and TPOT metrics&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; info &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; stream&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;True&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[stream] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;info&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;else&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#57606a&#34;&gt;# Normal non-streaming request&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        q &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;choice&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;questions&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        status&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; body &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; chat&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;llama3.2:1b&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; q&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        answer &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;choices&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[{}])[&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;0&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;message&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)[:&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;80&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        tokens &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; body&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;usage&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{})&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;print&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;f&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[ok] status=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;status&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; tokens=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;tokens&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt; answer=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;{&lt;/span&gt;answer&lt;span style=&#34;color:#0a3069&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;...&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    time&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;sleep&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;random&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;randint&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;20&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;30&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Run it:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pip install requests
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;python app.py
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The application talks to the AI Gateway on port 1975, which routes to Ollama.
Each request generates GenAI metrics (token usage, latency, TTFT, TPOT) and access logs
that the gateway pushes to SkyWalking via OTLP.&lt;/p&gt;
&lt;p&gt;The error requests (non-existent model → HTTP 404) are always captured by the access log
sampling policy, so you will see them in the SkyWalking log view.&lt;/p&gt;
&lt;h3 id=&#34;step-4-view-in-skywalking-ui&#34;&gt;Step 4: View in SkyWalking UI&lt;/h3&gt;
&lt;p&gt;Open &lt;a href=&#34;http://localhost:8080&#34;&gt;http://localhost:8080&lt;/a&gt; and select the &lt;strong&gt;GenAI &amp;gt; Envoy AI Gateway&lt;/strong&gt; menu.&lt;/p&gt;
&lt;p&gt;The service list shows &lt;code&gt;my-ai-gateway&lt;/code&gt; with CPM, latency, and token rates at a glance:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-1.png&#34; alt=&#34;Service list&#34;&gt;&lt;/p&gt;
&lt;p&gt;Click into the service to see the full dashboard — Request CPM, Latency (average + percentiles),
Input/Output Token Rates, TTFT, and TPOT:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-2.png&#34; alt=&#34;Service dashboard&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Providers&lt;/strong&gt; tab breaks down metrics by AI provider:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-3.png&#34; alt=&#34;Provider breakdown&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Models&lt;/strong&gt; tab shows per-model metrics including TTFT and TPOT (streaming only).
Note the &lt;code&gt;unknown&lt;/code&gt; model entries — these are the error requests with non-existent models:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-4.png&#34; alt=&#34;Model breakdown&#34;&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Log&lt;/strong&gt; tab shows access logs. The sampling policy drops normal successful responses
but always captures errors (HTTP 404) and high-token requests:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;screen-5.png&#34; alt=&#34;Access logs&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;cleanup&#34;&gt;Cleanup&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose down
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;deploying-on-kubernetes&#34;&gt;Deploying on Kubernetes&lt;/h2&gt;
&lt;p&gt;For production deployments, Envoy AI Gateway runs as a full Kubernetes controller with
Envoy Gateway as the control plane. See the
&lt;a href=&#34;https://aigateway.envoyproxy.io/docs/getting-started/&#34;&gt;Envoy AI Gateway getting started guide&lt;/a&gt;
for Kubernetes installation.&lt;/p&gt;
&lt;p&gt;The OTLP configuration is the same — set the &lt;code&gt;OTEL_*&lt;/code&gt; environment variables on the
AI Gateway&amp;rsquo;s external processor to point at SkyWalking OAP&amp;rsquo;s gRPC port (11800).
See the &lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway Monitoring&lt;/a&gt;
documentation for details.&lt;/p&gt;
&lt;h2 id=&#34;genai-observability-without-an-ai-gateway&#34;&gt;GenAI Observability Without an AI Gateway&lt;/h2&gt;
&lt;p&gt;Not every deployment uses an AI gateway. If your applications call LLM providers directly,
SkyWalking 10.4.0 also provides GenAI observability through the
&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;Virtual GenAI&lt;/a&gt; layer.&lt;/p&gt;
&lt;p&gt;This works with any SkyWalking-instrumented, OpenTelemetry-instrumented, or Zipkin-instrumented application.
When traces carry &lt;code&gt;gen_ai.*&lt;/code&gt; tags (following
&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt;),
SkyWalking derives per-provider and per-model metrics from the client side:
latency, token usage, success rate, and estimated cost.&lt;/p&gt;
&lt;p&gt;For Java applications, the SkyWalking Java Agent (9.7+) includes a Spring AI plugin that automatically
instruments calls to 13+ providers (OpenAI, Anthropic, AWS Bedrock, Google GenAI, DeepSeek, Mistral, etc.)
with the correct &lt;code&gt;gen_ai.*&lt;/code&gt; span tags — no code changes needed.&lt;/p&gt;
&lt;p&gt;This is a different use case from the Envoy AI Gateway monitoring covered above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Envoy AI Gateway layer&lt;/strong&gt;: infrastructure-level observability — what the gateway sees across all traffic.
Best for platform teams managing centralized AI routing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Virtual GenAI layer&lt;/strong&gt;: application-level observability — what each instrumented app sees for its own LLM calls.
Best for teams without a centralized gateway, or for per-application cost tracking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;references&#34;&gt;References&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://aigateway.envoyproxy.io/&#34;&gt;Envoy AI Gateway&lt;/a&gt; — project site and documentation&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/envoyproxy/ai-gateway/tree/main/cmd/aigw&#34;&gt;Envoy AI Gateway CLI&lt;/a&gt; — standalone mode for local development&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/&#34;&gt;SkyWalking Envoy AI Gateway Monitoring&lt;/a&gt; — OAP setup doc&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/&#34;&gt;SkyWalking Virtual GenAI&lt;/a&gt; — client-side GenAI observability&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/gen-ai/&#34;&gt;OpenTelemetry GenAI Semantic Conventions&lt;/a&gt; — the metric/attribute standard both projects follow&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Zh: AI Coding 如何重塑软件架构师的工作方式</title>
      <link>/zh/2026-03-13-how-ai-changed-the-economics-of-architecture/</link>
      <pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/zh/2026-03-13-how-ai-changed-the-economics-of-architecture/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;以 SkyWalking GraalVM Distro 为例，看 AI Coding 如何把一批探索性 PoC 打磨成一条可重复的迁移流水线。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./graph.jpg&#34; alt=&#34;graph.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;这个项目给我最大的启发，不是 AI 能写多少代码，而是 AI Coding 改变了架构设计的试错成本。当一个想法可以很快做成 PoC、跑起来验证、不行就推翻重来时，架构师就更有机会逼近自己真正想要的设计，而不是过早停在“团队现在做得出来”的折中方案上。&lt;/p&gt;
&lt;p&gt;这种变化在成熟开源系统里尤其重要。Apache SkyWalking OAP 长期以来一直是一个功能强大且经过生产验证的可观测性后端，但大型 Java 平台该有的问题它一个不少：运行时字节码生成、重反射初始化、classpath 扫描、基于 SPI 的模块装配，以及动态 DSL 执行——这些机制方便扩展，但做 GraalVM Native Image 时全是障碍。&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SkyWalking GraalVM Distro&lt;/strong&gt; 的出现，源于我们把这个挑战当成一个架构设计问题来处理，而不是一次性的移植工程。目标不仅是让 OAP 能以原生二进制运行，更是把 GraalVM 迁移本身做成一条可重复执行、能够持续跟上上游演进的自动化流水线。&lt;/p&gt;
&lt;p&gt;如果你想看完整的技术设计、基准数据和上手方式，请阅读配套文章：&lt;a href=&#34;/zh/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/&#34;&gt;SkyWalking GraalVM Distro：设计与基准测试&lt;/a&gt;。&lt;/p&gt;
&lt;h2 id=&#34;从停滞的想法到可运行的系统&#34;&gt;从停滞的想法到可运行的系统&lt;/h2&gt;
&lt;p&gt;这件事其实很多年前就开始了。在这个仓库创建不久之后，&lt;a href=&#34;https://github.com/yswdqz&#34;&gt;yswdqz&lt;/a&gt; 曾花了数个月探索迁移方案。真正做下来才发现，这个项目远比 GraalVM 文档里列出的那些单点限制复杂得多，这项工作最终也因此搁置了很多年。&lt;/p&gt;
&lt;p&gt;这段停滞很重要。缺少的并不是想法。成熟维护者通常从来不缺想法，真正稀缺的，是把这些想法真正做出来的时间、人力和精力。即使架构师已经看到了几条很有前景的路线，有限的开发资源也会迫使大家更早做出权衡：优先选择实现成本最低的方案，而不是那个更干净、更可复用、更经得起未来变化的方案。&lt;/p&gt;
&lt;p&gt;这种情况非常普遍，并不特殊。在开源社区里，很多工作依赖志愿者或有限的企业赞助；在商业产品里，约束的形式不同，但本质仍然一样：路线图承诺、团队规模和交付压力都会让工程资源始终紧张。在这两种环境里，很多好想法被放弃，并不是因为它们错了，而是因为要把它们真正验证清楚、实现完整，成本太高。&lt;/p&gt;
&lt;p&gt;还有一个同样重要的约束：架构师通常同时也是非常资深的工程师，而不是一个可以全职扑在实现细节上的人。问题在于个人编码精力有限、时间高度碎片化，同时还要在代码尚未出现之前，不断向其他资深工程师解释自己的设计意图。传统上，这种解释主要通过图、文档和沟通完成。它很慢、信息损失大，而且充满不确定性。我们都体验过“传话游戏”：哪怕是很简单的意思，也很容易被误解，而等误解真正暴露出来时，时间已经过去很多了。&lt;/p&gt;
&lt;p&gt;到了 2025 年末，AI Coding 让”同时尝试多条路线”这件事终于变得现实。我们不必再因为实现能力稀缺而过早接受折中，而是可以在多个设计之间来回切换，用代码验证，快速淘汰弱方案，持续迭代，直到架构本身变得足够稳固、足够实用、足够高效。&lt;/p&gt;
&lt;p&gt;这种设计自由度至关重要。GraalVM 文档对单个限制讲得很清楚，但成熟 OSS 平台遇到的是一整套彼此牵连的系统性问题。只修补一个动态机制远远不够。要让 native image 真正落地，我们必须把整类运行时行为改造成构建期产物和自动生成的元数据。&lt;/p&gt;
&lt;p&gt;在这条路的早期历史中，还有一座非常具体的大山。那时上游 SkyWalking 仍然大量依赖 Groovy 来处理 LAL、MAL 和 Hierarchy 脚本。理论上，这只不过是另一个“不支持运行时动态行为”的例子；但在实践中，Groovy 是整条路径上最大的障碍。它不仅意味着脚本执行，还意味着一整套在 JVM 里极其便利、在 native image 里却极其不友好的动态模型。&lt;/p&gt;
&lt;p&gt;为了跨过这道坎，我们围绕 AOT-first 模式重新设计了 OAP 的核心引擎。早期实验必须直接面对 Groovy 时代的运行时行为，并尝试不同的脚本编译方案来绕过去。最终方案走得更远：对齐上游编译器流水线，把动态生成前移到构建期，并引入自动化机制，让这条迁移路径在上游持续演进时依然保持可控。具体来说，就是把 OAL、MAL、LAL 和 Hierarchy 的生成过程变成构建期预编译器的输出，而不是继续保留为启动期的动态行为。&lt;/p&gt;
&lt;h2 id=&#34;ai-coding-如何改写架构迭代&#34;&gt;AI Coding 如何改写架构迭代&lt;/h2&gt;
&lt;p&gt;这次转变的关键，并不只是“写代码更快了”。AI 真正改变的，是想法、原型、验证和重设计之间来回迭代的速度。围绕同一个问题，我们可以很快做出几个可运行的 PoC，迅速淘汰不成立的方向，再把值得保留的抽象慢慢沉淀成一套连贯的迁移系统。&lt;/p&gt;
&lt;p&gt;这并不会削弱人的架构价值，反而会放大它。哪些行为应该前移到构建期，哪些地方应该保留可配置性，哪里应该引入 same-FQCN 替换，如何让上游同步保持可控，以及哪些抽象值得不惜代价保留下来，这些判断仍然只能由人来做。不同的是，AI 的速度让我们终于有机会把这些更好的设计真正做出来，而不是过早退回到更简单、也更差的折中方案。&lt;/p&gt;
&lt;p&gt;这才是软件架构师工作方式真正发生变化的地方。过去，架构师往往已经知道更干净的方向在哪里，但有限的工程产能会逼着那个愿景退回到一个更便宜的妥协方案。现在，架构师在某种意义上又重新变回了“能快速动手的人”：可以直接用代码把思路搭出来，把高层抽象落成接口，再用真实运行的实现去证明设计。&lt;/p&gt;
&lt;p&gt;这不仅改变了实现，也改变了沟通方式。在开源里，我们常说：&lt;code&gt;talk is cheap, show me the code&lt;/code&gt;。在 AI Coding 时代，“把代码拿出来”这件事变得容易多了。设计不再那么依赖一个缓慢的、自上而下的翻译过程：从想法到文档，再到解释，再到实现。代码可以更早出现，也可以更早跑起来。&lt;/p&gt;
&lt;p&gt;这也让其他资深工程师受益。他们不必只靠图、会议或长篇解释来还原整个设计，而是可以直接审查抽象、阅读真实代码、运行它、质疑它，并在具体实现上一起打磨。这让架构协作更快、更清晰，也少了很多沟通误差。&lt;/p&gt;
&lt;p&gt;也正因为如此，我总觉得今天很多 AI 讨论有点跑偏。很多项目确实很有趣、也很好玩，拿来体验当然没问题，但高级工程工作并不会因为“给代码库接了个 agent”就自然变好。真正重要的，不是哪个 demo 看起来最炫，而是哪些工程能力真的被放大了，同时软件开发本身的纪律有没有被保留下来。&lt;/p&gt;
&lt;p&gt;对于架构师和资深工程师来说，这里真正重要的能力包括：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;快速做对比式原型验证&lt;/strong&gt;：不是只用 slides 和文档去论证某个想法，而是直接把多个方案做成可运行代码来比较。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;大规模代码理解能力&lt;/strong&gt;：能在大量模块之间快速阅读，同时保持对整个系统的全局认识。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;系统性的重构能力&lt;/strong&gt;：把基于反射、依赖运行时动态行为的路径，系统性地改造成适配 AOT 约束的设计。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;搭建自动化的能力&lt;/strong&gt;：当一个迁移步骤在每次上游同步时都必须重做一次，靠手工处理本身就很费时费力，而且越往后只会越累。AI 让我们真正有条件去投资生成器、清单、一致性检查和漂移检测，把重复的人力劳动变成可重复的自动化流程。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;大范围审查能力&lt;/strong&gt;：在很大的代码面上检查边界条件、兼容性约束，以及方案是否经得起反复执行。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这些能力也都体现在最终的设计结果里。same-FQCN 替换为 GraalVM 特定行为建立了清晰、受控的边界；反射元数据不再依赖手工维护的猜测清单，而是直接从构建产物中生成；各种清单机制和漂移检测，则把原本模糊的“上游同步风险”变成了显式的工程工作流。&lt;/p&gt;
&lt;p&gt;对于初级工程师，我觉得这里的启发同样重要。AI 不会让架构设计、系统约束、接口设计、测试和可维护性这些基本功变得不重要。恰恰相反，这些能力只会变得更重要，因为它们决定了“被加速的实现”最终产出的是一个可持续演进的系统，还是只是更快地制造出更多代码。真正的杠杆来自工程判断力，而不是新鲜感。&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; 和 &lt;strong&gt;Gemini AI&lt;/strong&gt; 在整个过程中都扮演了工程加速器的角色。在 GraalVM Distro 这个项目里，它们具体帮我们做了几件事：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;把迁移思路直接做成可运行代码&lt;/strong&gt;：不是争论哪个方向可能行得通，而是把多个真实原型做出来、跑起来、比较掉，把不成立的方向淘汰掉。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;重构重反射、重动态的代码路径&lt;/strong&gt;：把不适合运行时的模式系统性替换成 AOT 友好的实现方式。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;让上游同步真正可持续&lt;/strong&gt;：每次 distro 从上游 SkyWalking 拉取变更后，元数据扫描、配置再生成和重新编译都必须再来一次。AI 帮助我们把这些过程做成流水线，使每次同步都变成一个可控、且大部分自动化的过程，而不是一次比一次更长的手工重复劳动。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;在大范围内审查逻辑和边界情况&lt;/strong&gt;：特别是在功能对等性比纯实现速度更重要的地方。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;最终产出的，不只是一次大重写，而是一套可重复的系统：预编译器、manifest 驱动的加载、反射配置生成、替换边界，以及让上游迁移可审查、可自动化的漂移检测机制。&lt;/p&gt;
&lt;p&gt;如果你想看这种开发方法背后的更广泛背景，可以读这篇文章：&lt;a href=&#34;/zh/2026-03-08-agentic-vibe-coding/&#34;&gt;在成熟开源大型项目中实践 Agentic Vibe Coding：软件工程与工程控制论还在延续&lt;/a&gt;。这篇文章则是这个故事的下一步：不仅是在一个成熟代码库里增强功能，而是重新激活一项曾经停滞的工作，并把它真正做成可运行系统。&lt;/p&gt;
&lt;h2 id=&#34;真正改变的到底是什么&#34;&gt;真正改变的到底是什么&lt;/h2&gt;
&lt;p&gt;这个项目最重要的结果，并不是一张 benchmark 表。基准数据当然属于 distro 本身，而且它们很重要，因为它们证明这套系统是真实可运行的。但对这篇文章来说，更深层的变化发生在方法论层面：AI Coding 改变了我们探索、验证和打磨架构方案的方式。&lt;/p&gt;
&lt;p&gt;过去，架构往往更像一项以文档为主、后面拖着漫长而昂贵实现过程的活动。现在，我们可以更快地在想法、原型、比较和重设计之间切换。这让我们真正有机会去追求更高抽象层次的方案，保留更干净的边界，并建设那些让迁移过程可持续维护的自动化机制。&lt;/p&gt;
&lt;p&gt;这项工作的技术证据，就是 SkyWalking GraalVM Distro 本身：它不仅是一个可运行的系统，更是一条由预编译器、自动生成的反射元数据、受控替换边界和漂移检查组成的迁移流水线。基准数据之所以重要，是因为它们证明这套系统在实践里是成立的；但从架构角度看，真正的结果是：这次迁移不再是一场一次性的移植，而是变成了一套可重复执行的系统工程。关于完整测试方法、原始数据和技术设计，请阅读配套文章：&lt;a href=&#34;/zh/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/&#34;&gt;SkyWalking GraalVM Distro：设计与基准测试&lt;/a&gt;。&lt;/p&gt;
&lt;p&gt;项目仓库位于 &lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro&#34;&gt;apache/skywalking-graalvm-distro&lt;/a&gt;。我们欢迎社区成员测试这个新发行版、提交 issue，并帮助它逐步走向生产可用。&lt;/p&gt;
&lt;p&gt;对我来说，更深层的启发并不止于这个发行版。AI Coding 不会让架构变得不重要，反而会让架构更值得被认真追求。当实现速度提升到一定程度时，我们终于有机会在真实代码里验证更多想法，保留那些真正好的抽象，并把那些过去常常因为投入太大而半途妥协的系统真正做出来。&lt;/p&gt;
&lt;p&gt;对于资深工程师来说，瓶颈正在从单纯的代码实现速度，转向品味、系统判断力，以及定义稳定边界的能力。对于初级工程师来说，真正该走的路不是追逐每一种看上去都很刺激的 AI 工作流，而是把基础能力练得更扎实，让加速真正产生复利：理解需求、阅读陌生系统、质疑假设，并识别出在系统快速变化时仍然必须保持正确的那些部分。AI Coding 降低了验证好设计的代价，但并没有降低工程判断本身的门槛。&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Zh: SkyWalking GraalVM Distro：设计与基准测试</title>
      <link>/zh/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/</link>
      <pubDate>Sun, 15 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/zh/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;这篇文章会完整介绍我们如何把 Apache SkyWalking OAP 迁移到 GraalVM Native Image。目标不是做一次性移植，而是把这件事做成一套能持续跟上上游演进的流程。&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./graph.jpg&#34; alt=&#34;graph.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;如果你想看这项工作的更大背景，以及 AI Coding 如何让这个项目真正做得出来，请阅读：&lt;a href=&#34;/zh/2026-03-13-how-ai-changed-the-economics-of-architecture/&#34;&gt;AI Coding 如何重塑软件架构师的工作方式&lt;/a&gt;。&lt;/p&gt;
&lt;h2 id=&#34;为什么-graalvm-在这里是刚需&#34;&gt;为什么 GraalVM 在这里是刚需&lt;/h2&gt;
&lt;p&gt;GraalVM Native Image 可以把 Java 应用做 Ahead-of-Time（AOT）编译，生成独立可执行文件。对于像 SkyWalking OAP 这样的可观测性后端来说，这不是“锦上添花”的性能优化，而是明确的工程刚需。&lt;/p&gt;
&lt;p&gt;可观测性平台必须是基础设施中最可靠的部分。它必须在自己要观测的那些故障发生时依然存活。在云原生环境里，工作负载会不断扩缩容、迁移和重启，负责观测一切的后端本身不能还是那个启动慢、空闲占用大、恢复缓慢的重型进程。&lt;/p&gt;
&lt;p&gt;我们的基准测试结果让这个结论变得非常具体：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;**启动时间：**约 5 ms 对比约 635 ms。在 Kubernetes 集群里，当 OAP Pod 被驱逐或重新调度时，635 ms 的差距意味着这段时间里的遥测数据可能会丢失。5 ms 的情况下，新 Pod 往往在大部分客户端还没感知到中断之前就已经重新开始接收数据了。&lt;/li&gt;
&lt;li&gt;**空闲内存：**约 41 MiB 对比约 1.2 GiB。可观测性后端是 24/7 常驻运行的。在多租户或边缘部署场景里，基础 RSS 降了 97%，可以放进更小的节点，而不再必须占用一台专用机器。&lt;/li&gt;
&lt;li&gt;**负载下内存：**在 20 RPS 下约 629 MiB 对比约 2.0 GiB。生产级负载下内存降了 70%，直接对应更少的节点、更低的云账单，以及在后端本身成为扩容瓶颈之前更多的余量。&lt;/li&gt;
&lt;li&gt;**没有预热惩罚：**峰值吞吐可以更早发挥出来。JVM 的 JIT 编译器往往需要数分钟流量才能完成热点优化，在这段时间里，尾延迟更差，数据处理也会滞后。原生二进制没有同样的阶段。&lt;/li&gt;
&lt;li&gt;**更小的攻击面：**不再需要完整 JDK 运行时，需要跟踪和修补的 CVE 也就少了很多。对于一个会接收整个集群所有服务数据的组件来说，这一点很重要。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这些都不是“小修小补”。它们直接改变了哪些部署形态开始变得可行：无服务器形态的可观测性后端、边车式采集模型、内存预算极其紧张的边缘节点。只有当后端足够轻、足够快时，这些方案才真正有落地空间。&lt;/p&gt;
&lt;h2 id=&#34;挑战一个成熟动态特性很多的-java-平台&#34;&gt;挑战：一个成熟、动态特性很多的 Java 平台&lt;/h2&gt;
&lt;p&gt;SkyWalking OAP 身上有大型 Java 平台的所有典型问题：运行时字节码生成、重反射初始化、classpath 扫描、基于 SPI 的模块装配，以及动态 DSL 执行。这些机制方便扩展，但做 GraalVM native image 时全是障碍。&lt;/p&gt;
&lt;p&gt;GraalVM 文档中列出的限制，只是问题的开始。在一个成熟的 OSS 平台里，这些限制会深深缠绕在多年积累下来的运行时设计决策中。常规的 GraalVM native image 很难处理运行时类生成、反射、动态发现和脚本执行，而这些在 SkyWalking OAP 中都不是零散存在的，它们本来就是系统设计的一部分。&lt;/p&gt;
&lt;p&gt;在这个发行版的早期历史里，还有一座非常具体的大山。那时上游 SkyWalking 仍然高度依赖 Groovy 来处理 LAL、MAL 和 Hierarchy 脚本。理论上，它只是另一个“不支持运行时动态”的组件；但在实践里，Groovy 是整条路径上最大的障碍。它不仅仅是脚本执行问题，而是代表着一整套在 JVM 世界里极其便利、在 native image 世界里极其不友好的动态模型。&lt;/p&gt;
&lt;h2 id=&#34;设计目标让迁移这件事可以重复做&#34;&gt;设计目标：让迁移这件事可以重复做&lt;/h2&gt;
&lt;p&gt;设计目标不是”把 native-image 跑通一次就完”，而是做出一套能反复用、能长期维护的迁移系统：&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;把运行时生成的产物前移到构建期。&lt;/strong&gt; OAL、MAL、LAL、Hierarchy 规则，以及 meter 相关的生成类，都在构建期完成编译并打包，而不是等到启动时才动态生成。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;用确定性的加载机制替代动态发现。&lt;/strong&gt; classpath 扫描和运行时注册路径被转换为基于 manifest 的加载方式。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;减少运行时反射，并在构建期生成 native 元数据。&lt;/strong&gt; 反射配置不再依赖人工维护的猜测清单，而是根据真实 manifest 和扫描结果生成。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;让上游同步边界保持清晰。&lt;/strong&gt; same-FQCN replacements 会被显式打包、列清单，并通过陈旧性检查守住边界。&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;让变化第一时间暴露出来。&lt;/strong&gt; 一旦上游 provider、规则文件或被替换的源文件发生变化，测试就会失败，迫使我们做显式审查。&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;这才是最关键的架构转变。好的抽象和前瞻性，在 AI 时代并没有变得不重要，反而变得更重要了，因为它们决定了 AI 带来的速度，最终产出的是一个可维护的系统，还是一堆膨胀得更快的代码。&lt;/p&gt;
&lt;h2 id=&#34;把运行时动态行为变成构建期产物&#34;&gt;把运行时动态行为变成构建期产物&lt;/h2&gt;
&lt;p&gt;SkyWalking OAP 里有多个在 JVM 世界里很自然、但在 native image 里很棘手的动态子系统：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OAL 会在运行时生成类。&lt;/li&gt;
&lt;li&gt;LAL、MAL 和 Hierarchy 在历史上与大量基于 Groovy 的运行时行为绑定在一起，这也是早期 distro 工作中最难处理的阻碍之一。&lt;/li&gt;
&lt;li&gt;MAL、LAL 和 Hierarchy 规则依赖运行时编译行为。&lt;/li&gt;
&lt;li&gt;基于 Guava 的 classpath 扫描会发现注解、dispatcher、decorator 和 meter function。&lt;/li&gt;
&lt;li&gt;基于 SPI 的模块和 provider 发现依赖更动态的运行时环境。&lt;/li&gt;
&lt;li&gt;YAML/config 初始化和框架集成依赖反射访问。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;在 SkyWalking GraalVM Distro 里，这些问题不是靠零散补丁一个个修掉的，而是被统一收敛到一条构建期流水线里。&lt;/p&gt;
&lt;p&gt;预编译器会在构建过程中运行 DSL 引擎、导出生成类、写入 manifest、序列化配置数据，并生成 native-image 元数据。这样一来，启动时只需要做类加载和注册，不再需要运行时代码生成。运行期之所以能变得更简单，是因为原本的复杂性被前移到了构建期。&lt;/p&gt;
&lt;p&gt;这也是为什么这个项目不只是一次性能优化。我们的设计目标，是把复杂性前移到一个更容易验证、更容易自动化、也更便于反复执行的位置。&lt;/p&gt;
&lt;h2 id=&#34;same-fqcn-替换一条可控的边界&#34;&gt;same-FQCN 替换：一条可控的边界&lt;/h2&gt;
&lt;p&gt;这个发行版里最实用的设计选择之一，就是使用 same-FQCN 替换类。我们没有依赖模糊的启动技巧，也没有依赖未文档化的加载顺序假设。相反，我们会重新打包 GraalVM 特定 jar，排除原本的上游类，再让替换类占据完全相同的 fully-qualified class name。&lt;/p&gt;
&lt;p&gt;这对可维护性非常关键，因为它建立了一条非常清晰的边界：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;上游类仍然定义行为契约；&lt;/li&gt;
&lt;li&gt;GraalVM 侧的替换类提供兼容的实现策略；&lt;/li&gt;
&lt;li&gt;打包过程则让这次替换变得显式可见。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;例如，OAL 的加载过程从运行时编译变成了基于 manifest 的预编译类加载。类似的替换也处理了 MAL 和 LAL DSL 加载、模块装配、配置初始化，以及多个对反射敏感的路径。目标不是把一切都 fork 出去，而是只替换那些运行时模型从根本上不适合 native image 的部分。&lt;/p&gt;
&lt;p&gt;随后，这条边界还会通过测试来守护：测试会对照与 replacement 对应的上游源文件做哈希。当上游改动了这些文件中的任何一个，构建就会失败，并明确告诉我们哪个 replacement 需要重新审查。这样一来，“如何跟上上游”就不再是一个充满焦虑的抽象问题，而变成一项明确、可落地的工程工作。&lt;/p&gt;
&lt;h2 id=&#34;反射配置不是猜出来的而是生成出来的&#34;&gt;反射配置不是猜出来的，而是生成出来的&lt;/h2&gt;
&lt;p&gt;在很多 GraalVM 迁移项目里，&lt;code&gt;reflect-config.json&lt;/code&gt; 最终会变成一个靠经验不断累积的工件。它会越来越大，越来越陈旧，最后没有人真正清楚它是不是完整，也不清楚每一项配置为什么存在。这种模式在一个持续演化的大型 OSS 平台里是无法扩展的。&lt;/p&gt;
&lt;p&gt;在这个发行版里，反射元数据直接从构建产物和扫描结果中生成，包括：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OAL、MAL、LAL、Hierarchy 以及 meter 生成类的 manifest；&lt;/li&gt;
&lt;li&gt;注解扫描得到的类；&lt;/li&gt;
&lt;li&gt;Armeria HTTP handler；&lt;/li&gt;
&lt;li&gt;GraphQL resolver 和 schema 映射类型；&lt;/li&gt;
&lt;li&gt;被接受的 &lt;code&gt;ModuleConfig&lt;/code&gt; 类。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这是一种健康得多的模式。我们不再依赖人去记住所有可能触发反射访问的路径，而是让系统根据真实迁移流水线推导出反射元数据。构建过程本身，成为了事实来源。&lt;/p&gt;
&lt;h2 id=&#34;让上游同步变得现实可行&#34;&gt;让上游同步变得现实可行&lt;/h2&gt;
&lt;p&gt;如果这个发行版只是一次性的工程冲刺，那它的意义会小很多。真正困难的事情，是在上游 SkyWalking 继续演进的同时，让它还能持续维护下去。&lt;/p&gt;
&lt;p&gt;这也是为什么仓库里会有一整套显式的清单和漂移检测机制：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;provider 清单，用来强制新上游 provider 被分类；&lt;/li&gt;
&lt;li&gt;规则文件清单，用来强制新 DSL 输入被显式确认；&lt;/li&gt;
&lt;li&gt;预编译 YAML 输入的 SHA watcher；&lt;/li&gt;
&lt;li&gt;带 GraalVM 特定 replacement 的上游源文件 SHA watcher。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;好的抽象不仅仅是代码结构优雅，更在于你是否选择了一种能在未来变化面前继续成立的迁移设计。&lt;/p&gt;
&lt;h2 id=&#34;基准测试结果&#34;&gt;基准测试结果&lt;/h2&gt;
&lt;p&gt;我们在一台 Apple M3 Max（macOS、Docker Desktop、10 CPUs / 62.7 GB）上，对标准 JVM OAP 和 GraalVM Distro 做了对比测试，两者都连接到 BanyanDB。&lt;/p&gt;
&lt;h3 id=&#34;启动测试docker-compose无流量3-次取中位数&#34;&gt;启动测试（Docker Compose，无流量，3 次取中位数）&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;指标&lt;/th&gt;
          &lt;th&gt;JVM OAP&lt;/th&gt;
          &lt;th&gt;GraalVM OAP&lt;/th&gt;
          &lt;th&gt;差异&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;冷启动时间&lt;/td&gt;
          &lt;td&gt;635 ms&lt;/td&gt;
          &lt;td&gt;5 ms&lt;/td&gt;
          &lt;td&gt;约快 127 倍&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;热启动时间&lt;/td&gt;
          &lt;td&gt;630 ms&lt;/td&gt;
          &lt;td&gt;5 ms&lt;/td&gt;
          &lt;td&gt;约快 126 倍&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;空闲 RSS&lt;/td&gt;
          &lt;td&gt;约 1.2 GiB&lt;/td&gt;
          &lt;td&gt;约 41 MiB&lt;/td&gt;
          &lt;td&gt;约降低 97%&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;启动时间的测量方式，是从 OAP 第一条应用日志时间戳开始，到出现 &lt;code&gt;listening on 11800&lt;/code&gt; 日志（即 gRPC 服务 ready）为止。&lt;/p&gt;
&lt;h3 id=&#34;持续负载下kind--istio-1252--bookinfo约-20-rps2-个-oap-副本&#34;&gt;持续负载下（Kind + Istio 1.25.2 + Bookinfo，约 20 RPS，2 个 OAP 副本）&lt;/h3&gt;
&lt;p&gt;在 60 秒预热之后，每 10 秒采样一次，共 30 个样本。&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;指标&lt;/th&gt;
          &lt;th&gt;JVM OAP&lt;/th&gt;
          &lt;th&gt;GraalVM OAP&lt;/th&gt;
          &lt;th&gt;差异&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;CPU 中位数（millicores）&lt;/td&gt;
          &lt;td&gt;101&lt;/td&gt;
          &lt;td&gt;68&lt;/td&gt;
          &lt;td&gt;-33%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;CPU 平均值（millicores）&lt;/td&gt;
          &lt;td&gt;107&lt;/td&gt;
          &lt;td&gt;67&lt;/td&gt;
          &lt;td&gt;-37%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;内存中位数（MiB）&lt;/td&gt;
          &lt;td&gt;2068&lt;/td&gt;
          &lt;td&gt;629&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;-70%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;内存平均值（MiB）&lt;/td&gt;
          &lt;td&gt;2082&lt;/td&gt;
          &lt;td&gt;624&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;-70%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;两个版本报告的 entry-service CPM 一致，说明在这个测试负载下，两者的流量处理能力相同。&lt;/p&gt;
&lt;p&gt;我们每 30 秒通过 swctl 对所有已发现服务收集这些指标：
&lt;code&gt;service_cpm&lt;/code&gt;、&lt;code&gt;service_resp_time&lt;/code&gt;、&lt;code&gt;service_sla&lt;/code&gt;、&lt;code&gt;service_apdex&lt;/code&gt;、&lt;code&gt;service_percentile&lt;/code&gt;。&lt;/p&gt;
&lt;p&gt;完整的基准测试脚本和原始数据位于发行版仓库中的 &lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro/tree/main/benchmark&#34;&gt;benchmark/&lt;/a&gt; 目录。&lt;/p&gt;
&lt;h2 id=&#34;当前状态&#34;&gt;当前状态&lt;/h2&gt;
&lt;p&gt;这个项目已经是一个可运行的实验性发行版，托管在独立仓库中：&lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro&#34;&gt;apache/skywalking-graalvm-distro&lt;/a&gt;。&lt;/p&gt;
&lt;p&gt;当前发行版有意聚焦在一种现代、高性能的运行模式上：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;存储：&lt;/strong&gt; BanyanDB&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;集群模式：&lt;/strong&gt; Standalone 和 Kubernetes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;配置方式：&lt;/strong&gt; 无配置或 Kubernetes ConfigMap&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;运行模型：&lt;/strong&gt; 固定模块集合、预编译产物和 AOT 友好的装配方式&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这种聚焦是刻意的。要把迁移做成一套可重复的系统，第一步必须先把边界收清楚，做出一个真正能跑起来的版本，然后再在不失控的前提下逐步扩展。&lt;/p&gt;
&lt;h2 id=&#34;快速开始&#34;&gt;快速开始&lt;/h2&gt;
&lt;p&gt;由于 SkyWalking GraalVM Distro 的设计目标就是追求极致性能，它目前最适合与 &lt;strong&gt;BanyanDB&lt;/strong&gt; 存储后端搭配使用。当前发布的镜像已经可以在 Docker Hub 获取，你可以直接用下面这个 &lt;code&gt;docker-compose.yml&lt;/code&gt; 启动整套系统。&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;version&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;3.8&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;services&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ghcr.io/apache/skywalking-banyandb:e1ba421bd624727760c7a69c84c6fe55878fb526&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17912:17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17913:17913&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data --measure-metadata-cache-wait-duration 1m --stream-metadata-cache-wait-duration 1m&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;nc -nz 127.0.0.1 17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;120&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-graalvm-distro:0.1.1&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;oap&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;11800:11800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;12800:12800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE_BANYANDB_TARGETS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb:17912&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_HEALTH_CHECKER&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;nc -nz 127.0.0.1 11800 || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;120&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ui&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ghcr.io/apache/skywalking/ui:10.3.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ui&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;8080:8080&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_OAP_ADDRESS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;http://oap:12800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;只需要执行：&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;欢迎社区来测试这个新发行版、提交 issue，并帮助我们推动它走向生产可用。&lt;/p&gt;
&lt;p&gt;&lt;em&gt;特别感谢 GraalVM 团队提供的技术基础。&lt;/em&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: How AI Changed the Economics of Architecture</title>
      <link>/blog/2026-03-13-how-ai-changed-the-economics-of-architecture/</link>
      <pubDate>Fri, 13 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/blog/2026-03-13-how-ai-changed-the-economics-of-architecture/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;SkyWalking GraalVM Distro: A case study in turning runnable PoCs into a repeatable migration pipeline.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./graph.jpg&#34; alt=&#34;graph.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;The most important lesson from this project is not that AI can generate a large amount of code. It is that AI changes the economics of architecture. When runnable PoCs become cheap to build, compare, discard, and rebuild, architects can push further toward the design they actually want instead of stopping early at a compromise they can afford to implement.&lt;/p&gt;
&lt;p&gt;That shift matters a lot in mature open source systems. Apache SkyWalking OAP has long been a powerful and production-proven observability backend, but it also carries all the realities of a large Java platform: runtime bytecode generation, reflection-heavy initialization, classpath scanning, SPI-based module wiring, and dynamic DSL execution that are friendly to extensibility but hostile to GraalVM native image.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SkyWalking GraalVM Distro&lt;/strong&gt; is the result of treating that challenge as a design-system problem instead of a one-off porting exercise. The goal was not only to make OAP run as a native binary, but to turn GraalVM migration itself into a repeatable automation pipeline that can stay aligned with upstream evolution.&lt;/p&gt;
&lt;p&gt;For the full technical design, benchmark data, and getting-started guide, see the companion post: &lt;a href=&#34;../2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/index.md&#34;&gt;SkyWalking GraalVM Distro: Design and Benchmarks&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;from-paused-idea-to-runnable-system&#34;&gt;From Paused Idea to Runnable System&lt;/h2&gt;
&lt;p&gt;This journey actually began years ago. Shortly after this repository was created, &lt;a href=&#34;https://github.com/yswdqz&#34;&gt;yswdqz&lt;/a&gt; spent several months exploring the transition. The project proved much harder in practice than the individual GraalVM limitations sounded on paper, and the work eventually paused for years.&lt;/p&gt;
&lt;p&gt;That pause is important. The missing ingredient was not ideas. Mature maintainers usually have more ideas than time. The real constraint was implementation economics. Even when the architect can see several promising directions, limited developer resources force an earlier trade-off: choose the path that is cheapest to implement, not necessarily the path that is cleanest, most reusable, or most future-proof.&lt;/p&gt;
&lt;p&gt;This is a very common reality, not an exceptional one. In open source communities, much of the work depends on volunteers or limited company sponsorship. In commercial products, the pressure is different but the constraint is still real: roadmap commitments, staffing limits, and delivery deadlines keep engineering resources tight. In both worlds, good ideas are often abandoned not because they are wrong, but because they are too expensive to validate and implement thoroughly.&lt;/p&gt;
&lt;p&gt;There is another constraint that matters just as much: the architect is usually also a very senior engineer, not a full-time implementation machine. That means limited personal coding energy, fragmented time, and a constant need to explain ideas to other senior engineers before the code exists. Traditionally, that explanation happens through diagrams, documents, and conversations. It is slow, lossy, and unpredictable. We all know some version of the Telephone Game: even simple words are easy to misunderstand, and by the time the misunderstanding becomes visible, a lot of time has already passed.&lt;/p&gt;
&lt;p&gt;What changed in late 2025 was that AI engineering made multiple runnable ideas affordable. Instead of picking an early compromise because implementation capacity was scarce, we could switch repeatedly between designs, validate them with code, discard weak directions quickly, and keep iterating until the architecture became solid, practical, and efficient enough to hold.&lt;/p&gt;
&lt;p&gt;That design freedom was critical. GraalVM documentation gives clear guidance on isolated limitations, but a mature OSS platform hits them as a connected system. Fixing only one dynamic mechanism is not enough. To make native image practical, we had to turn whole categories of runtime behavior into build-time artifacts and automated metadata generation.&lt;/p&gt;
&lt;p&gt;There was also a very concrete mountain in front of us in the early history of this distro. In the first several commits of the repository, upstream SkyWalking still relied heavily on Groovy for LAL, MAL, and Hierarchy scripts. In theory, that was just one more unsupported runtime-heavy component. In practice, Groovy was the biggest obstacle in the whole path. It represented not only script execution, but a whole dynamic model that was deeply convenient on the JVM side and deeply unfriendly to native image.&lt;/p&gt;
&lt;p&gt;To bridge the gap, we re-architected the core engines of OAP around an AOT-first model. Earlier experiments had to confront Groovy-era runtime behavior directly and explore alternative script-compilation approaches to get around it. The finalized direction went further: align with the upstream compiler pipeline, move dynamic generation to build time, and add automation so the migration stays controllable as upstream keeps moving. Concretely, that meant turning OAL, MAL, LAL, and Hierarchy generation into build-time precompiler outputs instead of leaving them as startup-time dynamic behavior.&lt;/p&gt;
&lt;h2 id=&#34;ai-speed-changed-the-design-loop&#34;&gt;AI Speed Changed the Design Loop&lt;/h2&gt;
&lt;p&gt;The scale of this transformation was not only about coding faster. AI changed the loop between idea, prototype, validation, and redesign. We could build runnable PoCs for different approaches, throw away weak ones quickly, and preserve the promising abstractions until they formed a coherent migration system.&lt;/p&gt;
&lt;p&gt;That does not reduce the role of human architecture. It raises the value of it. Human judgment was still required to decide what should become build-time, what should stay configurable, where to introduce same-FQCN replacements, how to keep upstream sync controllable, and which abstractions were worth preserving. But AI speed made it realistic to pursue those better designs instead of settling for a simpler compromise too early.&lt;/p&gt;
&lt;p&gt;This is the real change in the economics of architecture. In the past, an architect might already know the cleaner direction, but limited engineering capacity often forced that vision back toward a cheaper compromise. Now the architect can return much closer to being a fast developer again: building code, shaping high-abstraction interfaces, and using design patterns to prove the vision directly in the real world.&lt;/p&gt;
&lt;p&gt;That changes communication as much as implementation. In open source, we often say, &lt;code&gt;talk is cheap, show me the code&lt;/code&gt;. With AI engineering, showing the code becomes much more straightforward. The design no longer depends so heavily on a slow top-down translation from idea to documents to interpretation to implementation. The code can appear earlier, and it can run earlier.&lt;/p&gt;
&lt;p&gt;Other senior engineers benefit from this too. They do not need to reconstruct the whole design only from diagrams, meetings, or long explanations. They can review the actual abstraction, see the behavior in code, run it, challenge it, and refine it from something concrete. That makes architectural collaboration faster, clearer, and less lossy.&lt;/p&gt;
&lt;p&gt;This is also where I think the current AI discussion is often noisy. Many projects are fun, surprising, and worth exploring, but advanced engineering work is not improved merely by attaching an agent to a codebase. The important question is not which demo looks most magical. The important question is which engineering capabilities are actually being accelerated without losing the discipline of software development itself.&lt;/p&gt;
&lt;p&gt;For architects and senior engineers, the capabilities that mattered most here were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Fast comparative prototyping:&lt;/strong&gt; Building several runnable approaches in code instead of defending one idea with slides and documents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Large-scale code comprehension:&lt;/strong&gt; Reading across many modules quickly enough to keep the whole system in view.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Systematic refactoring:&lt;/strong&gt; Converting reflection-heavy or runtime-dynamic paths into designs that fit AOT constraints.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automation construction:&lt;/strong&gt; When a migration step must be repeated every upstream sync, doing it manually once is already expensive. Doing it manually again next time is even more expensive. AI made it practical to invest in generators, inventories, consistency checks, and drift detectors that turn repeated manual work into repeatable automation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review at breadth:&lt;/strong&gt; Checking edge cases, compatibility boundaries, and repeatability across a large surface area.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Those capabilities were visible in the resulting design. Same-FQCN replacements created a controlled boundary for GraalVM-specific behavior. Reflection metadata was generated from build outputs instead of maintained as a hand-written guess list. Inventories and drift detectors turned upstream sync from a vague maintenance risk into an explicit engineering workflow.&lt;/p&gt;
&lt;p&gt;For junior engineers, I think the lesson is equally important. AI does not remove the need to learn architecture, invariants, interfaces, testing, or maintenance. It makes those skills more valuable, because they determine whether accelerated implementation produces a durable system or just more code faster. The leverage comes from engineering judgment, not from novelty.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; and &lt;strong&gt;Gemini AI&lt;/strong&gt; acted as engineering accelerators throughout this process. In the GraalVM Distro specifically, they helped us:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Explore migration strategies as running code:&lt;/strong&gt; Instead of debating which approach might work, we built and compared multiple real prototypes, discarded the weak ones, and kept what held up.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Refactor reflection-heavy and dynamic code paths:&lt;/strong&gt; Replace runtime-hostile patterns with AOT-friendly alternatives across the codebase.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Make upstream sync sustainable:&lt;/strong&gt; Every time the distro pulls from upstream SkyWalking, metadata scanning, config regeneration, and recompilation must happen again. AI helped build the pipeline so that each sync is a controlled, largely automated process rather than a fresh manual effort that grows longer each time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review logic and edge cases at scale:&lt;/strong&gt; Especially in places where feature parity mattered more than raw implementation speed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The result was not just a large rewrite. It was a repeatable system: precompilers, manifest-driven loading, reflection-config generation, replacement boundaries, and drift detectors that make upstream migration reviewable and automatable.&lt;/p&gt;
&lt;p&gt;For the broader methodology behind this style of development, see &lt;a href=&#34;https://builder.aws.com/content/3AgtzlikuD9bSUJrWDCjGW5Q5nW/agentic-vibe-coding-in-a-mature-oss-project-what-worked-what-didnt&#34;&gt;Agentic Vibe Coding in a Mature OSS Project&lt;/a&gt;. This post is the next step in that story: not only enhancing an active mature codebase, but reviving a paused effort and making it actually runnable.&lt;/p&gt;
&lt;h2 id=&#34;what-actually-changed&#34;&gt;What Actually Changed&lt;/h2&gt;
&lt;p&gt;The most important outcome of this project is not a benchmark table. The benchmark results belong to the distro itself, and they matter because they prove the system is real. But for this post, the deeper result is methodological: AI engineering changed how architecture could be explored, validated, and refined.&lt;/p&gt;
&lt;p&gt;Instead of treating architecture as a mostly document-driven activity followed by a long and expensive implementation phase, we were able to move much faster between idea, prototype, comparison, and redesign. That made it realistic to pursue higher-abstraction solutions, preserve cleaner boundaries, and build the automation needed to keep the migration maintainable over time.&lt;/p&gt;
&lt;p&gt;The technical evidence for that work is the SkyWalking GraalVM Distro itself: not only a runnable system, but a migration pipeline expressed as precompilers, generated reflection metadata, controlled replacement boundaries, and drift checks. The benchmark data matter because they prove the system works in practice, but the architectural result is that the migration became a repeatable system rather than a one-time port. For detailed benchmark methodology, per-pod data, and the full technical design, see &lt;a href=&#34;https://skywalking.apache.org/blog/2026-03-13-how-ai-changed-the-economics-of-architecture/&#34;&gt;SkyWalking GraalVM Distro: Design and Benchmarks&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The project is hosted at &lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro&#34;&gt;apache/skywalking-graalvm-distro&lt;/a&gt;. We invite the community to test it, report issues, and help move it toward production readiness.&lt;/p&gt;
&lt;p&gt;For me, the deeper takeaway is broader than this distro. AI engineering does not make architecture less important. It makes architecture more worth pursuing. When implementation speed rises enough, we can afford to test more ideas in code, keep the good abstractions, and build systems that would previously have been judged too expensive to finish well.&lt;/p&gt;
&lt;p&gt;For senior engineers, that means the bottleneck shifts away from raw typing speed and toward taste, system judgment, and the ability to define stable boundaries. For junior engineers, it means the path forward is not to chase every exciting AI workflow, but to become stronger at the fundamentals that let acceleration compound: understanding requirements, reading unfamiliar systems, questioning assumptions, and recognizing what must remain correct as everything around it changes. AI changed the economics of architecture because it lowered the cost of validating better designs without lowering the bar for engineering judgment.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: SkyWalking GraalVM Distro: Design and Benchmarks</title>
      <link>/blog/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/</link>
      <pubDate>Fri, 13 Mar 2026 00:00:00 +0000</pubDate>
      <guid>/blog/2026-03-13-skywalking-graalvm-distro-design-and-benchmarks/</guid>
      <description>
        
        
        &lt;p&gt;&lt;em&gt;A technical deep-dive into how we migrated Apache SkyWalking OAP to GraalVM Native Image — not as a one-off port, but as a repeatable pipeline that stays aligned with upstream.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;./graph.jpg&#34; alt=&#34;graph.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;For the broader story of how AI engineering made this project economically viable, see &lt;a href=&#34;/blog/2026-03-13-how-ai-changed-the-economics-of-architecture/&#34;&gt;How AI Changed the Economics of Architecture&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;why-graalvm-is-not-optional&#34;&gt;Why GraalVM Is Not Optional&lt;/h2&gt;
&lt;p&gt;GraalVM Native Image compiles Java applications Ahead-of-Time (AOT) into standalone executables. For an observability backend like SkyWalking OAP, this is not a performance optimization — it is an operational necessity.&lt;/p&gt;
&lt;p&gt;An observability platform must be the most reliable component in the infrastructure. It has to survive the failures it is supposed to observe. In cloud-native environments where workloads scale, migrate, and restart constantly, the backend that watches everything cannot itself be the slow, heavy process that takes seconds to recover and gigabytes to idle.&lt;/p&gt;
&lt;p&gt;Our benchmarks make the case concrete:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Startup:&lt;/strong&gt; ~5 ms vs ~635 ms. In a Kubernetes cluster where an OAP pod gets evicted or rescheduled, a 635 ms gap means lost telemetry — traces, metrics, and logs that arrive during that window are simply dropped. At 5 ms, the new pod is receiving data before most clients even notice the disruption.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Idle memory:&lt;/strong&gt; ~41 MiB vs ~1.2 GiB. Observability backends run 24/7. In a multi-tenant or edge deployment, a 97% reduction in baseline RSS is the difference between fitting the observability stack on a small node and needing a dedicated one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory under load:&lt;/strong&gt; ~629 MiB vs ~2.0 GiB at 20 RPS. A 70% reduction at production-like traffic means fewer nodes, lower cloud bills, and more headroom before the backend itself becomes a scaling bottleneck.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No warm-up penalty:&lt;/strong&gt; Peak throughput is available from the first request. The JVM&amp;rsquo;s JIT compiler needs minutes of traffic before it optimizes hot paths — during that window, tail latency is worse and data processing lags behind. A native binary has no such phase.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smaller attack surface:&lt;/strong&gt; No JDK runtime means fewer CVEs to track and patch. For a component that ingests data from every service in the cluster, that matters.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are not incremental improvements. They change what deployment topologies are practical. Serverless observability backends, sidecar-model collectors, edge nodes with tight memory budgets — all become realistic when the backend is this light and this fast.&lt;/p&gt;
&lt;h2 id=&#34;the-challenge-a-mature-dynamic-java-platform&#34;&gt;The Challenge: A Mature, Dynamic Java Platform&lt;/h2&gt;
&lt;p&gt;SkyWalking OAP carries all the realities of a large Java platform: runtime bytecode generation, reflection-heavy initialization, classpath scanning, SPI-based module wiring, and dynamic DSL execution. These patterns are friendly to extensibility but hostile to GraalVM native image.&lt;/p&gt;
&lt;p&gt;The documented GraalVM limitations are only the beginning. In a mature OSS platform, those limitations are deeply entangled with years of runtime design decisions. Standard GraalVM native images struggle with runtime class generation, reflection, dynamic discovery, and script execution — all of which had deep roots in SkyWalking OAP.&lt;/p&gt;
&lt;p&gt;There was also a very concrete mountain in the early history of this distro. Upstream SkyWalking relied heavily on Groovy for LAL, MAL, and Hierarchy scripts. In theory, that was just one more unsupported runtime-heavy component. In practice, Groovy was the biggest obstacle in the whole path. It represented not only script execution, but a whole dynamic model that was deeply convenient on the JVM side and deeply unfriendly to native image.&lt;/p&gt;
&lt;h2 id=&#34;the-design-goal-make-migration-repeatable&#34;&gt;The Design Goal: Make Migration Repeatable&lt;/h2&gt;
&lt;p&gt;The final design is not just &amp;ldquo;run native-image successfully.&amp;rdquo; It is a system that keeps migration work repeatable:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Pre-compile runtime-generated assets at build time.&lt;/strong&gt; OAL, MAL, LAL, Hierarchy rules, and meter-related generated classes are compiled during the build and packaged as artifacts instead of being generated at startup.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Replace dynamic discovery with deterministic loading.&lt;/strong&gt; Classpath scanning and runtime registration paths are converted into manifest-driven loading.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reduce runtime reflection and generate native metadata from the build.&lt;/strong&gt; Reflection configuration is produced from actual manifests and scanned classes instead of being maintained as a hand-written guess list.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Keep the upstream sync boundary explicit.&lt;/strong&gt; Same-FQCN replacements are intentionally packaged, inventoried, and guarded with staleness checks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Make drift visible immediately.&lt;/strong&gt; If upstream providers, rule files, or replaced source files change, tests fail and force explicit review.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That is the architectural shift that matters most. Reusable abstraction and foresight did not become less important in the AI era. They became more important, because they determine whether AI speed produces a maintainable system or just a fast-growing pile of code.&lt;/p&gt;
&lt;h2 id=&#34;turning-runtime-dynamism-into-build-time-assets&#34;&gt;Turning Runtime Dynamism into Build-Time Assets&lt;/h2&gt;
&lt;p&gt;SkyWalking OAP has several dynamic subsystems that are natural in a JVM world but problematic for native image:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;OAL generates classes at runtime.&lt;/li&gt;
&lt;li&gt;LAL, MAL, and Hierarchy were historically tied to Groovy-heavy runtime behavior, which became one of the biggest practical blockers in the early distro work.&lt;/li&gt;
&lt;li&gt;MAL, LAL, and Hierarchy rules depend on runtime compilation behavior.&lt;/li&gt;
&lt;li&gt;Guava-based classpath scanning discovers annotations, dispatchers, decorators, and meter functions.&lt;/li&gt;
&lt;li&gt;SPI-based module/provider discovery expects a more dynamic runtime environment.&lt;/li&gt;
&lt;li&gt;YAML/config initialization and framework integrations depend on reflective access.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In SkyWalking GraalVM Distro, these are not solved one by one as isolated patches. They are pulled into a build-time pipeline.&lt;/p&gt;
&lt;p&gt;The precompiler runs the DSL engines during the build, exports generated classes, writes manifests, serializes config data, and generates native-image metadata. That means startup becomes class loading and registration, not runtime code generation. The runtime path is simpler because the build path became richer.&lt;/p&gt;
&lt;p&gt;This is also why the project is more than a performance exercise. The design goal was to move complexity into a place where it is easier to verify, easier to automate, and easier to repeat.&lt;/p&gt;
&lt;h2 id=&#34;same-fqcn-replacements-as-a-controlled-boundary&#34;&gt;Same-FQCN Replacements as a Controlled Boundary&lt;/h2&gt;
&lt;p&gt;One of the most practical design choices in this distro is the use of same-FQCN replacement classes. We do not rely on vague startup tricks or undocumented ordering assumptions. Instead, the GraalVM-specific jars are repackaged so the original upstream classes are excluded and the replacement classes occupy the exact same fully-qualified names.&lt;/p&gt;
&lt;p&gt;This matters for maintainability. It creates a very clear boundary:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the upstream class still defines the behavior contract,&lt;/li&gt;
&lt;li&gt;the GraalVM replacement provides a compatible implementation strategy,&lt;/li&gt;
&lt;li&gt;and the packaging makes that swap explicit.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example, OAL loading changes from runtime compilation into manifest-driven loading of precompiled classes. Similar replacements handle MAL and LAL DSL loading, module wiring, config initialization, and several reflection-sensitive paths. The goal is not to fork everything. The goal is to replace only the places where the runtime model is fundamentally unfriendly to native image.&lt;/p&gt;
&lt;p&gt;That boundary is then guarded by tests that hash the upstream source files corresponding to the replacements. When upstream changes one of those files, the build fails and tells us exactly which replacement needs review. This is what turns &amp;ldquo;keeping up with upstream&amp;rdquo; from an anxiety problem into a visible engineering task.&lt;/p&gt;
&lt;h2 id=&#34;reflection-config-is-generated-not-guessed&#34;&gt;Reflection Config Is Generated, Not Guessed&lt;/h2&gt;
&lt;p&gt;In many GraalVM migrations, &lt;code&gt;reflect-config.json&lt;/code&gt; becomes a manually accumulated artifact. It grows over time, gets stale, and nobody is fully sure whether it is complete or why each entry exists. That approach does not scale well for a large, evolving OSS platform.&lt;/p&gt;
&lt;p&gt;In this distro, reflection metadata is generated from the build outputs and scanned classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;manifests for OAL, MAL, LAL, Hierarchy, and meter-generated classes,&lt;/li&gt;
&lt;li&gt;annotation-scanned classes,&lt;/li&gt;
&lt;li&gt;Armeria HTTP handlers,&lt;/li&gt;
&lt;li&gt;GraphQL resolvers and schema-mapped types,&lt;/li&gt;
&lt;li&gt;and accepted &lt;code&gt;ModuleConfig&lt;/code&gt; classes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is a much healthier model. Instead of asking people to remember every reflective access path, the system derives reflection metadata from the actual migration pipeline. The build becomes the source of truth.&lt;/p&gt;
&lt;h2 id=&#34;keeping-upstream-sync-practical&#34;&gt;Keeping Upstream Sync Practical&lt;/h2&gt;
&lt;p&gt;If this distro were only a one-time engineering sprint, it would be much less interesting. The real challenge is keeping it alive while upstream SkyWalking continues to evolve.&lt;/p&gt;
&lt;p&gt;That is why the repo includes explicit inventories and drift detectors:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;provider inventories that force new upstream providers to be categorized,&lt;/li&gt;
&lt;li&gt;rule-file inventories that force new DSL inputs to be acknowledged,&lt;/li&gt;
&lt;li&gt;SHA watchers for precompiled YAML inputs,&lt;/li&gt;
&lt;li&gt;and SHA watchers for upstream source files with GraalVM-specific replacements.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Good abstraction is not only about elegant code structure. It is about choosing a migration design that can survive contact with future change.&lt;/p&gt;
&lt;h2 id=&#34;benchmark-results&#34;&gt;Benchmark Results&lt;/h2&gt;
&lt;p&gt;We benchmarked the standard JVM OAP against the GraalVM Distro on an Apple M3 Max (macOS, Docker Desktop, 10 CPUs / 62.7 GB), both connecting to BanyanDB.&lt;/p&gt;
&lt;h3 id=&#34;boot-test-docker-compose-no-traffic-median-of-3-runs&#34;&gt;Boot Test (Docker Compose, no traffic, median of 3 runs)&lt;/h3&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Metric&lt;/th&gt;
          &lt;th&gt;JVM OAP&lt;/th&gt;
          &lt;th&gt;GraalVM OAP&lt;/th&gt;
          &lt;th&gt;Delta&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Cold boot startup&lt;/td&gt;
          &lt;td&gt;635 ms&lt;/td&gt;
          &lt;td&gt;5 ms&lt;/td&gt;
          &lt;td&gt;~127x faster&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Warm boot startup&lt;/td&gt;
          &lt;td&gt;630 ms&lt;/td&gt;
          &lt;td&gt;5 ms&lt;/td&gt;
          &lt;td&gt;~126x faster&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Idle RSS&lt;/td&gt;
          &lt;td&gt;~1.2 GiB&lt;/td&gt;
          &lt;td&gt;~41 MiB&lt;/td&gt;
          &lt;td&gt;~97% reduction&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Boot time is measured from OAP&amp;rsquo;s first application log timestamp to the &lt;code&gt;listening on 11800&lt;/code&gt; log line (gRPC server ready).&lt;/p&gt;
&lt;h3 id=&#34;under-sustained-load-kind--istio-1252--bookinfo-at-20-rps-2-oap-replicas&#34;&gt;Under Sustained Load (Kind + Istio 1.25.2 + Bookinfo at ~20 RPS, 2 OAP replicas)&lt;/h3&gt;
&lt;p&gt;30 samples at 10s intervals after 60s warmup.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Metric&lt;/th&gt;
          &lt;th&gt;JVM OAP&lt;/th&gt;
          &lt;th&gt;GraalVM OAP&lt;/th&gt;
          &lt;th&gt;Delta&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;CPU median (millicores)&lt;/td&gt;
          &lt;td&gt;101&lt;/td&gt;
          &lt;td&gt;68&lt;/td&gt;
          &lt;td&gt;-33%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;CPU avg (millicores)&lt;/td&gt;
          &lt;td&gt;107&lt;/td&gt;
          &lt;td&gt;67&lt;/td&gt;
          &lt;td&gt;-37%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Memory median (MiB)&lt;/td&gt;
          &lt;td&gt;2068&lt;/td&gt;
          &lt;td&gt;629&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;-70%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Memory avg (MiB)&lt;/td&gt;
          &lt;td&gt;2082&lt;/td&gt;
          &lt;td&gt;624&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;-70%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Both variants reported identical entry-service CPM, confirming equivalent traffic processing capability.&lt;/p&gt;
&lt;p&gt;Service metrics collected every 30s via swctl for all discovered services:
&lt;code&gt;service_cpm&lt;/code&gt;, &lt;code&gt;service_resp_time&lt;/code&gt;, &lt;code&gt;service_sla&lt;/code&gt;, &lt;code&gt;service_apdex&lt;/code&gt;, &lt;code&gt;service_percentile&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Full benchmark scripts and raw data are in the &lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro/tree/main/benchmark&#34;&gt;benchmark/&lt;/a&gt; directory of the distro repository.&lt;/p&gt;
&lt;h2 id=&#34;current-status&#34;&gt;Current Status&lt;/h2&gt;
&lt;p&gt;The project is a runnable experimental distribution, hosted in its own repository: &lt;a href=&#34;https://github.com/apache/skywalking-graalvm-distro&#34;&gt;apache/skywalking-graalvm-distro&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The current distro intentionally focuses on a modern, high-performance operating model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Storage:&lt;/strong&gt; BanyanDB&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cluster modes:&lt;/strong&gt; Standalone and Kubernetes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Configuration:&lt;/strong&gt; none or Kubernetes ConfigMap&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Runtime model:&lt;/strong&gt; fixed module set, precompiled assets, and AOT-friendly wiring&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This focus is deliberate. A repeatable migration system starts by making a clear scope runnable, then expanding without losing control.&lt;/p&gt;
&lt;h2 id=&#34;getting-started&#34;&gt;Getting Started&lt;/h2&gt;
&lt;p&gt;Because the SkyWalking GraalVM Distro is designed for peak performance, it is optimized to work with &lt;strong&gt;BanyanDB&lt;/strong&gt; as its storage backend. The current published image is available on Docker Hub, and you can boot the stack using the following &lt;code&gt;docker-compose.yml&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;version&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;3.8&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;services&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ghcr.io/apache/skywalking-banyandb:e1ba421bd624727760c7a69c84c6fe55878fb526&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17912:17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;17913:17913&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;command&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;standalone --stream-root-path /tmp/stream-data --measure-root-path /tmp/measure-data --measure-metadata-cache-wait-duration 1m --stream-metadata-cache-wait-duration 1m&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;sh&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;-c&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;nc -nz 127.0.0.1 17912&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;120&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;apache/skywalking-graalvm-distro:0.1.1&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;oap&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;banyandb&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;11800:11800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;12800:12800&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_STORAGE_BANYANDB_TARGETS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;banyandb:17912&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_HEALTH_CHECKER&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;default&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;healthcheck&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;test&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;CMD-SHELL&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;nc -nz 127.0.0.1 11800 || exit 1&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;5s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;timeout&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;10s&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;retries&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;120&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;  &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ui&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;image&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ghcr.io/apache/skywalking/ui:10.3.0&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;container_name&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;ui&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;depends_on&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;oap&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;service_healthy&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;restart&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;always&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;ports&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;- &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;8080:8080&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;environment&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#fff&#34;&gt;      &lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;SW_OAP_ADDRESS&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#fff&#34;&gt; &lt;/span&gt;http://oap:12800&lt;span style=&#34;color:#fff&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Simply run:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;docker compose up -d
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We invite the community to test this new distribution, report issues, and help us move it toward a production-ready state.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Special thanks to the GraalVM team for the technology foundation.&lt;/em&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: SkyWalking 10 Release: Service Hierarchy, Kubernetes Network Monitoring by eBPF, BanyanDB, and More</title>
      <link>/blog/2024-05-13-skywalking-10-release/</link>
      <pubDate>Mon, 13 May 2024 00:00:00 +0000</pubDate>
      <guid>/blog/2024-05-13-skywalking-10-release/</guid>
      <description>
        
        
        &lt;p&gt;The Apache SkyWalking team today announced the 10 release. SkyWalking 10 provides a host of groundbreaking features and enhancements.
The introduction of Layer and Service Hierarchy streamlines monitoring by organizing services and metrics into distinct layers and providing seamless navigation across them.
Leveraging eBPF technology, Kubernetes Network Monitoring delivers granular insights into network traffic, topology, and TCP/HTTP metrics.
BanyanDB emerges as a high-performance native storage solution, while expanded monitoring support encompasses Apache RocketMQ, ClickHouse,
and Apache ActiveMQ Classic. Support for Multiple Labels Names enhances flexibility in metrics analysis,
while enhanced exporting and querying capabilities streamline data dissemination and processing.&lt;/p&gt;
&lt;p&gt;This release blog briefly introduces these new features and enhancements as well as some other notable changes.&lt;/p&gt;
&lt;h2 id=&#34;layer-and-service-hierarchy&#34;&gt;Layer and Service Hierarchy&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;Layer&lt;/code&gt; concept was introduced in SkyWalking 9.0.0, it represents an abstract framework in computer science,
such as Operating System(OS_LINUX layer), Kubernetes(k8s layer). It organizes services and metrics into different layers based on their roles
and responsibilities in the system. SkyWalking provides a suite of monitoring and diagnostic tools for each layer, but there is a gap between the layers,
which can not easily bridge the data across different layers.&lt;/p&gt;
&lt;p&gt;In SkyWalking 10, SkyWalking provides new abilities to jump/connect across different layers and provide a seamless monitoring experience for users.&lt;/p&gt;
&lt;h3 id=&#34;layer-jump&#34;&gt;Layer Jump&lt;/h3&gt;
&lt;p&gt;In the topology graph, users can click on a service node to jump to the dashboard of the service in another layer.
The following figures show the jump from the &lt;code&gt;GENERAL&lt;/code&gt; layer service topology to the &lt;code&gt;VIRTUAL_DATABASE&lt;/code&gt; service layer dashboard by clicking the topology node.
&lt;img src=&#34;layer_jump.jpg&#34; alt=&#34;Figure 1: Layer Jump&#34;&gt;
Figure 1: Layer Jump&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;layer_jump2.jpg&#34; alt=&#34;Figure 2: Layer jump Dashboard&#34;&gt;
Figure 2: Layer jump Dashboard&lt;/p&gt;
&lt;h3 id=&#34;service-hierarchy&#34;&gt;Service Hierarchy&lt;/h3&gt;
&lt;p&gt;SkyWalking 10 introduces a new concept called &lt;code&gt;Service Hierarchy&lt;/code&gt;, which defines the relationships of existing logically same services in various layers.
OAP will detect the services from different layers, and try to build the connections.
Users can click the &lt;code&gt;Hierarchy Services&lt;/code&gt; in any layer&amp;rsquo;s service topology node or service dashboard to get the &lt;code&gt;Hierarchy Topology&lt;/code&gt;.
In this topology graph, users can see the relationships between the services in different layers and the summary of the metrics and also can jump to the service dashboard in the layer.
When a service occurs performance issue, users can easily analyze the metrics from different layers and track down the root cause:&lt;/p&gt;
&lt;p&gt;The examples of the &lt;code&gt;Service Hierarchy&lt;/code&gt; relationships:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The application &lt;code&gt;song&lt;/code&gt; deployed in the Kubernetes cluster with SkyWalking agent and Service Mesh at the same time.
So the application &lt;code&gt;song&lt;/code&gt; across the &lt;code&gt;GENERAL&lt;/code&gt;, &lt;code&gt;MESH&lt;/code&gt;, &lt;code&gt;MESH_DP&lt;/code&gt; and &lt;code&gt;K8S_SERVICE&lt;/code&gt; layers which could be monitored by SkyWalking,
the &lt;code&gt;Service Hierarchy&lt;/code&gt; topology as below:
&lt;img src=&#34;song.jpg&#34; alt=&#34;Figure 3: Service Hierarchy Agent With K8s Service And Mesh With K8s Service&#34;&gt;
Figure 3: Service Hierarchy Agent With K8s Service And Mesh With K8s Service.&lt;/br&gt;
&lt;/br&gt;
And can also have the &lt;code&gt;Service Instance Hierarchy&lt;/code&gt; topology to get the single instance status across the layers as below:
&lt;img src=&#34;song_instance.jpg&#34; alt=&#34;Figure 4: Instance Hierarchy Agent With K8s Service(Pod)&#34;&gt;
Figure 4: Instance Hierarchy Agent With K8s Service(Pod)&lt;/br&gt;
&lt;/br&gt;&lt;/li&gt;
&lt;li&gt;The PostgreSQL database &lt;code&gt;psql&lt;/code&gt; deployed in the Kubernetes cluster and used by the application &lt;code&gt;song&lt;/code&gt;.
So the database &lt;code&gt;psql&lt;/code&gt; across the &lt;code&gt;VIRTUAL_DATABASE&lt;/code&gt;, &lt;code&gt;POSTGRESQL&lt;/code&gt; and &lt;code&gt;K8S_SERVICE&lt;/code&gt; layers which could be monitored by SkyWalking,
the &lt;code&gt;Service Hierarchy&lt;/code&gt; topology as below:
&lt;img src=&#34;postgre.jpg&#34; alt=&#34;Figure 5: Service Hierarchy Agent(Virtual Database) With Real Database And K8s Service&#34;&gt;
Figure 5: Service Hierarchy Agent(Virtual Database) With Real Database And K8s Service&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For more supported layers and how to detect the relationships between services in different layers please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/concepts-and-designs/service-hierarchy/#service-hierarchy&#34;&gt;Service Hierarchy&lt;/a&gt;.
how to configure the &lt;code&gt;Service Hierarchy&lt;/code&gt; in SkyWalking, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/concepts-and-designs/service-hierarchy-configuration/&#34;&gt;Service Hierarchy Configuration&lt;/a&gt; section.&lt;/p&gt;
&lt;h2 id=&#34;monitoring-kubernetes-network-traffic-by-using-ebpf&#34;&gt;Monitoring Kubernetes Network Traffic by using eBPF&lt;/h2&gt;
&lt;p&gt;In the previous version, skyWalking provides &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/setup/backend/backend-k8s-monitoring-metrics-cadvisor/&#34;&gt;Kubernetes (K8s) monitoring from kube-state-metrics and cAdvisor&lt;/a&gt;,
which can monitor the Kubernetes cluster status and the metrics of the Kubernetes resources.&lt;/p&gt;
&lt;p&gt;In SkyWalking 10, by leverage &lt;a href=&#34;https://skywalking.apache.org/docs/skywalking-rover/latest/readme/&#34;&gt;Apache SkyWalking Rover&lt;/a&gt; 0.6+,
SkyWalking has the ability to monitor the Kubernetes network traffic by using eBPF, which can collect and map access logs from applications in Kubernetes environments.
Through these data, SkyWalking can analyze and provide the Service Traffic, Topology, TCP/HTTP level metrics from the Kubernetes aspect.&lt;/p&gt;
&lt;p&gt;The following figures show the Topology and TCP Dashboard of the Kubernetes network traffic:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;k8s_topology.jpg&#34; alt=&#34;Figure 6: Kubernetes Network Traffic Topology&#34;&gt;
Figure 6: Kubernetes Network Traffic Topology&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;k8s_dashboard.jpg&#34; alt=&#34;Figure 7: Kubernetes Network Traffic TCP Dashboard&#34;&gt;
Figure 7: Kubernetes Network Traffic TCP Dashboard&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;More details about how to monitor the Kubernetes network traffic by using eBPF, please refer to the &lt;a href=&#34;https://skywalking.apache.org/blog/2024-03-18-monitor-kubernetes-network-by-ebpf/&#34;&gt;Monitoring Kubernetes Network Traffic by using eBPF&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;banyandb---native-apm-database&#34;&gt;BanyanDB - Native APM Database&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/skywalking-banyandb/latest/readme/&#34;&gt;BanyanDB&lt;/a&gt; 0.6.0 and &lt;a href=&#34;https://github.com/apache/skywalking-banyandb-java-client&#34;&gt;BanyanDB Java client&lt;/a&gt; 0.6.0 are released with SkyWalking 10,
As a native storage solution for SkyWalking, BanyanDB is going to be SkyWalking&amp;rsquo;s next-generation storage solution. This is recommended to use for medium-scale deployments from 0.6 until 1.0.&lt;br&gt;
It has shown high potential performance improvement. Less than 50% CPU usage and 50% memory usage with 40% disk volume compared to Elasticsearch in the same scale.&lt;/p&gt;
&lt;h2 id=&#34;apache-rocketmq-server-monitoring&#34;&gt;Apache RocketMQ Server Monitoring&lt;/h2&gt;
&lt;p&gt;Apache RocketMQ is an open-source distributed messaging and streaming platform, which is widely used in various scenarios including Internet, big data, mobile Internet, IoT, and other fields.
SkyWalking provides a basic monitoring dashboard for RocketMQ, which includes the following metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cluster Metrics: including messages produced/consumed today, total producer/consumer TPS, producer/consumer message size, messages produced/consumed until yesterday, max consumer latency, max commitLog disk ratio, commitLog disk ratio, pull/send threadPool queue head wait time, topic count, and broker count.&lt;/li&gt;
&lt;li&gt;Broker Metrics: including produce/consume TPS, producer/consumer message size.&lt;/li&gt;
&lt;li&gt;Topic Metrics: including max producer/consumer message size, consumer latency, producer/consumer TPS, producer/consumer offset, producer/consumer message size, consumer group count, and broker count.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following figure shows the RocketMQ Cluster Metrics dashboard:
&lt;img src=&#34;rocket_mq.jpg&#34; alt=&#34;Figure 8: Apache RocketMQ Server Monitoring&#34;&gt;
Figure 8: Apache RocketMQ Server Monitoring&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;For more metrics and details about the RocketMQ monitoring, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/setup/backend/backend-rocketmq-monitoring/&#34;&gt;Apache RocketMQ Server Monitoring&lt;/a&gt;,&lt;/p&gt;
&lt;h2 id=&#34;clickhouse-server-monitoring&#34;&gt;ClickHouse Server Monitoring&lt;/h2&gt;
&lt;p&gt;ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real-time, it is widely used for online analytical processing (OLAP).
ClickHouse monitoring provides monitoring of the metrics 、events and asynchronous metrics of the ClickHouse server, which includes the following parts of metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Server Metrics&lt;/li&gt;
&lt;li&gt;Query Metrics&lt;/li&gt;
&lt;li&gt;Network Metrics&lt;/li&gt;
&lt;li&gt;Insert Metrics&lt;/li&gt;
&lt;li&gt;Replica Metrics&lt;/li&gt;
&lt;li&gt;MergeTree Metrics&lt;/li&gt;
&lt;li&gt;ZooKeeper Metrics&lt;/li&gt;
&lt;li&gt;Embedded ClickHouse Keeper Metrics&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following figure shows the ClickHouse Cluster Metrics dashboard:
&lt;img src=&#34;clickhouse.jpg&#34; alt=&#34;Figure 9: ClickHouse Server Monitoring&#34;&gt;
Figure 9: ClickHouse Server Monitoring&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;For more metrics and details about the ClickHouse monitoring, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/setup/backend/backend-clickhouse-monitoring/&#34;&gt;ClickHouse Server Monitoring&lt;/a&gt;,
and here is a blog that can help for a quick start &lt;a href=&#34;https://skywalking.apache.org/blog/2024-03-12-monitoring-clickhouse-through-skywalking/&#34;&gt;Monitoring ClickHouse through SkyWalking&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;apache-activemq-server-monitoring&#34;&gt;Apache ActiveMQ Server Monitoring&lt;/h2&gt;
&lt;p&gt;Apache ActiveMQ Classic is a popular and powerful open-source messaging and integration pattern server.
SkyWalking provides a basic monitoring dashboard for ActiveMQ, which includes the following metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cluster Metrics: including memory usage, rates of write/read, and average/max duration of write.&lt;/li&gt;
&lt;li&gt;Broker Metrics: including node state, number of connections, number of producers/consumers, and rate of write/read under the broker. Depending on the cluster mode, one cluster may include one or more brokers.&lt;/li&gt;
&lt;li&gt;Destination Metrics: including number of producers/consumers, messages in different states, queues, and enqueue duration in a queue/topic.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following figure shows the ActiveMQ Cluster Metrics dashboard:
&lt;img src=&#34;active_mq.jpg&#34; alt=&#34;Figure 10: Apache ActiveMQ Server Monitoring&#34;&gt;
Figure 10: Apache ActiveMQ Server Monitoring&lt;/br&gt;&lt;/p&gt;
&lt;p&gt;For more metrics and details about the ActiveMQ monitoring, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/setup/backend/backend-activemq-monitoring/&#34;&gt;Apache ActiveMQ Server Monitoring&lt;/a&gt;,
and here is a blog that can help for a quick start &lt;a href=&#34;https://skywalking.apache.org/blog/2024-04-19-monitoring-activemq-through-skywalking/&#34;&gt;Monitoring ActiveMQ through SkyWalking&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;support-multiple-labels-names&#34;&gt;Support Multiple Labels Names&lt;/h2&gt;
&lt;p&gt;Before SkyWalking 10, SkyWalking does not store the labels names in the metrics data, which makes MQE have to use &lt;code&gt;_&lt;/code&gt; as the generic label name,
it can&amp;rsquo;t query the metrics data with multiple labels names.&lt;/p&gt;
&lt;p&gt;SkyWalking 10 supports storing the labels names in the metrics data, and MQE can query or calculate the metrics data with multiple labels names.
For example:
The &lt;code&gt;k8s_cluster_deployment_status&lt;/code&gt; metric has labels &lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;deployment&lt;/code&gt; and &lt;code&gt;status&lt;/code&gt;.
If we want to query all deployment metric values with &lt;code&gt;namespace=skywalking-showcase&lt;/code&gt; and &lt;code&gt;status=true&lt;/code&gt;, we can use the following expression:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;k8s_cluster_deployment_status{namespace=&amp;#39;skywalking-showcase&amp;#39;, status=&amp;#39;true&amp;#39;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;related enhancement:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Since Alarm rule configuration had migrated to the MQE in SkyWalking 9.6.0, the alarm rule also supports multiple labels names.&lt;/li&gt;
&lt;li&gt;PromeQL service supports multiple labels names query.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;metrics-grpc-exporter&#34;&gt;Metrics gRPC exporter&lt;/h2&gt;
&lt;p&gt;SkyWalking 10 enhanced the &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/setup/backend/exporter/#grpc-exporter&#34;&gt;metrics gPRC exporter&lt;/a&gt;,
it supports exporting all types of metrics data to the gRPC server.&lt;/p&gt;
&lt;h2 id=&#34;skywalking-native-ui-metrics-query-switch-to-v3-apis&#34;&gt;SkyWalking Native UI Metrics Query Switch to V3 APIs&lt;/h2&gt;
&lt;p&gt;SkyWalking Native UI metrics query deprecate the V2 APIs, and all migrated to &lt;a href=&#34;https://skywalking.apache.org/docs/main/latest/en/api/query-protocol/#v3-apis&#34;&gt;V3 APIs&lt;/a&gt;
and &lt;a href=&#34;https://skywalking.apache.org/docs/main/next/en/api/metrics-query-expression/#metrics-query-expressionmqe-syntax&#34;&gt;MQE&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;other-notable-enhancements&#34;&gt;Other Notable Enhancements&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Support Java 21 runtime and oap-java21 image for Java 21 runtime.&lt;/li&gt;
&lt;li&gt;Remove CLI(&lt;code&gt;swctl&lt;/code&gt;) from the image.&lt;/li&gt;
&lt;li&gt;More MQE functions and operators supported.&lt;/li&gt;
&lt;li&gt;Enhance the native UI and improve the user experience.&lt;/li&gt;
&lt;li&gt;Several bugs and CVEs fixed.&lt;/li&gt;
&lt;/ol&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: How to Use SkyWalking for Distributed Tracing in Istio?</title>
      <link>/blog/how-to-use-skywalking-for-distributed-tracing-in-istio/</link>
      <pubDate>Wed, 14 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/blog/how-to-use-skywalking-for-distributed-tracing-in-istio/</guid>
      <description>
        
        
        &lt;p&gt;In cloud native applications, a request often needs to be processed through a series of APIs or backend services, some of which are parallel and some serial and located on different platforms or nodes. How do we determine the service paths and nodes a call goes through to help us troubleshoot the problem? This is where distributed tracing comes into play.&lt;/p&gt;
&lt;p&gt;This article covers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How distributed tracing works&lt;/li&gt;
&lt;li&gt;How to choose distributed tracing software&lt;/li&gt;
&lt;li&gt;How to use distributed tracing in Istio&lt;/li&gt;
&lt;li&gt;How to view distributed tracing data using Bookinfo and SkyWalking as examples&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;distributed-tracing-basics&#34;&gt;Distributed Tracing Basics&lt;/h2&gt;
&lt;p&gt;Distributed tracing is a method for tracing requests in a distributed system to help users better understand, control, and optimize distributed systems. There are two concepts used in distributed tracing: TraceID and SpanID. You can see them in Figure 1 below.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;TraceID&lt;/strong&gt; is a globally unique ID that identifies the trace information of a request. All traces of a request belong to the same TraceID, and the TraceID remains constant throughout the trace of the request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SpanID&lt;/strong&gt; is a locally unique ID that identifies a request’s trace information at a certain time. A request generates different SpanIDs at different periods, and SpanIDs are used to distinguish trace information for a request at different periods.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;TraceID and SpanID are the basis of distributed tracing. They provide a uniform identifier for request tracing in distributed systems and facilitate users’ ability to query, manage, and analyze the trace information of requests.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f1.svg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 1: Trace and span&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The following is the process of distributed tracing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;When a system receives a request, the distributed tracing system assigns a TraceID to the request, which is used to chain together the entire chain of invocations.&lt;/li&gt;
&lt;li&gt;The distributed trace system generates a SpanID and ParentID for each service call within the system for the request, which is used to record the parent-child relationship of the call; a Span without a ParentID is used as the entry point of the call chain.&lt;/li&gt;
&lt;li&gt;TraceID and SpanID are to be passed during each service call.&lt;/li&gt;
&lt;li&gt;When viewing a distributed trace, query the full process of a particular request by TraceID.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-istio-implements-distributed-tracing&#34;&gt;How Istio Implements Distributed Tracing&lt;/h2&gt;
&lt;p&gt;Istio’s distributed tracing is based on information collected by the Envoy proxy in the data plane. After a service request is intercepted by Envoy, Envoy adds tracing information as headers to the request forwarded to the destination workload. The following headers are relevant for distributed tracing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As TraceID: x-request-id&lt;/li&gt;
&lt;li&gt;Used to establish parent-child relationships for Span in the LightStep trace: x-ot-span-context&amp;lt;/li&lt;/li&gt;
&lt;li&gt;Used for Zipkin, also for Jaeger, SkyWalking, see &lt;a href=&#34;https://github.com/openzipkin/b3-propagation&#34;&gt;b3-propagation&lt;/a&gt;:
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;x-b3-traceid&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-b3-traceid&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-b3-spanid&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-b3-parentspanid&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-b3-sampled&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-b3-flags&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;b3&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;For Datadog:
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;x-datadog-trace-id&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-datadog-parent-id&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;x-datadog-sampling-priority&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;For SkyWalking: &lt;em&gt;sw8&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;For AWS X-Ray: &lt;em&gt;x-amzn-trace-id&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For more information on how to use these headers, please see the &lt;a href=&#34;https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/headers&#34;&gt;Envoy documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Regardless of the language of your application, Envoy will generate the appropriate tracing headers for you at the Ingress Gateway and forward these headers to the upstream cluster. However, in order to utilize the distributed tracing feature, you must modify your application code to attach the tracing headers to upstream requests. Since neither the service mesh nor the application can automatically propagate these headers, you can integrate the agent for distributed tracing into the application or manually propagate these headers in the application code itself. Once the tracing headers are propagated to all upstream requests, Envoy will send the tracing data to the tracer’s back-end processing, and then you can view the tracing data in the UI.&lt;/p&gt;
&lt;p&gt;For example, look at the code of the Productpage service in the &lt;a href=&#34;https://istio.io/latest/docs/examples/bookinfo/&#34;&gt;Bookinfo application&lt;/a&gt;. You can see that it integrates the Jaeger client library and synchronizes the header generated by Envoy with the HTTP requests to the Details and Reviews services in the &lt;em&gt;getForwardHeaders (request)&lt;/em&gt; function.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#6639ba&#34;&gt;getForwardHeaders&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;request&lt;span style=&#34;color:#1f2328&#34;&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    headers &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#57606a&#34;&gt;# Using Jaeger agent to get the x-b3-* headers&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    span &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; get_current_span&lt;span style=&#34;color:#1f2328&#34;&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    carrier &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    tracer&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;inject&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        span_context&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;span&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;context&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#6639ba&#34;&gt;format&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;Format&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;HTTP_HEADERS&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        carrier&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;carrier&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    headers&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;update&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;carrier&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#57606a&#34;&gt;# Dealing with the non x-b3-* header manually&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;in&lt;/span&gt; session&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        headers&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;end-user&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; session&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;user&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    incoming_headers &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-request-id&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-ot-span-context&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-datadog-trace-id&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-datadog-parent-id&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-datadog-sampling-priority&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;traceparent&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;tracestate&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;x-cloud-trace-context&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;grpc-trace-bin&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;sw8&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;user-agent&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;cookie&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;authorization&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;jwt&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;for&lt;/span&gt; ihdr &lt;span style=&#34;color:#0550ae&#34;&gt;in&lt;/span&gt; incoming_headers&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        val &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; request&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;headers&lt;span style=&#34;color:#0550ae&#34;&gt;.&lt;/span&gt;get&lt;span style=&#34;color:#1f2328&#34;&gt;(&lt;/span&gt;ihdr&lt;span style=&#34;color:#1f2328&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#cf222e&#34;&gt;if&lt;/span&gt; val &lt;span style=&#34;color:#0550ae&#34;&gt;is&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;not&lt;/span&gt; &lt;span style=&#34;color:#cf222e&#34;&gt;None&lt;/span&gt;&lt;span style=&#34;color:#1f2328&#34;&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            headers&lt;span style=&#34;color:#1f2328&#34;&gt;[&lt;/span&gt;ihdr&lt;span style=&#34;color:#1f2328&#34;&gt;]&lt;/span&gt; &lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt; val
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#cf222e&#34;&gt;return&lt;/span&gt; headers
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For more information, the &lt;a href=&#34;https://istio.io/latest/about/faq/#distributed-tracing&#34;&gt;Istio documentation&lt;/a&gt; provides answers to frequently asked questions about distributed tracing in Istio.&lt;/p&gt;
&lt;h2 id=&#34;how-to-choose-a-distributed-tracing-system&#34;&gt;How to Choose A Distributed Tracing System&lt;/h2&gt;
&lt;p&gt;Distributed tracing systems are similar in principle. There are many such systems on the market, such as &lt;a href=&#34;https://github.com/apache/skywalking&#34;&gt;Apache SkyWalking&lt;/a&gt;, &lt;a href=&#34;https://github.com/jaegertracing/jaeger&#34;&gt;Jaeger&lt;/a&gt;, &lt;a href=&#34;https://github.com/openzipkin/zipkin/&#34;&gt;Zipkin&lt;/a&gt;, &lt;a href=&#34;https://lightstep.com/&#34;&gt;Lightstep&lt;/a&gt;, &lt;a href=&#34;https://github.com/pinpoint-apm/pinpoint&#34;&gt;Pinpoint&lt;/a&gt;, and so on. For our purposes here, we will choose three of them and compare them in several dimensions. Here are our inclusion criteria:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;They are currently the most popular open-source distributed tracing systems.&lt;/li&gt;
&lt;li&gt;All are based on the OpenTracing specification.&lt;/li&gt;
&lt;li&gt;They support integration with Istio and Envoy.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Items&lt;/th&gt;
          &lt;th&gt;Apache SkyWalking&lt;/th&gt;
          &lt;th&gt;Jaeger&lt;/th&gt;
          &lt;th&gt;Zipkin&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Implementations&lt;/td&gt;
          &lt;td&gt;Language-based probes, service mesh probes, eBPF agent, third-party instrumental libraries (Zipkin currently supported)&lt;/td&gt;
          &lt;td&gt;Language-based probes&lt;/td&gt;
          &lt;td&gt;Language-based probes&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Database&lt;/td&gt;
          &lt;td&gt;ES, H2, MySQL, TiDB, Sharding-sphere, BanyanDB&lt;/td&gt;
          &lt;td&gt;ES, MySQL, Cassandra, Memory&lt;/td&gt;
          &lt;td&gt;ES, MySQL, Cassandra, Memory&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Supported Languages&lt;/td&gt;
          &lt;td&gt;Java, Rust, PHP, NodeJS, Go, Python, C++, .Net, Lua&lt;/td&gt;
          &lt;td&gt;Java, Go, Python, NodeJS, C#, PHP, Ruby, C++&lt;/td&gt;
          &lt;td&gt;Java, Go, Python, NodeJS, C#, PHP, Ruby, C++&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Initiator&lt;/td&gt;
          &lt;td&gt;Personal&lt;/td&gt;
          &lt;td&gt;Uber&lt;/td&gt;
          &lt;td&gt;Twitter&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Governance&lt;/td&gt;
          &lt;td&gt;Apache Foundation&lt;/td&gt;
          &lt;td&gt;CNCF&lt;/td&gt;
          &lt;td&gt;CNCF&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Version&lt;/td&gt;
          &lt;td&gt;9.3.0&lt;/td&gt;
          &lt;td&gt;1.39.0&lt;/td&gt;
          &lt;td&gt;2.23.19&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Stars&lt;/td&gt;
          &lt;td&gt;20.9k&lt;/td&gt;
          &lt;td&gt;16.8k&lt;/td&gt;
          &lt;td&gt;15.8k&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Although Apache SkyWalking’s agent does not support as many languages as Jaeger and Zipkin, SkyWalking’s implementation is richer and compatible with Jaeger and Zipkin trace data, and development is more active, so it is one of the best choices for building a telemetry platform.&lt;/p&gt;
&lt;h2 id=&#34;demo&#34;&gt;Demo&lt;/h2&gt;
&lt;p&gt;Refer to the &lt;a href=&#34;https://istio.io/latest/docs/tasks/observability/distributed-tracing/skywalking/&#34;&gt;Istio documentation&lt;/a&gt; to install and configure Apache SkyWalking.&lt;/p&gt;
&lt;h3 id=&#34;environment-description&#34;&gt;Environment Description&lt;/h3&gt;
&lt;p&gt;The following is the environment for our demo:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Kubernetes 1.24.5&lt;/li&gt;
&lt;li&gt;Istio 1.16&lt;/li&gt;
&lt;li&gt;SkyWalking 9.1.0&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;install-istio&#34;&gt;Install Istio&lt;/h3&gt;
&lt;p&gt;Before installing Istio, you can check the environment for any problems:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;$ istioctl experimental precheck
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;✔ No issues found when checking the cluster. Istio is safe to install or upgrade!
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  To get started, check out https://istio.io/latest/docs/setup/getting-started/
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then install Istio and configure the destination for sending tracing messages as SkyWalking:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Initial Istio Operator&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl operator init
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Configure tracing destination&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f - &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;apiVersion: install.istio.io/v1alpha1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;kind: IstioOperator
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;metadata:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  namespace: istio-system
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  name: istio-with-skywalking
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;spec:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  meshConfig:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    defaultProviders:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;      tracing:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;      - &amp;#34;skywalking&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    enableTracing: true
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    extensionProviders:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;    - name: &amp;#34;skywalking&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;      skywalking:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        service: tracing.istio-system.svc.cluster.local
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;        port: 11800
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;deploy-apache-skywalking&#34;&gt;Deploy Apache SkyWalking&lt;/h2&gt;
&lt;p&gt;Istio 1.16 supports distributed tracing using Apache SkyWalking. Install SkyWalking by executing the following code:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;https://raw.githubusercontent.com/istio/istio/release-1.16/samples/addons/extras/skywalking.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It will install the following components under the &lt;em&gt;istio-system&lt;/em&gt; namespace:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/v9.3.0/en/concepts-and-designs/backend-overview/&#34;&gt;SkyWalking Observability Analysis Platform (OAP)&lt;/a&gt;: Used to receive trace data, supports SkyWalking native data formats, Zipkin v1 and v2 and Jaeger format.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/docs/main/v9.3.0/en/ui/readme/&#34;&gt;UI&lt;/a&gt;: Used to query distributed trace data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For more information about SkyWalking, please refer to the &lt;a href=&#34;https://skywalking.apache.org/docs/main/v9.3.0/readme/&#34;&gt;SkyWalking documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;deploy-the-bookinfo-application&#34;&gt;Deploy the Bookinfo Application&lt;/h2&gt;
&lt;p&gt;Execute the following command to install the bookinfo application:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl label namespace default istio-injection&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;enabled
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Launch the SkyWalking UI:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl dashboard skywalking
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Figure 2 shows all the services available in the bookinfo application:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f2.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 2: SkyWalking General Service page&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;You can also see information about instances, endpoints, topology, tracing, etc. For example, Figure 3 shows the service topology of the bookinfo application:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f3.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 3: Topology diagram of the Bookinfo application&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Tracing views in SkyWalking can be displayed in a variety of formats, including list, tree, table, and statistics. See Figure 4:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f4.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 4: SkyWalking General Service trace supports multiple display formats&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;To facilitate our examination, set the sampling rate of the trace to 100%:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f - &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;apiVersion: telemetry.istio.io/v1alpha1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;kind: Telemetry
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;metadata:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  name: mesh-default
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  namespace: istio-system
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;spec:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  tracing:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;  - randomSamplingPercentage: 100.00
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; &lt;em&gt;It’s generally not good practice to set the sampling rate to 100% in a production environment. To avoid the overhead of generating too many trace logs in production, please adjust the sampling strategy (sampling percentage).&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&#34;uninstall&#34;&gt;Uninstall&lt;/h2&gt;
&lt;p&gt;After experimenting, uninstall Istio and SkyWalking by executing the following command.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;samples/bookinfo/platform/kube/cleanup.sh
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl unintall --purge
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl delete namespace istio-system
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;understanding-the-bookinfo-tracing-information&#34;&gt;Understanding the Bookinfo Tracing Information&lt;/h2&gt;
&lt;p&gt;Navigate to the General Service tab in the Apache SkyWalking UI, and you can see the trace information for the most recent &lt;em&gt;istio-ingressgateway&lt;/em&gt; service, as shown in Figure 5. Click on each span to see the details.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f5.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 5: The table view shows the basic information about each span.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Switching to the list view, you can see the execution order and duration of each span, as shown in Figure 6:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f6.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 6: List display&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;You might want to know why such a straightforward application generates so much span data. Because after we inject the Envoy proxy into the pod, every request between services will be intercepted and processed by Envoy, as shown in Figure 7:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f7.svg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 7: Envoy intercepts requests to generate a span&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The tracing process is shown in Figure 8:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;f8.svg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Figure 8: Trace of the Bookinfo application&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;We give each span a label with a serial number, and the time taken is indicated in parentheses. For illustration purposes, we have summarized all spans in the table below.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;No.&lt;/th&gt;
          &lt;th&gt;Endpoint&lt;/th&gt;
          &lt;th&gt;Total Duration (ms)&lt;/th&gt;
          &lt;th&gt;Component Duration (ms)&lt;/th&gt;
          &lt;th&gt;Current Service&lt;/th&gt;
          &lt;th&gt;Description&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;/productpage&lt;/td&gt;
          &lt;td&gt;190&lt;/td&gt;
          &lt;td&gt;0&lt;/td&gt;
          &lt;td&gt;istio-ingressgateway&lt;/td&gt;
          &lt;td&gt;Envoy Outbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;/productpage&lt;/td&gt;
          &lt;td&gt;190&lt;/td&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;istio-ingressgateway&lt;/td&gt;
          &lt;td&gt;Ingress -&amp;gt; Productpage network transmission&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;3&lt;/td&gt;
          &lt;td&gt;/productpage&lt;/td&gt;
          &lt;td&gt;189&lt;/td&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Envoy Inbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;4&lt;/td&gt;
          &lt;td&gt;/productpage&lt;/td&gt;
          &lt;td&gt;188&lt;/td&gt;
          &lt;td&gt;21&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Application internal processing&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;5&lt;/td&gt;
          &lt;td&gt;/details/0&lt;/td&gt;
          &lt;td&gt;8&lt;/td&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Envoy Outbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;6&lt;/td&gt;
          &lt;td&gt;/details/0&lt;/td&gt;
          &lt;td&gt;7&lt;/td&gt;
          &lt;td&gt;3&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Productpage -&amp;gt; Details network transmission&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;7&lt;/td&gt;
          &lt;td&gt;/details/0&lt;/td&gt;
          &lt;td&gt;4&lt;/td&gt;
          &lt;td&gt;0&lt;/td&gt;
          &lt;td&gt;details&lt;/td&gt;
          &lt;td&gt;Envoy Inbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;8&lt;/td&gt;
          &lt;td&gt;/details/0&lt;/td&gt;
          &lt;td&gt;4&lt;/td&gt;
          &lt;td&gt;4&lt;/td&gt;
          &lt;td&gt;details&lt;/td&gt;
          &lt;td&gt;Application internal processing&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;9&lt;/td&gt;
          &lt;td&gt;/reviews/0&lt;/td&gt;
          &lt;td&gt;159&lt;/td&gt;
          &lt;td&gt;0&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Envoy Outbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;10&lt;/td&gt;
          &lt;td&gt;/reviews/0&lt;/td&gt;
          &lt;td&gt;159&lt;/td&gt;
          &lt;td&gt;14&lt;/td&gt;
          &lt;td&gt;productpage&lt;/td&gt;
          &lt;td&gt;Productpage -&amp;gt; Reviews network transmission&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;11&lt;/td&gt;
          &lt;td&gt;/reviews/0&lt;/td&gt;
          &lt;td&gt;145&lt;/td&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;reviews&lt;/td&gt;
          &lt;td&gt;Envoy Inbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;12&lt;/td&gt;
          &lt;td&gt;/reviews/0&lt;/td&gt;
          &lt;td&gt;144&lt;/td&gt;
          &lt;td&gt;109&lt;/td&gt;
          &lt;td&gt;reviews&lt;/td&gt;
          &lt;td&gt;Application internal processing&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;13&lt;/td&gt;
          &lt;td&gt;/ratings/0&lt;/td&gt;
          &lt;td&gt;35&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;reviews&lt;/td&gt;
          &lt;td&gt;Envoy Outbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;14&lt;/td&gt;
          &lt;td&gt;/ratings/0&lt;/td&gt;
          &lt;td&gt;33&lt;/td&gt;
          &lt;td&gt;16&lt;/td&gt;
          &lt;td&gt;reviews&lt;/td&gt;
          &lt;td&gt;Reviews -&amp;gt; Ratings network transmission&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;15&lt;/td&gt;
          &lt;td&gt;/ratings/0&lt;/td&gt;
          &lt;td&gt;17&lt;/td&gt;
          &lt;td&gt;1&lt;/td&gt;
          &lt;td&gt;ratings&lt;/td&gt;
          &lt;td&gt;Envoy Inbound&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;16&lt;/td&gt;
          &lt;td&gt;/ratings/0&lt;/td&gt;
          &lt;td&gt;16&lt;/td&gt;
          &lt;td&gt;16&lt;/td&gt;
          &lt;td&gt;ratings&lt;/td&gt;
          &lt;td&gt;Application internal processing&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;From the above information, it can be seen that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The total time consumed for this request is 190 ms.&lt;/li&gt;
&lt;li&gt;In Istio sidecar mode, each traffic flow in and out of the application container must pass through the Envoy proxy once, each time taking 0 to 2 ms.&lt;/li&gt;
&lt;li&gt;Network requests between Pods take between 1 and 16ms.&lt;/li&gt;
&lt;li&gt;This is because the data itself has errors and the start time of the Span is not necessarily equal to the end time of the parent Span.&lt;/li&gt;
&lt;li&gt;We can see that the most time-consuming part is the Reviews application, which takes 109 ms so that we can optimize it for that application.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;Distributed tracing is an indispensable tool for analyzing performance and troubleshooting modern distributed applications. In this tutorial, we’ve seen how, with just a few minor changes to your application code to propagate tracing headers, Istio makes distributed tracing simple to use. We’ve also reviewed &lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;Apache SkyWalking&lt;/a&gt; as one of the best distributed tracing systems that Istio supports. It is a fully functional platform for cloud native application analytics, with features such as metrics and log collection, alerting, Kubernetes monitoring, &lt;a href=&#34;https://skywalking.apache.org/blog/diagnose-service-mesh-network-performance-with-ebpf/&#34;&gt;service mesh performance diagnosis using eBPF&lt;/a&gt;, and more.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;If you’re new to service mesh and Kubernetes security, we have a bunch of free online courses &lt;a href=&#34;https://tetr8.io/academy&#34;&gt;available at Tetrate Academy&lt;/a&gt; that will quickly get you up to speed with Istio and Envoy.&lt;/p&gt;
&lt;p&gt;If you’re looking for a fast way to get to production with Istio, check out &lt;a href=&#34;https://tetr8.io/tid&#34;&gt;Tetrate Istio Distribution (TID)&lt;/a&gt;. TID is Tetrate’s hardened, fully upstream Istio distribution, with FIPS-verified builds and support available. It’s a great way to get started with Istio knowing you have a trusted distribution to begin with, have an expert team supporting you, and also have the option to get to FIPS compliance quickly if you need to.&lt;/p&gt;
&lt;p&gt;Once you have Istio up and running, you will probably need simpler ways to manage and secure your services beyond what’s available in Istio, that’s where Tetrate Service Bridge comes in. You can learn more about how Tetrate Service Bridge makes service mesh more secure, manageable, and resilient &lt;a href=&#34;https://tetr8.io/tsb&#34;&gt;here&lt;/a&gt;, or &lt;a href=&#34;https://tetr8.io/contact&#34;&gt;contact us for a quick demo&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: How to run Apache SkyWalking on AWS EKS and RDS/Aurora</title>
      <link>/blog/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</link>
      <pubDate>Tue, 13 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/</guid>
      <description>
        
        
        &lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Apache SkyWalking is an open source APM tool for monitoring and troubleshooting distributed systems,
especially designed for microservices, cloud native and container-based (Docker, Kubernetes, Mesos)
architectures. It provides distributed tracing, service mesh observability, metric aggregation and
visualization, and alarm.&lt;/p&gt;
&lt;p&gt;In this article, I will introduce how to quickly set up Apache SkyWalking on AWS EKS and RDS/Aurora,
as well as a couple of sample services, monitoring services to observe SkyWalking itself.&lt;/p&gt;
&lt;h2 id=&#34;prerequisites&#34;&gt;Prerequisites&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;AWS account&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html&#34;&gt;AWS CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.terraform.io/downloads.html&#34;&gt;Terraform&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://kubernetes.io/docs/tasks/tools/#kubectl&#34;&gt;kubectl&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can use the AWS web console or CLI to create all resources needed in this tutorial, but it can be
too tedious and hard to debug when something goes wrong. So in this artical I will use Terraform to
create all AWS resources, deploy SkyWalking, sample services, and load generator services (Locust).&lt;/p&gt;
&lt;h2 id=&#34;architecture&#34;&gt;Architecture&lt;/h2&gt;
&lt;p&gt;The demo architecture is as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-mermaid&#34; data-lang=&#34;mermaid&#34;&gt;graph LR
    subgraph AWS
        subgraph EKS
          subgraph istio-system namespace
              direction TB
              OAP[[SkyWalking OAP]]
              UI[[SkyWalking UI]]
            Istio[[istiod]]
          end
          subgraph sample namespace
              Service0[[Service0]]
              Service1[[Service1]]
              ServiceN[[Service ...]]
          end
          subgraph locust namespace
              LocustMaster[[Locust Master]]
              LocustWorkers0[[Locust Worker 0]]
              LocustWorkers1[[Locust Worker 1]]
              LocustWorkersN[[Locust Worker ...]]
          end
        end
        RDS[[RDS/Aurora]]
    end
    OAP --&amp;gt; RDS
    Service0 -. telemetry data -.-&amp;gt; OAP
    Service1 -. telemetry data -.-&amp;gt; OAP
    ServiceN -. telemetry data -.-&amp;gt; OAP
    UI --query--&amp;gt; OAP
    LocustWorkers0 -- traffic --&amp;gt; Service0
    LocustWorkers1 -- traffic --&amp;gt; Service0
    LocustWorkersN -- traffic --&amp;gt; Service0
    Service0 --&amp;gt; Service1 --&amp;gt; ServiceN
    LocustMaster --&amp;gt; LocustWorkers0
    LocustMaster --&amp;gt; LocustWorkers1
    LocustMaster --&amp;gt; LocustWorkersN
    User --&amp;gt; LocustMaster
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As shown in the architecture diagram, we need to create the following AWS resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;EKS cluster&lt;/li&gt;
&lt;li&gt;RDS instance or Aurora cluster&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Sounds simple, but there are a lot of things behind the scenes, such as VPC, subnets, security groups, etc.
You have to configure them correctly to make sure the EKS cluster can connect to RDS instance/Aurora cluster
otherwise the SkyWalking won&amp;rsquo;t work. Luckily, Terraform can help us to create and destroy all these resources
automatically.&lt;/p&gt;
&lt;p&gt;I have created a Terraform module to create all AWS resources needed in this tutorial, you can find it in the
&lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/tree/main/aws&#34;&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;create-aws-resources&#34;&gt;Create AWS resources&lt;/h2&gt;
&lt;p&gt;First, we need to clone the GitHub repository and &lt;code&gt;cd&lt;/code&gt; into the folder:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/kezhenxu94/oap-load-test.git
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then, we need to create a file named &lt;code&gt;terraform.tfvars&lt;/code&gt; to specify the AWS region and other variables:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cat &amp;gt; terraform.tfvars &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;lt;&amp;lt;EOF
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_access_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;aws_secret_key = &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;cluster_name   = &amp;#34;skywalking-on-aws&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;region         = &amp;#34;ap-east-1&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;db_type        = &amp;#34;rds-postgresql&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;EOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you have already configured the AWS CLI, you can skip the &lt;code&gt;aws_access_key&lt;/code&gt; and &lt;code&gt;aws_secret_key&lt;/code&gt; variables.
To install SkyWalking with RDS postgresql, set the &lt;code&gt;db_type&lt;/code&gt; to &lt;code&gt;rds-postgresql&lt;/code&gt;, to install SkyWalking with
Aurora postgresql, set the &lt;code&gt;db_type&lt;/code&gt; to &lt;code&gt;aurora-postgresql&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There are a lot of other variables you can configure, such as tags, sample services count, replicas, etc.,
you can find them in the &lt;a href=&#34;https://github.com/kezhenxu94/oap-load-test/blob/main/aws/variables.tf&#34;&gt;variables.tf&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then, we can run the following commands to initialize the Terraform module and download the required providers,
then create all AWS resources:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform init
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Type &lt;code&gt;yes&lt;/code&gt; to confirm the creation of all AWS resources, or add the &lt;code&gt;-auto-approve&lt;/code&gt; flag to the &lt;code&gt;terraform apply&lt;/code&gt;
to skip the confirmation:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform apply -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now what you need to do is to wait for the creation of all AWS resources to complete, it may take a few minutes.
You can check the progress of the creation in the AWS web console, and check the deployment progress of the services
inside the EKS cluster.&lt;/p&gt;
&lt;h2 id=&#34;generate-traffic&#34;&gt;Generate traffic&lt;/h2&gt;
&lt;p&gt;Besides creating necessary AWS resources, the Terraform module also deploys SkyWalking, sample services, and Locust
load generator services to the EKS cluster.&lt;/p&gt;
&lt;p&gt;You can access the Locust web UI to generate traffic to the sample services:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;open http://&lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl get svc -n locust -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;locust-master -o &lt;span style=&#34;color:#953800&#34;&gt;jsonpath&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;{.items[0].status.loadBalancer.ingress[0].hostname}&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt;:8089
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The command opens the browser to the Locust web UI, you can configure the number of users and hatch rate to generate
traffic.&lt;/p&gt;
&lt;h2 id=&#34;observe-skywalking&#34;&gt;Observe SkyWalking&lt;/h2&gt;
&lt;p&gt;You can access the SkyWalking web UI to observe the sample services.&lt;/p&gt;
&lt;p&gt;First you need to forward the SkyWalking UI port to local&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl -n istio-system port-forward &lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl -n istio-system get pod -l &lt;span style=&#34;color:#953800&#34;&gt;app&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;skywalking -l &lt;span style=&#34;color:#953800&#34;&gt;component&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;ui -o name&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt; 8080:8080
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And then open the browser to http://localhost:8080 to access the SkyWalking web UI.&lt;/p&gt;
&lt;h2 id=&#34;observe-rdsaurora&#34;&gt;Observe RDS/Aurora&lt;/h2&gt;
&lt;p&gt;You can also access the RDS/Aurora web console to observe the performance of RDS/Aurora instance/Aurora cluste.&lt;/p&gt;
&lt;h2 id=&#34;test-results&#34;&gt;Test Results&lt;/h2&gt;
&lt;h3 id=&#34;test-1-skywalking-with-eks-and-rds-postgresql&#34;&gt;Test 1: SkyWalking with EKS and RDS PostgreSQL&lt;/h3&gt;
&lt;h4 id=&#34;service-traffic&#34;&gt;Service Traffic&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-cpm.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-performance&#34;&gt;RDS Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-performance&#34;&gt;SkyWalking Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/postgresql/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/postgresql/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;test-2-skywalking-with-eks-and-aurora-postgresql&#34;&gt;Test 2: SkyWalking with EKS and Aurora PostgreSQL&lt;/h3&gt;
&lt;h4 id=&#34;service-traffic-1&#34;&gt;Service Traffic&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-cpm-locust.png&#34; alt=&#34;Service Traffic Locust&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-cpm-skywalking.png&#34; alt=&#34;Service Traffic SW&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;rds-performance-1&#34;&gt;RDS Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-postgresql-1.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-2.png&#34; alt=&#34;RDS Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-postgresql-3.png&#34; alt=&#34;RDS Performance&#34;&gt;&lt;/p&gt;
&lt;h4 id=&#34;skywalking-performance-1&#34;&gt;SkyWalking Performance&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;./outputs/aurora/test1-so11y-1.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-2.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-3.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-4.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;
&lt;img src=&#34;./outputs/aurora/test1-so11y-5.png&#34; alt=&#34;SkyWalking Performance&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;clean-up&#34;&gt;Clean up&lt;/h2&gt;
&lt;p&gt;When you are done with the demo, you can run the following command to destroy all AWS resources:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;terraform destroy -var-file&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;terraform.tfvars -auto-approve
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
      </description>
    </item>
    
    <item>
      <title>Blog: Pinpoint Service Mesh Critical Performance Impact by using eBPF</title>
      <link>/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf/</link>
      <pubDate>Tue, 05 Jul 2022 00:00:00 +0000</pubDate>
      <guid>/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf/</guid>
      <description>
        
        
        &lt;h3 id=&#34;content&#34;&gt;Content&lt;/h3&gt;
&lt;h1 id=&#34;background&#34;&gt;Background&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;Apache SkyWalking&lt;/a&gt; observes metrics, logs, traces, and events for services deployed into the service mesh. When troubleshooting, SkyWalking error analysis can be an invaluable tool helping to pinpoint where an error occurred. However, performance problems are more difficult: It’s often impossible to locate the root cause of performance problems with pre-existing observation data. To move beyond the status quo, dynamic debugging and troubleshooting are essential service performance tools. In this article, we&amp;rsquo;ll discuss how to use eBPF technology to improve the profiling feature in SkyWalking and analyze the performance impact in the service mesh.&lt;/p&gt;
&lt;h1 id=&#34;trace-profiling-in-skywalking&#34;&gt;Trace Profiling in SkyWalking&lt;/h1&gt;
&lt;p&gt;Since SkyWalking 7.0.0, Trace Profiling has helped developers find performance problems by periodically sampling the thread stack to let developers know which lines of code take more time. However, Trace Profiling is not suitable for the following scenarios:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Thread Model&lt;/strong&gt;: Trace Profiling is most useful for profiling code that executes in a single thread. It is less useful for middleware that relies heavily on async execution models. For example Goroutines in Go or Kotlin Coroutines.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language&lt;/strong&gt;: Currently, Trace Profiling is only supported in Java and Python, since it’s not easy to obtain the thread stack in the runtimes of some languages such as Go and Node.js.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Binding&lt;/strong&gt;: Trace Profiling requires Agent installation, which can be tricky depending on the language (e.g., PHP has to rely on its C kernel; Rust and C/C++ require manual instrumentation to make install).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trace Correlation&lt;/strong&gt;: Since Trace Profiling is only associated with a single request it can be hard to determine which request is causing the problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Short Lifecycle Services&lt;/strong&gt;: Trace Profiling doesn&amp;rsquo;t support short-lived services for (at least) two reasons:
&lt;ol&gt;
&lt;li&gt;It&amp;rsquo;s hard to differentiate system performance from class code manipulation in the booting stage.&lt;/li&gt;
&lt;li&gt;Trace profiling is linked to an endpoint to identify performance impact, but there is no endpoint to match these short-lived services.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Fortunately, there are techniques that can go further than Trace Profiling in these situations.&lt;/p&gt;
&lt;h1 id=&#34;introduce-ebpf&#34;&gt;Introduce eBPF&lt;/h1&gt;
&lt;p&gt;We have found that eBPF — a technology that can run sandboxed programs in an operating system kernel and thus safely and efficiently extend the capabilities of the kernel without requiring kernel modifications or loading kernel modules — can help us fill gaps left by Trace Profiling. eBPF is a trending technology because it breaks the traditional barrier between user and kernel space. Programs can now inject bytecode that runs in the kernel, instead of having to recompile the kernel to customize it. This is naturally a good fit for observability.&lt;/p&gt;
&lt;p&gt;In the figure below, we can see that when the system executes the execve syscalls, the eBPF program is triggered, and the current process runtime information is obtained by using function calls.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;eBPF-hook-points.png&#34; alt=&#34;eBPF Hook Point&#34;&gt;&lt;/p&gt;
&lt;p&gt;Using eBPF technology, we can expand the scope of Skywalking&amp;rsquo;s profiling capabilities:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Global Performance Analysis&lt;/strong&gt;: Before eBPF, data collection was limited to what agents can observe. Since eBPF programs run in the kernel, they can observe all threads. This is especially useful when you are not sure whether a performance problem is caused by a particular request.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data Content&lt;/strong&gt;: eBPF can dump both user and kernel space thread stacks, so if a performance issue happens in kernel space, it’s easier to find.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Binding&lt;/strong&gt;: All modern Linux kernels support eBPF, so there is no need to install anything. This means it is an orchestration-free vs an agent model. This reduces friction caused by built-in software which may not have the correct agents installed, such as Envoy in a Service Mesh.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sampling Type&lt;/strong&gt;: Unlike Trace Profiling, eBPF is event-driven and, therefore, not constrained by interval polling. For example, eBPF can trigger events and collect more data depending on a transfer size threshold. This can allow the system to triage and prioritize data collection under extreme load.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;ebpf-limitations&#34;&gt;eBPF Limitations&lt;/h2&gt;
&lt;p&gt;While eBPF offers significant advantages for hunting performance bottlenecks, no technology is perfect. eBPF has a number of limitations described below. Fortunately, since SkyWalking does not require eBPF, the impact is limited.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Linux Version Requirement&lt;/strong&gt;: eBPF programs require a Linux kernel version above 4.4, with later kernel versions offering more data to be collected. The BCC has &lt;a href=&#34;https://github.com/iovisor/bcc/blob/13b5563c11f7722a61a17c6ca0a1a387d2fa7788/docs/kernel-versions.md#main-features&#34;&gt;documented the features supported by different Linux kernel versions&lt;/a&gt;, with the differences between versions usually being what data can be collected with eBPF.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privileges Required&lt;/strong&gt;: All processes that intend to load eBPF programs into the Linux kernel must be running in privileged mode. As such, bugs or other issues in such code may have a big impact.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weak Support for Dynamic Language&lt;/strong&gt;: eBPF has weak support for JIT-based dynamic languages, such as Java. It also depends on what data you want to collect. For Profiling, eBPF does not support parsing the symbols of the program, which is why most eBPF-based profiling technologies only support static languages like C, C++, Go, and Rust. However, symbol mapping can sometimes be solved through tools provided by the language. For example, in Java, &lt;a href=&#34;https://github.com/jvm-profiling-tools/perf-map-agent#architecture&#34;&gt;perf-map-agent&lt;/a&gt; can be used to generate the symbol mapping. However, dynamic languages don&amp;rsquo;t support the attach (uprobe) functionality that would allow us to trace execution events through symbols.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;introducing-skywalking-rover&#34;&gt;Introducing SkyWalking Rover&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking-rover&#34;&gt;SkyWalking Rover&lt;/a&gt; introduces the eBPF profiling feature into the SkyWalking ecosystem. The figure below shows the overall architecture of SkyWalking Rover. SkyWalking Rover is currently supported in Kubernetes environments and must be deployed inside a Kubernetes cluster. After establishing a connection with the SkyWalking backend server, it saves information about the processes on the current machine to SkyWalking. When the user creates an eBPF profiling task via the user interface, SkyWalking Rover receives the task and executes it in the relevant C, C++, Golang, and Rust language-based programs.&lt;/p&gt;
&lt;p&gt;Other than an eBPF-capable kernel, there are no additional prerequisites for deploying SkyWalking Rover.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;architecture.png&#34; alt=&#34;architecture&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;cpu-profiling-with-rover&#34;&gt;CPU Profiling with Rover&lt;/h2&gt;
&lt;p&gt;CPU profiling is the most intuitive way to show service performance. Inspired by &lt;a href=&#34;https://www.brendangregg.com/offcpuanalysis.html&#34;&gt;Brendan Gregg‘s blog post&lt;/a&gt;, we&amp;rsquo;ve divided CPU profiling into two types that we have implemented in Rover:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;On-CPU Profiling&lt;/strong&gt;: Where threads are spending time running on-CPU.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Off-CPU Profiling&lt;/strong&gt;: Where time is spent waiting while blocked on I/O, locks, timers, paging/swapping, etc.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;profiling-envoy-with-ebpf&#34;&gt;Profiling Envoy with eBPF&lt;/h1&gt;
&lt;p&gt;Envoy is a popular proxy, used as the data plane by the Istio service mesh. In a Kubernetes cluster, Istio injects Envoy into each service’s pod as a sidecar where it transparently intercepts and processes incoming and outgoing traffic. As the data plane, any performance issues in Envoy can affect all service traffic in the mesh. In this scenario, it’s more powerful to use &lt;strong&gt;eBPF profiling&lt;/strong&gt; to analyze issues in production caused by service mesh configuration.&lt;/p&gt;
&lt;h2 id=&#34;demo-environment&#34;&gt;Demo Environment&lt;/h2&gt;
&lt;p&gt;If you want to see this scenario in action, we&amp;rsquo;ve built a demo environment where we deploy an Nginx service for stress testing. Traffic is intercepted by Envoy and forwarded to Nginx. The commands to install the whole environment can be accessed through &lt;a href=&#34;https://github.com/mrproliu/skywalking-rover-profiling-demo&#34;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&#34;on-cpu-profiling&#34;&gt;On-CPU Profiling&lt;/h1&gt;
&lt;p&gt;On-CPU profiling is suitable for analyzing thread stacks when service CPU usage is high. If the stack is dumped more times, it means that the thread stack occupies more CPU resources.&lt;/p&gt;
&lt;p&gt;When installing Istio using the demo configuration profile, we found there are two places where we can optimize performance:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Zipkin Tracing&lt;/strong&gt;: Different Zipkin sampling percentages have a direct impact on QPS.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Access Log Format&lt;/strong&gt;: Reducing the fields of the Envoy access log can improve QPS.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;zipkin-tracing&#34;&gt;Zipkin Tracing&lt;/h2&gt;
&lt;h3 id=&#34;zipkin-with-100-sampling&#34;&gt;Zipkin with 100% sampling&lt;/h3&gt;
&lt;p&gt;In the default demo configuration profile, Envoy is using 100% sampling as default tracing policy. How does that impact the performance?&lt;/p&gt;
&lt;p&gt;As shown in the figure below, using the &lt;strong&gt;on-CPU profiling&lt;/strong&gt;, we found that it takes about &lt;strong&gt;16%&lt;/strong&gt; of the CPU overhead. At a fixed consumption of &lt;strong&gt;2 CPUs&lt;/strong&gt;, its QPS can reach &lt;strong&gt;5.7K&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;zipkin-sampling-100.png&#34; alt=&#34;Zipkin with 100% sampling&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;disable-zipkin-tracing&#34;&gt;Disable Zipkin tracing&lt;/h3&gt;
&lt;p&gt;At this point, we found that if Zipkin is not necessary, the sampling percentage can be reduced or we can even disable tracing. Based on the &lt;a href=&#34;https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/#Tracing&#34;&gt;Istio documentation&lt;/a&gt;, we can disable tracing when installing the service mesh using the following command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;meshConfig.enableTracing=false&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;meshConfig.defaultConfig.tracing.sampling=0.0&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After disabling tracing, we performed on-CPU profiling again. According to the figure below, we found that Zipkin has disappeared from the flame graph. With the same &lt;strong&gt;2 CPU&lt;/strong&gt; consumption as in the previous example, the QPS reached &lt;strong&gt;9K&lt;/strong&gt;, which is an almost &lt;strong&gt;60%&lt;/strong&gt; increase.
&lt;img src=&#34;zipkin-disable-tracing.png&#34; alt=&#34;Disable Zipkin tracing&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;tracing-with-throughput&#34;&gt;Tracing with Throughput&lt;/h3&gt;
&lt;p&gt;With the same CPU usage, we&amp;rsquo;ve discovered that Envoy performance greatly improves when the tracing feature is disabled. Of course, this requires us to make trade-offs between the number of samples Zipkin collects and the desired performance of Envoy (QPS).&lt;/p&gt;
&lt;p&gt;The table below illustrates how different Zipkin sampling percentages under the same CPU usage affect QPS.&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Zipkin sampling %&lt;/th&gt;
          &lt;th&gt;QPS&lt;/th&gt;
          &lt;th&gt;CPUs&lt;/th&gt;
          &lt;th&gt;Note&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;100% &lt;strong&gt;(default)&lt;/strong&gt;&lt;/td&gt;
          &lt;td&gt;5.7K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;16% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;1%&lt;/td&gt;
          &lt;td&gt;8.1K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;0.3% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;disabled&lt;/td&gt;
          &lt;td&gt;9.2K&lt;/td&gt;
          &lt;td&gt;2&lt;/td&gt;
          &lt;td&gt;0% used by Zipkin&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;access-log-format&#34;&gt;Access Log Format&lt;/h2&gt;
&lt;h3 id=&#34;default-log-format&#34;&gt;Default Log Format&lt;/h3&gt;
&lt;p&gt;In the default demo configuration profile, &lt;a href=&#34;https://istio.io/latest/docs/tasks/observability/logs/access-log/#default-access-log-format&#34;&gt;the default Access Log format&lt;/a&gt; contains a lot of data. The flame graph below shows various functions involved in parsing the data such as request headers, response headers, and streaming the body.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;log-format-default.png&#34; alt=&#34;Default Log Format&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;simplifying-access-log-format&#34;&gt;Simplifying Access Log Format&lt;/h3&gt;
&lt;p&gt;Typically, we don’t need all the information in the access log, so we can often simplify it to get what we need. The following command simplifies the access log format to only display basic information:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   --set meshConfig.accessLogFormat&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;[%START_TIME%] \&amp;#34;%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\&amp;#34; %RESPONSE_CODE%\n&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After simplifying the access log format, we found that the QPS increased from &lt;strong&gt;5.7K&lt;/strong&gt; to &lt;strong&gt;5.9K&lt;/strong&gt;. When executing the on-CPU profiling again, the CPU usage of log formatting dropped from &lt;strong&gt;2.4%&lt;/strong&gt; to &lt;strong&gt;0.7%&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Simplifying the log format helped us to improve the performance.&lt;/p&gt;
&lt;h1 id=&#34;off-cpu-profiling&#34;&gt;Off-CPU Profiling&lt;/h1&gt;
&lt;p&gt;Off-CPU profiling is suitable for performance issues that are not caused by high CPU usage. For example, when there are too many threads in one service, using off-CPU profiling could reveal which threads spend more time context switching.&lt;/p&gt;
&lt;p&gt;We provide data aggregation in two dimensions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Switch count&lt;/strong&gt;: The number of times a thread switches context. When the thread returns to the CPU, it completes one context switch. A thread stack with a higher switch count spends more time context switching.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Switch duration&lt;/strong&gt;: The time it takes a thread to switch the context. A thread stack with a higher switch duration spends more time off-CPU.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;write-access-log&#34;&gt;Write Access Log&lt;/h2&gt;
&lt;h3 id=&#34;enable-write&#34;&gt;Enable Write&lt;/h3&gt;
&lt;p&gt;Using the same environment and settings as before in the on-CPU test, we performed off-CPU profiling. As shown below, we found that access log writes accounted for about &lt;strong&gt;28%&lt;/strong&gt; of the total context switches. The &amp;ldquo;__write&amp;rdquo; shown below also indicates that this method is the Linux kernel method.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;access-log-write-enable.png&#34; alt=&#34;Enable Write Access Log&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;disable-write&#34;&gt;Disable Write&lt;/h3&gt;
&lt;p&gt;SkyWalking implements Envoy&amp;rsquo;s Access Log Service (ALS) feature which allows us to send access logs to the SkyWalking Observability Analysis Platform (OAP) using the gRPC protocol. Even by disabling the access logging, we can still use ALS to capture/aggregate the logs. We&amp;rsquo;ve disabled writing to the access log using the following command:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install -y --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo --set meshConfig.accessLogFile&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After disabling the Access Log feature, we performed the off-CPU profiling. File writing entries have disappeared as shown in the figure below. Envoy throughput also increased from &lt;strong&gt;5.7K&lt;/strong&gt; to &lt;strong&gt;5.9K&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;access-log-write-disable.png&#34; alt=&#34;Disable Write Access Log&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;In this article, we&amp;rsquo;ve examined the insights Apache Skywalking&amp;rsquo;s Trace Profiling can give us and how much more can be achieved with eBPF profiling. All of these features are implemented in &lt;a href=&#34;https://github.com/apache/skywalking-rover&#34;&gt;skywalking-rover&lt;/a&gt;. In addition to on- and off-CPU profiling, you will also find the following features:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Continuous profiling&lt;/strong&gt;, helps you automatically profile without manual intervention. For example, when Rover detects that the CPU exceeds a configurable threshold, it automatically executes the on-CPU profiling task.&lt;/li&gt;
&lt;li&gt;More profiling types to enrich usage scenarios, such as network, and memory profiling.&lt;/li&gt;
&lt;/ol&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Chaos Mesh &#43; SkyWalking: Better Observability for Chaos Engineering</title>
      <link>/blog/2021-12-21-better-observability-for-chaos-engineering/</link>
      <pubDate>Tue, 21 Dec 2021 00:00:00 +0000</pubDate>
      <guid>/blog/2021-12-21-better-observability-for-chaos-engineering/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img src=&#34;chaos-mesh-skywalking-banner.png&#34; alt=&#34;Chaos Mesh + SkyWalking: Better Observability for Chaos Engineering&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/chaos-mesh/chaos-mesh&#34;&gt;Chaos Mesh&lt;/a&gt; is an open-source cloud-native &lt;a href=&#34;https://en.wikipedia.org/wiki/Chaos_engineering&#34;&gt;chaos engineering&lt;/a&gt; platform. You can use Chaos Mesh to conveniently inject failures and simulate abnormalities that might occur in reality, so you can identify potential problems in your system. Chaos Mesh also offers a Chaos Dashboard which allows you to monitor the status of a chaos experiment. However, this dashboard cannot let you observe how the failures in the experiment impact the service performance of applications. This hinders us from further testing our systems and finding potential problems.&lt;/p&gt;
&lt;!--truncate--&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking&#34;&gt;Apache SkyWalking&lt;/a&gt; is an open-source application performance monitor (APM), specially designed to monitor, track, and diagnose cloud native, container-based distributed systems. It collects events that occur and then displays them on its dashboard, allowing you to observe directly the type and number of events that have occurred in your system and how different events impact the service performance.&lt;/p&gt;
&lt;p&gt;When you use SkyWalking and Chaos Mesh together during chaos experiments, you can observe how different failures impact the service performance.&lt;/p&gt;
&lt;p&gt;This tutorial will show you how to configure SkyWalking and Chaos Mesh. You’ll also learn how to leverage the two systems to monitor events and observe in real time how chaos experiments impact applications’ service performance.&lt;/p&gt;
&lt;h2 id=&#34;preparation&#34;&gt;Preparation&lt;/h2&gt;
&lt;p&gt;Before you start to use SkyWalking and Chaos Mesh, you have to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Set up a SkyWalking cluster according to &lt;a href=&#34;https://github.com/apache/skywalking-kubernetes#install&#34;&gt;the SkyWalking configuration guide&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Deploy Chao Mesh &lt;a href=&#34;https://chaos-mesh.org/docs/production-installation-using-helm/&#34;&gt;using Helm&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Install &lt;a href=&#34;https://jmeter.apache.org/index.html&#34;&gt;JMeter&lt;/a&gt; or other Java testing tools (to increase service loads).&lt;/li&gt;
&lt;li&gt;Configure SkyWalking and Chaos Mesh according to &lt;a href=&#34;https://github.com/chaos-mesh/chaos-mesh-on-skywalking&#34;&gt;this guide&lt;/a&gt; if you just want to run a demo.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, you are fully prepared, and we can cut to the chase.&lt;/p&gt;
&lt;h2 id=&#34;step-1-access-the-skywalking-cluster&#34;&gt;Step 1: Access the SkyWalking cluster&lt;/h2&gt;
&lt;p&gt;After you install the SkyWalking cluster, you can access its user interface (UI). However, no service is running at this point, so before you start monitoring, you have to add one and set the agents.&lt;/p&gt;
&lt;p&gt;In this tutorial, we take Spring Boot, a lightweight microservice framework, as an example to build a simplified demo environment.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create a SkyWalking demo in Spring Boot by referring to &lt;a href=&#34;https://github.com/chaos-mesh/chaos-mesh-on-skywalking/blob/master/demo-deployment.yaml&#34;&gt;this document&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Execute the command &lt;code&gt;kubectl apply -f demo-deployment.yaml -n skywalking&lt;/code&gt; to deploy the demo.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After you finish deployment, you can observe the real-time monitoring results at the SkyWalking UI.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Spring Boot and SkyWalking have the same default port number: 8080. Be careful when you configure the port forwarding; otherise, you may have port conflicts. For example, you can set Spring Boot’s port to 8079 by using a command like &lt;code&gt;kubectl port-forward svc/spring-boot-skywalking-demo 8079:8080 -n skywalking&lt;/code&gt; to avoid conflicts.&lt;/p&gt;
&lt;h2 id=&#34;step-2-deploy-skywalking-kubernetes-event-exporter&#34;&gt;Step 2: Deploy SkyWalking Kubernetes Event Exporter&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking-kubernetes-event-exporter&#34;&gt;SkyWalking Kubernetes Event Exporter&lt;/a&gt; is able to watch, filter, and send Kubernetes events into the SkyWalking backend. SkyWalking then associates the events with the system metrics and displays an overview about when and how the metrics are affected by the events.&lt;/p&gt;
&lt;p&gt;If you want to deploy SkyWalking Kubernetes Event Explorer with one line of commands, refer to &lt;a href=&#34;https://github.com/chaos-mesh/chaos-mesh-on-skywalking/blob/master/exporter-deployment.yaml&#34;&gt;this document&lt;/a&gt; to create configuration files in YAML format and then customize the parameters in the filters and exporters. Now, you can use the command &lt;code&gt;kubectl apply&lt;/code&gt; to deploy SkyWalking Kubernetes Event Explorer.&lt;/p&gt;
&lt;h2 id=&#34;step-3-use-jmeter-to-increase-service-loads&#34;&gt;Step 3: Use JMeter to increase service loads&lt;/h2&gt;
&lt;p&gt;To better observe the change in service performance, you need to increase the service loads on Spring Boot. In this tutorial, we use JMeter, a widely adopted Java testing tool, to increase the service loads.&lt;/p&gt;
&lt;p&gt;Perform a stress test on &lt;code&gt;localhost:8079&lt;/code&gt; using JMeter and add five threads to continuously increase the service loads.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;jmeter-1.png&#34; alt=&#34;JMeter Dashboard 1&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;jmeter-2.png&#34; alt=&#34;JMeter Dashboard 2&#34;&gt;&lt;/p&gt;
&lt;p&gt;Open the SkyWalking Dashboard. You can see that the access rate is 100%, and that the service loads reach about 5,300 calls per minute (CPM).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;skywalking-dashboard.png&#34; alt=&#34;SkyWalking Dashboard&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;step-4-inject-failures-via-chaos-mesh-and-observe-results&#34;&gt;Step 4: Inject failures via Chaos Mesh and observe results&lt;/h2&gt;
&lt;p&gt;After you finish the three steps above, you can use the Chaos Dashboard to simulate stress scenarios and observe the change in service performance during chaos experiments.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;chaos-dashboard-stresschaos.png&#34; alt=&#34;StressChaos on Chaos Dashboard&#34;&gt;&lt;/p&gt;
&lt;p&gt;The following sections describe how service performance varies under the stress of three chaos conditions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CPU load: 10%;  memory load: 128 MB&lt;/p&gt;
&lt;p&gt;The first chaos experiment simulates low CPU usage. To display when a chaos experiment starts and ends, click the switching button on the right side of the dashboard. To learn whether the experiment is Applied to the system or Recovered from the system, move your cursor onto the short, green line.&lt;/p&gt;
&lt;p&gt;During the time period between the two short, green lines, the service load decreases to 4,929 CPM, but returns to normal after the chaos experiment ends.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;cpuload-1.png&#34; alt=&#34;Test 1&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CPU load: 50%; memory load: 128 MB&lt;/p&gt;
&lt;p&gt;When the application’s CPU load increases to 50%,  the service load decreases to 4,307 CPM.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;cpuload-2.png&#34; alt=&#34;Test 2&#34;&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CPU load: 100%; memory load: 128 MB&lt;/p&gt;
&lt;p&gt;When the CPU usage is at 100%, the service load decreases to only 40% of what it would be if no chaos experiments were taking place.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;cpuload-3.png&#34; alt=&#34;Test 3&#34;&gt;&lt;/p&gt;
&lt;p&gt;Because the process scheduling under the Linux system does not allow a process to occupy the CPU all the time, the deployed Spring Boot Demo can still handle 40% of the access requests even in the extreme case of a full CPU load.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;
&lt;p&gt;By combining SkyWalking and Chaos Mesh, you can clearly observe when and to what extent chaos experiments affect application service performance. This combination of tools lets you observe the service performance in various extreme conditions, thus boosting your confidence in your services.&lt;/p&gt;
&lt;p&gt;Chaos Mesh has grown a lot in 2021 thanks to the unremitting efforts of all PingCAP engineers and community contributors. In order to continue to upgrade our support for our wide variety of users and learn more about users’ experience in Chaos Engineering, we’d like to invite you to take&lt;a href=&#34;https://www.surveymonkey.com/r/X77BCNM&#34;&gt; this survey&lt;/a&gt; and give us your valuable feedback.&lt;/p&gt;
&lt;p&gt;If you want to know more about Chaos Mesh, you’re welcome to join &lt;a href=&#34;https://github.com/chaos-mesh&#34;&gt;the Chaos Mesh community on GitHub&lt;/a&gt; or our &lt;a href=&#34;https://slack.cncf.io/&#34;&gt;Slack discussions&lt;/a&gt; (#project-chaos-mesh). If you find any bugs or missing features when using Chaos Mesh, you can submit your pull requests or issues to our &lt;a href=&#34;https://github.com/chaos-mesh/chaos-mesh&#34;&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Observe VM Service Meshes with Apache SkyWalking and the Envoy Access Log Service</title>
      <link>/blog/obs-service-mesh-vm-with-sw-and-als/</link>
      <pubDate>Sun, 21 Feb 2021 00:00:00 +0000</pubDate>
      <guid>/blog/obs-service-mesh-vm-with-sw-and-als/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img src=&#34;stone-arch.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Origin: &lt;a href=&#34;https://thenewstack.io/observe-virtual-machine-service-meshes-with-apache-skywalking-and-the-envoy-access-log-service&#34;&gt;Observe VM Service Meshes with Apache SkyWalking and the Envoy Access Log Service - The New Stack&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking&#34;&gt;Apache SkyWalking&lt;/a&gt;: an APM (application performance monitor) system, especially
designed for microservices, cloud native, and container-based (Docker, Kubernetes, Mesos) architectures.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.envoyproxy.io/docs/envoy/latest/api-v2/service/accesslog/v2/als.proto&#34;&gt;Envoy Access Log Service&lt;/a&gt;: Access
Log Service (ALS) is an Envoy extension that emits detailed access logs of all requests going through Envoy.&lt;/p&gt;
&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;In the &lt;a href=&#34;/blog/obs-service-mesh-with-sw-and-als&#34;&gt;previous post&lt;/a&gt;, we talked about the observability of service mesh under
Kubernetes environment, and applied it to the bookinfo application in practice. We also mentioned that, in order to map
the IP addresses into services, SkyWalking needs access to the service metadata from a Kubernetes cluster, which is not
available for services deployed in virtual machines (VMs). In this post, we will introduce a new analyzer in SkyWalking
that leverages Envoy’s metadata exchange mechanism to decouple with Kubernetes. The analyzer is designed to work in
Kubernetes environments, VM environments, and hybrid environments. If there are virtual machines in your service mesh,
you might want to try out this new analyzer for better observability, which we will demonstrate in this tutorial.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;The mechanism of how the analyzer works is the same as what we discussed in
the &lt;a href=&#34;/blog/obs-service-mesh-with-sw-and-als&#34;&gt;previous post&lt;/a&gt;. What makes VMs different from Kubernetes is that, for VM
services, there are no places where we can fetch the metadata to map the IP addresses into services.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image1.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;The basic idea we present in this article is to carry the metadata along with Envoy’s access logs, which is called
metadata-exchange mechanism in Envoy. When Istio pilot-agent starts an Envoy proxy as a sidecar of a service, it
collects the metadata of that service from the Kubernetes platform, or a file on the VM where that service is deployed,
and injects the metadata into the bootstrap configuration of Envoy. Envoy will carry the metadata transparently when
emitting access logs to the SkyWalking receiver.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image2.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;But how does Envoy compose a piece of a complete access log that involves the client side and server side? When a
request goes out from Envoy, a plugin of istio-proxy named &amp;ldquo;metadata-exchange&amp;rdquo; injects the metadata into the http
headers (with a prefix like &lt;code&gt;x-envoy-downstream-&lt;/code&gt;), and the metadata is propagated to the server side. The Envoy sidecar
of the server side receives the request and parses the headers into metadata, and puts the metadata into the access log,
keyed by &lt;code&gt;wasm.downstream_peer&lt;/code&gt;. The server side Envoy also puts its own metadata into the access log keyed
by &lt;code&gt;wasm.upstream_peer.&lt;/code&gt; Hence the two sides of a single request are completed.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image3.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;With the metadata-exchange mechanism, we can use the metadata directly without any extra query.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image4.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;p&gt;In this tutorial, we will use another demo
application &lt;a href=&#34;http://github.com/GoogleCloudPlatform/microservices-demo&#34;&gt;Online Boutique&lt;/a&gt; that consists of 10+ services so
that we can deploy some of them in VMs and make them communicate with other services deployed in Kubernetes.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image5.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Topology of Online Boutique In order to cover as many cases as possible, we will deploy &lt;code&gt;CheckoutService&lt;/code&gt;
and &lt;code&gt;PaymentService&lt;/code&gt; on VM and all the other services on Kubernetes, so that we can cover the cases like Kubernetes →
VM (e.g. &lt;code&gt;Frontend&lt;/code&gt; → &lt;code&gt;CheckoutService&lt;/code&gt;), VM → Kubernetes (e.g. &lt;code&gt;CheckoutService&lt;/code&gt; → &lt;code&gt;ShippingService&lt;/code&gt;), and VM → VM (
e.g. &lt;code&gt;CheckoutService&lt;/code&gt; → &lt;code&gt;PaymentService&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: All the commands used in this tutorial are accessible
on &lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts&#34;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/SkyAPMTest/sw-als-vm-demo-scripts
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;cd&lt;/span&gt; sw-als-vm-demo-scripts
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Make sure to init the &lt;code&gt;gcloud&lt;/code&gt; SDK properly before moving on. Modify the &lt;code&gt;GCP_PROJECT&lt;/code&gt; in file &lt;code&gt;env.sh&lt;/code&gt; to your own
project name. Most of the other variables should be OK to work if you keep them intact. If you would like to
use &lt;code&gt;ISTIO_VERSION&lt;/code&gt; &amp;gt;/= 1.8.0, please make sure &lt;a href=&#34;https://github.com/istio/istio/pull/28956&#34;&gt;this patch&lt;/a&gt; is included.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Prepare Kubernetes cluster and VM instances
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/00-create-cluster-and-vms.sh&#34;&gt;&lt;code&gt;00-create-cluster-and-vms.sh&lt;/code&gt;&lt;/a&gt;
creates a new GKE cluster and 2 VM instances that will be used through the entire tutorial, and sets up some necessary
firewall rules for them to communicate with each other.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Install Istio and SkyWalking
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/01a-install-istio.sh&#34;&gt;&lt;code&gt;01a-install-istio.sh&lt;/code&gt;&lt;/a&gt;
installs Istio Operator with spec &lt;code&gt;resources/vmintegration.yaml&lt;/code&gt;. In the YAML file, we enable the &lt;code&gt;meshExpansion&lt;/code&gt; that
supports VM in mesh. We also enable the Envoy access log service and specify the
address &lt;code&gt;skywalking-oap.istio-system.svc.cluster.local:11800&lt;/code&gt; to which Envoy emits the access logs.
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/01b-install-skywalking.sh&#34;&gt;&lt;code&gt;01b-install-skywalking.sh&lt;/code&gt;&lt;/a&gt;
installs Apache SkyWalking and sets the analyzer to &lt;code&gt;mx-mesh&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create files to initialize the VM
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/02-create-files-to-transfer-to-vm.sh&#34;&gt;&lt;code&gt;02-create-files-to-transfer-to-vm.sh&lt;/code&gt;&lt;/a&gt;
creates necessary files that will be used to initialize the VMs.
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/03-copy-work-files-to-vm.sh&#34;&gt;&lt;code&gt;03-copy-work-files-to-vm.sh&lt;/code&gt;&lt;/a&gt;
securely transfers the generated files to the VMs with &lt;code&gt;gcloud scp&lt;/code&gt; command. Now use &lt;code&gt;./ssh.sh checkoutservice&lt;/code&gt;
and &lt;code&gt;./ssh.sh paymentservice&lt;/code&gt; to log into the two VMs respectively, and &lt;code&gt;cd&lt;/code&gt; to the &lt;code&gt;~/work&lt;/code&gt; directory,
execute &lt;code&gt;./prep-checkoutservice.sh&lt;/code&gt; on &lt;code&gt;checkoutservice&lt;/code&gt; VM instance and &lt;code&gt;./prep-paymentservice.sh&lt;/code&gt;
on &lt;code&gt;paymentservice&lt;/code&gt; VM instance. The Istio sidecar should be installed and started properly. To verify that,
use &lt;code&gt;tail -f /var/logs/istio/istio.log&lt;/code&gt; to check the Istio logs. The output should be something like:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;2020-12-12T08:07:07.348329Z	info	sds	resource:default new connection
2020-12-12T08:07:07.348401Z	info	sds	Skipping waiting for gateway secret
2020-12-12T08:07:07.348401Z	info	sds	Skipping waiting for gateway secret
2020-12-12T08:07:07.568676Z	info	cache	Root cert has changed, start rotating root cert for SDS clients
2020-12-12T08:07:07.568718Z	info	cache	GenerateSecret default
2020-12-12T08:07:07.569398Z	info	sds	resource:default pushed key/cert pair to proxy
2020-12-12T08:07:07.949156Z	info	cache	Loaded root cert from certificate ROOTCA
2020-12-12T08:07:07.949348Z	info	sds	resource:ROOTCA pushed root cert to proxy
2020-12-12T20:12:07.384782Z	info	sds	resource:default pushed key/cert pair to proxy
2020-12-12T20:12:07.384832Z	info	sds	Dynamic push for secret default
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The dnsmasq configuration &lt;code&gt;address=/.svc.cluster.local/{ISTIO_SERVICE_IP_STUB}&lt;/code&gt; also resolves the domain names ended
with &lt;code&gt;.svc.cluster.local&lt;/code&gt; to Istio service IP, so that you are able to access the Kubernetes services in the VM by
fully qualified domain name (FQDN) such as &lt;code&gt;httpbin.default.svc.cluster.local&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Deploy demo application Because we want to deploy &lt;code&gt;CheckoutService&lt;/code&gt; and &lt;code&gt;PaymentService&lt;/code&gt; manually on
VM, &lt;code&gt;resources/google-demo.yaml&lt;/code&gt; removes the two services
from &lt;a href=&#34;https://github.com/GoogleCloudPlatform/microservices-demo/blob/master/release/kubernetes-manifests.yaml&#34;&gt;the original YAML&lt;/a&gt;
.
&lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/04a-deploy-demo-app.sh&#34;&gt;&lt;code&gt;04a-deploy-demo-app.sh&lt;/code&gt;&lt;/a&gt;
deploys the other services on Kubernetes. Then log into the 2 VMs, run &lt;code&gt;~/work/deploy-checkoutservice.sh&lt;/code&gt;
and &lt;code&gt;~/work/deploy-paymentservice.sh&lt;/code&gt; respectively to deploy &lt;code&gt;CheckoutService&lt;/code&gt; and &lt;code&gt;PaymentService&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Register VMs to Istio Services on VMs can access the services on Kubernetes by FQDN, but that’s not the case when the
Kubernetes services want to talk to the VM services. The mesh has no idea where to forward the requests such
as &lt;code&gt;checkoutservice.default.svc.cluster.local&lt;/code&gt; because &lt;code&gt;checkoutservice&lt;/code&gt; is isolated in the VM. Therefore, we need to
register the services to the
mesh. &lt;a href=&#34;https://github.com/SkyAPMTest/sw-als-vm-demo-scripts/blob/2179d04270c98b9f87cf3998f5af775870ed53a7/04b-register-vm-with-istio.sh&#34;&gt;&lt;code&gt;04b-register-vm-with-istio.sh&lt;/code&gt;&lt;/a&gt;
registers the VM services to the mesh by creating a &amp;ldquo;dummy&amp;rdquo; service without running Pods, and a &lt;code&gt;WorkloadEntry&lt;/code&gt; to
bridge the &amp;ldquo;dummy&amp;rdquo; service with the VM service.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;done&#34;&gt;Done!&lt;/h2&gt;
&lt;p&gt;The demo application contains a &lt;code&gt;load generator&lt;/code&gt; service that performs requests repeatedly. We only need to wait a few
seconds, and then open the SkyWalking web UI to check the results.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;export POD_NAME=$(kubectl get pods --namespace istio-system -l &amp;#34;app=skywalking,release=skywalking,component=ui&amp;#34; -o jsonpath=&amp;#34;{.items[0].metadata.name}&amp;#34;)
echo &amp;#34;Visit http://127.0.0.1:8080 to use your application&amp;#34;
kubectl port-forward $POD_NAME 8080:8080 --namespace istio-system
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Navigate the browser to http://localhost:8080 . The metrics, topology should be there.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image6.png&#34; alt=&#34;Topology&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image7.png&#34; alt=&#34;Global metrics&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image8.png&#34; alt=&#34;Metrics of CheckoutService&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;image9.png&#34; alt=&#34;Metrics of PaymentService&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;troubleshooting&#34;&gt;Troubleshooting&lt;/h2&gt;
&lt;p&gt;If you face any trouble when walking through the steps, here are some common problems and possible solutions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;VM service cannot access Kubernetes services? It’s likely the DNS on the VM doesn’t correctly resolve the fully
qualified domain names. Try to verify that with &lt;code&gt;nslookup istiod.istio-system.svc.cluster.local&lt;/code&gt;. If it doesn’t
resolve to the Kubernetes CIDR address, recheck the step in &lt;code&gt;prep-checkoutservice.sh&lt;/code&gt; and &lt;code&gt;prep-paymentservice.sh&lt;/code&gt;. If
the DNS works correctly, try to verify that Envoy has fetched the upstream clusters from the control plane
with &lt;code&gt;curl http://localhost:15000/clusters&lt;/code&gt;. If it doesn’t contain the target service,
recheck &lt;code&gt;prep-checkoutservice.sh&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Services are normal but nothing on SkyWalking WebUI? Check the SkyWalking OAP logs
via &lt;code&gt;kubectl -n istio-system logs -f $(kubectl get pod -A -l &amp;quot;app=skywalking,release=skywalking,component=oap&amp;quot; -o name)&lt;/code&gt;
and WebUI logs
via &lt;code&gt;kubectl -n istio-system logs -f $(kubectl get pod -A -l &amp;quot;app=skywalking,release=skywalking,component=ui&amp;quot; -o name)&lt;/code&gt;
to see whether there are any error logs . Also, make sure the time zone at the bottom-right of the browser is set
to &lt;code&gt;UTC +0&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;additional-resources&#34;&gt;Additional Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/blog/obs-service-mesh-with-sw-and-als&#34;&gt;Observe a Service Mesh with Envoy ALS&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: Observe Service Mesh with SkyWalking and Envoy Access Log Service</title>
      <link>/blog/2020-12-03-obs-service-mesh-with-sw-and-als/</link>
      <pubDate>Thu, 03 Dec 2020 00:00:00 +0000</pubDate>
      <guid>/blog/2020-12-03-obs-service-mesh-with-sw-and-als/</guid>
      <description>
        
        
        &lt;p&gt;&lt;img src=&#34;canyonhorseshoe.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Author: Zhenxu Ke, Sheng Wu, and Tevah Platt. tetrate.io&lt;/li&gt;
&lt;li&gt;Original link, &lt;a href=&#34;https://www.tetrate.io/blog/observe-service-mesh-with-skywalking-and-envoy-access-log-service/&#34;&gt;Tetrate.io blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Dec. 03th, 2020&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/apache/skywalking&#34;&gt;Apache SkyWalking&lt;/a&gt;: an APM (application performance monitor) system, especially designed for microservices, cloud native, and container-based (Docker, Kubernetes, Mesos) architectures.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://www.envoyproxy.io/docs/envoy/latest/api-v2/service/accesslog/v2/als.proto&#34;&gt;Envoy Access Log Service&lt;/a&gt;: Access Log Service (ALS) is an Envoy extension that emits detailed access logs of all requests going through Envoy.&lt;/p&gt;
&lt;h2 id=&#34;background&#34;&gt;Background&lt;/h2&gt;
&lt;p&gt;Apache SkyWalking has long supported observability in service mesh with Istio Mixer adapter. But since v1.5, Istio began to deprecate Mixer due to its poor performance in large scale clusters. Mixer’s functionalities have been moved into the Envoy proxies, and is supported only through the 1.7 Istio release.
On the other hand, &lt;a href=&#34;https://github.com/wu-sheng&#34;&gt;Sheng Wu&lt;/a&gt; and &lt;a href=&#34;https://github.com/lizan&#34;&gt;Lizan Zhou&lt;/a&gt; presented a better solution based on the Apache SkyWalking and Envoy ALS on &lt;a href=&#34;https://kccncosschn19eng.sched.com/event/NroB/observability-in-service-mesh-powered-by-envoy-and-apache-skywalking-sheng-wu-lizan-zhou-tetrate&#34;&gt;KubeCon China 2019&lt;/a&gt;,  to reduce the performance impact brought by Mixer, while retaining the same observability in service mesh. This solution was initially implemented by Sheng Wu, &lt;a href=&#34;https://github.com/hanahmily&#34;&gt;Hongtao Gao&lt;/a&gt;, Lizan Zhou, and &lt;a href=&#34;https://github.com/dio&#34;&gt;Dhi Aurrahman&lt;/a&gt; at Tetrate.io.
If you are looking for a more efficient solution to observe your service mesh instead of using a Mixer-based solution, this is exactly what you need. In this tutorial, we will explain a little bit how the new solution works, and apply it to the bookinfo application in practice.&lt;/p&gt;
&lt;h2 id=&#34;how-it-works&#34;&gt;How it works&lt;/h2&gt;
&lt;p&gt;From a perspective of observability, Envoy can be typically deployed in 2 modes, sidecar, and router. As a sidecar, Envoy mostly represents a single service to receive and send requests (2 and 3 in the picture below). While as a proxy, Envoy may represent many services (1 in the picture below).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-2.25.17-PM.png&#34; alt=&#34;Example of Envoy deployment, as front proxy and sidecar&#34;&gt;&lt;/p&gt;
&lt;p&gt;In both modes, the logs emitted by ALS include a node identifier. The identifier starts with &lt;code&gt;router~&lt;/code&gt; (or &lt;code&gt;ingress~&lt;/code&gt;) in router mode and &lt;code&gt;sidecar~&lt;/code&gt; in sidecar proxy mode.&lt;/p&gt;
&lt;p&gt;Apart from the node identifier, there are &lt;a href=&#34;https://github.com/envoyproxy/envoy/blob/549164c42cae84b59154ca4c36009e408aa10b52/generated_api_shadow/envoy/data/accesslog/v2/accesslog.proto&#34;&gt;several noteworthy properties in the access logs&lt;/a&gt; that will be used in this solution:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;downstream_direct_remote_address&lt;/code&gt;: This field is the downstream direct remote address on which the request from the user was received. Note: This is always the physical peer, even if the remote address is inferred from for example the &lt;code&gt;x-forwarded-for&lt;/code&gt; header, proxy protocol, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;downstream_remote_address&lt;/code&gt;: The remote/origin address on which the request from the user was received.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;downstream_local_address&lt;/code&gt;: The local/destination address on which the request from the user was received.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;upstream_remote_address&lt;/code&gt;: The upstream remote/destination address that handles this exchange.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;upstream_local_address&lt;/code&gt;: The upstream local/origin address that handles this exchange.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;upstream_cluster&lt;/code&gt;: The upstream cluster that &lt;em&gt;upstream_remote_address&lt;/em&gt; belongs to.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We will discuss more about the properties in the following sections.&lt;/p&gt;
&lt;h3 id=&#34;sidecar&#34;&gt;Sidecar&lt;/h3&gt;
&lt;p&gt;When serving as a sidecar, Envoy is deployed alongside a service, and delegates all the incoming/outgoing requests to/from the service.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Delegating incoming requests:&lt;/strong&gt; in this case, Envoy acts as a server side sidecar, and sets the &lt;code&gt;upstream_cluster&lt;/code&gt; in form of &lt;code&gt;inbound|portNumber|portName|Hostname[or]SidecarScopeID&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-2.37.49-PM.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;The SkyWalking analyzer checks whether either &lt;code&gt;downstream_remote_address&lt;/code&gt; can be mapped to a Kubernetes service:&lt;/p&gt;
&lt;p&gt;a. If there is a service (say &lt;code&gt;Service B&lt;/code&gt;) whose implementation is running in this IP(and port), then we have a service-to-service relation, &lt;code&gt;Service B -&amp;gt; Service A&lt;/code&gt;, which can be used to build the topology. Together with the &lt;code&gt;start_time&lt;/code&gt; and &lt;code&gt;duration&lt;/code&gt; fields in the access log, we have the latency metrics now.&lt;/p&gt;
&lt;p&gt;b. If there is no service that can be mapped to &lt;code&gt;downstream_remote_address&lt;/code&gt;, then the request may come from a service out of the mesh. Since SkyWalking cannot identify the source service where the requests come from, it simply generates the metrics without source service, according to the &lt;a href=&#34;https://wu-sheng.github.io/STAM/&#34;&gt;topology analysis method&lt;/a&gt;. The topology can be built as accurately as possible, and the metrics detected from server side are still correct.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Delegating outgoing requests:&lt;/strong&gt; in this case, Envoy acts as a client-side sidecar, and sets the &lt;code&gt;upstream_cluster&lt;/code&gt; in form of &lt;code&gt;outbound|&amp;lt;port&amp;gt;|&amp;lt;subset&amp;gt;|&amp;lt;serviceFQDN&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-2.43.16-PM.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Client side detection is relatively simpler than (1. Delegating incoming requests). If &lt;code&gt;upstream_remote_address&lt;/code&gt; is another sidecar or proxy, we simply get the mapped service name and generate the topology and metrics. Otherwise, we have no idea what it is and consider it an &lt;code&gt;UNKNOWN&lt;/code&gt; service.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;proxy-role&#34;&gt;Proxy role&lt;/h3&gt;
&lt;p&gt;When Envoy is deployed as a proxy, it is an independent service itself and doesn&amp;rsquo;t represent any other service like a sidecar does. Therefore, we can build client-side metrics as well as server-side metrics.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-2.46.56-PM.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;p&gt;In this section, we will use the typical &lt;a href=&#34;https://istio.io/latest/docs/examples/bookinfo/&#34;&gt;bookinfo application&lt;/a&gt; to demonstrate how  Apache SkyWalking 8.3.0+ (the latest version up to Nov. 30th, 2020) works together with Envoy ALS to observe a service mesh.&lt;/p&gt;
&lt;h3 id=&#34;installing-kubernetes&#34;&gt;Installing Kubernetes&lt;/h3&gt;
&lt;p&gt;SkyWalking 8.3.0 supports the Envoy ALS solution under both Kubernetes environment and virtual machines (VM) environment, in this tutorial, we’ll only focus on the Kubernetes scenario, for VM solution, please stay tuned for our next blog, so we need to install Kubernetes before taking further steps.&lt;/p&gt;
&lt;p&gt;In this tutorial, we will use the &lt;a href=&#34;https://minikube.sigs.k8s.io/docs/&#34;&gt;Minikube&lt;/a&gt; tool to quickly set up a local Kubernetes(v1.17) cluster for testing. In order to run all the needed components, including the bookinfo application, the SkyWalking OAP and WebUI, the cluster may need up to 4GB RAM and 2 CPU cores.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;minikube start --memory&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;4096&lt;/span&gt; --cpus&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Next, run &lt;code&gt;kubectl get pods --namespace=kube-system --watch&lt;/code&gt; to check whether all the Kubernetes components are ready. If not, wait for the readiness before going on.&lt;/p&gt;
&lt;h3 id=&#34;installing-istio&#34;&gt;Installing Istio&lt;/h3&gt;
&lt;p&gt;Istio provides a very convenient way to configure the Envoy proxy and enable the access log service. The built-in configuration profiles free us from lots of manual operations. So, for demonstration purposes, we will use Istio through this tutorial.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;export&lt;/span&gt; &lt;span style=&#34;color:#953800&#34;&gt;ISTIO_VERSION&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;1.7.1
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -L https://istio.io/downloadIstio &lt;span style=&#34;color:#1f2328&#34;&gt;|&lt;/span&gt; sh - 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo mv &lt;span style=&#34;color:#953800&#34;&gt;$PWD&lt;/span&gt;/istio-&lt;span style=&#34;color:#953800&#34;&gt;$ISTIO_VERSION&lt;/span&gt;/bin/istioctl /usr/local/bin/
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl  install --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl label namespace default istio-injection&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;enabled
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Run &lt;code&gt;kubectl get pods --namespace=istio-system --watch&lt;/code&gt; to check whether all the Istio components are ready. If not, wait for the readiness before going on.&lt;/p&gt;
&lt;h3 id=&#34;enabling-als&#34;&gt;Enabling ALS&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;demo&lt;/code&gt; profile doesn’t enable ALS by default. We need to reconfigure it to enable ALS via some configuration.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl  manifest install &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set meshConfig.enableEnvoyAccessLogService&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;true&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set meshConfig.defaultConfig.envoyAccessLogService.address&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;skywalking-oap.istio-system:11800
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The example command &lt;code&gt;--set meshConfig.enableEnvoyAccessLogService=true&lt;/code&gt; enables the Envoy access log service in the mesh. And as we said earlier, ALS is essentially a gRPC service that emits requests logs. The config &lt;code&gt;meshConfig.defaultConfig.envoyAccessLogService.address=skywalking-oap.istio-system:11800&lt;/code&gt; tells this gRPC service  where to emit the logs, say &lt;code&gt;skywalking-oap.istio-system:11800&lt;/code&gt;, where we will deploy the SkyWalking ALS receiver later.&lt;/p&gt;
&lt;p&gt;NOTE:
You can also enable the ALS when installing Istio so that you don’t need to restart Istio after installation:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;istioctl install --set &lt;span style=&#34;color:#953800&#34;&gt;profile&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;demo &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set meshConfig.enableEnvoyAccessLogService&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;true&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set meshConfig.defaultConfig.envoyAccessLogService.address&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;skywalking-oap.istio-system:11800
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl label namespace default istio-injection&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;enabled
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;deploying-apache-skywalking&#34;&gt;Deploying Apache SkyWalking&lt;/h3&gt;
&lt;p&gt;The SkyWalking community provides a &lt;a href=&#34;https://helm.sh&#34;&gt;Helm&lt;/a&gt; Chart to make it easier to deploy SkyWalking and its dependent services in Kubernetes. The Helm Chart can be found at the &lt;a href=&#34;https://github.com/apache/skywalking-kubernetes&#34;&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Install Helm&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -sSLO https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo tar xz -C /usr/local/bin --strip-components&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;1&lt;/span&gt; linux-amd64/helm -f helm-v3.0.0-linux-amd64.tar.gz
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Clone SkyWalking Helm Chart&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/apache/skywalking-kubernetes
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;cd&lt;/span&gt; skywalking-kubernetes/chart
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git reset --hard dd749f25913830c47a97430618cefc4167612e75
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Update dependencies&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;helm dep up skywalking
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#57606a&#34;&gt;# Deploy SkyWalking&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;helm -n istio-system install skywalking skywalking &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.storageType&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;h2&amp;#39;&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set ui.image.tag&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;8.3.0 &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.image.tag&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;8.3.0-es7 &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.replicas&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.env.SW_ENVOY_METRIC_ALS_HTTP_ANALYSIS&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;k8s-mesh &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.env.JAVA_OPTS&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#39;-Dmode=&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set oap.envoy.als.enabled&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;true&lt;/span&gt; &lt;span style=&#34;color:#0a3069&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               --set elasticsearch.enabled&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;false&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We deploy SkyWalking to the namespace &lt;code&gt;istio-system&lt;/code&gt;, so that SkyWalking OAP service can be accessed by &lt;code&gt;skywalking-oap.istio-system:11800&lt;/code&gt;, to which we told ALS to emit their logs, in the previous step.&lt;/p&gt;
&lt;p&gt;We also enable the ALS analyzer in the SkyWalking OAP: &lt;code&gt;oap.env.SW_ENVOY_METRIC_ALS_HTTP_ANALYSIS=k8s-mesh&lt;/code&gt;. The analyzer parses the access logs and maps the IP addresses in the logs to the real service names in the Kubernetes, to build a topology.&lt;/p&gt;
&lt;p&gt;In order to retrieve the metadata (such as Pod IP and service names) from a Kubernetes cluster for IP mappings, we also set &lt;code&gt;oap.envoy.als.enabled=true&lt;/code&gt;, to apply for a &lt;code&gt;ClusterRole&lt;/code&gt; that has access to the metadata.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;export&lt;/span&gt; &lt;span style=&#34;color:#953800&#34;&gt;POD_NAME&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#cf222e&#34;&gt;$(&lt;/span&gt;kubectl get pods -A -l &lt;span style=&#34;color:#0a3069&#34;&gt;&amp;#34;app=skywalking,release=skywalking,component=ui&amp;#34;&lt;/span&gt; -o name&lt;span style=&#34;color:#cf222e&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#953800&#34;&gt;$POD_NAME&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl -n istio-system port-forward &lt;span style=&#34;color:#953800&#34;&gt;$POD_NAME&lt;/span&gt; 8080:8080
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now navigate your browser to http://localhost:8080 . You should be able to see the SkyWalking dashboard. The dashboard is empty for now, but after we deploy the demo application and generate traffic, it should be filled up later.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-3.01.03-PM.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;deploying-bookinfo-application&#34;&gt;Deploying Bookinfo application&lt;/h3&gt;
&lt;p&gt;Run:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#f7f7f7;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-shell&#34; data-lang=&#34;shell&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#6639ba&#34;&gt;export&lt;/span&gt; &lt;span style=&#34;color:#953800&#34;&gt;ISTIO_VERSION&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;1.7.1
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f https://raw.githubusercontent.com/istio/istio/&lt;span style=&#34;color:#953800&#34;&gt;$ISTIO_VERSION&lt;/span&gt;/samples/bookinfo/platform/kube/bookinfo.yaml
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl apply -f https://raw.githubusercontent.com/istio/istio/&lt;span style=&#34;color:#953800&#34;&gt;$ISTIO_VERSION&lt;/span&gt;/samples/bookinfo/networking/bookinfo-gateway.yaml
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;kubectl &lt;span style=&#34;color:#6639ba&#34;&gt;wait&lt;/span&gt; --for&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#953800&#34;&gt;condition&lt;/span&gt;&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;Ready pods --all --timeout&lt;span style=&#34;color:#0550ae&#34;&gt;=&lt;/span&gt;1200s
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;minikube tunnel
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then navigate your browser to http://localhost/productpage. You should be able to see the typical bookinfo application. Refresh the webpage several times to generate enough access logs.&lt;/p&gt;
&lt;h3 id=&#34;done&#34;&gt;Done!&lt;/h3&gt;
&lt;p&gt;And you’re all done! Check out the SkyWalking WebUI again. You should see the topology of the bookinfo application, as well the metrics of each individual service of the bookinfo application.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;Screen-Shot-2020-12-02-at-3.05.24-PM.png&#34; alt=&#34;&#34;&gt;
&lt;img src=&#34;Screen-Shot-2020-12-02-at-3.11.55-PM.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;troubleshooting&#34;&gt;Troubleshooting&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Check all pods status: &lt;code&gt;kubectl get pods -A&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;SkyWalking OAP logs: &lt;code&gt;kubectl -n istio-system logs -f $(kubectl get pod -A -l &amp;quot;app=skywalking,release=skywalking,component=oap&amp;quot; -o name)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;SkyWalking WebUI logs: &lt;code&gt;kubectl -n istio-system logs -f $(kubectl get pod -A -l &amp;quot;app=skywalking,release=skywalking,component=ui&amp;quot; -o name)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Make sure the time zone at the bottom-right of the WebUI is set to &lt;code&gt;UTC +0&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;customizing-service-names&#34;&gt;Customizing Service Names&lt;/h2&gt;
&lt;p&gt;The SkyWalking community brought more improvements to the ALS solution in the 8.3.0 version. You can decide how to compose the service names when mapping from the IP addresses, with variables &lt;code&gt;service&lt;/code&gt; and &lt;code&gt;pod&lt;/code&gt;. For instance, configuring &lt;code&gt;K8S_SERVICE_NAME_RULE&lt;/code&gt; to the expression &lt;code&gt;${service.metadata.name}-${pod.metadata.labels.version}&lt;/code&gt; gets service names with version label such as &lt;code&gt;reviews-v1&lt;/code&gt;, &lt;code&gt;reviews-v2&lt;/code&gt;, and &lt;code&gt;reviews-v3&lt;/code&gt;, instead of a single service &lt;code&gt;reviews&lt;/code&gt;, see &lt;a href=&#34;https://github.com/apache/skywalking/pull/5722&#34;&gt;the PR&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;working-als-with-vm&#34;&gt;Working ALS with VM&lt;/h2&gt;
&lt;p&gt;Kubernetes is popular, but what about VMs? From what we discussed above, in order to map the IPs to services, SkyWalking needs access to the Kubernetes cluster, fetching service metadata and Pod IPs. But in a VM environment, there is no source from which we can fetch those metadata.
In the next post, we will introduce another ALS analyzer based on the Envoy metadata exchange mechanism. With this analyzer, you are able to observe a service mesh in the VM environment. Stay tuned!
If you want to  have commercial support for the ALS solution or hybrid mesh observability, &lt;a href=&#34;https://www.tetrate.io/tetrate-service-bridge/&#34;&gt;Tetrate Service Bridge, TSB&lt;/a&gt; is another good option out there.&lt;/p&gt;
&lt;h2 id=&#34;additional-resources&#34;&gt;Additional Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;KubeCon 2019 Recorded &lt;a href=&#34;https://www.youtube.com/watch?v=tERm39ju9ew&#34;&gt;Video&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Get more SkyWalking updates on &lt;a href=&#34;https://skywalking.apache.org&#34;&gt;the official website&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Apache SkyWalking founder Sheng Wu, SkyWalking core maintainer Zhenxu Ke are Tetrate engineers, and Tevah Platt is a content writer for Tetrate. Tetrate helps organizations adopt open source service mesh tools, including Istio, Envoy, and Apache SkyWalking, so they can manage microservices, run service mesh on any infrastructure, and modernize their applications.&lt;/em&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: The Apdex Score for Measuring Service Mesh Health</title>
      <link>/blog/2020-07-26-apdex-and-skywalking/</link>
      <pubDate>Sun, 26 Jul 2020 00:00:00 +0000</pubDate>
      <guid>/blog/2020-07-26-apdex-and-skywalking/</guid>
      <description>
        
        
        &lt;ul&gt;
&lt;li&gt;Author: Srinivasan Ramaswamy, tetrate&lt;/li&gt;
&lt;li&gt;Original link, &lt;a href=&#34;https://www.tetrate.io/blog/the-apdex-score-for-measuring-service-mesh-health/&#34;&gt;Tetrate.io blog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;asking-how-are-you-is-more-profound-than-what-are-your-symptoms&#34;&gt;Asking &lt;code&gt;How are you&lt;/code&gt; is more profound than &lt;code&gt;What are your symptoms&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;&lt;img src=&#34;intro_image.png&#34; alt=&#34;alt_text&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;background&#34;&gt;&lt;strong&gt;Background&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Recently I visited my preferred doctor. Whenever I visit, the doctor greets me with a series of light questions: How’s your day? How about the week before? Any recent trips? Did I break my cycling record? How’s your workout regimen? _Finally _he asks, “Do you have any problems?&amp;quot;  On those visits when I didn&amp;rsquo;t feel ok, I would say something like, &amp;ldquo;&lt;em&gt;I&amp;rsquo;m feeling dull this week, and I&amp;rsquo;m feeling more tired towards noon….&amp;rdquo;&lt;/em&gt; It&amp;rsquo;s at this point that he takes out his stethoscope, his pulse oximeter, and blood pressure apparatus. Then, if he feels he needs a more in-depth insight, he starts listing out specific tests to be made.&lt;/p&gt;
&lt;p&gt;When I asked him if the first part of the discussion was just an ice-breaker, he said, &amp;ldquo;&lt;em&gt;That&amp;rsquo;s the essential part. It helps me find out how you feel, rather than what your symptoms are.&amp;rdquo;&lt;/em&gt; So, despite appearances, our opening chat about life helped him structure subsequent questions on symptoms, investigations and test results.&lt;/p&gt;
&lt;p&gt;On the way back, I couldn&amp;rsquo;t stop asking myself, &lt;em&gt;&amp;ldquo;Shouldn&amp;rsquo;t we be managing our mesh this way, too?&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If I strike parallels between my own health check and a  health check, “tests” would be log analysis, “investigations” would be tracing, and “symptoms” would be the traditional RED (Rate, Errors and Duration) metrics. That leaves the “essential part,” which is what we are talking about here: the &lt;em&gt;Wellness Factor&lt;/em&gt;, primarily the health of our mesh.&lt;/p&gt;
&lt;h3 id=&#34;health-in-the-context-of-service-mesh&#34;&gt;&lt;strong&gt;Health in the context of service mesh&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;We can measure the performance of any observed service through RED metrics.  RED metrics offer immense value in understanding the performance, reliability, and throughput of every service. Compelling visualizations of these metrics across the mesh make monitoring the entire mesh standardized and scalable. Also, setting alerts based on thresholds for each of these metrics helps to detect anomalies as and when they arise.&lt;/p&gt;
&lt;p&gt;To establish the context of any service and observe them, it&amp;rsquo;s ideal to visualize the mesh as a topology.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;mesh-1.png&#34; alt=&#34;alt_text&#34;&gt;&lt;/p&gt;
&lt;p&gt;A topology visualization of the mesh not only allows for picking any service and watching its metrics, but also gives vital information about service dependencies and the potential impact of a given service on the mesh.&lt;/p&gt;
&lt;p&gt;While RED metrics of each service offer tremendous insights, the user is more concerned with the overall responsiveness of the mesh rather than each of these services in isolation.&lt;/p&gt;
&lt;p&gt;To describe the performance of any service, right from submitting the request to receiving a completed http response, we’d be measuring the user&amp;rsquo;s perception of responsiveness. This measure of response time compared with a set threshold is called Apdex. This Apdex is an indicator of the health of a service in the mesh.&lt;/p&gt;
&lt;h3 id=&#34;apdex&#34;&gt;&lt;strong&gt;Apdex&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Apdex is a measure of response time considered against a set threshold**.  **It is the ratio of satisfactory response times and unsatisfactory response times to total response times.&lt;/p&gt;
&lt;p&gt;Apdex is an industry standard to measure the satisfaction of users based on the response time of applications and services. It measures how satisfied your users are with your services, as traditional metrics such as average response time could get skewed quickly.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Satisfactory response time&lt;/em&gt; indicates the number of times when the roundtrip response time of a particular service was less than this threshold. &lt;em&gt;Unsatisfactory response time&lt;/em&gt; while meaning the opposite, is further categorized as &lt;em&gt;Tolerating&lt;/em&gt; and &lt;em&gt;Frustrating&lt;/em&gt;. &lt;em&gt;Tolerating&lt;/em&gt; accommodates any performance that is up to four times the threshold, and anything over that or any errors encountered is considered &lt;em&gt;Frustrating&lt;/em&gt;. The threshold mentioned here is an ideal roundtrip performance that we expect from any service. We could even start with an organization-wide limit of say, 500ms.&lt;/p&gt;
&lt;p&gt;The Apdex score is a ratio of satisfied and tolerating requests to the total requests made.&lt;/p&gt;
&lt;p&gt;Each &lt;em&gt;satisfied request&lt;/em&gt; counts as one request, while each &lt;em&gt;tolerating request&lt;/em&gt; counts as half a  &lt;em&gt;satisfied request&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;An Apdex score takes values from 0 to 1, with 0 being the worst possible score indicating that users were always frustrated, and ‘1’ as the best possible score (100% of response times were Satisfactory).&lt;/p&gt;
&lt;p&gt;A percentage representation of this score also serves as the Health Indicator of the service.&lt;/p&gt;
&lt;h2 id=&#34;the-math&#34;&gt;&lt;strong&gt;The Math&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The actual computation of this Apdex score is achieved through the following formula.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;		            SatisfiedCount +  ( ToleratingCount / 2 )

Apdex Score  =  ------------------------------------------------------

                                TotalSamples
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;A percentage representation of this score is known as the Health Indicator of a service.&lt;/p&gt;
&lt;h3 id=&#34;example-computation&#34;&gt;&lt;strong&gt;Example Computation&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;During a 2-minute period, a host handles 200 requests.&lt;/p&gt;
&lt;p&gt;The Apdex threshold T = 0.5 seconds (500ms).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;170 of the requests were handled within 500ms, so they are classified as Satisfied.&lt;/li&gt;
&lt;li&gt;20 of the requests were handled between 500ms and 2 seconds (2000 ms), so they are classified as Tolerating.&lt;/li&gt;
&lt;li&gt;The remaining 10 were not handled properly or took longer than 2 seconds, so they are classified as Frustrated.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The resulting Apdex score is 0.9:  (170 + (20/2))/200 = 0.9.&lt;/p&gt;
&lt;h3 id=&#34;the-next-level&#34;&gt;&lt;strong&gt;The next level&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;At the next level, we can attempt to improve our topology visualization by coloring nodes based on their health. Also, we can include health as a part of the information we show when the user taps on a service.&lt;/p&gt;
&lt;p&gt;Apdex specifications recommend the following Apdex Quality Ratings by classifying Apdex Score as Excellent (0.94 - 1.00), Good (0.85 - 0.93), Fair (0.70 - 0.84), Poor (0.50 - 0.69)  and Unacceptable (0.00 - 0.49).&lt;/p&gt;
&lt;p&gt;To visualize this, let’s look at our topology using traffic light colors, marking our  nodes as  Healthy,  At-Risk and  Unhealthy, where &lt;strong&gt;Unhealthy&lt;/strong&gt; indicates health that falls below 80%. A rate between 80% and 95% indicates &lt;strong&gt;At-Risk&lt;/strong&gt;, and health at 95% and above is termed &lt;strong&gt;Healthy&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Let’s incorporate this coloring into our topology visualization and take its usability to the next level. If implemented, we will be looking at something like this.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;mesh-2.png&#34; alt=&#34;alt_text&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;moving-further&#34;&gt;&lt;strong&gt;Moving further&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Apdex provides tremendous visibility into customer satisfaction on the responsiveness of our services. Even more, by extending the implementation to the edges calling this service we get further insight into the health of the mesh itself.&lt;/p&gt;
&lt;p&gt;Two services with similar Apdex scores offer the same customer satisfaction to the customer. However, the size of traffic that flows into the service can be of immense help in prioritizing between services to address. A service with higher traffic flow is an indication that this experience is impacting a significant number of users on the mesh.&lt;/p&gt;
&lt;p&gt;While health relates to a service, we can also analyze the interactions between two services and calculate the health of the interaction. This health calculation of every interaction on the mesh helps us establish a critical path, based on the health of all interactions in the entire topology.&lt;/p&gt;
&lt;p&gt;In a big mesh, showing traffic as yet another number will make it more challenging to visualize and monitor. We can, with a bit of creativity, improve the entire visualization by rendering the edges that connect services with different thickness depending on the throughput of the service.&lt;/p&gt;
&lt;p&gt;An unhealthy service participating in a high throughput transaction could lead to excessive consumption of resources. On the other hand, this visualization also offers a great tip to maximize investment in tuning services.&lt;/p&gt;
&lt;p&gt;Tuning service that is a part of a high throughput transaction offers exponential benefits when compared to tuning an occasionally used service.&lt;/p&gt;
&lt;p&gt;If we look at implementing such a visualization, which includes the health of interactions and throughput of such interactions, we would be looking at something like below :&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;mesh-4.png&#34; alt=&#34;alt_text&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;the-day-is-not-far&#34;&gt;&lt;strong&gt;The day is not far&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;These capabilities are already available to users &lt;strong&gt;today&lt;/strong&gt; as one of the UI features of Tetrate’s service mesh platform, using the highly configurable and performant observability and performance management framework: Apache SkyWalking (&lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;https://skywalking.apache.org&lt;/a&gt;), which monitors traffic across the mesh, aggregates RED metrics for both services and their interactions, continuously computes and monitors health of the services, and enables users to configure alerts and notifications when services cross specific thresholds, thereby having a comprehensive health visibility of the mesh.&lt;/p&gt;
&lt;p&gt;With such tremendous visibility into our mesh performance, the day is not far when we at our NOC (Network Operations Center) for the mesh have this topology as our HUD (Heads Up Display).&lt;/p&gt;
&lt;p&gt;This HUD, with the insights and patterns gathered over time, would predict situations and proactively prompt us on potential focus areas to improve customer satisfaction.&lt;/p&gt;
&lt;p&gt;The visualization with rich historical data can also empower the Network Engineers to go back in time and look at the performance of the mesh on a similar day in the past.&lt;/p&gt;
&lt;p&gt;An earnest implementation of such a visualization would be something like below :&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;mesh-5.png&#34; alt=&#34;alt_text&#34;&gt;&lt;/p&gt;
&lt;h3 id=&#34;to-conclude&#34;&gt;&lt;strong&gt;To conclude&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;With all the discussion so far, the health of a mesh is more about how our users feel, and what we can proactively do as service providers to sustain, if not enhance, the experience of our users.&lt;/p&gt;
&lt;p&gt;As the world advances toward personalized medicine, we&amp;rsquo;re not far from a day when my doctor will text me: &amp;ldquo;How about feasting yourself with ice cream today and take the Gray Butte Trail to Mount Shasta!&amp;rdquo; Likewise, we can do more for our customers by having better insight into their overall wellness.&lt;/p&gt;
&lt;p&gt;Tetrate’s approach to “service mesh health” is not only to offer management, monitoring and support but to make infrastructure healthy from the start to reduce the probability of incidents.  Powered by the Istio, Envoy, and SkyWalking, Tetrate&amp;rsquo;s solutions enable consistent end-to-end observability, runtime security, and traffic management for any workload in any environment.&lt;/p&gt;
&lt;p&gt;Our customers deserve healthy systems! Please do share your thoughts on making service mesh an exciting and robust experience for our customers.&lt;/p&gt;
&lt;h3 id=&#34;references&#34;&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Apdex&#34;&gt;https://en.wikipedia.org/wiki/Apdex&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.apdex.org/overview.html&#34;&gt;https://www.apdex.org/overview.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.apdex.org/index.php/specifications/&#34;&gt;https://www.apdex.org/index.php/specifications/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://skywalking.apache.org/&#34;&gt;https://skywalking.apache.org/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

      </description>
    </item>
    
    <item>
      <title>Blog: SkyWalking v6 is Service Mesh ready</title>
      <link>/blog/2018-12-12-skywalking-service-mesh-ready/</link>
      <pubDate>Wed, 05 Dec 2018 00:00:00 +0000</pubDate>
      <guid>/blog/2018-12-12-skywalking-service-mesh-ready/</guid>
      <description>
        
        
        &lt;p&gt;Original link, &lt;a href=&#34;https://www.tetrate.io/blog/apache-skywalking-v6/&#34;&gt;Tetrate.io blog&lt;/a&gt;&lt;/p&gt;
&lt;h1 id=&#34;context&#34;&gt;Context&lt;/h1&gt;
&lt;p&gt;The integration of SkyWalking and Istio Service Mesh yields an essential open-source tool for resolving the chaos created by the proliferation of siloed, cloud-based services.&lt;/p&gt;
&lt;p&gt;Apache SkyWalking is an open, modern performance management tool for distributed services, designed especially for microservices, cloud native and container-based (Docker, K8s, Mesos) architectures. We at Tetrate believe it is going to be an important project for understanding the performance of microservices. The recently released v6 integrates with Istio Service Mesh and focuses on metrics and tracing. It natively understands the most common language runtimes (Java, .Net, and NodeJS). With its new core code, SkyWalking v6 also supports Istrio telemetry data formats, providing consistent analysis, persistence, and visualization.&lt;/p&gt;
&lt;p&gt;SkyWalking has evolved into an Observability Analysis Platform that enables observation and monitoring of hundreds of services all at once. It promises solutions for some of the trickiest problems faced by system administrators using complex arrays of abundant services: Identifying why and where a request is slow, distinguishing normal from deviant system performance, comparing apples-to-apples metrics across apps regardless of programming language, and attaining a complete and meaningful view of performance.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2ctge1g5j31pc0s8h04.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;skywalking-history&#34;&gt;SkyWalking History&lt;/h1&gt;
&lt;p&gt;Launched in China by Wu Sheng in 2015, SkyWalking started as just a distributed tracing system, like Zipkin, but with auto instrumentation from a Java agent. This enabled JVM users to see distributed traces without any change to their source code. In the last two years, it has been used for research and production by more than &lt;a href=&#34;https://github.com/apache/incubator-skywalking/blob/master/docs/powered-by.md&#34;&gt;50 companies&lt;/a&gt;. With its expanded capabilities, we expect to see it adopted more globally.&lt;/p&gt;
&lt;h1 id=&#34;whats-new&#34;&gt;What&amp;rsquo;s new&lt;/h1&gt;
&lt;h2 id=&#34;service-mesh-integration&#34;&gt;Service Mesh Integration&lt;/h2&gt;
&lt;p&gt;Istio has picked up a lot of steam as the framework of choice for distributed services. Based on all the interest in the Istio project, and community feedback, some SkyWalking (P)PMC members decided to integrate with Istio Service Mesh to move SkyWalking to a higher level.&lt;/p&gt;
&lt;p&gt;So now you can use Skywalking to get metrics and understand the topology of your applications. This works not just for Java, .NET and Node using our language agents, but also for microservices running under the Istio service mesh. You can get a full topology of both kinds of applications.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2cjmhi3uj31h80m5jwn.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;h2 id=&#34;observability-analysis-platform&#34;&gt;Observability analysis platform&lt;/h2&gt;
&lt;p&gt;With its roots in tracing, SkyWalking is now transitioning into an open-standards based &lt;strong&gt;Observability Analysis Platform&lt;/strong&gt;, which means the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It can accept different kinds and formats of telemetry data from mesh like Istio telemetry.&lt;/li&gt;
&lt;li&gt;Its agents support various popular software technologies and frameworks like Tomcat, Spring, Kafka. The whole supported framework list is &lt;a href=&#34;https://github.com/apache/incubator-skywalking/blob/master/docs/en/setup/service-agent/java-agent/Supported-list.md&#34;&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;It can accept data from other compliant sources like Zipkin-formatted traces reported from Zipkin, Jaeger, or OpenCensus clients.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2cqo4yctj31ok0s07hh.jpg&#34; alt=&#34;img&#34;&gt;&lt;/p&gt;
&lt;p&gt;SkyWalking is logically split into four parts: Probes, Platform Backend, Storage and UI:&lt;/p&gt;
&lt;p&gt;There are two kinds of &lt;strong&gt;probes&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Language agents or SDKs following SkyWalking across-thread propagation formats and trace formats, run in the user’s application process.&lt;/li&gt;
&lt;li&gt;The Istio mixer adaptor, which collects telemetry from the Service Mesh.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The platform &lt;strong&gt;backend&lt;/strong&gt; provides gRPC and RESTful HTTP endpoints for all SkyWalking-supported trace and metric telemetry data. For example, you can stream these metrics into an analysis system.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Storage&lt;/strong&gt; supports multiple implementations such as ElasticSearch, H2 (alpha), MySQL, and Apache ShardingSphere for MySQL Cluster. TiDB will be supported in next release.&lt;/p&gt;
&lt;p&gt;SkyWalking’s built-in &lt;strong&gt;UI&lt;/strong&gt; with a GraphQL endpoint for data allows intuitive, customizable integration.&lt;/p&gt;
&lt;p&gt;Some examples of SkyWalking’s UI:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Observe a Spring app using the SkyWalking JVM-agent&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2ckeyyxlj31h70lvdjf.jpg&#34; alt=&#34;Topology&#34;&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Observe on Istio without any agent, no matter what langugage the service is written in&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2ckwr65mj31h80m5jwn.jpg&#34; alt=&#34;Topology&#34;&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;See fine-grained metrics like request/Call per Minute, P99/95/90/75/50 latency, avg response time, heatmap&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2cmxcrdqj31gz0qmdja.jpg&#34; alt=&#34;Dashboard&#34;&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Service dependencies and metrics&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;0081Kckwly1gl2cngbu84j31h00oxadw.jpg&#34; alt=&#34;Service&#34;&gt;&lt;/p&gt;
&lt;h1 id=&#34;service-focused&#34;&gt;Service Focused&lt;/h1&gt;
&lt;p&gt;At Tetrate, we are focused on discovery, reliability, and security of your running services.
This is why we are embracing Skywalking, which makes service performance observable.&lt;/p&gt;
&lt;p&gt;Behind this admittedly cool UI, the aggregation logic is very easy to understand, making it easy to customize SkyWalking in its Observability Analysis Language (OAL) script.&lt;/p&gt;
&lt;p&gt;We’ll post more about OAL for developers looking to customize SkyWalking, and you can read the official &lt;a href=&#34;https://github.com/apache/incubator-skywalking/blob/master/docs/en/concepts-and-designs/oal.md&#34;&gt;OAL introduction&lt;/a&gt; document.&lt;/p&gt;
&lt;p&gt;Scripts are based on three core concepts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Service&lt;/strong&gt; represents a group of workloads that provide the same behaviours for incoming requests. You can define the service name whether you are using instrument agents or SDKs. Otherwise, SkyWalking uses the name you defined in the underlying platform, such as Istio.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Service Instance&lt;/strong&gt; Each workload in the Service group is called an instance. Like &lt;em&gt;Pods&lt;/em&gt; in Kubernetes, it doesn&amp;rsquo;t need  to be a single OS process. If you are using an instrument agent, an instance does map to one OS process.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt; is a path in a certain service that handles incoming requests, such as HTTP paths or a gRPC service + method. Mesh telemetry and trace data are formatted as source objects (aka scope). These are the input for the aggregation, with the script describing how to aggregate, including input, conditions, and the resulting metric name.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&#34;core-features&#34;&gt;Core Features&lt;/h1&gt;
&lt;p&gt;The other core features in SkyWalking v6 are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Service, service instance, endpoint metrics analysis.&lt;/li&gt;
&lt;li&gt;Consistent visualization in Service Mesh and no mesh.&lt;/li&gt;
&lt;li&gt;Topology discovery, Service dependency analysis.&lt;/li&gt;
&lt;li&gt;Distributed tracing.&lt;/li&gt;
&lt;li&gt;Slow services and endpoints detected.&lt;/li&gt;
&lt;li&gt;Alarms.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Of course, SkyWalking has some more upgrades from v5, such as:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;ElasticSearch 6 as storage is supported.&lt;/li&gt;
&lt;li&gt;H2 storage implementor is back.&lt;/li&gt;
&lt;li&gt;Kubernetes cluster management is provided. You don’t need Zookeeper to keep the backend running in cluster mode.&lt;/li&gt;
&lt;li&gt;Totally new alarm core. Easier configuration.&lt;/li&gt;
&lt;li&gt;More cloud native style.&lt;/li&gt;
&lt;li&gt;MySQL will be supported in the next release.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1 id=&#34;please-test-and-provide-feedback&#34;&gt;Please: Test and Provide Feedback!&lt;/h1&gt;
&lt;p&gt;We would love everyone to try to test our new version. You can find everything you need in our &lt;a href=&#34;https://github.com/apache/incubator-skywalking&#34;&gt;Apache repository&lt;/a&gt;,read the &lt;a href=&#34;https://github.com/apache/incubator-skywalking/blob/master/docs/README.md&#34;&gt;document&lt;/a&gt; for further details. You can contact the project team through the following channels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Submit an issue on &lt;a href=&#34;https://github.com/apache/incubator-skywalking/issues/new&#34;&gt;GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Mailing list: &lt;a href=&#34;mailto:dev@skywalking.apache.org&#34;&gt;dev@skywalking.apache.org&lt;/a&gt; . Send to &lt;a href=&#34;mailto:dev-subscribe@kywalking.apache.org&#34;&gt;dev-subscribe@kywalking.apache.org&lt;/a&gt; to subscribe the mail list.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://gitter.im/OpenSkywalking/Lobby&#34;&gt;Gitter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://twitter.com/ASFSkyWalking&#34;&gt;Project twitter&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Oh, and one last thing! If you like our project, don&amp;rsquo;t forget to &lt;a href=&#34;https://github.com/apache/incubator-skywalking&#34;&gt;give us a star on GitHub&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
