Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TLDR: Tracing has been added for A1 Policy Management Service. By default tracing is disabled. To enable it change the flag there are two ways:

A) System Property

Change the flag otel.sdk.disabled to false in the application.yaml (New Delhi) 

Code Block
languageyml
themeMidnight
otel:
  sdk:
      disabled:
 false

...

 

...

 

...

 

...

Code Block
languageyml
themeMidnight
disabled: ${ONAP_SDK_DISABLED=:false}
    south: ${ONAP_TRACING_SOUTHBOUND=true

Tracing Test

View file
nameapplication_configuration.json.nosdnc
height250
View file
namedocker-compose.yaml
height250

:true}   
  instrumentation:
    spring-webflux:
      enabled: true
  1. otel.sdk.disabled: enable/disable tracing all toghether
  2. otel.sdk.south: if ONAP_SDK_DISABLED is false then we can enable/disable southbound tracing
  3. otel.instrumentation.spring-webflux.enabled: if ONAP_SDK_DISABLED is false we can enable/disable northbound tracing

B) Enviroment Variable

Have the environment variables, this way you don't need to change the application.yaml and rebuild the docker imagea) A docker compose with a1pms, a1-osc-simulator, and jaeger that acts as a collector and exporter

version: '3.7' services: a1_policy_management: container_name: a1-pms image: onap/ccsdk-oran-a1policymanagementservice:1.7.0-SNAPSHOT ports: - "8433:8433" - "8081:8081" volumes: - ./application_configuration.json.nosdnc:/opt/app/policy-agent/data/application_configuration.json:ro networks: - jaeger-example depends_on: - jaeger environment: -
Code Block
languageyml
themeMidnight
ONAP_SDK_DISABLED=false
ONAP_TRACING_SOUTHBOUND=true
OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED=true
  1. ONAP_SDK_DISABLED: enable/disable tracing all toghether
  2. ONAP_TRACING_SOUTHBOUND: if ONAP_SDK_DISABLED is false then we can enable/disable southbound tracing
  3. OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED: if ONAP_SDK_DISABLED is false we can enable/disable northbound tracing


Possible Combinations 

So we can have the following combinations:

TracingNorthboundSouthboundFlags
(error)(error)(error)

ONAP_SDK_DISABLED=true

(tick)(tick)(tick)

ONAP_SDK_DISABLED=false

...

; ONAP_TRACING_SOUTHBOUND=true

...

; OTEL_

...

INSTRUMENTATION_

...

SPRING_

...

WEBFLUX_ENABLED=true

(tick)(tick)(error)

ONAP_SDK_DISABLED=false; ONAP_TRACING_SOUTHBOUND=false; OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED=true

(tick)(error)(tick)

ONAP_SDK_DISABLED=false; ONAP_TRACING_SOUTHBOUND=true; OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED=false


Tracing Test


View file
nameapplication_configuration.json.nosdnc
height250
View file
namedocker-compose.yaml
height250

View file
nametracing_demo.mp4
height250



a) A docker compose with a1pms, a1-osc-simulator, and jaeger that acts as a collector and exporter. Note: onap/ccsdk-oran-a1policymanagementservice:1.7.0-SNAPSHOT is built locally by doing "mvn clean install", you can use the nexus hosted image changing the prefix.

Code Block
languageyml
themeMidnight
version: '3.7'
services:
  a1_policy_management:ENDPOINT=http://jaeger:14250
      - ONAP_OTEL_EXPORTER_ENDPOINT=http://jaeger:4317
      - ONAP_OTEL_EXPORTER_PROTOCOL=grpc
      - ONAP_OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc

  a1-sim-OSC:
    image: "nexus3.o-ran-sc.org:10002/o-ran-sc/a1-simulator:2.1.0"
    container_name: a1-sim-OSCpms
    portsimage:
     onap/ccsdk-oran-a1policymanagementservice:1.7.0-SNAPSHOT
    ports:
      - "300018433:80858433"
      - "300028081:81858081"
    environmentvolumes:
      - A1_VERSION=OSC_2.1.0./application_configuration.json.nosdnc:/opt/app/policy-agent/data/application_configuration.json:ro
    networks:
      - REMOTE_HOSTS_LOGGING=1jaeger-example
      - ALLOW_HTTP=true
    networksdepends_on:
      - jaeger-example

  jaeger:
    image: jaegertracing/all-in-one:latestenvironment:
    container_name: jaeger
    ports:- ONAP_SDK_DISABLED=false
      - "16686:16686"ONAP_TRACING_SOUTHBOUND=true
      - "14250:14250"OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED=true
      - "14268:14268"ONAP_OTEL_SAMPLER_JAEGER_REMOTE_ENDPOINT=http://jaeger:14250
      - "4317ONAP_OTEL_EXPORTER_ENDPOINT=http://jaeger:4317"
      - "4318:4318"ONAP_OTEL_EXPORTER_PROTOCOL=grpc
    environment:
      - JAEGER_DISABLED=trueONAP_OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc

      - LOG_LEVEL=debuga1-sim-OSC:
      - COLLECTOR_OTLP_ENABLED=trueimage: "nexus3.o-ran-sc.org:10002/o-ran-sc/a1-simulator:2.1.0"
    networks:container_name: a1-sim-OSC
    ports:
  - jaeger-example

networks:
    jaeger-example "30001:8085"
    driver: bridge

b) The application_configuration.json.nosdnc in the same folder

Code Block
languageyml
themeMidnight
{
   - "description":"Application configuration",30002:8185"
    "config"environment:{
       "ric":[- A1_VERSION=OSC_2.1.0
          {- REMOTE_HOSTS_LOGGING=1
      - ALLOW_HTTP=true
      "name":"ric1",networks:
      - jaeger-example

  jaeger:
    "baseUrl":"https://a1-sim-OSC:8185/",image: jaegertracing/all-in-one:latest
    container_name: jaeger
    ports:
      - "managedElementIds16686:16686":[
      - "14250:14250"
      -   "kista_1","14268:14268"
      - "4317:4317"
       -  "kista_2"4318:4318"
    environment:
      - JAEGER_DISABLED=true
  ]
    - LOG_LEVEL=debug
       }- COLLECTOR_OTLP_ENABLED=true
    networks:
      ]
- jaeger-example

networks:
  jaeger-example:
    driver: }bridge

...


...

...

c) Creating a PolicyType in the simulatorb) The application_configuration.json.nosdnc in the same folder




Code Block
languagebashyml
themeMidnight
curl{
  -v -X 'PUT' \ "description":"Application configuration",
   'http://localhost:30001/a1-p/policytypes/1' \
 "config":{
    -H 'accept: application/json' \ "ric":[
   -H 'Content-Type: application/json' \
   -d '{
             "name":"pt1ric1",
       "description":"pt1 policy type",
    "policy_type_id":1,
baseUrl":"https://a1-sim-OSC:8185/",
          "create_schema   "managedElementIds":{[
       "$schema":"http://json-schema.org/draft-07/schema#",
         "title":"OSC_Type1_1.0.0kista_1",
       "description":"Type 1 policy type",
             "type":"object","kista_2"
       "properties":{
      ]
    "scope":{
      }
       "type":"object",]
    }
 }



...

c) Creating a PolicyType in the simulator

Code Block
languagebash
themeMidnight
curl -v -X 'PUT' \
    "properties":{'http://localhost:30001/a1-p/policytypes/1' \
   -H 'accept: application/json' \
   -H 'Content-Type: application/json' \
    "ueId":-d '{
    "name":"pt1",
    "description":"pt1 policy type",
    "policy_type_id":1,
     "typecreate_schema":"string"{
       "$schema":"http://json-schema.org/draft-07/schema#",
         }"title":"OSC_Type1_1.0.0",
       "description":"Type 1 policy type",
       "qosIdtype":{"object",
       "properties":{
            "typescope":"string"{
             "type":"object",
   }
          "properties":{
   },
             "additionalPropertiesueId":false,{
             "required      "type":["string"
                "ueId"},
                "qosId":{
             ]
      "type":"string"
     },           }
          "qosObjectives":{   },
             "typeadditionalProperties":"object"false,
             "propertiesrequired":{[
                "priorityLevelueId":{,
                "qosId"
    "type":"number"
         ]
       }
   },
          },"qosObjectives":{
             "additionalPropertiestype":false"object",
             "requiredproperties":[{
                "priorityLevel":{
             ]
      "type":"number"
                }
             },
             "additionalProperties":false,
             "required":[
                "scopepriorityLevel",
          "qosObjectives"
   ]
          ]}
       }
}'

,
       "additionalProperties":false,
       "required":[
          "scope",
          "qosObjectives"
       ]
    }
}'

...

dd) Creating a policy in a1-pms, after the policy type is successfully registered (curl http://localhost:8081/a1-policy/v2/policy-types)

...

e) http://localhost:16686/ Load Jaeger UI, a1-pms traces, and a sample of the last call would be:
Image Removed
Image Added


Steps Taken and Challenges:


Adding Telemetry to a1policymanagementservice: The application uses the WebClient from SpringWebflux to contact from the northbound interface a southbound interface (for the latter a A1-OSC simulator has been used).

...

And then use var context = ApplicationContextProvider.getApplicationContext().getBean(OtelConfig.class); In the non Spring class, and if tracing is enabled to add the tracing filters.

4. The ApplicationContextProvider class got removed, because it can cause issues on different environment. The class during start up time, in rare cases, was null (if the dependant classes were initialized first). So the approach changed into wrapping the AsyncWebClient build function into a @Service

5. Used opentelemetry-springboot-starter, we noticed more information getting traced automatically if we enabled this dependecy.  So we control this dependency in the application yaml under the otel properties.

...

with the Bean SpringWebfluxTelemetry  in @Autowired(required = false) in case the telemetry is disabled an the bean does not start

Code Block
languagejava
themeMidnight
collapsetrue
@Service
@DependsOn({"otelConfig"})
public class WebClientUtil
Code Block
languagejava
collapsetrue
    ObservationRegistryCustomizer<ObservationRegistry> skipActuatorEndpointsFromObservation() {
    private static OtelConfig otelConfig;
    PathMatcherprivate pathMatcherstatic = new AntPathMatcher("/")SpringWebfluxTelemetry springWebfluxTelemetry;
     public   return registry -> registry.observationConfig().observationPredicate(observationPredicate(pathMatcher));WebClientUtil(OtelConfig otelConfig, @Autowired(required = false) SpringWebfluxTelemetry springWebfluxTelemetry) {
    }

    staticWebClientUtil.otelConfig ObservationPredicate observationPredicate(PathMatcher pathMatcher) {= otelConfig;
        returnif (name, contextotelConfig.isTracingEnabled()) -> {
            WebClientUtil.springWebfluxTelemetry = springWebfluxTelemetry;
        }
    }


5. Used opentelemetry-springboot-starter, we noticed more information getting traced automatically if we enabled this dependecy.  So we control this dependency in the application yaml under the otel properties.


NOTES:

1.Using the ObservationRegistryCustomizer would still track /actuator manual calls, but it was kept in to kept UnitTests running


Code Block
languagejava
collapsetrue
    ObservationRegistryCustomizer<ObservationRegistry> skipActuatorEndpointsFromObservation() if (context instanceof ServerRequestObservationContext observationContext) {
                return !pathMatcher.match("/actuator/**", observationContext.getCarrier().getRequestURI());
            } else {
         PathMatcher  pathMatcher  =   return !SCHEDULED_TASK_NAME.equals(namenew AntPathMatcher("/");
          return registry -> registry.observationConfig().observationPredicate(observationPredicate(pathMatcher));
    }

    static ObservationPredicate observationPredicate(PathMatcher pathMatcher) {
    };
    }

It's worth mentioning that if using the spring-boot auto configuration:

  <dependency>
    <groupId>io.opentelemetry.instrumentation</groupId>
    <artifactId>opentelemetry-spring-boot-starter</artifactId>
  </dependency>

You can follow the below steps:
https://opentelemetry.io/docs/zero-code/java/spring-boot-starter/sdk-configuration/#exclude-actuator-endpoints-from-tracing

...

 return (name, context) -> {
            if (context instanceof ServerRequestObservationContext observationContext) {
                return !pathMatcher.match("/actuator/**", observationContext.getCarrier().getRequestURI());
            } else {
                return !SCHEDULED_TASK_NAME.equals(name);
            }
        };
    }


It's worth mentioning that if using the spring-boot auto configuration:

  <dependency>
    <groupId>io.opentelemetry.instrumentation</groupId>
    <artifactId>opentelemetry-spring-boot-starter</artifactId>
  </dependency>

You can follow the below steps:
https://opentelemetry.io/docs/zero-code/java/spring-boot-starter/sdk-configuration/#exclude-actuator-endpoints-from-tracing


2. To retrieve multiple spans, and enable automatic context propagation to ThreadLocals used by FLUX and MONO operators we used:
        Hooks.enableAutomaticContextPropagation(); only if tracing is enabled

https://docs.micrometer.io/context-propagation/reference/index.html 

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>context-propagation</artifactId>
  </dependency>

3.When disabling Telemetry micrometer-tracing-bridge-otel would still try to export spans, so we decided to use one flag to rule them both (micrometer and opentelemetry)

The flag controlling it is

managment
  tracing
    enable: true

Example of polluted logs when disabling only opentelemetry beans:

Code Block
languagebash
collapsetrue

...

Code Block
languagebash
collapsetrue
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope attached event [ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope changed event [ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope closedattached event [io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope closedchanged event [io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345 [ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.061060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope restoredclosed event [ScopeRestored{context: [span: null] [baggage: null]}]io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope closed event [io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345]
2024-06-16 18:55:19 2024-06-16 17:55:19.061 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListenerSlf4JBaggageEventListener - Got scope restored event [ScopeRestored{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.062061 [ERRORTRACE] [OkHttp http://localhost:4318/...BatchSpanProcessor_WorkerThread-1] i.om.et.io.hb.HttpExporterSlf4JEventListener - FailedGot toscope exportrestored spans. The request could not be executed. Full error message: event [ScopeRestored{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.062 [ERROR] [OkHttp http://localhost:4318/...] i.o.e.i.h.HttpExporter - Failed to export spans. The request could not be executed. Full error message: Failed to connect to localhost/127.0.0.1:4318
2024-06-16 18:55:19 java.net.ConnectException: Failed to connect to localhost/127.0.0.1:4318
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:297)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
2024-06-16 18:55:19     at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
2024-06-16 18:55:19     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2024-06-16 18:55:19     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2024-06-16 18:55:19     at java.base/java.lang.Thread.run(Thread.java:833)
2024-06-16 18:55:19 Caused by: java.net.ConnectException: Connection refused
2024-06-16 18:55:19     at java.base/sun.nio.ch.Net.pollConnect(Native Method)
2024-06-16 18:55:19     at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
2024-06-16 18:55:19     at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
2024-06-16 18:55:19     at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
2024-06-16 18:55:19     at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
2024-06-16 18:55:19     at java.base/java.net.Socket.connect(Socket.java:633)
2024-06-16 18:55:19     at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
2024-06-16 18:55:19       ... 18 common frames omitted
2024-06-16 18:55:19 2024-06-16 17:55:19.066 [DEBUG] [BatchSpanProcessor_WorkerThread-1] i.o.s.t.e.BatchSpanProcessor - Exporter failed  ... 18 common frames omitted
2024-06-16 18:55:19 2024-06-16 17:55:19.066 [DEBUG] [BatchSpanProcessor_WorkerThread-1] i.o.s.t.e.BatchSpanProcessor - Exporter failed

4. Flags to enable/disable northbound or southbound interfaces
Since we used Java Springboot starter library from OpenTelemetry we can use their flags to enable or disableinstrumentation libraries.

https://opentelemetry.io/docs/zero-code/java/agent/configuration/#suppressing-specific-agent-instrumentation

OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED=true from the documentation we can use this flag to disable the automatic spring instrumentation and we keep a separate manual flag ONAP_TRACING_SOUTHBOUND for the AsyncRestClient requests made to the southbound.

System property
: otel.instrumentation.[name].enabled
Environment variable: OTEL_INSTRUMENTATION_[NAME]_ENABLED
Note: When using OPENTELEMETRY (Evrything starting with otel) environment variables, dashes (-) should be converted to underscores (_). For example, to suppress traces from spring-webflux library, set OTEL_INSTRUMENTATION_SPRING_WEBFLUX_ENABLED to false


Full Tracing:
Image Added

Only Southbound Tracing Output:
Image Added

Only Northbound Tracing Output:

Image Added