You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

CCSDK-4010 - Getting issue details... STATUS

TLDR: Tracing has been added for A1 Policy Management Service. By default is disabled. To enable it change the flag in the application.yaml 

otel:
  sdk:
    disabled: false

or have an envaronment variable 
ONAP_SDK_DISABLED=false
ONAP_TRACING_SOUTHBOUND=true



Tracing Test

a) A docker compose with a1pms, a1-osc-simulator, and jaeger that acts as a collector and exporter


version: '3.7'
services:
  a1_policy_management:
    container_name: a1-pms
    image: onap/ccsdk-oran-a1policymanagementservice:1.7.0-SNAPSHOT
    ports:
      - "8433:8433"
      - "8081:8081"
    volumes:
      - ./application_configuration.json.nosdnc:/opt/app/policy-agent/data/application_configuration.json:ro
    networks:
      - jaeger-example
    depends_on:
      - jaeger
    environment:
      - ONAP_SDK_DISABLED=false
- ONAP_TRACING_SOUTHBOUND=true - ONAP_OTEL_SAMPLER_JAEGER_REMOTE_ENDPOINT=http://jaeger:14250 - ONAP_OTEL_EXPORTER_ENDPOINT=http://jaeger:4317 - ONAP_OTEL_EXPORTER_PROTOCOL=grpc - ONAP_OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc a1-sim-OSC: image: "nexus3.o-ran-sc.org:10002/o-ran-sc/a1-simulator:2.1.0" container_name: a1-sim-OSC ports: - "30001:8085" - "30002:8185" environment: - A1_VERSION=OSC_2.1.0 - REMOTE_HOSTS_LOGGING=1 - ALLOW_HTTP=true networks: - jaeger-example jaeger: image: jaegertracing/all-in-one:latest container_name: jaeger ports: - "16686:16686" - "14250:14250" - "14268:14268" - "4317:4317" - "4318:4318" environment: - LOG_LEVEL=debug - COLLECTOR_OTLP_ENABLED=true networks: - jaeger-example networks: jaeger-example: driver: bridge

b) The application_configuration.json.nosdnc in the same folder

{
    "description":"Application configuration",
    "config":{
       "ric":[
          {
             "name":"ric1",
             "baseUrl":"https://a1-sim-OSC:8185/",
             "managedElementIds":[
                "kista_1",
                "kista_2"
             ]
          }
       ]
    }
 }

c) Creating a PolicyType in the simulator

curl -v -X 'PUT' \
   'http://localhost:30001/a1-p/policytypes/1' \
   -H 'accept: application/json' \
   -H 'Content-Type: application/json' \
   -d '{
    "name":"pt1",
    "description":"pt1 policy type",
    "policy_type_id":1,
    "create_schema":{
       "$schema":"http://json-schema.org/draft-07/schema#",
       "title":"OSC_Type1_1.0.0",
       "description":"Type 1 policy type",
       "type":"object",
       "properties":{
          "scope":{
             "type":"object",
             "properties":{
                "ueId":{
                   "type":"string"
                },
                "qosId":{
                   "type":"string"
                }
             },
             "additionalProperties":false,
             "required":[
                "ueId",
                "qosId"
             ]
          },
          "qosObjectives":{
             "type":"object",
             "properties":{
                "priorityLevel":{
                   "type":"number"
                }
             },
             "additionalProperties":false,
             "required":[
                "priorityLevel"
             ]
          }
       },
       "additionalProperties":false,
       "required":[
          "scope",
          "qosObjectives"
       ]
    }
}'

d) Creating a policy in a1-pms, after the policy type is successfully registered (curl http://localhost:8081/a1-policy/v2/policy-types)

curl -v -X 'PUT' \
  'http://localhost:8081/a1-policy/v2/policies' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "ric_id": "ric1",
  "policy_id": "aa8feaa88d944d919ef0e83f2172a51002",
  "transient": false,
  "service_id": "controlpanel",
    "policy_data": {
        "scope": {
            "ueId": "ue5100",
            "qosId": "qos5100"
        },
        "qosObjectives": {
            "priorityLevel": 5100.0
        }
    },
  "status_notification_uri": "http://callback-receiver:8090/callbacks/test",
  "policytype_id": "1"
}'

e) http://localhost:16686/ Load Jaeger UI, a1-pms traces, and a sample of the last call would be:


Steps:


Adding Telemetry to a1policymanagementservice The application uses the WebClient from SpringWebflux to contact from the northbound interface a southbound interface (for the latter a A1-OSC simulator has been used).

https://opentelemetry.io/docs/zero-code/java/spring-boot-starter/out-of-the-box-instrumentation/#spring-webflux-autoconfiguration

Opentelemetry documentation provides a bean to mutate the default WebClient builder and to add tracing filters.

In our case the AsyncRestClient manually builds a WebClient for every asynchronous request.

The challenge was to add the tracing filters to this non-Spring class.

1.Adding OpenTelemetry Bean

    @Bean
    public OpenTelemetry openTelemetry() {
        return AutoConfiguredOpenTelemetrySdk.initialize().getOpenTelemetrySdk();
    }

Introduced circular dependency openTelemetryConfig defined in URL [jar:file:/opt/app/policy-agent/a1-policy-management-service.jar!/BOOT-INF/classes!/org/onap/ccsdk/oran/a1policymanagementservice/configuration/OpenTelemetryConfig.class

2. Adding filters into AsyncRestClient directly and not into builder bean, but the AutoConfiguredOpenTelemetrySdk uses by default parameters such as localhost:4317 to export grpc, so we opted for using the application.yaml parameters to build the exporters beans.

AsyncRestClient.java
		...
        OpenTelemetry openTelemetry = AutoConfiguredOpenTelemetrySdk.initialize().getOpenTelemetrySdk();
        var webfluxTelemetry = SpringWebfluxTelemetry.builder(openTelemetry).build();
        return WebClient.builder() //
				...
                .filters(webfluxTelemetry::addClientTracingFilter)
                .build();

3. Context Provider class to use get the ApplicationContext into Non-Spring Components

import org.springframework.beans.BeansException;
import org.springframework.context.ApplicationContext;
import org.springframework.context.ApplicationContextAware;
import org.springframework.stereotype.Component;

@Component
public class ApplicationContextProvider implements ApplicationContextAware {
    private static ApplicationContext context;
 
    @Override
    public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
        context = applicationContext;
    }
 
    public static ApplicationContext getApplicationContext() {
        return context;
    }
}

And then use var context = ApplicationContextProvider.getApplicationContext().getBean(OtelConfig.class); In the non Spring class, and if tracing is enabled to add the tracing filters.



NOTES:

1.Using the ObservationRegistryCustomizer would still track /actuator manual calls, but it was kept in to kept UnitTests running


    ObservationRegistryCustomizer<ObservationRegistry> skipActuatorEndpointsFromObservation() {
        PathMatcher pathMatcher = new AntPathMatcher("/");
        return registry -> registry.observationConfig().observationPredicate(observationPredicate(pathMatcher));
    }

    static ObservationPredicate observationPredicate(PathMatcher pathMatcher) {
        return (name, context) -> {
            if (context instanceof ServerRequestObservationContext observationContext) {
                return !pathMatcher.match("/actuator/**", observationContext.getCarrier().getRequestURI());
            } else {
                return !SCHEDULED_TASK_NAME.equals(name);
            }
        };
    }


It's worth mentioning that if using the spring-boot auto configuration:

  <dependency>
    <groupId>io.opentelemetry.instrumentation</groupId>
    <artifactId>opentelemetry-spring-boot-starter</artifactId>
  </dependency>

You can follow the below steps:
https://opentelemetry.io/docs/zero-code/java/spring-boot-starter/sdk-configuration/#exclude-actuator-endpoints-from-tracing


2. To retrieve multiple spans, and enable automatic context propagation to ThreadLocals used by FLUX and MONO operators we used:
        Hooks.enableAutomaticContextPropagation(); only if tracing is enabled

https://docs.micrometer.io/context-propagation/reference/index.html 

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>context-propagation</artifactId>
  </dependency>

3.When disabling Telemetry micrometer-tracing-bridge-otel would still try to export spans, so we decided to use one flag to rule them both (micrometer and opentelemetry)

The flag controlling it is

managment
  tracing
    enable: true

Example of polluted logs when disabling only opentelemetry beans:

2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope attached event [ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope changed event [ScopeAttached{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope closed event [io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345]
2024-06-16 18:55:19 2024-06-16 17:55:19.060 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope closed event [io.micrometer.tracing.otel.bridge.EventPublishingContextWrapper$ScopeClosedEvent@56db4345]
2024-06-16 18:55:19 2024-06-16 17:55:19.061 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JBaggageEventListener - Got scope restored event [ScopeRestored{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.061 [TRACE] [BatchSpanProcessor_WorkerThread-1] i.m.t.o.b.Slf4JEventListener - Got scope restored event [ScopeRestored{context: [span: null] [baggage: null]}]
2024-06-16 18:55:19 2024-06-16 17:55:19.062 [ERROR] [OkHttp http://localhost:4318/...] i.o.e.i.h.HttpExporter - Failed to export spans. The request could not be executed. Full error message: Failed to connect to localhost/127.0.0.1:4318
2024-06-16 18:55:19 java.net.ConnectException: Failed to connect to localhost/127.0.0.1:4318
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:297)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
2024-06-16 18:55:19     at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
2024-06-16 18:55:19     at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
2024-06-16 18:55:19     at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:517)
2024-06-16 18:55:19     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2024-06-16 18:55:19     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2024-06-16 18:55:19     at java.base/java.lang.Thread.run(Thread.java:833)
2024-06-16 18:55:19 Caused by: java.net.ConnectException: Connection refused
2024-06-16 18:55:19     at java.base/sun.nio.ch.Net.pollConnect(Native Method)
2024-06-16 18:55:19     at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
2024-06-16 18:55:19     at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
2024-06-16 18:55:19     at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
2024-06-16 18:55:19     at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
2024-06-16 18:55:19     at java.base/java.net.Socket.connect(Socket.java:633)
2024-06-16 18:55:19     at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
2024-06-16 18:55:19     at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
2024-06-16 18:55:19     ... 18 common frames omitted
2024-06-16 18:55:19 2024-06-16 17:55:19.066 [DEBUG] [BatchSpanProcessor_WorkerThread-1] i.o.s.t.e.BatchSpanProcessor - Exporter failed
  • No labels