SDC - sdc-BE startup failed

while trying to stat the sdc VM I dont get the health check for sdc-BE to success. If I look in the log files for jetty, I see immediately these lines:

2017-04-24 12:30:52.947:INFO:oejs.SetUIDListener:main: Setting umask=02
2017-04-24 12:30:55.021:INFO:oejs.SetUIDListener:main: Opened ServerConnector@643b1d11{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2017-04-24 12:30:55.115:INFO:oejs.SetUIDListener:main: Opened ServerConnector@2ef5e5e3{SSL,[ssl, http/1.1]}{0.0.0.0:8443}
2017-04-24 12:30:55.143:INFO:oejs.SetUIDListener:main: Setting GID=999
2017-04-24 12:30:55.187:INFO:oejs.SetUIDListener:main: Setting UID=999
2017-04-24 12:30:55.653:INFO:oejs.Server:main: jetty-9.3.15.v20161220
2017-04-24 12:30:59.327:INFO:oejdp.ScanningAppProvider:main: Deployment monitor [file:///var/lib/jetty/webapps/] at interval 1
2017-04-24 12:41:12.935:INFO:oeja.AnnotationConfiguration:main: Scanning elapsed time=435140ms
2017-04-24 12:41:13.023:WARN:oejw.WebAppContext:main: Failed startup of context o.e.j.w.WebAppContext@52a86356{/onboarding-be-1.1.0,file:///var/lib/jetty/temp
/jetty-0.0.0.0-8080-onboarding-be-1.1.0.war-_onboarding-be-1.1.0-any-4636361256497583740.dir/webapp/,UNAVAILABLE}{/onboarding-be-1.1.0.war}
java.lang.Exception: Timeout scanning annotations
<stack trace deleted>
        at org.eclipse.jetty.start.Main.start(Main.java:457)
        at org.eclipse.jetty.start.Main.main(Main.java:75)
2017-04-24 12:59:37.324:INFO:oeja.AnnotationConfiguration:main: Scanning elapsed time=784044ms
2017-04-24 12:59:37.386:WARN:oejw.WebAppContext:main: Failed startup of context o.e.j.w.WebAppContext@5b0abc94{/catalog-be-1.1.0,file:///var/lib/jetty/temp/je
tty-0.0.0.0-8080-catalog-be-1.1.0.war-_catalog-be-1.1.0-any-28835300833789324.dir/webapp/,UNAVAILABLE}{/catalog-be-1.1.0.war}
java.lang.Exception: Timeout scanning annotations
        at org.eclipse.jetty.annotations.AnnotationConfiguration.scanForAnnotations(AnnotationConfiguration.java:578)

Any idea what went wrong?

Btw, all

curl http://localhost:8080/sdc2/rest/healthCheck

curl http://localhost:8080/sdc2/rest/healthCheck

curl -s -X GET -H "Accept: application/json" -H "Content-Type: application/json" -H "USER_ID: jh0003" "http://localhost:8080/sdc2/rest/v1/user/demo"

return a 404

Comment

Josef Reisinger
- Apr 25, 2017
I re-installed the SDC from the structure in the RC heat template manually (i.e. deleting docker containers, configuring /opt/config etc) and started the asdc_install.sh file manually. The results seems to be the same
- same error messages above (Timeout scanning annotations)
- docker_health.sh returns 404 (not found) for FE/BE and "Error [12] while user existence check"
Michael Lando
- Apr 25, 2017
hi,
can you please describe the flow you are using to set up the vm and start us?
in addition from a mail you sent to the distribution list you said that you tried the 1.1 version and the 1.0 and both gave you the same behavior?
Josef Reisinger
- Apr 26, 2017
@Michael Lando Initially, I started the VMs out of cloud-init or later using asdc_vm_init.sh; regading 1.0 vs. 1.1, that was the artifacts_version which I mixed up with the docker image version ... sorry ()

Josef Reisinger

Apr 26, 2017

Having set the increased timeout for the scanning (thanks to Kiran Kamineni, I see sdc-FE starting and responding to the health status curl commands

After about 20 min of initialization ,sdc-BE runs into another error,which I need help to understand what it is about.

com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily distributionstatusevent

com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily distributionnotificationevent

com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily distributiondeployevent

2017-04-26 11:20:48.509:WARN:oejw.WebAppContext:main: Failed startup of context o.e.j.w.WebAppContext@322e49ee{/,file:///var/lib/jetty/temp/jetty-0.0.0.0-8080-catalog-be-1.1.0.war-_catalog-be-1.1.0-any-2306941777608968246.dir/webapp/,UNAVAILABLE}{/catalog-be-1.1.0.war}
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'userBusinessLogic': Injection of resource dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'componentUtils': Injection of resource dependencies failed; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'auditingManager': Unsatisfied dependency expressed through field 'cassandraDao'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'audit-cassandra-dao': Invocation of init method failed; nested exception is java.lang.RuntimeException: Error preparing queries for accessor AuditAccessor

... and more

If I find a way to attach the logfile I will do.

Thanks in advance for your help.

Michael Lando
- Apr 30, 2017
this exception looks like the keyspace in Cassandra was not created correctly.
when you set up the environment you did not make modifications to the heat env file correct?
my issue here is that by default you are using the certified version 1.0 which was tested ad runs successfully. we did not see any issue with the anotations time out or with the cassndra key space.
can you please run docker ps on the sdc vm so i will be able to see what version are you using.
Josef Reisinger
- May 02, 2017
I had the same thought initially as you. Digging down in cassandra with cqlsh has shown the name space exists.
I separated the step in docker_run.sh to manually start each container and look at the log files. It seemed to me as if sdc-BE requests the services from sdc-cs before sdc-cs is finally initialized. On my installation, the startup od sdc-cs took quite a bit longer than the time sdc-BE startup up and requests services from cassandra. Seem to be a real race condition.
To rule this out, I restarted the containers once again and waited sdc-cs to settle before I created the sdc-BE container. Even inside the container I assumed some race conditions as /docker-entrypoint.sh is asynchronous started as well as two python scripts, followed by a health check script which waits max 100 sec for the backend to come up - which doesn't happend, the backend needs >20 minutes to respond to the URL in the Health script.
With further patches I was able to stop startup.sh after starting /docker-entrypoint.sh. Now I do not get any cassandra related errors, but while initializing the spring framework, a connection refused error to mrouter:3904.
You might guess I am a bit out of ideas what to try next...
I also assumed a too tiny host as a compute node .. but looking at top and/or iotop, I can see the CPU 75% idle an close to no disk activity.
I'll keep you posted with my newest findings and likewise I appreciate any help ...

CommentAdd your comment...

2 answers

2
1
0
Kiran Kamineni
Apr 25, 2017
Looks like the annotations is timing out. Try to set the following option in your FE and BE docker containters.
-Dorg.eclipse.jetty.annotations.maxWait=<timeout in seconds>
Comment
Josef Reisinger
Apr 26, 2017
Thanks Kiran.. I tried this on FE and it seems to have helped there. On BE, it helps also to overcome the scanning timeout, but unfortunately I run into another error (see below);
Dave Chen
Jun 29, 2017
Hi Kiran and Josef, could you pls tell me where is the option can be set? I think I should modify jetty*.xml to add the option but I haven't found the configuration file.
Josef Reisinger
Jun 29, 2017
From my notes when I had the issue (not sure they are 100% accurate, but should show the principle):
To increase the timeout, a parameter -Dorg.eclipse.jetty.annotations.maxWait=<sec> needs to be added to the java command which starts jetty
The file is in the image, which is pulled by the startup script
Everything is kind-of non-changeable in automatic startup
No local hooks to inject any code to analyze the startup of the containers
Needs high manual intervention
“Dirty” patch of a script file in the container immediately after start
There are a few seconds (or even minutes) in the VM context to PATCH the start file of Jetty before it is once again patched (by a file in the container) in the sdc-BE context
In docker_run.sh, after the docker_run command

docker run --detach --name sdc-FE --env HOST_IP=${IP} --env ENVNAME="${DEP_ENV}" --log-driver=json-file --log-opt max-size=100m --log-opt max-file=10 --ulimit memlock=-1:-1 --memory 2g --memory-swap=2g --ulimit nofile=4096:100000 --volume /etc/localtime:/etc/localtime:ro --volume /data/logs/FE/:/var/lib/jetty/logs --volume /data/environments:/root/chef-solo/environments --publish 9443:9443 --publish 8181:8181 ${NEXUS_DOCKER_REPO}/openecomp/sdc-frontend:${RELEASE}

add two other docker commands like below:

docker exec sdc-FE sed -i '/^set -e/aJAVA_OPTIONS=\"-Dorg.eclipse.jetty.annotations.maxWait=7200 $JAVA_OPTIONS\"' /docker-entrypoint.sh docker exec sdc-FE head -20 /docker-entrypoint.sh

The second command is just to confirm that the sed command was successful
CommentAdd your comment...
2
1
0
Josef Reisinger
May 10, 2017
The observations above may be the result of having a lab infrastructure available for which the ONAP stack might be a bit too heavy and the run times assumptions too optimistic
I saw a couple of occurrences where the run time makes a difference:
the above annotation timeout
I saw times of 300..400 sec which needs the application of the parameter
the waiting time in the
/root/chef-solo/cookbooks/sdc-normatives/files/default/check_Backend_Health.py script
the script times out after 100 sec, I think I saw times for the backend to be seen as online ~360sec
the time between the startup of containers in docker_run.sh
25.45 sec vs. same time in minutes to start up especially cassandra and sdc-BE
I have changed a few files in the SDC to cope with the longer runtime. Changes include
changing JAVA_OPTIONS in /root/startup.sh
changing the loop counter and sleep time in Backend_Health.py
changing docker run command to use the changed image for sdc-BE container
changing asdc_vm_init.sh to avoid overwriting the just changed docker_run.sh
With these changes in place, I am able to reboot the SDC VM and get SDC up and running
If there is interest, I post the entire procedure as part of this response. Feel free to reach out to me and ask if you have questions
Comment
CommentAdd your comment...