Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

  1. Action point follow-up
    1. Morgan Richomme review Integration simu and release versions if possible + update doc accordingly
      1. not done, will be done after doc, stability and resiliency testing #reaction
    2. Lukasz Rajewski properly tag SO bugs to reference them as blocking in Integration blicking table
      1. SO bugs do not appear in Honolulu Integration Blocking Points => to be crosscheck with Lukasz Rajewski
    3. if SDNC-1515 merged and pnf-simu OK in honoluluMorgan Richomme add pnf-macro to CI
      1. honolulu daily shows that the basic_vm_macro issue is not really fully fixed => #action Morgan Richomme create a JIRA for basoc_vm_macro
    4. Morgan Richomme grant access to Christophe Closset to check the status of SDC/DB
      1. done see section on SDC stability test
    5. Morgan Richomme retry and wait for the 5 minutes (Eviction timeout) then investigate on the networking issue
      1. done, Jira completed
    6. Morgan Richomme check ansible image availability for chained-ci
      1. doneAndreas Geissler gave a new try, still issue but not related to teh image. open question: shall we also reference the image in the CI page of the release?
  2. Admin
    1. INFO.yaml updated needed (NS repo have no yaml + status of the project)
    2. Commiter Promotion Request for Bartosz Gardziejewski: please vote before end of the week. BTW if other committer proposal => feel free to suggest (to replace Pawel, Marcin and partly Thierry)
  3. Honolulu
    1. status:board review https://jira.onap.org/secure/RapidBoard.jspa?rapidView=229
    2. release testsuite 1.8.0 under built
      1. in progress https://gerrit.onap.org/r/c/testsuite/+/120960, once merged, submit a patch to OOM
    3. Remaining work on tests
      1. macro issues Lukasz Rajewski
        1. pnf-macro test to be done on master before integration in CI (working fine in guilin)
        2. basic_vm_macro good results on guilin but not so good on master/honolulu => JIRA created
          Jira
          serverONAP JIRA
          serverId425b2b0a-557c-3c0c-b515-579789cceedb
          keySDNC-1529
      2. tern: test run in CI but results not pushed yet to LF Backend => WIP to manage properly the push to lf backend. On last honolulu and master-weekly, artifacts were produced and pushed manually #action user-98b79 finalize push to LF back end
      3. stability/resiliency tests

        1. Resiliency test

          1. worker node restart (thanks Bartek Grzybowski for the full analysis) => how do we follow, shall we create Jiras?
          2. test done, eviction looks good. some issues detected on some pods but not always trivial to reproduce may depend on the associated evicted pods. sometimes Init error, sometimes pod Running but exception..sometimes first eviction look good, but second seems to be fail... different cases referenced in Honolulu Resiliency and Backup and Restore test
          3. for SDC issue, the restart may be due to teh fact that the SDC is trying to recreate tables that already exist in cassandra
          4. #action Morgan Richomme create resiliency chapter in doc and initiate Jira for confirmed issue (appears at least 2 times), restart of a controller still to be done to complete the test.
        2. stability tests

          1. SDC tests: 72h 5 parallel on boarding Image AddedINT-1912 - SDC Stability test: success rate drops after ~ 500 onboarding Open
          2. Christophe Closset reports that it was useful to detect new issues, especially the continuous time increasing that my be lead to teh DB and/or a middle ware layer (explicit exception seen)
          3. plan to integrate such test in weekly as part of benchmark test
          4. instantiation tests
            1. 48h 5 parallel basic_vm (initial issue on SDNC (fixed by restart) then regular timeout value then internal openstack issues Image AddedINT-1918 - Stability tests: // instantiation Open
            2. more timeout than in guilin, need to be confirmed with a 1 test (no parallelization), mariadbgalera issue observed on some replica
            3. 72h 1 test basic_vm => problem with mariadb galera, results will not be as good as in guilin run
            4. issue with mariadbgalera after 24h (but tests run after the 48h tests..) once finished (thurday morning => reinstallation of honolulu weekly to reproduce the tests with the last dockers
            5. mariadb galera workaround reported by Bell to OOM, seems to improve the overall quality of gate but trigger an new issue on CDS. tradefoff between redundancy and efficiency not clear
    4. documentation update started but not finished => https://gerrit.onap.org/r/c/integration/+/120236
    5. estimation 2-3 days still needed to compelte the doc / release note + testsuite integration
  4. Istanbul
    1. New Jira board created: https://jira.onap.org/secure/RapidBoard.jspa?rapidView=233
    2. robot refacroting..it eems we have a plan (Krzysztof Kuzmicki )
    3. discussion by mail, #action Krzysztof Kuzmicki crerate a wiki page in Istanbul page
    4. migration of DCAE from cloudify to helm chart..first analysis by Krzysztof Kuzmicki , impacts to be planned on several tests
    5. discussion by mail, #action Krzysztof Kuzmicki crerate a wiki page in Istanbul page
    6. New CSIT (AAI, CDS,..)
    7. AAI is folowwing the SDC way (Maven based), for CDS it shoudl be possible to reuse what Lasse Kaihlavirta suggested. AAI team contacted SDC
    8. brainstorm on new use cases for Istanbul open => https://etherpad.anuket.io/p/onap_integration_istanbul
    9. all review the page, it is an etherpad..so do not refrain yourself..put everything you have in mind
  5. AoB
    1. Lasse Kaihlavirta closed the CSIT refactoring EPIC
    2. Lasse Kaihlavirta also disabled useless SO CSIT tests (with very old images) - the Jira was closed without action by Seshu Kumar Mudiganti so we remove some jobs as they are meaningless (test with casablanca images)
    3. Lasse Kaihlavirta add a browser cleanup on robot healthcheck (vid) to save resource, it will be integrated in 1.8.0


  1. Action point follow-up
    1. Illia Halych review the official simu page (pythonsdk wrapper)
      1. done, no example for Honolulu, to be completed in Istanbul
    2. Morgan Richomme refine the information about the conditions of the resiliency tests of worker restart -
      1. See Honolulu topic
    3. Lasse Kaihlavirta Offline discussions with Lasse about AAI CSIT
      1. Done
    4. Krzysztof Kuzmicki Next week proposal about what we can do with robot
      1. Postponed to next week
    5. Krzysztof Kuzmicki INT-1907 - browser_setup.robot does not provide proper teardown Open  no teardown for browser-based checks - high consumption of resources at the end
      1. used by VID test, workaround to be confirmed
    6. Maybe we need more information about DMaaP simulator use and to write it better in the simulator doc → Reported by Lasse Kaihlavirta
      1. Globally we are not well structured on the simulator, on the release note of version X we should list all the versions of the simulators we used for the validation. It shall be done by the use case teams that developed a simu and:or the integration team. The difficulty ois to get the full view on the simulators, some are hidden in repositories, some are unmaintained,...# action Morgan Richomme review Integration simu and release versions if possible + update doc accordingly

    7. Need of consolidation of versioning of images of simulators used in CSIT  → Reported by Lasse Kaihlavirta
      1. sure it is a bit messy...and impossible to maintain, integration can cleanup time to time..but the best way for functional tests => bring functional tests including simu in their own repo. And if the simu can be used more widely create a dedicate repo under integration/simulators..
  2. Admin
    1. INFO.yaml updated needed (NS repo have no yaml + status of the project)
      1. strange INFO.yaml is needed to create the repo => to be xchecked. Morgan Richomme indicates to Thomas Kulik that it will be done with the next INFO update (needed as at least 2 committers announced they will stop after Honolulu)
    2. thanks to Pawel and Marcin for their contributions and all the best for the next challenges
  3. Honolulu
    1. status:board review https://jira.onap.org/secure/RapidBoard.jspa?rapidView=229
    2. Remaining work on tests
      1. pnf_macro => OK on daily guilin, what do we do regarding CI integration
        1. kudos to Michał Jagiełło  - first pythonsdk tests with simulator - let's wait for the merge of Dan fix for SDNC-1515, if OK #action Morgan Richomme add pnf-macro to CI

        2. stil SO bugs preventing Lukasz Rajewski to complete his test .. deja-vu?
        3. #action Lukasz Rajewski properly tag SO bugs to reference them as blocking in Integration blicking table
      2. tern: test run in CI
        1. test results available (launched manually) => https://logs.onap.org/onap-integration/weekly/onap_weekly_pod4_master/2021-04/19_18-36/security/tern/index.html
        2. next CI run shall be good..wait and see
      3. stability/resiliency tests

        1. I started the first resiliency tests: TEST-308 - [Resiliency] Evaluate ONAP behavior on k8s worker node restart In Progress , replay done, mail sent to the community (network issue?)

          1. behavor different on Nokia (pod stuck for ever in Terminating state when stopping a working) and Samsung RKE2 cluster, continue discussion on the ticket Bartek Grzybowski  indicates that we shall be careful especially with statefulset, it coudl explain some issues
          2. #action morgan retry and wait for the 5 minutes (Eviction timeout) then investigate on the networking issue
        2. stability tests

          1. SDC tests: 72h 5 parallel on boarding started on the 20th of April..wait and see...
            1. First tests were all OK but duration ontinuously increases. No error rate is high. Wait for the end of the tests to get the graphs.
            2. #action Morgan Richomme grant access to Christophe Closset to check the status of SDC/DB
          2. instantiation tests to be planned after onboarding tests
      4. consolidation of versions

        1. better exception catching + integration of the GUI => still action Morgan

      5. removal of the submodule: done

    3. documentation update started but not finished => https://gerrit.onap.org/r/c/integration/+/120236
    4. release testsuite 1.8.0 to be done?
      1. next week if no objection
    5. simulators: whuch dockers are available in nexus today? could we clarify the build chain?
      1. see previous topic
  4. Istanbul
    1. New Jira board created: https://jira.onap.org/secure/RapidBoard.jspa?rapidView=233
    2. discussion on future tests
      1. follow up integration AAI tests in CSIT
      2. CDS in CSIT?
  5. AoB
    1. Andreas Geissler mentioned that his CI chains were affected by the non availability of an ansible image used in the runner => could be a problem, need to release also such images somewhere to avoid such issues #action Morgan Richomme check ansible image availability for chained-ci (2.7.13)
      1. issue also on Orange chains last week (build chain of ansible image was broken, chain fixed)
    2. Christophe Closset raises a question on python2.7 in jenkins CI and in robot. Partly addressed by the initiative of robot refactoring. note xtesting dockers also still using python2.7 due to dependency to python-utils that has not evolved for a long time.

...