1. 10 May, 2021 8 commits
    • Łukasz Nowak's avatar
      monitor: Implement edgetest regions · f9ff6660
      Łukasz Nowak authored
      Request and publication is switched to fully serialised approach, so that
      complex data structures like lists and objects can be safely transmitted.
      
      The requests are backward compatible, so for current and simple usage
      automatic region is setup.
      
      Much more information are published to main and shared instances, like
      lists of available and assigned regions. Regions can be added and removed,
      which will be reflected automatically for the slaves.
      
      Information send to the bot node are minimised to only needed ones.
      
      check-frontend-ip-list can be configured according to the list, and each
      element overrides previous one:
      
       * globally on cluster
       * default per region
       * globally per slave
       * specific per region on slave
      
      Note: It's known that:
      
       {%-     set base_slave_dict = json_module.loads(slave.pop('_')) %} {#- XXX: Unsafe! #}
      
            is really unsafe, but for most of usage of monitoring slaves it's
            considered good enough.
      f9ff6660
    • Łukasz Nowak's avatar
      monitor: Review parameter names · 485c6939
      Łukasz Nowak authored
      485c6939
    • Łukasz Nowak's avatar
      monitor: Indent and annotate profiles · 8b79e082
      Łukasz Nowak authored
      8b79e082
    • Łukasz Nowak's avatar
      monitor: Expose explicitly slave issue · e4d6d903
      Łukasz Nowak authored
      e4d6d903
    • Łukasz Nowak's avatar
      monitor: Simplify edge configuration · 6982f750
      Łukasz Nowak authored
      Minimize amount of parameters passed from master partition to the slaves, as
      slave configuration shall be maximally self contained.
      
      Note: This will allow further development (like regionalization) much simpler.
      6982f750
    • Łukasz Nowak's avatar
      monitor: Fix check-http-header-dict support · 3cafb865
      Łukasz Nowak authored
      Since json-in-xml is used, check-http-header-dict shall be simply the JSON
      object, no need to express it as a string.
      3cafb865
    • Łukasz Nowak's avatar
      monitor: Switch to json-in-xml and update types · 64773d30
      Łukasz Nowak authored
      Note: This is request backward incompatible change.
      64773d30
    • Łukasz Nowak's avatar
      XXX: Test only monitor · 4d2e52e0
      Łukasz Nowak authored
      4d2e52e0
  2. 07 May, 2021 6 commits
    • Julien Muchembled's avatar
      NEO: save VM system log if the stress test fails · c9633501
      Julien Muchembled authored
      By using '/lib/systemd/systemd-journal-remote --output=...'
      the dump can be converted back to a file that is readable with
      'journalctl --file=...'
      c9633501
    • Julien Muchembled's avatar
      kvm: always enable discard for drives · bbb4d069
      Julien Muchembled authored
      Given that the default is to use raw for device drives and qcow2 for
      file drives,  and that 'discard' was already enabled for raw, this
      commit is in practice mainly for file drives, for which QEMU processes
      TRIM by punching holes (i.e. the image file becomes sparse).
      
      This reduces disk usage not only by the process running the VM but
      also by backups and replicas (e.g. runner1).
      
      For qcow2 at least, the discarded space can be reused for other blocks
      and qemu compacts the image at dump (for backups) so images on replicas
      are not sparse.
      
      When an image disk format doesn't support it, QEMU ignores the option
      so it's safe.
      
      Guest support of TRIM for virtio-blk is recent: Linux 5.0
      bbb4d069
    • Julien Muchembled's avatar
      kvm: do not try to correct disk-related parameters · 128a37e0
      Julien Muchembled authored
      The user must be aware of any mistake he did. For example, he may lose
      time by not understanding why the VM does not behave as expected or by
      distorting measures in benchmarks.
      
      The only legitimate reason to automatically fix a parameter is backward
      compatibility, if a value is not valid anymore. But such fallback
      should only be temporary. There's no such case recently.
      
      At last, it increased maintenance by having to keep the lists of valid
      values up-to-date.
      
      About:
      
        -  if disk_info['io'] == 'native':
        -    additional_disk_options += ',cache.direct=on'
      
      These lines are redundant when cache is none.
      128a37e0
    • Jérome Perrin's avatar
    • Jérome Perrin's avatar
      Cloudooo cluster fixes · 7fc6aae8
      Jérome Perrin authored
       - make cloudooo services use same name as in haproxy, this was off by one (service cloudooo-0 was cloudooo_0 in haproxy)
       - test cluster functionality (to catch bug from nexedi/cloudooo!29 )
       - update cloudooo with that fix
       - make sure we set an existant locale ( by using `C.UTF-8` ) and exercice a bit conversions involving characters encoding
      
      See merge request nexedi/slapos!975
      7fc6aae8
    • Łukasz Nowak's avatar
      caddy-frontend: Fix profile issue · 2cb1cf57
      Łukasz Nowak authored
      The "30" was not removed by mistake, which lead to activity timeout being 30,
      instead of configured one.
      2cb1cf57
  3. 06 May, 2021 1 commit
  4. 05 May, 2021 4 commits
  5. 04 May, 2021 1 commit
  6. 03 May, 2021 1 commit
  7. 30 Apr, 2021 4 commits
  8. 28 Apr, 2021 1 commit
  9. 27 Apr, 2021 5 commits
  10. 26 Apr, 2021 3 commits
  11. 21 Apr, 2021 6 commits