1. 14 Aug, 2020 1 commit
    • Leo Le Bouter's avatar
      Use MsgPack instead of JSON, add command line arguments + bug fixes · 86c55efd
      Leo Le Bouter authored
      * Convert stat_result to proper dictionary so that field names are
        retained after serialization
      
      * Add ability to ignore directories through command line arguments,
        explicitly add "ignored" field on ignored directories
      
      It was decided that JSON was not a suitable format because bytes
      serialization support is lacking. MsgPack supports it and is more
      efficient, also it is the internal serialization format for Fluentd
      which we will most probably use for ingesting data in a central
      place.
      86c55efd
  2. 13 Aug, 2020 3 commits
    • Leo Le Bouter's avatar
      do not follow symlinks in getxattr, close mp_pool first · 02a190aa
      Leo Le Bouter authored
      multiprocessing.Pool.close() ensures no new tasks can be submitted
      to the pool and waits for them to all finish. Even though
      AsyncResult.get() also waits for the tasks to finish, and our code
      structure shouldnt submit new tasks at that point, close() first,
      get() then. In the future this could be error-prone in the future
      where mp_tasks is modified while results are being merged back and
      we miss some results because the iterator wont take these new items
      into account *during* iteration.
      02a190aa
    • Leo Le Bouter's avatar
    • Leo Le Bouter's avatar
      xattrs dict must be created first, decode xattrs as utf-8 · 001ed5c5
      Leo Le Bouter authored
      In Python, the JSON encoder cannot process bytes, the JSON
      specification also does not define a "bytes" type. We are
      constrained by this in that we cannot serialize data of bytes type.
      
      xattrs can be either strings or bytes, in practice they're likely
      representable as strings, therefore, decode as utf-8, error
      otherwise. If real world situation of xattrs in true binary format
      arise then we will rule out another solution.
      001ed5c5
  3. 12 Aug, 2020 1 commit