- 14 Aug, 2020 1 commit
-
-
Leo Le Bouter authored
* Convert stat_result to proper dictionary so that field names are retained after serialization * Add ability to ignore directories through command line arguments, explicitly add "ignored" field on ignored directories It was decided that JSON was not a suitable format because bytes serialization support is lacking. MsgPack supports it and is more efficient, also it is the internal serialization format for Fluentd which we will most probably use for ingesting data in a central place.
-
- 13 Aug, 2020 3 commits
-
-
Leo Le Bouter authored
multiprocessing.Pool.close() ensures no new tasks can be submitted to the pool and waits for them to all finish. Even though AsyncResult.get() also waits for the tasks to finish, and our code structure shouldnt submit new tasks at that point, close() first, get() then. In the future this could be error-prone in the future where mp_tasks is modified while results are being merged back and we miss some results because the iterator wont take these new items into account *during* iteration.
-
Leo Le Bouter authored
-
Leo Le Bouter authored
In Python, the JSON encoder cannot process bytes, the JSON specification also does not define a "bytes" type. We are constrained by this in that we cannot serialize data of bytes type. xattrs can be either strings or bytes, in practice they're likely representable as strings, therefore, decode as utf-8, error otherwise. If real world situation of xattrs in true binary format arise then we will rule out another solution.
-
- 12 Aug, 2020 1 commit
-
-
Leo Le Bouter authored
-