1. 03 Jan, 2017 9 commits
  2. 29 Dec, 2016 8 commits
  3. 28 Dec, 2016 3 commits
  4. 27 Dec, 2016 5 commits
  5. 26 Dec, 2016 14 commits
  6. 23 Dec, 2016 1 commit
    • Kazuhiko Shiozaki's avatar
      Use a dedicated Login document, separate user id from Person's reference. · a4c67f4b
      Kazuhiko Shiozaki authored
      # Situation before
      
      * Person reference is a convenient property to look Persons up with, because
        it has an efficient secondary index in catalog table, and is part of fulltext
        representation of all ERP5 documents (including Persons).
      
      - Person reference is user's login, ie what users type to identify
        themselves.
      
      - Person reference is user's id, a widely unknown concept which is how local
        roles are granted (including ownership)
      
      User authentication and enumeration are handled by PAS plugin
      `ERP5 User Manager`.
      
      # Problems
      
      - We need more than one way to identify oneself: login/password, but also
        3rd-party services (facebook, google, persona, client x509 certificates,
        various types of tokens, ...) but the current mecanism is monovalued.
      
      - We cannot change a Person reference once its set without risking striping
        the user from all its local roles everywhere in the site, with no cheap way to
        identify what gave a role to a given user.
      
      - We may need to change a user's login (because of policy change, ...).
      
      # Situation after
      
      - Person reference remains a convenient property to look persons up with.No
        change here.
      
      - ERP5 Login is a new document type allowed inside Person documents. Each
        identifying mechanism can have its own document type with properties suiting
        its purpose, and any number of instance inside a single Person. Logins have a
        workflow state allowing to control their life span and keep some traceability
        (via workflow histories).
        ERP5 Login.getReference is user's login.
        Because a user can change his own login, there is no long term relation between login and user.
      
      - Person use_id is user's id, which is systematically assigned upon Person
        creation. Any document which is a Zope user must be bound to the new ERP5User
        property sheet.
      
      This new pattern is supported by a new PAS plugin for authentication and
      enumeration: `ERP5 Login User Manager`.
      
      # Minimal upgrade
      
      Change your test suite to create the old PAS plugin and disable the new one.
      
      After upgrading erp5_mysql_innodb_catalog, create the "user" table if it does
      not happen automatically.
      
      Nothing else to do. The code is exepcted to be backward-compatible with the
      historical data schema. Just make sure you keep the original PAS plugin enabled
      and you should be fine.
      
      Be advised though that the old data schema may not be tested much, which means
      compatibility may get broken.
      
      # Full upgrade
      
      Get your unit tests to pass with the new PAS plugin. This includes getting your
      Person-creating, -modifying and -querying code to use the proper APIs, and to migrate all your workflow `actor` variable declarations to use `getUserIdOrUserName`.
      
      After upgrading erp5_mysql_innodb_catalog, create the "user" table if it does
      not happen automatically.
      
      Once you are done upgrading, start automated Person migration. It will:
      - create the new PAS plugin
      
      - gradually create ERP5 Login documents, copy the existing Person reference to
        user_id to not break existing users
      
      - once done, deleted the old PAS plugin
      
      ## Caveats
      
      - for each migrated user there is a small time lapse (the duration of a few
        reindexations) where it is not a user
      
      - better migrate a few persons individually while checking roles_and_users table
        content, to ensure no new security uids get allocated in that process. It may
        happen if view-granting roles are granted to users on their own Person
        documents, for example, and may expose security configuration weaknesses to
        race-conditions on local role setting and indexation. If your security is only
        group-based, nothing to fear.
      
      - the migration is rather fast, but in the end all depends on how many Person
        documents you have. On a beefy instance (15 processing nodes on a single
        large server) 450k Persons were migrated in about 20 hours, or 6 persons a
        second, along with normal instance usage.
      
      # APIs
      
      The following APIs are available, independently of whether you migrated your
      users or not. Use them.
      
      - About the current user:
      
          *Do* use User API (as conveniently accessed through portal_membership), which
          is a convenient shortcut to the PAS API.
      
          ```python
          user = context.getPortalObject().portal_membership.getAuthenticatedMember()
          # The user document (typically, a Person)
          user.getUserValue()
          # The login document currently used (be careful, it may contain plain-text
          # secret values, like a token).
          user.getLoginValue()
          ```
      
          Do *not* peek in the request.
      
          Do *not* use the catalog.
      
          Do *not* search in person_module.
      
      - About another user:
      
          *Do* use PAS API.
      
          For reasons which should be obvious, you should generally *not* descend in a
          specific PAS plugin.
      
          ```python
          # Many users:
          user_list = context.acl_users.searchUser(
            id=user_id_list,
            exact_match=True,
          )
          # Many logins:
          user_list = context.acl_users.searchUser(
            login=user_login_list,
            exact_match=True,
          )
          # Be careful of cases where a single user may be enumerated by more than one
          # PAS plugin. Depending on what you want to access among user properties
          # the applicable pattern will change. Consider the following, which will not
          # tolerate homonym-but-different users (which is a good thing !):
          user_path, = {
            x['path'] for x in context.acl_users.searchUser(
              id=user_id,
              exact_match=True,
            ) if 'path' in x
          }
          # When searching for a single login, you should not tolerate receiving more
          # than one user, with more than one login:
          user_list = [
            x for x in context.acl_users.searchUser(
              id=user_id,
              exact_match=True,
            ) if 'login_list' in x
          ]
          if user_list:
            login, = user_list['login_list']
            # Do something on the login
          else:
            # Login does not exist, do something else
          ```
          Each item in `user_list` contains the following keys:
      
          - `id`: the user id (ex: `Person.getUserId()`'s return value)
      
          - `path`: user document's path, if you want the document
      
          - `uid`: user document's uid, if you want to search further using the catalog
      
          - `login_list`, itself containing, one per user's login matching the conditions
      
            - `reference`: the textual login value (or token, or..., whatever PAS should
              use to lookup a user from transmitted credentials)
      
            - `path`: login document's path, if you want the document
      
            - `uid`: login document's uid, if you want to search further using the catalog
      
          If you are on an instance with non-migrated users, `login_list` will have a single entry, which will point at the Person document itself (just as the `user` level).
      
      - About a Person:
      
          *Do* use catalog API (`portal_catalog`, `searchFolder`, `ZSQLMethod`s, ...).
      
          Do *not* use PAS API.
      
      - To know the user_id of a Person document:
      
          ```python
          person.Person_getUserId()
          ```
      
      - To know the login of a user which is not currently logged in:
      
          This need is weird. Are you sure you did not forget to ask the user for their login ? If not, you're going to have to assume things about how many login documents the user has, which will not be well compatible.
      
      # Cheat-sheets (but use your common sense !)
      
      | is it a good idea to [action on the right] on [value below] |  store hashed & salted | store plaintext | display | look document (user or login) up by |
      |---|---|---|---|---|
      | password | yes | no | no | not possible |
      | username | no | yes | no, a username may be a token, which is both a login and a password | only when really doing login-related stuff (ex: password recovery, logging a user in...), using PAS API |
      | token | no | yes | no |  only when really doing login-related stuff (ex: password recovery, logging a user in...), using PAS API |
      | user id | no | yes | yes (if user complains it not a nice value, use reference) | yes, using PAS API |
      | reference  | no | yes | yes | yes, using catalog API |
      
      | to refer to [notion below] can I use [user property on the right] | User's `id` | User's `user name` (aka login) | an ERP5 document property (`title`, `reference`, ...) |
      | --- | --- | --- | --- |
      | a workflow actor (as stored) | yes | **no** | **no** |
      | a workflow actor (as displayed) | yes (may be ugly) | **no** | yes |
      | the owner of a document (as stored) | yes | **no** | **no** |
      | granting a local role on a document (as stored) | yes | **no** | **no** |
      | the owner of a document (as displayed) | yes (may be ugly) | **no** | yes |
      | logged in user (displayed) | yes (may be ugly) | **no** | yes |
      | user having generated a document (displayed, even in the document) | yes (may be ugly) | **no** | yes |
      | the requester of a password reset ("I forgot my password") request (as stored) | **no** | **no** | yes, some relation to the document (`path`, `relative_url`...) |
      | the requester of a password reset ("I forgot my password") request (as input by user) | **no** | yes | yes, optionally  and always in addition to User's `user name`  (to prevent an attacker from making us spam users) |
      | the requester of an account recovery ("I forgot my login") request (as stored) | **no** | **no** | yes, some relation to the document (`path`, `relative_url`...) |
      | the requester of an account recovery ("I forgot my login") request (as input by user) | **no** | **no** | yes |
      | the user in another system | **no** | **no** | yes ({`source`,`destination`}`_reference` are your friends) |
      
      Put in another way:
      - If it's stored in ERP5 and used by security machinery, you want to use the `id`
      - if it's stored in ERP5 and used by the document machinery, you want to use a relation to the document (ex: `Person` for a User `id`, `ERP5 Login` for a User user name)
      - If it's stored in another system (ie, part of an interface), use a document property
      - If it's displayed to the user, use a document property (optionally falling back on the User `id`)
      
      And in tests ? Unless you are testing an authentication mechanism, use User `id` only: It's unique, it's generated for you by ERP5 anyway, it does not need creating & validating an ERP5 Login document, it does not need choosing a login, it does not need choosing a password. Existing tests use a mix of techniques, please ignore them (or better, update them to use the simpler scheme !).
      
      /reviewed-on nexedi/erp5!185
      a4c67f4b