Initial proofreading Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>

commit: edbf0eb57248a4247e9922fcbd8cddd68612a7fa [log] [tgz]
author: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Wed Sep 16 14:13:17 2020 -0400
committer: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Wed Sep 16 14:13:17 2020 -0400
tree: c4245c78685a4d8981c4f664a8bb284b23477a62
parent: a2486b493d11408f12cdd1a4efb7fd70000f7503 [diff]
diff --git a/README.rst b/README.rst
index babf047..e036693 100644
--- a/README.rst
+++ b/README.rst

@@ -142,9 +142,9 @@
 Instead of creating a single attestation hash, we create a separate hash
 for each meaningful part of the patch submission:
 
-  - **i**: patch metadata
-  - **m**: commit message
-  - **p**: diff content
+  - i: patch metadata
+  - m: commit message
+  - p: diff content
 
 This allows the person performing verification to identify which part of
 the submission has been altered since being signed. A change to a commit
@@ -154,10 +154,10 @@
 
 Similarly, a patch that goes through a chain of maintainers will
 necessarily have its commit message modified by the inclusion of various
-trailers. Having a separate hash for the patch content and patch
-metadata provides a way to track whether or not any of the
+provenance trailers. Having a separate hash for the patch content and
+patch metadata provides a way to track whether or not any of the
 submaintainers made changes to the patch code, or just to the commit
-message, as expected.
+message, as is generally expected.
 
 To generate the three parts, we rely on the ``git mailinfo`` command,
 that does most of what we need::
@@ -165,8 +165,8 @@
     git mailinfo m p > i < email.msg
 
 The above command will produce three files that closely match what we
-need, but require a bit of extra processing to remove content that is
-likely to be altered in transmission.
+are looking for, but require a bit of extra processing to remove content
+that is likely to be altered in SMTP transmission.
 
 To get the "m" hash, we take the "m" file as-is::
 
@@ -183,7 +183,8 @@
 cut" portion of the commit message (usually, diffstat and revision
 information), plus trailing content such as signatures or mailing list
 subscription info. All of this is stripped away to leave just the diff
-content.
+content. Unfortunately, there is no way to do it with git itself, so we
+use manual parsing of the diff structure to perform this operation.
 
 Why not use git patch-id?
 ~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -192,12 +193,12 @@
 performs several canonicalization routines that make this hash
 unsuitable for attestation purposes:
 
-  - it collapses all whitespace together
+  - it collapses all repeating whitespace
   - it removes all line numbers from diff contents
 
 It is possible for a malicious actor to create two patches that generate
-identical patch-id hashes but have drastically different results in the
-code. For more info, see discussion here:
+identical patch-id hashes but have drastically different results when
+applied to the codebase. For more info, see discussion here:
 
   - https://lore.kernel.org/git/20200210164115.x4gciujyjisivfgi@chatter.i7.local/
 
@@ -229,9 +230,9 @@
 ------------------------
 Once the X-Patch-Hashes header is generated and inserted into the email,
 it will need to be signed in order to be useful for attestation
-purposes. Adding domain-level signatures is the simplest way to
-accomplish this, as it would allow entire companies to automatically
-attest all patches sent out via their infrastructure.
+purposes. Adding domain-level signatures during SMTP processing is the
+simplest way to accomplish this, as it would allow entire companies to
+automatically attest all patches sent out via their infrastructure.
 
 This can be easily done by introducing a patch-attestation milter that
 would automatically analyze body contents and generate the
@@ -244,7 +245,7 @@
 ~~~~~~~~~~~~~~~~~~
 Vanilla DKIM is well-suited for this purpose, as it was specifically
 created to sign email headers. The following changes will need to be
-made to the configuration:
+made to the configuration for it to be useful:
 
   - add "x-patch-hashes" to the list of signed headers
   - ensure that "sender" is not included
@@ -254,7 +255,7 @@
 
 Here's how it looks with the POC command, using the bundled rsa.key::
 
-    ./main.py sign-dkim
+    $ ./main.py sign-dkim
     Signing: plain DKIM
     Using emails/unsigned.eml as message source
     Using rsa.key to sign
@@ -283,7 +284,7 @@
 This POC also includes a few example emails signed by the kernel.org DKIM
 key. You can run the POC verification yourself::
 
-    ./main.py -m emails/korg-signed-dkim.eml verify
+    $ ./main.py -m emails/korg-signed-dkim.eml verify
     Using emails/korg-signed-dkim.eml as message source
     Verifying: Plain DKIM
     DNS-lookup: default._domainkey.kernel.org.
@@ -299,10 +300,10 @@
 
 As you can see, the verification steps will check several things:
 
-  - that the DKIM signature passes verification (this is done by
-    normalizing and concatenating all signed headers, plus the
-    DKIM-signature header itself, minus the signature content following
-    b=)
+  - that the DKIM signature passes verification (this is done as
+    dictated by the RFC -- by normalizing and concatenating all signed
+    headers, plus the DKIM-signature header itself, minus the signature
+    content following b=)
   - that the x-patch-hashes header is included in the content attested
     by DKIM
   - that the domain (d=) and identity (i=) values match what is in the
@@ -312,13 +313,13 @@
   - that all patch hashes that we generate match the hashes in the
     signed header
 
-Note, that this check specifically excludes checking the body hash (bh=)
-value, for the reasons described in the previous section concerning DKIM
-drawbacks. Also, since we excluded "subject" from the list of signed
-headers, the verification will succeed even with usual mailman-induced
-changes to the email content::
+Note, that this check specifically excludes verifying the body hash
+(bh=) value, for the reasons described in the previous section
+concerning DKIM drawbacks. Also, since we excluded "subject" from the
+list of signed headers, the verification will succeed even with usual
+mailman-induced changes to the email content::
 
-    ./main.py -m emails/korg-signed-dkim-with-ml-junk.eml verify
+    $ ./main.py -m emails/korg-signed-dkim-with-ml-junk.eml verify
     Using emails/korg-signed-dkim-with-ml-junk.eml as message source
     Verifying: Plain DKIM
     DNS-lookup: default._domainkey.kernel.org.
@@ -336,7 +337,7 @@
 into the "i" hash, any changes to the subject header that aren't extra
 prefixes like ``[topic]`` will result in verification failure::
 
-    ./main.py -m emails/korg-signed-dkim-changed-subject.eml verify
+    $ ./main.py -m emails/korg-signed-dkim-changed-subject.eml verify
     Using emails/korg-signed-dkim-changed-subject.eml as message source
     Verifying: Plain DKIM
     DNS-lookup: default._domainkey.kernel.org.
@@ -387,7 +388,7 @@
 
 Here's the result of running the POC code, using the bundled dk.key::
 
-    ./main.py sign-dk
+    $ ./main.py sign-dk
     Signing: X-Patch-Sig header using dk mode
     Using emails/unsigned.eml as message source
     --- MESSAGE STARTS ---
@@ -407,7 +408,7 @@
 the exact same DNS query to look up the public key for the selector
 specified::
 
-    ./main.py -m emails/korg-signed-dk.eml verify
+    $ ./main.py -m emails/korg-signed-dk.eml verify
     Using emails/korg-signed-dk.eml as message source
     Verifying: X-Patch-Sig (mode=dk)
     DNS-lookup: patches._domainkey.kernel.org.
@@ -434,10 +435,11 @@
 
     https://[domain]/.well-known/_domainkey/[selector].txt
 
-We have it set up for kernel.org and you can perform a verification
-lookup using the provided example::
+The contents of the txt file are the same as the contents of the TXT
+record. We have it configured for kernel.org and you can perform a
+verification lookup using the provided example::
 
-    ./main.py -m emails/korg-signed-wk.eml verify
+    $ ./main.py -m emails/korg-signed-wk.eml verify
     Using emails/korg-signed-wk.eml as message source
     Verifying: X-Patch-Sig (mode=wk)
     Retrieving: https://kernel.org/.well-known/_domainkey/patches.txt
@@ -460,7 +462,7 @@
    need for individual developers to make any changes to their usual
    routines
  - advantage: it piggybacks on the existing DKIM standard, which has
-   proven success record
+   a proven success record
  - disadvantage: it requires changes to the IT infrastructure, including
    adding a new milter daemon to the authenticated SMTP relay, which has
    security and stability implications
@@ -484,7 +486,7 @@
 signed git tags and git commits). We can easily use GnuPG to provide the
 signature content of the X-Patch-Sig header.
 
-Here are the headers from emails/mricon-signed-pgp.eml::
+Here is an example from the bundled emails/mricon-signed-pgp.eml::
 
     X-Patch-Hashes: v=1; h=sha256;
      i=pkD5Pg8+cndZAzQQzo3RBSOOUzZM3GYWxiFIKFGIKe0=;
@@ -496,8 +498,8 @@
      3WRdUllgM=
 
 Since a lot of the attesting information is already embedded into the
-PGP signature itself, the signature structure is different from the "dk"
-or "wk" mode:
+PGP signature itself, the header structure is different from the "dk" or
+"wk" mode:
 
   - we don't need to know the domain, since we won't be doing any
     lookups on our own (GnuPG can handle this, if configured)
@@ -505,7 +507,7 @@
     subkey, for ease of lookups
   - the identity field is informational only, but can be used by GnuPG
     to perform WKD lookups, if it matches the From header (not
-    implemented)
+    implemented in the POC)
   - the timestamp field is missing, since this data is embedded into the
     PGP signature itself
 
@@ -515,10 +517,10 @@
 TRUST_ULTIMATE.
 
 If the key is not present in the verifier's default keyring, the POC
-will check if there is a matching entry in
-.keys/openpgp/keys/[keyid].asc, and if so, will use
-.keys/openpgp/pubring.kbx for performing the verification. In this case,
-TRUST_* fields are not used, as they will always be "unknown".
+will check if there is a matching entry in .keys/openpgp/keys/[keyid].asc,
+and if so, will use .keys/openpgp/pubring.kbx for performing the
+verification. In this case, TRUST_* fields are not used, as they will
+always be "unknown".
 
 In-git key distribution is discussed further below.
 
@@ -531,7 +533,7 @@
 
 Here's the POC running with the bundled "ingit.key"::
 
-    ./main.py sign-wkd
+    $ ./main.py sign-wkd
     Signing: X-Patch-Sig header using wkd mode
     Using emails/unsigned.eml as message source
     --- MESSAGE STARTS ---
@@ -545,7 +547,7 @@
      g+wGNtQn3AmUsvnoX0Jppqc5ei6GDzr0yMQKzEbUt0DkPrd/Y000b
     [...]
 
-It is very similar to content created in the "DK" or "WK" mode, except
+It is very similar to content created in the "dk" or "wk" mode, except
 the identity field includes the entire email address of the developer.
 
 When we verify the attestation, we will do the following:
@@ -556,12 +558,13 @@
 
 The hashing and zbase32-encoding is taken to be compatible with
 openpgp's WKD implementation and is done to prevent someone from easily
-finding out full email addresses from the directory listing.
+finding out everyone's email addresses from unprotected directory
+listings.
 
-You can run the verification using the POC example. Here's without the
-in-git matching key::
+You can run the verification using the POC example. Here's the run
+without using the in-git matching key::
 
-    ./main.py -m emails/mricon-signed-wkd.eml verify
+    $ ./main.py -m emails/mricon-signed-wkd.eml verify
     Using emails/mricon-signed-wkd.eml as message source
     Verifying: X-Patch-Sig (mode=wkd)
     Retrieving: https://kernel.org/.well-known/devkey/sapsizz4qsj4zmmscbz9f7y8cunt496y/patches.txt
@@ -575,9 +578,10 @@
     ----- ---------------
     PASS : All hashes verified
 
-Here is with the public key provided in git repository itself::
+Here is the same, but using the public key provided in the git
+repository itself::
 
-    ./main.py -m emails/dev-signed-wkd-ingit.eml verify
+    $ ./main.py -m emails/dev-signed-wkd-ingit.eml verify
     Using emails/dev-signed-wkd-ingit.eml as message source
     Verifying: X-Patch-Sig (mode=wkd)
     Loading: WKD key from /var/home/user/work/git/patch-attestation-poc/.keys/devkey/kernel.org/dev/patches.txt
@@ -592,13 +596,13 @@
     PASS : All hashes verified
 
 The structure and nature of the WKD mechanism is entirely up for
-discussion, along with everything else in this README.
+discussion (along with everything else in this proposal).
 
 Automating developer attestation
 --------------------------------
 The easiest way to automate developer attestation is by providing a
 sendmail-compatible "attest-and-send" utility that can be a drop-in
-command settable via git's sendemail.smtpServer command. It would
+command settable via git's sendemail.smtpServer config setting. It would
 be automatically invoked whenever git-send-email runs and would inject
 the X-Patch-Hashes and X-Patch-Sig headers before sending the emails to
 the SMTP server specified via the rest of the sendemail configuration
@@ -607,18 +611,18 @@
 In addition to creating these headers, this tool can also automatically
 add all emails going through it to the developer's personal public-inbox
 archive that can act as a separate source of patch data in addition to
-mail delivered via the regular means.
+mail delivered via SMTP and mailing lists.
 
 Public keys bundled with git repos
 ----------------------------------
 Delegated trust is hard and securely bootstrapping your trusted
-identities is even harder. There are several proposals to include
+identities is even harder. There are existing proposals to include
 developer keys as part of the git repository itself in order to make it
 possible for someone to quickly bootstrap their keyring with trusted
 identities. Obviously, this introduces a chicken-and-egg problem of
 getting your source of trust from the thing you're trying to attest in
 the first place. However, no mechanism short of in-person meetings is
-able to provide this kind of assurance, so in-git key distribution
+able to provide perfect levels of assurance, so in-git key distribution
 remains as good a source of bootstrap trust as any.
 
 The implementation in this POC is naive and shouldn't be used for
@@ -629,8 +633,14 @@
 Where should verification be performed
 --------------------------------------
 Signature verification should be performed by the maintainer evaluating
-the code submission for inclusion into the git repository. The POC
+the patches they received for inclusion into the git repository. The POC
 already pulls in "b4" as a dependency for the patch hashing routines,
 and I intend to add the header-based verification mechanisms in the
 future release of b4, once this proposal is thoroughly discussed.
 
+Similarly, browser and other email client plugins can be written to
+indicate to the developer whether the patches they are viewing pass
+signature verification. If this proposal is adopted, we can come up with
+implementations for Gmail, Mutt and Emacs, which should cover a
+significant number of end-user tools.
+
commit	edbf0eb57248a4247e9922fcbd8cddd68612a7fa	[log] [tgz]
author	Konstantin Ryabitsev <konstantin@linuxfoundation.org>	Wed Sep 16 14:13:17 2020 -0400
committer	Konstantin Ryabitsev <konstantin@linuxfoundation.org>	Wed Sep 16 14:13:17 2020 -0400
tree	c4245c78685a4d8981c4f664a8bb284b23477a62
parent	a2486b493d11408f12cdd1a4efb7fd70000f7503 [diff]