Initial proofreading
Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
diff --git a/README.rst b/README.rst
index babf047..e036693 100644
--- a/README.rst
+++ b/README.rst
@@ -142,9 +142,9 @@
Instead of creating a single attestation hash, we create a separate hash
for each meaningful part of the patch submission:
- - **i**: patch metadata
- - **m**: commit message
- - **p**: diff content
+ - i: patch metadata
+ - m: commit message
+ - p: diff content
This allows the person performing verification to identify which part of
the submission has been altered since being signed. A change to a commit
@@ -154,10 +154,10 @@
Similarly, a patch that goes through a chain of maintainers will
necessarily have its commit message modified by the inclusion of various
-trailers. Having a separate hash for the patch content and patch
-metadata provides a way to track whether or not any of the
+provenance trailers. Having a separate hash for the patch content and
+patch metadata provides a way to track whether or not any of the
submaintainers made changes to the patch code, or just to the commit
-message, as expected.
+message, as is generally expected.
To generate the three parts, we rely on the ``git mailinfo`` command,
that does most of what we need::
@@ -165,8 +165,8 @@
git mailinfo m p > i < email.msg
The above command will produce three files that closely match what we
-need, but require a bit of extra processing to remove content that is
-likely to be altered in transmission.
+are looking for, but require a bit of extra processing to remove content
+that is likely to be altered in SMTP transmission.
To get the "m" hash, we take the "m" file as-is::
@@ -183,7 +183,8 @@
cut" portion of the commit message (usually, diffstat and revision
information), plus trailing content such as signatures or mailing list
subscription info. All of this is stripped away to leave just the diff
-content.
+content. Unfortunately, there is no way to do it with git itself, so we
+use manual parsing of the diff structure to perform this operation.
Why not use git patch-id?
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -192,12 +193,12 @@
performs several canonicalization routines that make this hash
unsuitable for attestation purposes:
- - it collapses all whitespace together
+ - it collapses all repeating whitespace
- it removes all line numbers from diff contents
It is possible for a malicious actor to create two patches that generate
-identical patch-id hashes but have drastically different results in the
-code. For more info, see discussion here:
+identical patch-id hashes but have drastically different results when
+applied to the codebase. For more info, see discussion here:
- https://lore.kernel.org/git/20200210164115.x4gciujyjisivfgi@chatter.i7.local/
@@ -229,9 +230,9 @@
------------------------
Once the X-Patch-Hashes header is generated and inserted into the email,
it will need to be signed in order to be useful for attestation
-purposes. Adding domain-level signatures is the simplest way to
-accomplish this, as it would allow entire companies to automatically
-attest all patches sent out via their infrastructure.
+purposes. Adding domain-level signatures during SMTP processing is the
+simplest way to accomplish this, as it would allow entire companies to
+automatically attest all patches sent out via their infrastructure.
This can be easily done by introducing a patch-attestation milter that
would automatically analyze body contents and generate the
@@ -244,7 +245,7 @@
~~~~~~~~~~~~~~~~~~
Vanilla DKIM is well-suited for this purpose, as it was specifically
created to sign email headers. The following changes will need to be
-made to the configuration:
+made to the configuration for it to be useful:
- add "x-patch-hashes" to the list of signed headers
- ensure that "sender" is not included
@@ -254,7 +255,7 @@
Here's how it looks with the POC command, using the bundled rsa.key::
- ./main.py sign-dkim
+ $ ./main.py sign-dkim
Signing: plain DKIM
Using emails/unsigned.eml as message source
Using rsa.key to sign
@@ -283,7 +284,7 @@
This POC also includes a few example emails signed by the kernel.org DKIM
key. You can run the POC verification yourself::
- ./main.py -m emails/korg-signed-dkim.eml verify
+ $ ./main.py -m emails/korg-signed-dkim.eml verify
Using emails/korg-signed-dkim.eml as message source
Verifying: Plain DKIM
DNS-lookup: default._domainkey.kernel.org.
@@ -299,10 +300,10 @@
As you can see, the verification steps will check several things:
- - that the DKIM signature passes verification (this is done by
- normalizing and concatenating all signed headers, plus the
- DKIM-signature header itself, minus the signature content following
- b=)
+ - that the DKIM signature passes verification (this is done as
+ dictated by the RFC -- by normalizing and concatenating all signed
+ headers, plus the DKIM-signature header itself, minus the signature
+ content following b=)
- that the x-patch-hashes header is included in the content attested
by DKIM
- that the domain (d=) and identity (i=) values match what is in the
@@ -312,13 +313,13 @@
- that all patch hashes that we generate match the hashes in the
signed header
-Note, that this check specifically excludes checking the body hash (bh=)
-value, for the reasons described in the previous section concerning DKIM
-drawbacks. Also, since we excluded "subject" from the list of signed
-headers, the verification will succeed even with usual mailman-induced
-changes to the email content::
+Note, that this check specifically excludes verifying the body hash
+(bh=) value, for the reasons described in the previous section
+concerning DKIM drawbacks. Also, since we excluded "subject" from the
+list of signed headers, the verification will succeed even with usual
+mailman-induced changes to the email content::
- ./main.py -m emails/korg-signed-dkim-with-ml-junk.eml verify
+ $ ./main.py -m emails/korg-signed-dkim-with-ml-junk.eml verify
Using emails/korg-signed-dkim-with-ml-junk.eml as message source
Verifying: Plain DKIM
DNS-lookup: default._domainkey.kernel.org.
@@ -336,7 +337,7 @@
into the "i" hash, any changes to the subject header that aren't extra
prefixes like ``[topic]`` will result in verification failure::
- ./main.py -m emails/korg-signed-dkim-changed-subject.eml verify
+ $ ./main.py -m emails/korg-signed-dkim-changed-subject.eml verify
Using emails/korg-signed-dkim-changed-subject.eml as message source
Verifying: Plain DKIM
DNS-lookup: default._domainkey.kernel.org.
@@ -387,7 +388,7 @@
Here's the result of running the POC code, using the bundled dk.key::
- ./main.py sign-dk
+ $ ./main.py sign-dk
Signing: X-Patch-Sig header using dk mode
Using emails/unsigned.eml as message source
--- MESSAGE STARTS ---
@@ -407,7 +408,7 @@
the exact same DNS query to look up the public key for the selector
specified::
- ./main.py -m emails/korg-signed-dk.eml verify
+ $ ./main.py -m emails/korg-signed-dk.eml verify
Using emails/korg-signed-dk.eml as message source
Verifying: X-Patch-Sig (mode=dk)
DNS-lookup: patches._domainkey.kernel.org.
@@ -434,10 +435,11 @@
https://[domain]/.well-known/_domainkey/[selector].txt
-We have it set up for kernel.org and you can perform a verification
-lookup using the provided example::
+The contents of the txt file are the same as the contents of the TXT
+record. We have it configured for kernel.org and you can perform a
+verification lookup using the provided example::
- ./main.py -m emails/korg-signed-wk.eml verify
+ $ ./main.py -m emails/korg-signed-wk.eml verify
Using emails/korg-signed-wk.eml as message source
Verifying: X-Patch-Sig (mode=wk)
Retrieving: https://kernel.org/.well-known/_domainkey/patches.txt
@@ -460,7 +462,7 @@
need for individual developers to make any changes to their usual
routines
- advantage: it piggybacks on the existing DKIM standard, which has
- proven success record
+ a proven success record
- disadvantage: it requires changes to the IT infrastructure, including
adding a new milter daemon to the authenticated SMTP relay, which has
security and stability implications
@@ -484,7 +486,7 @@
signed git tags and git commits). We can easily use GnuPG to provide the
signature content of the X-Patch-Sig header.
-Here are the headers from emails/mricon-signed-pgp.eml::
+Here is an example from the bundled emails/mricon-signed-pgp.eml::
X-Patch-Hashes: v=1; h=sha256;
i=pkD5Pg8+cndZAzQQzo3RBSOOUzZM3GYWxiFIKFGIKe0=;
@@ -496,8 +498,8 @@
3WRdUllgM=
Since a lot of the attesting information is already embedded into the
-PGP signature itself, the signature structure is different from the "dk"
-or "wk" mode:
+PGP signature itself, the header structure is different from the "dk" or
+"wk" mode:
- we don't need to know the domain, since we won't be doing any
lookups on our own (GnuPG can handle this, if configured)
@@ -505,7 +507,7 @@
subkey, for ease of lookups
- the identity field is informational only, but can be used by GnuPG
to perform WKD lookups, if it matches the From header (not
- implemented)
+ implemented in the POC)
- the timestamp field is missing, since this data is embedded into the
PGP signature itself
@@ -515,10 +517,10 @@
TRUST_ULTIMATE.
If the key is not present in the verifier's default keyring, the POC
-will check if there is a matching entry in
-.keys/openpgp/keys/[keyid].asc, and if so, will use
-.keys/openpgp/pubring.kbx for performing the verification. In this case,
-TRUST_* fields are not used, as they will always be "unknown".
+will check if there is a matching entry in .keys/openpgp/keys/[keyid].asc,
+and if so, will use .keys/openpgp/pubring.kbx for performing the
+verification. In this case, TRUST_* fields are not used, as they will
+always be "unknown".
In-git key distribution is discussed further below.
@@ -531,7 +533,7 @@
Here's the POC running with the bundled "ingit.key"::
- ./main.py sign-wkd
+ $ ./main.py sign-wkd
Signing: X-Patch-Sig header using wkd mode
Using emails/unsigned.eml as message source
--- MESSAGE STARTS ---
@@ -545,7 +547,7 @@
g+wGNtQn3AmUsvnoX0Jppqc5ei6GDzr0yMQKzEbUt0DkPrd/Y000b
[...]
-It is very similar to content created in the "DK" or "WK" mode, except
+It is very similar to content created in the "dk" or "wk" mode, except
the identity field includes the entire email address of the developer.
When we verify the attestation, we will do the following:
@@ -556,12 +558,13 @@
The hashing and zbase32-encoding is taken to be compatible with
openpgp's WKD implementation and is done to prevent someone from easily
-finding out full email addresses from the directory listing.
+finding out everyone's email addresses from unprotected directory
+listings.
-You can run the verification using the POC example. Here's without the
-in-git matching key::
+You can run the verification using the POC example. Here's the run
+without using the in-git matching key::
- ./main.py -m emails/mricon-signed-wkd.eml verify
+ $ ./main.py -m emails/mricon-signed-wkd.eml verify
Using emails/mricon-signed-wkd.eml as message source
Verifying: X-Patch-Sig (mode=wkd)
Retrieving: https://kernel.org/.well-known/devkey/sapsizz4qsj4zmmscbz9f7y8cunt496y/patches.txt
@@ -575,9 +578,10 @@
----- ---------------
PASS : All hashes verified
-Here is with the public key provided in git repository itself::
+Here is the same, but using the public key provided in the git
+repository itself::
- ./main.py -m emails/dev-signed-wkd-ingit.eml verify
+ $ ./main.py -m emails/dev-signed-wkd-ingit.eml verify
Using emails/dev-signed-wkd-ingit.eml as message source
Verifying: X-Patch-Sig (mode=wkd)
Loading: WKD key from /var/home/user/work/git/patch-attestation-poc/.keys/devkey/kernel.org/dev/patches.txt
@@ -592,13 +596,13 @@
PASS : All hashes verified
The structure and nature of the WKD mechanism is entirely up for
-discussion, along with everything else in this README.
+discussion (along with everything else in this proposal).
Automating developer attestation
--------------------------------
The easiest way to automate developer attestation is by providing a
sendmail-compatible "attest-and-send" utility that can be a drop-in
-command settable via git's sendemail.smtpServer command. It would
+command settable via git's sendemail.smtpServer config setting. It would
be automatically invoked whenever git-send-email runs and would inject
the X-Patch-Hashes and X-Patch-Sig headers before sending the emails to
the SMTP server specified via the rest of the sendemail configuration
@@ -607,18 +611,18 @@
In addition to creating these headers, this tool can also automatically
add all emails going through it to the developer's personal public-inbox
archive that can act as a separate source of patch data in addition to
-mail delivered via the regular means.
+mail delivered via SMTP and mailing lists.
Public keys bundled with git repos
----------------------------------
Delegated trust is hard and securely bootstrapping your trusted
-identities is even harder. There are several proposals to include
+identities is even harder. There are existing proposals to include
developer keys as part of the git repository itself in order to make it
possible for someone to quickly bootstrap their keyring with trusted
identities. Obviously, this introduces a chicken-and-egg problem of
getting your source of trust from the thing you're trying to attest in
the first place. However, no mechanism short of in-person meetings is
-able to provide this kind of assurance, so in-git key distribution
+able to provide perfect levels of assurance, so in-git key distribution
remains as good a source of bootstrap trust as any.
The implementation in this POC is naive and shouldn't be used for
@@ -629,8 +633,14 @@
Where should verification be performed
--------------------------------------
Signature verification should be performed by the maintainer evaluating
-the code submission for inclusion into the git repository. The POC
+the patches they received for inclusion into the git repository. The POC
already pulls in "b4" as a dependency for the patch hashing routines,
and I intend to add the header-based verification mechanisms in the
future release of b4, once this proposal is thoroughly discussed.
+Similarly, browser and other email client plugins can be written to
+indicate to the developer whether the patches they are viewing pass
+signature verification. If this proposal is adopted, we can come up with
+implementations for Gmail, Mutt and Emacs, which should cover a
+significant number of end-user tools.
+