[GSoC] Progress Report

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GSoC] Progress Report

Zero King-2
Hi,

GSoC coding phrase has begun and I'm implementing a CI bot that runs on
Travis CI and tests pull requests.

My project includes two bots, the CI bot testing pull requests and the
PR bot assigning labels to PRs and notify maintainers.

The design docs are available at https://github.com/l2dy/mpbot-design,
but the code is not functional yet so I'm not sharing it for now.

My schedule next week is to finish `port install` testing in the CI bot
and publish its code.

--
Best regards,
Zero King

Don't trust the From address.

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Mojca Miklavec-2
Dear Zero King,

On 4 June 2017 at 14:49, Zero King wrote:

> Hi,
>
> GSoC coding phrase has begun and I'm implementing a CI bot that runs on
> Travis CI and tests pull requests.
>
> My project includes two bots, the CI bot testing pull requests and the
> PR bot assigning labels to PRs and notify maintainers.
>
> The design docs are available at https://github.com/l2dy/mpbot-design,
> but the code is not functional yet so I'm not sharing it for now.

Thank you very much for the update.

There's one thing I didn't fully understand:

https://github.com/l2dy/mpbot-design/blob/master/cibot.md#interaction-with-ci-bot

> "This design is aimed for traceability, we can find the exact GitHub user who submitted a malicious PR."

I understand that you can neither trust the author's nor committer's
email from the git commit history, but doesn't GitHub provide a
reliable information about who submitted the pull request? Of course
one can have a stolen identity (username/password or key), but I
probably don't understand at which point you wanted to identify the
user submitting a PR. Or did you want to identify user trying to chat
with the bots?

You asked about extraction of list of ports which is currently a combination of
    https://github.com/macports/macports-infrastructure/blob/f79cc559611e5f42dd26808f38cd0750beee12bf/buildbot/master.cfg#L32
and list-subports in mpbb. I guess the first function could be
implemented in mpbb instead. And maybe mpbb could get some more
branching (if-else statements) depending on whether it runs for
"production" (buildbot) or "testing" (Travis). Or maybe some
functionality from mpbb could even move to the MacPorts core.

> My schedule next week is to finish `port install` testing in the CI bot
> and publish its code.

Looking forward to it :)

Thanks again for the update,
    Mojca
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Rainer Müller-4
In reply to this post by Zero King-2
On 2017-06-04 14:49, Zero King wrote:
> GSoC coding phrase has begun and I'm implementing a CI bot that runs on
> Travis CI and tests pull requests.
>
> My project includes two bots, the CI bot testing pull requests and the
> PR bot assigning labels to PRs and notify maintainers.

As far as I understand it, the CI "bot" are just scripts to be executed
on Travis CI, but the PR bot will be a daemon process running on our own
infrastructure?

> The design docs are available at https://github.com/l2dy/mpbot-design,
> but the code is not functional yet so I'm not sharing it for now.

Quoting from the linked document:

| 1. List subports
| 2. port lint test
| 3. port -d install test
| 4. Send data to CI bot
                  ^^
That is supposed to be PR bot, right?

| The CI bot generates an ECDSA key pair on start and prints the public
| key on Travis log. While testing ports, the bot attempts handshake
| with the PR bot by signing the salt PR bot provided (TCP or HTTP?).
| The PR bot would grab the public key from Travis logs and verify the
| signature.

This seems overly complex. In case the CI bot needs to communicate with
the PR bot directly, shouldn't a simple password/access token passed in
the environment [1] be secure enough for this? Or are we running into
these restrictions [2]?

As I see it, the status of the PR on GitHub needs to be updated. Travis
already has functionality to do so, what role does the PR bot play at
that point? Couldn't it just pick up the notification from GitHub [3]?

Rainer

[1] https://docs.travis-ci.com/user/environment-variables/
[2]
https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
[3]
https://developer.github.com/v3/activity/events/types/#pullrequestreviewevent
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Zero King-2
In reply to this post by Mojca Miklavec-2
On Sun, Jun 04, 2017 at 10:23:52PM +0200, Mojca Miklavec wrote:

>Dear Zero King,
>
>Thank you very much for the update.
>
>There's one thing I didn't fully understand:
>
>https://github.com/l2dy/mpbot-design/blob/master/cibot.md#interaction-with-ci-bot
>
>> "This design is aimed for traceability, we can find the exact GitHub user who submitted a malicious PR."
>
>I understand that you can neither trust the author's nor committer's
>email from the git commit history, but doesn't GitHub provide a
>reliable information about who submitted the pull request? Of course
>one can have a stolen identity (username/password or key), but I
>probably don't understand at which point you wanted to identify the
>user submitting a PR. Or did you want to identify user trying to chat
>with the bots?
All information CI bot have access to is public, so I'm worried that
someone would send PR bot data without submitting a PR at all.

>You asked about extraction of list of ports which is currently a combination of
>    https://github.com/macports/macports-infrastructure/blob/f79cc559611e5f42dd26808f38cd0750beee12bf/buildbot/master.cfg#L32
>and list-subports in mpbb. I guess the first function could be
>implemented in mpbb instead. And maybe mpbb could get some more
>branching (if-else statements) depending on whether it runs for
>"production" (buildbot) or "testing" (Travis). Or maybe some
>functionality from mpbb could even move to the MacPorts core.

mpbb has a dependency on getopt, so it's not ideal for Travis since
there's a time limit for each build and I'd like to save more time for
actually testing ports.

--
Best regards,
Zero King

Don't trust the From address.

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Zero King-2
In reply to this post by Rainer Müller-4
On Sun, Jun 04, 2017 at 11:13:54PM +0200, Rainer Müller wrote:
>As far as I understand it, the CI "bot" are just scripts to be executed
>on Travis CI, but the PR bot will be a daemon process running on our own
>infrastructure?

Yes, except that the CI bot is not just scripts.
The CI bot is written in Go to share code with the PR bot.

>> The design docs are available at https://github.com/l2dy/mpbot-design,
>> but the code is not functional yet so I'm not sharing it for now.
>
>Quoting from the linked document:
>
>| 1. List subports
>| 2. port lint test
>| 3. port -d install test
>| 4. Send data to CI bot
>                  ^^
>That is supposed to be PR bot, right?
Thanks, indeed.

>| The CI bot generates an ECDSA key pair on start and prints the public
>| key on Travis log. While testing ports, the bot attempts handshake
>| with the PR bot by signing the salt PR bot provided (TCP or HTTP?).
>| The PR bot would grab the public key from Travis logs and verify the
>| signature.
>
>This seems overly complex. In case the CI bot needs to communicate with
>the PR bot directly, shouldn't a simple password/access token passed in
>the environment [1] be secure enough for this? Or are we running into
>these restrictions [2]?
Yes, those restrictions apply. We can't have secrets in Travis's
environment for PRs.

>As I see it, the status of the PR on GitHub needs to be updated. Travis
>already has functionality to do so, what role does the PR bot play at
>that point? Couldn't it just pick up the notification from GitHub [3]?

Adding labels like "type:update" and notify maintainers. Foreign Tcl
code can't be safely executed on our infra. Pulling foreign git branches
consumes bandwidth and disk space. So the plan is to let Travis generate
needed data not available from GitHub API and these data be sent to and
sanitized by the PR bot.

>Rainer
>
>[1] https://docs.travis-ci.com/user/environment-variables/
>[2]
>https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
>[3]
>https://developer.github.com/v3/activity/events/types/#pullrequestreviewevent

--
Best regards,
Zero King

Don't trust the From address.

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Rainer Müller-4
In reply to this post by Zero King-2
On 2017-06-05 03:17, Zero King wrote:
> All information CI bot have access to is public, so I'm worried that
> someone would send PR bot data without submitting a PR at all.

Make the PR bot pull data from Travis. The CI bot then only triggers the
PR bot, which then checks for new unprocessed builds on Travis.

>> You asked about extraction of list of ports which is currently a
>> combination of
>>  
>> https://github.com/macports/macports-infrastructure/blob/f79cc559611e5f42dd26808f38cd0750beee12bf/buildbot/master.cfg#L32
>>
>> and list-subports in mpbb. I guess the first function could be
>> implemented in mpbb instead. And maybe mpbb could get some more
>> branching (if-else statements) depending on whether it runs for
>> "production" (buildbot) or "testing" (Travis). Or maybe some
>> functionality from mpbb could even move to the MacPorts core.
>
> mpbb has a dependency on getopt, so it's not ideal for Travis since
> there's a time limit for each build and I'd like to save more time for
> actually testing ports.

How is getopt relevant for the timing? If the dependency on getopt is a
problem, let's find a portable solution for mpbb. Duplicating the exact
same functionality we already have does not make sense to me.

Rainer
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Zero King-2
On Mon, Jun 05, 2017 at 02:08:20PM +0200, Rainer Müller wrote:
>On 2017-06-05 03:17, Zero King wrote:
>> All information CI bot have access to is public, so I'm worried that
>> someone would send PR bot data without submitting a PR at all.
>
>Make the PR bot pull data from Travis. The CI bot then only triggers the
>PR bot, which then checks for new unprocessed builds on Travis.

That's planned as an alternative fetch method. Travis won't keep any
data except full build log and most of it would be useless for the PR
bot. The CI bot would send less data to the PR bot and save resource for
our server.

>>> You asked about extraction of list of ports which is currently a
>>> combination of
>>>
>>> https://github.com/macports/macports-infrastructure/blob/f79cc559611e5f42dd26808f38cd0750beee12bf/buildbot/master.cfg#L32
>>>
>>> and list-subports in mpbb. I guess the first function could be
>>> implemented in mpbb instead. And maybe mpbb could get some more
>>> branching (if-else statements) depending on whether it runs for
>>> "production" (buildbot) or "testing" (Travis). Or maybe some
>>> functionality from mpbb could even move to the MacPorts core.
>>
>> mpbb has a dependency on getopt, so it's not ideal for Travis since
>> there's a time limit for each build and I'd like to save more time for
>> actually testing ports.
>
>How is getopt relevant for the timing? If the dependency on getopt is a
>problem, let's find a portable solution for mpbb. Duplicating the exact
>same functionality we already have does not make sense to me.
getopt has to be installed in a different prefix to avoid conflicts with
tested ports. It's not ideal to install another MacPorts on Travis VMs
because it would be lost in the next build.

mpbb is designed for Buildbot and archive uploads. It's too powerful for
the CI bot and I only need several port commands in it. Travis only
provides stdout & stderr mixed log so I need to make it parsable.
The two bots could share code and implementing this part in two
languages would be unnecessary work.

>Rainer

--
Best regards,
Zero King

Don't trust the From address.

smime.p7s (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Rainer Müller-4
On 2017-06-06 04:15, Zero King wrote:

> On Mon, Jun 05, 2017 at 02:08:20PM +0200, Rainer Müller wrote:
>> On 2017-06-05 03:17, Zero King wrote:
>>> All information CI bot have access to is public, so I'm worried that
>>> someone would send PR bot data without submitting a PR at all.
>>
>> Make the PR bot pull data from Travis. The CI bot then only triggers the
>> PR bot, which then checks for new unprocessed builds on Travis.
>
> That's planned as an alternative fetch method. Travis won't keep any
> data except full build log and most of it would be useless for the PR
> bot. The CI bot would send less data to the PR bot and save resource for
> our server.

I would reduce the complexity as much as possible. It should be easy to
get more computing power or disk space if needed.
> getopt has to be installed in a different prefix to avoid conflicts with
> tested ports. It's not ideal to install another MacPorts on Travis VMs
> because it would be lost in the next build.

There is no need that this getopt binary has to be provided by another
MacPorts installation. As you are using Go, the other CI bot
functionality will also be binaries, right? Just ship an additional
getopt binary. Or modify mpbb not to need getopt, for example with
something like pure-getopt [1]. I really do not see that as a problem.

> mpbb is designed for Buildbot and archive uploads. It's too powerful for
> the CI bot and I only need several port commands in it. Travis only
> provides stdout & stderr mixed log so I need to make it parsable.
> The two bots could share code and implementing this part in two
> languages would be unnecessary work.

My idea would be that the CI bot would call the mpbb scripts, while
capturing stdout/stderr to add additional markers for parseable output.

Maybe I am still missing something about the whole setup and you are
probably a few steps ahead with planning. Therefore I will stop arguing
about this now. Hopefully I will see how this is will work once you show
us the code. Looking forward to its release.

Happy hacking this week!

Rainer

[1] https://github.com/agriffis/pure-getopt
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [GSoC] Progress Report

Mojca Miklavec-2
In reply to this post by Zero King-2
On 6 June 2017 at 04:15, Zero King wrote:
> On Mon, Jun 05, 2017 at 02:08:20PM +0200, Rainer Müller wrote:
>>
>> How is getopt relevant for the timing? If the dependency on getopt is a
>> problem, let's find a portable solution for mpbb. Duplicating the exact
>> same functionality we already have does not make sense to me.
>
> getopt has to be installed in a different prefix to avoid conflicts with
> tested ports. It's not ideal to install another MacPorts on Travis VMs
> because it would be lost in the next build.

Don't worry about premature optimisations too much.

As Rainer said, we could avoid using getopt, rewrite the mpbb scirpts
to perl / python or whatever else.

But even if you have to install some minor dependencies to the same
prefix, I would not worry about that at this point, in particular
because you don't need to do a cleanup at the end anyway. It's clearly
suboptimal, but I believe this is something that can get fixed later.
I don't know if you need any other tools which are not provided by the
machines they use.

That said, the biggest power of mpbb are probably the tcl functions,
something that's much more difficult to rewrite to a different
programming language. But many of them could probably be integrated to
MacPorts core. Like: listing all subports from a given list of ports,
installing dependencies of a particular port without installing the
port itself etc.

But then again, if using mpbb is more limiting you than helping you,
... then well ... use whatever works best.

Mojca
Loading...