GSoC Proposal

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

GSoC Proposal

Zero King
Hi,

I'd like to implement a bot in Go dealing with pull requests for
macports-ports on GitHub and utilize Travis CI to test PRs and commits
in GSoC 2017.

More details in my draft:
https://gist.github.com/l2dy/420533b821570e26dc7374898c3264fb

Any advice is welcome and since this is a new idea I'm seeking for
potential mentors.

--
Best regards,
Zero King

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Jackson Isaac-2
Hi,

On Fri, 31 Mar 2017 at 9:14 PM, Zero King <[hidden email]> wrote:
Hi,

I'd like to implement a bot in Go dealing with pull requests for
macports-ports on GitHub and utilize Travis CI to test PRs and commits
in GSoC 2017.

More details in my draft:
https://gist.github.com/l2dy/420533b821570e26dc7374898c3264fb

Any advice is welcome and since this is a new idea I'm seeking for
potential mentors.

--
Best regards,
Zero King

Please upload the draft on gsoc portal so that we can review and add comments if required.

--
Jackson Isaac 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Rainer Müller-4
In reply to this post by Zero King
Hello,

On 2017-03-31 17:43, Zero King wrote:
> I'd like to implement a bot in Go dealing with pull requests for
> macports-ports on GitHub and utilize Travis CI to test PRs and commits
> in GSoC 2017.
>
> More details in my draft:
> https://gist.github.com/l2dy/420533b821570e26dc7374898c3264fb
>
> Any advice is welcome and since this is a new idea I'm seeking for
> potential mentors.

First of all, great to see a proposal coming from you as a project member!

We do not have anything else in Go yet, so this would be new to our
infrastructure. I am a bit hesitant with that, because even after this
GSoC, we will need someone to maintain this service. Personally, I also
know next to nothing about Go.

The other details given in the proposal are still quite sparse. I do not
yet see everything this project includes and how you will use the time
for the proposed task. Remember the GSoC program is meant to keep you
working for 12 weeks.

Could you give us a timeline? Do you have any milestones, especially for
the midterm evaluation? What will be the status at the end of GSoC this
summer? Is your plan to have it fully deployed already?

How will the bot work in detail? I guess listening to the webhooks?
Where will it get its database for the maintainers? Or do you intend to
parse the submitted/modified Portfile?

What will the scripts for Travis CI be written in? Can we reuse code
from the buildbot infrastructure, namely mpbb [1] which is written in
bash shell script? What are steps you will take to implement this
functionality?

Rainer

[1] https://github.com/macports/mpbb
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Zero King

On 3/31/17 6:23 PM, Rainer Müller wrote:
> First of all, great to see a proposal coming from you as a project
> member!
Thanks.
> We do not have anything else in Go yet, so this would be new to our
> infrastructure. I am a bit hesitant with that, because even after this
> GSoC, we will need someone to maintain this service. Personally, I also
> know next to nothing about Go.
I'll maintain it :)

Go produce static binaries and should be easy to deploy.
> The other details given in the proposal are still quite sparse. I do not
> yet see everything this project includes and how you will use the time
> for the proposed task. Remember the GSoC program is meant to keep you
> working for 12 weeks.
Updated the gist with more details.

I was wondering which should I follow " How to work with us" on GSoC
website or
https://trac.macports.org/wiki/SummerOfCodeApplicationTemplate.
> Could you give us a timeline? Do you have any milestones, especially for
> the midterm evaluation? What will be the status at the end of GSoC this
> summer? Is your plan to have it fully deployed already?
Updated the gist. Yes, I plan to have it fully deployed.
> How will the bot work in detail? I guess listening to the webhooks?
> Where will it get its database for the maintainers? Or do you intend to
> parse the submitted/modified Portfile?
Sure, it will listen to the webhooks and need a URL for that.

It will get the maintainers from unmodified Portfile for security
considerations.
However if the migration to GitHub handles didn't make it will have to
get that from a database.
> What will the scripts for Travis CI be written in? Can we reuse code
> from the buildbot infrastructure, namely mpbb [1] which is written in
> bash shell script? What are steps you will take to implement this
> functionality?
YAML and Bash. mpbb code can be reused.

To implement this, first let Travis generate archives for base and then
use the archive and modified mpbb to test the PR.

--
Best regards,
Zero King

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Bradley Giesbrecht-3
> On Mar 31, 2017, at 6:21 PM, Zero King <[hidden email]> wrote:
>
>
> On 3/31/17 6:23 PM, Rainer Müller wrote:
>> First of all, great to see a proposal coming from you as a project member!
> Thanks.
>> We do not have anything else in Go yet, so this would be new to our
>> infrastructure. I am a bit hesitant with that, because even after this
>> GSoC, we will need someone to maintain this service. Personally, I also
>> know next to nothing about Go.
> I'll maintain it :)
>
> Go produce static binaries and should be easy to deploy.
>> The other details given in the proposal are still quite sparse. I do not
>> yet see everything this project includes and how you will use the time
>> for the proposed task. Remember the GSoC program is meant to keep you
>> working for 12 weeks.
> Updated the gist with more details.
>
> I was wondering which should I follow " How to work with us" on GSoC website or
> https://trac.macports.org/wiki/SummerOfCodeApplicationTemplate.
>> Could you give us a timeline? Do you have any milestones, especially for
>> the midterm evaluation? What will be the status at the end of GSoC this
>> summer? Is your plan to have it fully deployed already?
> Updated the gist. Yes, I plan to have it fully deployed.
>> How will the bot work in detail? I guess listening to the webhooks?
>> Where will it get its database for the maintainers? Or do you intend to
>> parse the submitted/modified Portfile?
> Sure, it will listen to the webhooks and need a URL for that.
>
> It will get the maintainers from unmodified Portfile for security considerations.
> However if the migration to GitHub handles didn't make it will have to get that from a database.
>> What will the scripts for Travis CI be written in? Can we reuse code
>> from the buildbot infrastructure, namely mpbb [1] which is written in
>> bash shell script? What are steps you will take to implement this
>> functionality?
> YAML and Bash. mpbb code can be reused.
>
> To implement this, first let Travis generate archives for base and then
> use the archive and modified mpbb to test the PR.

I would be willing to help in some capacity, mentor, backup mentor or as an interested party.

A while back I provided some VM’s to the KDE project when they were working on adding macOS to their Jenkins CI platform. I could do the same for MP if that was helpful.



Brad

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Mojca Miklavec-2
In reply to this post by Zero King
Can you please elaborate a bit more on
    Travis CI will do lint and install tests on macOS 10.10-12 for PRs
and commits.

What does "install tests" refer to exactly?

Mojca
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Kenneth F. Cunningham
Have you seen how homebrew does this? I imagine he means something like that:

Every submission has to be submitted to the 10.10 - 10.12 bots first, to see if it builds.

Every submission is suggested / required to have at least a minimal test `myport --version` to make sure something actually works.

Only after those things are submission PRs inhaled.

Ken



On 2017-04-01, at 2:43 PM, Mojca Miklavec wrote:

> Can you please elaborate a bit more on
>    Travis CI will do lint and install tests on macOS 10.10-12 for PRs
> and commits.
>
> What does "install tests" refer to exactly?
>
> Mojca

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Zero King
In reply to this post by Mojca Miklavec-2
macOS VMs on Travis would do `port lint <port list> | tee lint.txt; if
grep "^Error: " lint.txt ; then ...`

and `port install ...` (`port test ...` if test exists) and if one of
them failed Travis will report that back to GitHub.

Also Travis will keep logs so lint results will be available there.


On 4/1/17 9:43 PM, Mojca Miklavec wrote:
> Can you please elaborate a bit more on
>      Travis CI will do lint and install tests on macOS 10.10-12 for PRs
> and commits.
>
> What does "install tests" refer to exactly?
>
> Mojca

--
Best regards,
Zero King

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Mojca Miklavec-2
On 2 April 2017 at 04:23, Zero King wrote:
> macOS VMs on Travis would do `port lint <port list> | tee lint.txt; if grep
> "^Error: " lint.txt ; then ...`
>
> and `port install ...` (`port test ...` if test exists) and if one of them
> failed Travis will report that back to GitHub.
>
> Also Travis will keep logs so lint results will be available there.

Thank you. Please include that into the proposal as well. It wasn't
clear to me whether your project would be a full "replacement"
compared to the current functionality of the buildbot or just covering
some small load cycle (like doing very very basic checks). It seems
you aim for the former.

In that case there are a few things that potentially worry me.

1.) They have HomeBrew installed. This is probably a bad thing and a
good thing at the same time. The official server should not have any
traces of HomeBrew to avoid linking against /usr/local/whatever
instead of /opt/local (but it's not a dealbreaker for the testing
infrastructure). Then again, we might spot some problems and try to
avoid them when we find such cases.

2.) We sometimes consume almost 100% of server load on a multicore
server (sometimes for a couple of days). And we sometimes run out disk
space. What happens if Travis bans us (or asks for a price that we
cannot afford to pay) right at the moment when the full solution gets
implemented? I believe we should contact Travis upfront to make sure
that they can meet our expectations. And in any case I would like to
see slightly more details about how the plan intends to keep our
footprint (disk resources and cpu cycles) low. Would you uninstall all
existing packages after each trial and install them again from
packages.macports.org for the next build? Would you keep official
packages installed & inactive between multilple runs and just clean up
whatever the pull request tried to build? (Less important: How would
you avoid rebuilding the whole of Qt after, say, minor edit of
comments in a pull request?)

The alternative can always be to use our own server for builds, but of
course that brings other problems to the table (mostly security) and
requires a completely different approach.

And yes, it would be great if mpbb sources could be shared (and
improved for both buildbot & travis at the same time). Not a
requirement, but if one part can benefit from the other, that would be
great.

Mojca
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Zero King
On 4/2/17 6:03 AM, Mojca Miklavec wrote:

> Thank you. Please include that into the proposal as well. It wasn't
> clear to me whether your project would be a full "replacement"
> compared to the current functionality of the buildbot or just covering
> some small load cycle (like doing very very basic checks). It seems
> you aim for the former.
>
> In that case there are a few things that potentially worry me.
>
> 1.) They have HomeBrew installed. This is probably a bad thing and a
> good thing at the same time. The official server should not have any
> traces of HomeBrew to avoid linking against /usr/local/whatever
> instead of /opt/local (but it's not a dealbreaker for the testing
> infrastructure). Then again, we might spot some problems and try to
> avoid them when we find such cases.

I'll uninstall Homebrew from the VM if needed.

> 2.) We sometimes consume almost 100% of server load on a multicore
> server (sometimes for a couple of days). And we sometimes run out disk
> space. What happens if Travis bans us (or asks for a price that we
> cannot afford to pay) right at the moment when the full solution gets
> implemented? I believe we should contact Travis upfront to make sure
> that they can meet our expectations. And in any case I would like to
> see slightly more details about how the plan intends to keep our
> footprint (disk resources and cpu cycles) low. Would you uninstall all
> existing packages after each trial and install them again from
> packages.macports.org for the next build? Would you keep official
> packages installed & inactive between multilple runs and just clean up
> whatever the pull request tried to build? (Less important: How would
> you avoid rebuilding the whole of Qt after, say, minor edit of
> comments in a pull request?)

Travis won't keep the VMs up. Every build gets a fresh VM (no need to
cleanup) and could only run for 50 minutes so if a build takes too long
it will fail anyway.

You're right about high server load. Since we already have the Buildbot
infrastructure I'll only do install tests in PRs. Lint tests will be
done for both commits and PRs. I'll limit the number of ports to test so
if a PR touches too many ports it'll only do lint tests. We can also use
tags like [ci-skip] in PR title or label to skip some tests.

Installing from packages.macports.org would avoid rebuilding
distributable ports after minor edits.

Travis is unlikely to ban us but if that happens we can keep only macOS
10.12 or only the lint tests (which can even be done on Linux) and
contact them to unban us.

> The alternative can always be to use our own server for builds, but of
> course that brings other problems to the table (mostly security) and
> requires a completely different approach.

Yes, mostly security considerations. We'll have to use VMs and revert to
snapshot after every build in our own servers. Managing VMs and sending
data around securely would be painful to setup.

> And yes, it would be great if mpbb sources could be shared (and
> improved for both buildbot & travis at the same time). Not a
> requirement, but if one part can benefit from the other, that would be
> great.
>
> Mojca

--
Best regards,
Zero King

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Mojca Miklavec-2
Hi,

Just to clarify one thing (also to any other GSOC applicant): this
discussion does not provide any indication about the ranking of the
submitted proposal or whether this or any other proposal might be or
not be accepted. There might be better or worse applications where
less clarification is needed. Independent of whether or not the
proposal would be accepted, I felt that this particular idea needs
some further discussion & clarifications.

Another disclaimer to Zero King: I'm clearly not an authority and my
thoughts might not fully match with the views of other developers, so
please don't take all of my words or ideas for granted. This is
supposed to trigger discussion, not suggestions that one should
blindly follow.

I'm also aware that it makes absolutely no sense to convince someone
to go into directions that don't sound like fun to him, so consider
this just brainstorming for now.

On 2 April 2017 at 14:21, Zero King wrote:

>
> Travis won't keep the VMs up. Every build gets a fresh VM (no need to
> cleanup) and could only run for 50 minutes so if a build takes too long it
> will fail anyway.
>
> You're right about high server load. Since we already have the Buildbot
> infrastructure I'll only do install tests in PRs. Lint tests will be done
> for both commits and PRs. I'll limit the number of ports to test so if a PR
> touches too many ports it'll only do lint tests. We can also use tags like
> [ci-skip] in PR title or label to skip some tests.
>
> Installing from packages.macports.org would avoid rebuilding distributable
> ports after minor edits.
>
> Travis is unlikely to ban us but if that happens we can keep only macOS
> 10.12 or only the lint tests (which can even be done on Linux) and contact
> them to unban us.

I was talking to someone from the HB community. Apparently they tried
very hard to make Travis CI work. One of the major problems they had
other than the time limit (the builder is relatively slow and jobs get
killed after 50 minutes, so any builds like Qt would fail of course)
was the fact that it was impossible to upload the binaries from
"untrusted pull requests" (not sure what that means, in any case, it
didn't work for them). They set up their own build infrastructure
almost from scratch. We clearly don't need to upload the result
anywhere, but the limitations will stay in any case. I would find it
valuable if we put our effort in something that can be expanded.

Another option being mentioned was CircleCI (no limitation to upload
files), but that one has an additional limitation of only 500 free
minutes per month for OpenSource orgs, so most likely not anywhere
near enough.

I was thinking about two aspects:
(a) It would in fact be helpful to have some build jobs running
automatically for PRs
(b) But we also need something to test complex builds. Exactly those
where maintainers can hardly afford to build and test them on their
own machines.


>> The alternative can always be to use our own server for builds, but of
>> course that brings other problems to the table (mostly security) and
>> requires a completely different approach.
>
> Yes, mostly security considerations. We'll have to use VMs and revert to
> snapshot after every build in our own servers. Managing VMs and sending data
> around securely would be painful to setup.

From what I understand there are two different kinds of security issues:

(a) People being able to submit arbitrary code as pull requests

(b) Most of committers are probably not really inspecting diffs or
full sources of packages. In theory one can write some originally
useful opensource tool that we package. Once people start using it,
some problematic code gets introduced, we happily upgrade the port,
and both our main builder and any user installing that port would be
affected.


We currently don't have a problem with (a) because we don't build pull
requests, and with Travis we would simply outsource the problem to
someone else. For (b) we currently don't have any reliable solution,
but using Travis would not actually help since a maintainer would
probably eventually merge the code (if it looks reasonable). And then
the code from official repository would be built on the main builder.


I'm thinking of an alternative approach that would cover another use
case (which cannot be solved by Travis CI due to limitations).

Use cases I have in mind:
- developer testing a new release of Qt; or a bunch of KDE apps
- developer trying to build all the 1000+ perl modules once perl5.26
gets released
- developer without access to older machines, trying to figure out why
the build fails on 10.6; having some clue why that could be, but
trying to avoid doing ten semi-random commits to the main repository
just to test the hypothesis

What we *could* do is to set up a few more build slaves on the
"existing" infrastructure (the build slaves could actually be
anywhere, people could even volunteer to provide access to their
machines; but I guess the idea would be to eventually set up a few new
ones next to the existing slaves). The build master could either run
in the same instance or separately.

The idea would be that people with commit access would have to
manually approve building of a particular pull request (and then that
pull request would get a green icon of course once the build is
complete). When a pull request gets updated/modified, the build could
be tested again, but it would have to be triggered manually again.

For each manually approved/triggered build the build slave would:
- clone the repository
- (?) most likely rebase that on top of master to make sure that the
latest packages of everything would be used at the time of build
- determine which ports have been modified
- maybe run lint as an initial quick test (?)
- build those ports in a very similar way to what the buildbot does at
this moment, with a few exceptions:

  (a) packages from other/existing ports would be fetched from the
main builders, perhaps even when they are not distributable (we could
have private copies of binary packages)
  (b) the resulting packages would be stored between individual builds
that belong to the same PR, but they would not be uploaded anywhere
and both sources and binaries (basically anything that has been
fetched or modified) would be deleted once PR testing is complete


All we need to implement this is:
- The code (Bash/Tcl) that removes anything that has been downloaded
and built when testing the pull request. As a first approximation we
could actually remove *all* binaries and all distfiles provided we
have a nearby mirror from where we can fetch an up-to-date version of
everyting.
- Some glue code that knows how to communicate between GitHub and
BuildBot: which jobs to submit, how to report success etc.

The glue code might need some time to be implemented, but it sounds
like something in the same direction that you already proposed with
that "go" bot. To be honest, I currently don't know yet what exactly
would be needed on that part.

One of the quick&dirty options would even be to add an additional
custom field (like "Port list" on https://build.macports.org/builders
that allows rebuilding an arbitrary port) saying "PR Number" where a
developer would enter the number of the pull request and that would
then be built. Another alternative would be to write a simple website
interface where developers could log in with OAuth and click on a
build button for open pull requests. Just brainstorming.



One of the things I really miss is a website with easy access to build
summary on per-port basis:
- https://trac.macports.org/ticket/51995#Statistics
(This might be something that might even be possible to implement as a
special view right on the buildbot.)


Please don't get me wrong: I'm actually all for the idea to get builds
running on Travis. It would be super helpful to have an "alternative
implementation". But I would find it even more intriguing if we would
keep the liberty of being able to run PRs on exotic setups like
10.6/libc++ and be able to do test builds of Qt, KDE and other huge
projects. (Or have both? :) But maybe other developers disagree?

Many of the building blocks (other than integration with GitHub) are
ready or nearly ready. There are also a number of builbot-related
issues that are waiting for someone to pick up the work.

Working with buildbot requires Python. Fixing mpbb scripts requires
some Bash and Tcl in any case, but probably the same level as if you
work with Travis.

Independent: it would be helpful to hear a better
justification/defence for selecting Go as the language of the "helper
bot".

Also, we would still have to find the main mentor for this project in
case it gets selected (or rather the other way around: having a mentor
assigned before the end of evaluation period is a strict prerequisite
to even qualify). That's mainly a question for other developers on the
list. We do have some volunteers for backup mentors.

Mojca
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Zero King
On 4/12/17 10:37 PM, Mojca Miklavec wrote:

> I was talking to someone from the HB community. Apparently they tried
> very hard to make Travis CI work. One of the major problems they had
> other than the time limit (the builder is relatively slow and jobs get
> killed after 50 minutes, so any builds like Qt would fail of course)
> was the fact that it was impossible to upload the binaries from
> "untrusted pull requests" (not sure what that means, in any case, it
> didn't work for them). They set up their own build infrastructure
> almost from scratch. We clearly don't need to upload the result
> anywhere, but the limitations will stay in any case. I would find it
> valuable if we put our effort in something that can be expanded.

Untrusted means I can't run the bot on Travis because any secrets
available to PRs would be public.
Travis CI is free and I believe this is an advantage for my proposal.
The bot for PRs is not related to Travis and takes more time in my schedule.

> Another option being mentioned was CircleCI (no limitation to upload
> files), but that one has an additional limitation of only 500 free
> minutes per month for OpenSource orgs, so most likely not anywhere
> near enough.

Their macOS builds aren't free. I looked into alternatives but Travis
seems to be the best.

> What we *could* do is to set up a few more build slaves on the
> "existing" infrastructure (the build slaves could actually be
> anywhere, people could even volunteer to provide access to their
> machines; but I guess the idea would be to eventually set up a few new
> ones next to the existing slaves). The build master could either run
> in the same instance or separately.
>
> The idea would be that people with commit access would have to
> manually approve building of a particular pull request (and then that
> pull request would get a green icon of course once the build is
> complete). When a pull request gets updated/modified, the build could
> be tested again, but it would have to be triggered manually again.

I had this idea in mind but chose the one in the proposal because I'd
like to finish deployment in GSoC and changing the Buildbot
infrastructure remotely could cause problems. On Travis whatever I do I
won't break anything hard to recover or leak any secrets but it's not so
safe on Buildbot. I consider virtualization and snapshots the last
defence against evil PRs and prefer to have them.

> The glue code might need some time to be implemented, but it sounds
> like something in the same direction that you already proposed with
> that "go" bot. To be honest, I currently don't know yet what exactly
> would be needed on that part.

Tentative plan:
0. Verify that webhook requests are indeed from GitHub
(https://developer.github.com/webhooks/securing/)
1. Add update/submission/enhancement based on smart logic (new Portfile
means submission) or a new "task list" in the PR template.
2. Determine the maintainer of the ports touched, if the submitter is
one of maintainers for all touched ports not nomaintainer, set the
maintainer label. If all touched ports are nomaintainer, set the
nomaintainer label. If a port touched has a maintainer (not PR
submitter) and is not openmaintainer, set the waitformaintainer label.
(All label names are tentative.)
3. Send mail to / mention with @ / request review from / assign PRs to
related maintainers (single maintainer -> assign, etc.). Limit the
number of maintainers a PR could notify but manual override via
commenting can be implemented.
4. Send mail to a mailing list (if a new ML for PRs is helpful)

Database:
1. Read Email to GitHub handle mapping from database (if migration to
GitHub handle in Portfiles didn't finish in time)
2. Cache port to maintainer mapping (?)

> One of the things I really miss is a website with easy access to build
> summary on per-port basis:
> -https://trac.macports.org/ticket/51995#Statistics
> (This might be something that might even be possible to implement as a
> special view right on the buildbot.)

I'm not familiar with the front end techs. Maybe next time.

> Independent: it would be helpful to hear a better
> justification/defence for selecting Go as the language of the "helper
> bot".

Go has a great set of standard libraries and I can use
https://github.com/google/go-github. Go programs are easy to compile and
deploy.

--
Best regards,
Zero King

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: GSoC Proposal

Clemens Lang-2
Hi,

On Thu, Apr 13, 2017 at 12:37:55AM +0200, Mojca Miklavec wrote:
> I was talking to someone from the HB community. Apparently they tried
> very hard to make Travis CI work. One of the major problems they had
> other than the time limit (the builder is relatively slow and jobs get
> killed after 50 minutes, so any builds like Qt would fail of course)
> was the fact that it was impossible to upload the binaries from
> "untrusted pull requests" (not sure what that means, in any case, it
> didn't work for them).

Yeah, that was also the impression I had last time I talked to somebody
from homebrew about their build setup.

> Another option being mentioned was CircleCI (no limitation to upload
> files), but that one has an additional limitation of only 500 free
> minutes per month for OpenSource orgs, so most likely not anywhere
> near enough.

I think in the long run we may not have an other option than providing
the build resources ourselves, but that comes with the responsibility of
containing any effects due to the untrusted input.

> From what I understand there are two different kinds of security
> issues:
>
> (a) People being able to submit arbitrary code as pull requests
>
> (b) Most of committers are probably not really inspecting diffs or
> full sources of packages. In theory one can write some originally
> useful opensource tool that we package. Once people start using it,
> some problematic code gets introduced, we happily upgrade the port,
> and both our main builder and any user installing that port would be
> affected.
>
> We currently don't have a problem with (a) because we don't build pull
> requests, and with Travis we would simply outsource the problem to
> someone else. For (b) we currently don't have any reliable solution,
> but using Travis would not actually help since a maintainer would
> probably eventually merge the code (if it looks reasonable). And then
> the code from official repository would be built on the main builder.

As for (b), we try to run things in sandboxes to limit the attack
surface, but sure, that's an issue we currently have. Ideally, we would
also run 'make install' as non-root user to further reduce the risks.
Realistically speaking, all distributions have the problem of trusting
their upstream code, though.

Our main concern for building PRs are attackers that could try to use
our CI infrastructure to run bots for a botnet or use it to send spam by
sending a pull request with a specially crafted Portfile.


> I'm thinking of an alternative approach that would cover another use
> case (which cannot be solved by Travis CI due to limitations).
>
> Use cases I have in mind:
> - developer testing a new release of Qt; or a bunch of KDE apps
> - developer trying to build all the 1000+ perl modules once perl5.26
> gets released
> - developer without access to older machines, trying to figure out why
> the build fails on 10.6; having some clue why that could be, but
> trying to avoid doing ten semi-random commits to the main repository
> just to test the hypothesis
>
> What we *could* do is to set up a few more build slaves on the
> "existing" infrastructure (the build slaves could actually be
> anywhere, people could even volunteer to provide access to their
> machines; but I guess the idea would be to eventually set up a few new
> ones next to the existing slaves). The build master could either run
> in the same instance or separately.

We must take care to not risk the security of the build machines that
build our binary archives. They are a valuable target, so we should
isolate machines that build unreviewed user input, ideally using VMs.

> The idea would be that people with commit access would have to
> manually approve building of a particular pull request (and then that
> pull request would get a green icon of course once the build is
> complete). When a pull request gets updated/modified, the build could
> be tested again, but it would have to be triggered manually again.
>
> For each manually approved/triggered build the build slave would:
> - clone the repository
> - (?) most likely rebase that on top of master to make sure that the
> latest packages of everything would be used at the time of build
> - determine which ports have been modified
> - maybe run lint as an initial quick test (?)
> - build those ports in a very similar way to what the buildbot does at
> this moment, with a few exceptions:
>
>   (a) packages from other/existing ports would be fetched from the
> main builders, perhaps even when they are not distributable (we could
> have private copies of binary packages)
>   (b) the resulting packages would be stored between individual builds
> that belong to the same PR, but they would not be uploaded anywhere
> and both sources and binaries (basically anything that has been
> fetched or modified) would be deleted once PR testing is complete

I'm not sure having to review each PR before a build is started is
worthwhile. The whole idea of a PR build bot is that it reduces the
review load on committers, because quick and easy-to-spot feedback is
returned to contributors earlier and faster, and reviewers can spend
their time where it matters most.

> - Some glue code that knows how to communicate between GitHub and
> BuildBot: which jobs to submit, how to report success etc.
>
> The glue code might need some time to be implemented, but it sounds
> like something in the same direction that you already proposed with
> that "go" bot. To be honest, I currently don't know yet what exactly
> would be needed on that part.

I was under the impression that was the second part proposed by the GSoC
project. First, it's a build under Travis, second is the bot. Ideally,
the bot would be written in a way that it could also trigger other
builds (or we could easily integrate other builders with GitHub, I
believe there's a github PR build option for buildbot as well).


> One of the quick&dirty options would even be to add an additional
> custom field (like "Port list" on https://build.macports.org/builders
> that allows rebuilding an arbitrary port) saying "PR Number" where a
> developer would enter the number of the pull request and that would
> then be built. Another alternative would be to write a simple website
> interface where developers could log in with OAuth and click on a
> build button for open pull requests. Just brainstorming.

Please check whether there are pre-existing solutions for buildbot <->
GitHub integration.


> One of the things I really miss is a website with easy access to build
> summary on per-port basis:
> - https://trac.macports.org/ticket/51995#Statistics
> (This might be something that might even be possible to implement as a
> special view right on the buildbot.)

I always thought that was part of the MacPorts web app Ryan was working
on.


On Sun, Apr 16, 2017 at 03:02:20PM +0000, Zero King wrote:
> I had this idea in mind but chose the one in the proposal because I'd
> like to finish deployment in GSoC and changing the Buildbot
> infrastructure remotely could cause problems. On Travis whatever I do
> I won't break anything hard to recover or leak any secrets but it's
> not so safe on Buildbot. I consider virtualization and snapshots the
> last defence against evil PRs and prefer to have them.

I agree with this. Quick iteration with limited results is more useful
than the best solution that never gets done.

--
Clemens
Loading...