Problem/Motivation
Provided reposync plugin uses git directly from shell(exec's) to retrieve information of the real repository.
That is inefficient and it is at least a usual security concern.
Proposed resolution
Use the php binding for the libgit2 library, to implement a new reposync plugin.
A description of libgit2 from its page:
libgit2 is a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API, allowing you to write native speed custom Git applications in any language which supports C bindings.
Remaining tasks
- Implement the reposync plugin, making sure it passes all the related tests.
User interface changes
None.
API changes
None.
Original report by @marvil07
We are now using git directly from shell(exec's) to retrieve information of the real repository.
So, I would like to use a real php module that directly interacts with git, but unfortunately there is not such a module now.
But there is an initiative about creating a true git library started some time ago, that would be used internally, and was part of the GSoC 2010(Vicent Marti) too.
The actual status of libgit2, AFAIK is on development at github and have already some bindings for ruby, python and erlang.
Hopefully I would want to implement swig for libgit2, so we can have php binding ready and then use it! (and help bindings maintainability)
I will post news about that here.
Comments
Comment #1
marvil07 commentedI mailed libgit2 developers to ask suggestions about this. They suggest me to not using SWIG, so I would follow it, since I really have not a real project experience with it.
BTW sdboyer also thinks SWIG is a bad idea.
So, the way here seems to be to write directly a php binding.
Comment #2
marvil07 commentedBTW if we could not get this on time, there is a back up plan: pear VersionControl_GIt. Anyway, we want to avoid git output parsing if possible :-p
Comment #3
sdboyer commentedhah, that pear library is interesting. some pieces of it look a looooot like the git equivalent to svnlib that i've argued for.
Comment #4
marvil07 commentedComment #5
marvil07 commentedTagging as planning meeting, and hopefully waiting for feedback by tizzo and/or sdboyer ;-)
Comment #6
tizzo commentedjustinrandell and I spent some time with this over the weekend. We tried compiling a php extension with SWIG and SWIG really doesn't want to compile libgit2. After looking into what it takes it seems like starting with CodeGen_PECL and then fleshing out the details by hand is probably a better approach. There's some good documentation about it if you hunt (found by justin).
I built the Git Repository Browser against glip and I have to say that the interface is really nice and intuitive (in as far as something representing git data can be). What I would like see is the team standardizing on glip and then when a php language biding becomes available for libgit2 writing an alternative implementation of the glip interface that uses low level c to actually do the lifting.
I have to wonder, though, whether it's the best use of our time to worry about it. The git repo browser seems to be reasonably performant (at least with projects as big as D7 core, anyway). The place this would really help is with log parsing, that's the only place we need to do revwalking which is the really slow bit. Is it worth writing our own language bindings in order to avoid shelling out for log parsing when we have that part done and when d.o and any other serious install don't even need to do that because they'll be using git hooks?
Also, if we are using glip from multiple projects that don't actually rely on one another (the repo browser and git_deploy for starters), what's the best way to manage the dependency and prevent ourselves from including the library twice (especially when the includes are probably handled by autoload).
Comment #7
Anonymous (not verified) commentedsubscribe.
Comment #8
marvil07 commentedToday I just looked at libgit2 to actually try to do something about the php binding, but I found that someone has started it already :-)
The binding lives on https://github.com/chobie/php-git and is on early alpha state, but basic stuff work(I was playing a little with it), and I think it is not that far to be ready for read-only operations.
The only thing is that this binding assumes php 5.3, mainly because it use namespaces AFAIK, but the point is that there is some work there that we can join to \o/
Comment #9
tizzo commentedWooHoo!!! Nice find Marco! I'll be toying with this in the near future for sure.
I'm starting to forge on with the repo browser so that I can have something really cool to show by Drupalcon! It looks like I might have to dig further into git for some data so this should really help keep things snappy.
What would we need to do to get this tested enough to consider deployment on D.O? Would we need someone to do a thorough code review? I don't know C or the PHP extension API well enough to be qualified.
Comment #10
Josh The Geek commentedphp-git's latest comment is 'Info: php-git moved to /libgit2/php-git' So, http://github.com/libgit2/php-git
Comment #11
marvil07 commentedHey, it would be smarter to first do the repo-parsing as plugin before doing this, so we actually end up choosing if we want to use it at configuration time.
Comment #12
marvil07 commentedrepo-sync plugins now do exist!
BTW there is a debian package in process for libgit2, but not yet for the php binding.
Comment #13
marvil07 commentedFirst on D7, then maybe backport(there will probably be no change).
Comment #14
Alex Andrascu commentedWhat's the status of this ? Really interested in seeing this done. Also offering help if needed.
Comment #15
marvil07 commentedAs mentioned on comment 12, the module now have repository synchronization decoupled as ctools plugin. That means, we are in patches welcome status ;-)
BTW I see there is now already a real libgit2-0 package for Debian, but it seems like there is no one yet for the php binding, but that should not stop us to create the code here.
Comment #16
sdboyer commenteddon't need this for stable release.
Comment #17
drummThis would make a lot of improvements possible, and make us more confident in security.
The next step is to figure out what bindings work for us with libgit2, and making sure they are installed properly on our infrastructure.
Comment #18
drummComment #19
marvil07 commentedUpdating issue summary, original description after 4 years is outdated and we actually know what to do.
Comment #20
marvil07 commentedJust a new update, a.k.a. hopefully one more year does not pass between updates here again.
About debian package status: (a) libgit2 is already on current debian stable, jessie; and (b) php binding for libgit2 did not make it for jessie, but there is already a RFP, this binding is fixing up the libgit2 version to use, so anyway it will need to be installed manually; i.e. we should use whatever upstream is pointing to.
Even if D8 RC is actually ready, I would say we can still do the D7 version, since the code should not change much because this plugin is mainly interaction code with php libgit2 binding.