diff options
| author | Franck Cuny <franck.cuny@gmail.com> | 2013-11-26 10:36:10 -0800 |
|---|---|---|
| committer | Franck Cuny <franck.cuny@gmail.com> | 2013-11-26 10:36:10 -0800 |
| commit | 8ddf2e94df70707b458528a437759b96046d3e01 (patch) | |
| tree | d442818d92d3c9c6f7fcdc92857a1228963849a1 /_posts/2010-03-25-github-explorer.textile | |
| parent | Don't need to use the IP in the makefile. (diff) | |
| download | lumberjaph-8ddf2e94df70707b458528a437759b96046d3e01.tar.gz | |
Huge update.
Moved all posts from textile to markdown. Updated all the CSS and
styles. Added a new page for the resume.
Diffstat (limited to '')
| -rw-r--r-- | _posts/2010-03-25-github-explorer.md (renamed from _posts/2010-03-25-github-explorer.textile) | 92 |
1 files changed, 45 insertions, 47 deletions
diff --git a/_posts/2010-03-25-github-explorer.textile b/_posts/2010-03-25-github-explorer.md index 891e862..95a51d3 100644 --- a/_posts/2010-03-25-github-explorer.textile +++ b/_posts/2010-03-25-github-explorer.md @@ -1,22 +1,20 @@ --- layout: post -category: graph -title: Github explorer +summary: In which I write about GitHub Explorer. +title: GitHub explorer --- -bq. *More informations about the poster are available on "this post":http://lumberjaph.net/graph/2010/04/02/github-poster.html* +> *More informations about the poster are available on [this post](http://lumberjaph.net/graph/2010/04/02/github-poster.html)* -Last year, with help from my coworkers at "Linkfluence":http://linkfluence.net/, I created two sets of maps of the "Perl":http://perl.org and "CPAN":http://search.cpan.org/'s community. For this, I collected data from CPAN to create three maps: +Last year, with help from my coworkers at [Linkfluence](http://linkfluence.net/), I created two sets of maps of the [Perl](http://perl.org) and [CPAN](http://search.cpan.org/)'s community. For this, I collected data from CPAN to create three maps: - * "dependencies between distributions":http://cpan-explorer.org/2009/07/28/new-version-of-the-distributions-map-for-yapceu/ + * [dependencies between distributions](http://cpan-explorer.org/2009/07/28/new-version-of-the-distributions-map-for-yapceu/) + * [which authors wre important in term of reliability](http://cpan-explorer.org/2009/07/28/version-of-the-authors-graph-for-yapceu/) + * [and how the websites theses authors are structured](http://cpan-explorer.org/2009/07/28/new-web-communities-map-for-yapceu/) - * "which authors wre important in term of reliability":http://cpan-explorer.org/2009/07/28/version-of-the-authors-graph-for-yapceu/ +I wanted to do something similar again, but not with the same data. So I took a look at what could be a good subject. One of the things that we saw from the map of the websites is the importance [GitHub](http://github.com/) is gaining inside the Perl community. GitHub provides a [really good API](http://develop.github.com/), so I started to play with it. - * "and how the websites theses authors are structured":http://cpan-explorer.org/2009/07/28/new-web-communities-map-for-yapceu/ - -I wanted to do something similar again, but not with the same data. So I took a look at what could be a good subject. One of the things that we saw from the map of the websites is the importance "github":http://github.com/ is gaining inside the Perl community. Github provides a "really good API":http://develop.github.com/, so I started to play with it. - -bq. This graph will be printed on a poster, size will be "A2":http://en.wikipedia.org/wiki/A2_paper_size and "A1":http://en.wikipedia.org/wiki/A1_paper_size". Please, contact me *(franck.cuny [at] linkfluence.net)* if you will be interested by one. +> This graph will be printed on a poster, size will be [A2](http://en.wikipedia.org/wiki/A2_paper_size) and [A1](http://en.wikipedia.org/wiki/A1_paper_size). Please, contact me franck.cuny [at] linkfluence.net if you will be interested by one. <img class="img_center" src="/static/imgs/general.png" title="github explorer global" /> @@ -24,42 +22,42 @@ bq. This graph will be printed on a poster, size will be "A2":http://en.wikipedi This time, I didn't aim for the Perl community only, but the whole github communities. I've created several graphs: -bq. all the graph are available "on my flickr account":http://www.flickr.com/photos/franck_/sets/72157623447857405/ +> all the graph are available "on my flickr account":http://www.flickr.com/photos/franck_/sets/72157623447857405/ - * "a graph of all languages":http://www.flickr.com/photos/franck_/4460144638/ - * "a graph of the Perl community":http://www.flickr.com/photos/franck_/4456072255/in/set-72157623447857405/ - * "a graph of the Ruby community":http://www.flickr.com/photos/franck_/4456914448/ - * "a graph of the Python community":http://www.flickr.com/photos/franck_/4456118597/in/set-72157623447857405/ - * "a graph of the PHP community":http://www.flickr.com/photos/franck_/4456830956/in/set-72157623447857405/ - * "a graph of the European community":http://www.flickr.com/photos/franck_/4456862434/in/set-72157623447857405/ - * "a graph of the Japan community":http://www.flickr.com/photos/franck_/4456129655/in/set-72157623447857405/ +* [a graph of all languages](http://www.flickr.com/photos/franck_/4460144638/) +* [a graph of the Perl community](http://www.flickr.com/photos/franck_/4456072255/in/set-72157623447857405/) +* [a graph of the Ruby community](http://www.flickr.com/photos/franck_/4456914448/) +* [a graph of the Python community](http://www.flickr.com/photos/franck_/4456118597/in/set-72157623447857405/) +* [a graph of the PHP community](http://www.flickr.com/photos/franck_/4456830956/in/set-72157623447857405/) +* [a graph of the European community](http://www.flickr.com/photos/franck_/4456862434/in/set-72157623447857405/) +* [a graph of the Japan community](http://www.flickr.com/photos/franck_/4456129655/in/set-72157623447857405/) -I think a disclaimer is important at this point. I know that github doesn't represent the whole open source community. With these maps, I don't claim to represent what the open source world looks like right now. This is not a troll about which language is best, or used at large. It's *ONLY* about github. +I think a disclaimer is important at this point. I know that github doesn't represent the whole open source community. With these maps, I don't claim to represent what the open source world looks like right now. This is not a troll about which language is best, or used at large. It's **ONLY** about GitHub. -Also, I won't provide deep analysis for each of these graphs, as I lack insight about some of those communities. So feel free to "re-use the graphs":http://franck.lumberjaph.net/graphs.tgz and provide your own analyses. +Also, I won't provide deep analysis for each of these graphs, as I lack insight about some of those communities. So feel free to [re-use the graphs](http://franck.lumberjaph.net/graphs.tgz) and provide your own analyses. -h3. Methodology +## Methodology -I didn't collect all the profiles. We (with "Guilhem":http://twitter.com/gfouetil decided to limit to peoples who are followed by at least two other people. We did the same thing for repositories, limiting to repositories which are at least forked once. Using this technique, more than 17k profiles have been collected, and nearly as many repositories. +I didn't collect all the profiles. We (with [Guilhem](http://twitter.com/gfouetil) decided to limit to peoples who are followed by at least two other people. We did the same thing for repositories, limiting to repositories which are at least forked once. Using this technique, more than 17k profiles have been collected, and nearly as many repositories. -For each profile, using the github API, I've tried to determine what the main language for this person is. And with the help of the "geonames":http://www.geonames.org, find the right country to attach the profile to. +For each profile, using the github API, I've tried to determine what the main language for this person is. And with the help of the [geonames](http://www.geonames.org), find the right country to attach the profile to. Each profile is represented by a node. For each node, the following attributes are set: - * name of the profile - * main language used by this profile, determined by github - * name of the country - * follower count - * following count - * repository count +* name of the profile +* main language used by this profile, determined by github +* name of the country +* follower count +* following count +* repository count An edge is a link between two profiles. Each time someone follows another profile, a link is created. By default, the weight of this link is 1. For each project this person forked from the target profile, the weight is incremented. -As always, I've used "Gephi":http://gephi.org/ (now in version 0.7) to create the graphs. Feel free to download the various graph files and use them with Gephi. +As always, I've used [Gephi](http://gephi.org/) (now in version 0.7) to create the graphs. Feel free to download the various graph files and use them with Gephi. -h3. Github +## Github -bq. properties of the graph: 16443 nodes / 130650 edges +> properties of the graph: 16443 nodes / 130650 edges <a href="http://www.flickr.com/photos/franck_/4460144638/" title="Github - All - by languages by franck.cuny, on Flickr"><img class="img_center" src="http://farm5.static.flickr.com/4027/4460144638_48e7d83e80.jpg" width="482" height="500" alt="Github - All - by languages" /></a> @@ -67,11 +65,11 @@ The first map is about all the languages available on github. This one was real You can't miss Ruby on this map. As github uses Ruby on Rails, it's not really surprising that the Ruby community has a particular interest on this website. The main languages on github are what we can expect, with PHP, Python, Perl, Javascript. -Some languages are not really well represented. We can assume that most Haskell projects might use darcs, and therefore are not on github. Some other languages may use other platforms, like launchpad, or sourceforge. +Some languages are not really well represented. We can assume that most Haskell projects might use darcs, and therefore are not on github. Some other languages may use other platforms, like launchpad, or sourceforge. -h3. Perl +## Perl -bq. properties of the graph: 365 nodes / 4440 edges +> properties of the graph: 365 nodes / 4440 edges <a href="http://www.flickr.com/photos/franck_/4456842344/" title="Perl community on Github by franck.cuny, on Flickr"><img src="http://farm5.static.flickr.com/4002/4456842344_06f39127a8.jpg" class="img_center" width="500" height="437" alt="Perl community on Github" /></a> @@ -83,17 +81,17 @@ One important project that is not (deliberately) represented on this graph is th To conclude about Perl, there are only 365 nodes on this graph, but no less than 4440 edges. That's nearly two times the number of edges compared to the Python community. Perl is a really well structured community, probably thanks to the CPAN, which already acted as hub for contributors. -h3. Python +## Python -<blockquote>properties of the graph: 532 nodes / 2566 edges</blockquote> +> properties of the graph: 532 nodes / 2566 edges <a href="http://www.flickr.com/photos/franck_/4456118597/" title="Python community, by country, on Github by franck.cuny, on Flickr"><img src="http://farm3.static.flickr.com/2676/4456118597_9d39f8d413.jpg" class="img_center" width="470" height="500" alt="Python community, by country, on Github" /></a> The Python community looks a lot like the Perl community, but only in the structure of the graph. If we look closely, <a href="http://www.djangoproject.com/">Django</a> is the main project that represent Python on Github, in contrast with Perl where there is no leader. Some small projects gather small community of developers. -h3. PHP +## PHP -<blockquote>properties of the graph: 301 nodes / 1071 edges</blockquote> +> properties of the graph: 301 nodes / 1071 edges <a href="http://www.flickr.com/photos/franck_/4456830956/" title="PHP community on Github by franck.cuny, on Flickr"><img src="http://farm5.static.flickr.com/4033/4456830956_ef0e8f3587.jpg" class="img_center" width="500" height="372" alt="PHP community on Github" /></a> @@ -101,9 +99,9 @@ PHP is the only community that is structured this way on Github. We can clearly <a href="http://cakephp.org/">CakePHP</a> and <a href="http://www.symfony-project.org/">Symphony</a> are the two main projects. Nearly all the projects gather an international community, at the exception of a few japanese-only projects -h3. Ruby +## Ruby -<blockquote>properties of the graph: 3742 nodes / 24571 edges</blockquote> +> properties of the graph: 3742 nodes / 24571 edges <a href="http://www.flickr.com/photos/franck_/4456914448/" title="Ruby community, by country, on Github by franck.cuny, on Flickr"><img src="http://farm5.static.flickr.com/4012/4456914448_8089c3acca.jpg" class="img_center" width="500" height="469" alt="Ruby community, by country, on Github" /></a> @@ -111,17 +109,17 @@ As for the Github graph, we can clearly see that some countries are isolated. On The main projects that gather most of the hackers are <a href="http://rubyonrails.org/">Rails</a> and <a href="http://sinatrarb.com/">Sinatra</a>, two famous web frameworks. -h3. Europe +## Europe -<blockquote>properties of the graph: 2711 nodes / 11259 edges</blockquote> +> properties of the graph: 2711 nodes / 11259 edges <a href="http://www.flickr.com/photos/franck_/4456862434/" title="Europe community on Github by franck.cuny, on Flickr"><img src="http://farm5.static.flickr.com/4062/4456862434_324e7b2c75.jpg" class="img_center" width="500" height="450" alt="Europe community on Github" /></a> This one shows interesting features. Some countries are really isolated. If we look at Spain, we can see a community of Ruby programmers, with an important connectivity between them, but no really strong connection with any foreign developers. We can clearly see the Perl community exists as only one community, and is not split by country. The same is true for Python. -h3. Japanese hackers community +## Japanese hackers community -<blockquote>properties of the graph: 559 nodes / 5276 edges</blockquote> +> properties of the graph: 559 nodes / 5276 edges <a href="http://www.flickr.com/photos/franck_/4456129655/" title="Japan community on github by franck.cuny, on Flickr"><img src="http://farm3.static.flickr.com/2800/4456129655_8c6f7f20a0.jpg" class="img_center" width="500" height="410" alt="Japan community on github" /></a> @@ -133,7 +131,7 @@ We have seen in the previous graph that the Japanese hackers are always isolated This is a really well-connected graph too. -h3. Conclusions and graphs +## Conclusions and graphs I may have not provided a deep analysis of all the graph. I don't have knowledge of most of the community outside of Perl. Feel free to <a href="http://franck.lumberjaph.net/blog/graphs.tgz">download the graph</a>, to load them in <a href="http://gephi.org/">Gephi</a>, experiment, and provides your own thoughts. |
