summaryrefslogtreecommitdiff
path: root/posts/2010-03-25-github-explorer.org
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--posts/2010-03-25-github-explorer.org236
1 files changed, 0 insertions, 236 deletions
diff --git a/posts/2010-03-25-github-explorer.org b/posts/2010-03-25-github-explorer.org
deleted file mode 100644
index efa4816..0000000
--- a/posts/2010-03-25-github-explorer.org
+++ /dev/null
@@ -1,236 +0,0 @@
-#+BEGIN_QUOTE
- /More informations about the poster are available on
- [[http://lumberjaph.net/graph/2010/04/02/github-poster.html][this
- post]]/
-#+END_QUOTE
-
-Last year, with help from my coworkers at
-[[http://linkfluence.net/][Linkfluence]], I created two sets of maps of
-the [[http://perl.org][Perl]] and [[http://search.cpan.org/][CPAN]]'s
-community. For this, I collected data from CPAN to create three maps:
-
-- [[http://cpan-explorer.org/2009/07/28/new-version-of-the-distributions-map-for-yapceu/][dependencies
- between distributions]]
-- [[http://cpan-explorer.org/2009/07/28/version-of-the-authors-graph-for-yapceu/][which
- authors wre important in term of reliability]]
-- [[http://cpan-explorer.org/2009/07/28/new-web-communities-map-for-yapceu/][and
- how the websites theses authors are structured]]
-
-I wanted to do something similar again, but not with the same data. So I
-took a look at what could be a good subject. One of the things that we
-saw from the map of the websites is the importance
-[[http://github.com/][GitHub]] is gaining inside the Perl community.
-GitHub provides a [[http://develop.github.com/][really good API]], so I
-started to play with it.
-
-#+BEGIN_QUOTE
- This graph will be printed on a poster, size will be
- [[http://en.wikipedia.org/wiki/A2_paper_size][A2]] and
- [[http://en.wikipedia.org/wiki/A1_paper_size][A1]]. Please, contact me
- franck.cuny [at] linkfluence.net if you will be interested by one.
-#+END_QUOTE
-
-This time, I didn't aim for the Perl community only, but the whole
-github communities. I've created several graphs:
-
-#+BEGIN_QUOTE
- all the graph are available "on my flickr
- account":http://www.flickr.com/photos/franck\_/sets/72157623447857405/
-#+END_QUOTE
-
-- [[http://www.flickr.com/photos/franck_/4460144638/][a graph of all
- languages]]
-- [[http://www.flickr.com/photos/franck_/4456072255/in/set-72157623447857405/][a
- graph of the Perl community]]
-- [[http://www.flickr.com/photos/franck_/4456914448/][a graph of the
- Ruby community]]
-- [[http://www.flickr.com/photos/franck_/4456118597/in/set-72157623447857405/][a
- graph of the Python community]]
-- [[http://www.flickr.com/photos/franck_/4456830956/in/set-72157623447857405/][a
- graph of the PHP community]]
-- [[http://www.flickr.com/photos/franck_/4456862434/in/set-72157623447857405/][a
- graph of the European community]]
-- [[http://www.flickr.com/photos/franck_/4456129655/in/set-72157623447857405/][a
- graph of the Japan community]]
-
-I think a disclaimer is important at this point. I know that github
-doesn't represent the whole open source community. With these maps, I
-don't claim to represent what the open source world looks like right
-now. This is not a troll about which language is best, or used at large.
-It's *ONLY* about GitHub.
-
-Also, I won't provide deep analysis for each of these graphs, as I lack
-insight about some of those communities. So feel free to
-[[http://franck.lumberjaph.net/graphs.tgz][re-use the graphs]] and
-provide your own analyses.
-
-** Methodology
-
-I didn't collect all the profiles. We (with
-[[http://twitter.com/gfouetil][Guilhem]] decided to limit to peoples who
-are followed by at least two other people. We did the same thing for
-repositories, limiting to repositories which are at least forked once.
-Using this technique, more than 17k profiles have been collected, and
-nearly as many repositories.
-
-For each profile, using the github API, I've tried to determine what the
-main language for this person is. And with the help of the
-[[http://www.geonames.org][geonames]], find the right country to attach
-the profile to.
-
-Each profile is represented by a node. For each node, the following
-attributes are set:
-
-- name of the profile
-- main language used by this profile, determined by github
-- name of the country
-- follower count
-- following count
-- repository count
-
-An edge is a link between two profiles. Each time someone follows
-another profile, a link is created. By default, the weight of this link
-is 1. For each project this person forked from the target profile, the
-weight is incremented.
-
-As always, I've used [[http://gephi.org/][Gephi]] (now in version 0.7)
-to create the graphs. Feel free to download the various graph files and
-use them with Gephi.
-
-** Github
-
-#+BEGIN_QUOTE
- properties of the graph: 16443 nodes / 130650 edges
-#+END_QUOTE
-
-The first map is about all the languages available on github. This one
-was really heavy, with more than 17k nodes, and 130k edges. The final
-version of the graph use the 2270 more connected nodes.
-
-You can't miss Ruby on this map. As github uses Ruby on Rails, it's not
-really surprising that the Ruby community has a particular interest on
-this website. The main languages on github are what we can expect, with
-PHP, Python, Perl, Javascript.
-
-Some languages are not really well represented. We can assume that most
-Haskell projects might use darcs, and therefore are not on github. Some
-other languages may use other platforms, like launchpad, or sourceforge.
-
-** Perl
-
-#+BEGIN_QUOTE
- properties of the graph: 365 nodes / 4440 edges
-#+END_QUOTE
-
-The Perl community is split into two parts. On the left side, there is
-the occidental community, driven by people like
-"Florian":http://github.com/rafl, "Yuval":http://github.com/nothingmuch,
-"rjbs":http://github.com/rjbs, ... The second part are the japanese Perl
-hackers, with Tokhuirom, Typester, Yappo, ... And in between them,
-Miyagawa acts as a glue. This map looks a lot like the previous map of
-the CPAN. We can see that this community is international, with the
-exception of Japan that don't mix with others.
-
-There is no main project on github that gathers people, even though we
-can see a fair amount of MooseX:: projects. Most of the developers will
-work on different modules, that may not have the same purpose. Lately we
-have seen a fair amount of work on various Plack stuff, mainly
-middleware, but also HTTP servers (twiggy, starman, ...) and web
-framework (dancer).
-
-One important project that is not (deliberately) represented on this
-graph is the gitpan, Schwern's project. The gitpan is an import of all
-the CPAN modules, with complete history using the Backpan.
-
-To conclude about Perl, there are only 365 nodes on this graph, but no
-less than 4440 edges. That's nearly two times the number of edges
-compared to the Python community. Perl is a really well structured
-community, probably thanks to the CPAN, which already acted as hub for
-contributors.
-
-** Python
-
-#+BEGIN_QUOTE
- properties of the graph: 532 nodes / 2566 edges
-#+END_QUOTE
-
-The Python community looks a lot like the Perl community, but only in
-the structure of the graph. If we look closely, Django is the main
-project that represent Python on Github, in contrast with Perl where
-there is no leader. Some small projects gather small community of
-developers.
-
-** PHP
-
-#+BEGIN_QUOTE
- properties of the graph: 301 nodes / 1071 edges
-#+END_QUOTE
-
-PHP is the only community that is structured this way on Github. We can
-clearly see that people are structured based on a project where they
-mainly contribute.
-
-CakePHP and Symphony are the two main projects. Nearly all the projects
-gather an international community, at the exception of a few
-japanese-only projects
-
-** Ruby
-
-#+BEGIN_QUOTE
- properties of the graph: 3742 nodes / 24571 edges
-#+END_QUOTE
-
-As for the Github graph, we can clearly see that some countries are
-isolated. On the right side, we have: the Japan community is at the
-bottom; the Spanish at the top. Australian are represented on the upper
-right corner, while on the left side we got the Brazilians.
-
-The main projects that gather most of the hackers are Rails and Sinatra,
-two famous web frameworks.
-
-** Europe
-
-#+BEGIN_QUOTE
- properties of the graph: 2711 nodes / 11259 edges
-#+END_QUOTE
-
-This one shows interesting features. Some countries are really isolated.
-If we look at Spain, we can see a community of Ruby programmers, with an
-important connectivity between them, but no really strong connection
-with any foreign developers. We can clearly see the Perl community
-exists as only one community, and is not split by country. The same is
-true for Python.
-
-** Japanese hackers community
-
-#+BEGIN_QUOTE
- properties of the graph: 559 nodes / 5276 edges
-#+END_QUOTE
-
-This community is unique on github. In 2007, Yappo created
-coderepos.org, a repository for open source developers in Japan. It was
-a subversion repository, with Trac as an HTTP front-end. It gathered
-around 900 developers, with all kind of projects (Perl, Python, Ruby,
-Javascript, ...). Most of these users have switched to github now.
-
-Three main communities are visible on this graph: Perl; Ruby; PHP. As
-always, the Javascript community as a glue between them. And yes, we can
-confirm that Perl is big in Japan.
-
-We have seen in the previous graph that the Japanese hackers are always
-isolated. We can assume that their language is an obstacle.
-
-This is a really well-connected graph too.
-
-** Conclusions and graphs
-
-I may have not provided a deep analysis of all the graph. I don't have
-knowledge of most of the community outside of Perl. Feel free to
-download the graph, to load them in Gephi, experiment, and provides your
-own thoughts.
-
-I would like to thanks everybody at Linkfluence (guilhem for his
-advices, camille for giving me time to work on this, and antonin for the
-amazing poster), who have helped me and let me use time and resources to
-finish this work. Special thanks to blob for reviewing my prose and cdlm
-for the discussion :)