Waren Long, source{d}.
Waren Long, source{d}
PyDays Vienna, 2018
Programming language history of GitHub user X
$$ (\mathcal{P})~~ \left\{ \begin{array}{lll} \min & \sum_{i=1}^S \sum_{j=1}^D ~ x_{i,j} c_{i,j} \\ s.c. & \sum_{j=1}^D x_{i,j} \leq s_i & ~~i = 1,...,S \\ & \sum_{i=1}^S x_{i,j} \geq d_j & ~~j = 1,...,D \\ & x_{i,j} \geq 0 & ~~i,j = 1,...,S,D \end{array} \right. $$
Hypothesis :$$\sum_{i=1}^N s_i = \sum_{j=1}^N d_j ~~~~~\mbox{and}~~~~~ c_{i,j} = 1 ~~~~\forall i,j$$
Repeat until convergence :
$$x_{i+1} = P\cdot x_i$$
x : stationary distribution of the Markovian process associated with P
Convergence is guaranteed if P is stochastic, irreducible and aperiodic
Larry and Sergey had exactly the same objective with the WWW matrix.
Cij is 1 if web page i links to j and 0 otherwise.
They invented a trick to make C well conditioned.
They called it PageRank.
Update the transition matrix as follows :
$$ P = \beta P + \frac{1-\beta}{N}\left( \begin{array}{cccc} 1 & 1 & ... & 1 \\ 1 & 1 & ... & 1 \\ ... & ... & \ddots & ... \\ 1 & 1 & ... & 1 \\ \end{array} \right) $$
N : number of languages
β : dampening or random walk factor, usually 0.85
Rank | Language | popularity, % | source code, % |
---|---|---|---|
1. | Python | 17.7 | 11.0 |
2. | Java | 15.5 | 16.2 |
3. | C | 10.0 | 16.8 |
4. | C++ | 9.9 | 12.3 |
5. | PHP | 8.8 | 23.8 |
6. | Ruby | 8.7 | 2.5 |
GitHub annual stats