Source: O’Grady, Stephen, The RedMonk Programming Language Rankings: January 2019, redmonk.com, March 20, 2019, https://redmonk.com/sogrady/2019/03/20/language-rankings-1-19/
Back in December, I posted a couple of blogs about Data Scientists (the links are here and here). The first post discussed what a Data Scientist was, and the second post discussed the skills a Data Scientist needs to have. I thought this chart, created by Stephen O’Grady and Rachel Stephens from RedMonk.com, would be helpful for those folks considering what programming languages would be best to learn to help advance their careers. In addition, IT departments will find this helpful as their evaluate programming languages for future adoption.
Mr. O’Grady noted that these charts are a continuation of the work originally performed by Drew Conway and John Myles White late in 2010. While the specific means of collection have changed, the basic process remains the same: they extracted language rankings from GitHub and Stack Overflow, and combined them for a ranking that attempts to reflect both code (GitHub) and discussion (Stack Overflow) traction. The idea is not to offer a statistically valid representation of current usage, but rather to correlate language discussion and usage in an effort to extract insights into potential future adoption trends.
RedMonk’s Current Process
The data source used for the GitHub portion of the analysis is the GitHub Archive. RedMonk queried languages by pulling request in a manner similar to the one GitHub used to assemble the 2016 State of the Octoverse. Their query is designed to be as comparable as possible to the previous process.
- Language is based on the base repository language. While this continues to have the caveats outlined below, it does have the benefit of cohesion with our previous methodology.
- They excluded forked repos.
- They used the aggregated history to determine ranking (though based on the table structure changes this can no longer be accomplished via a single query.)
For Stack Overflow, RedMonk collected the required metrics using their data explorer tool.
Caveats Noted by RedMonk
- To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations RedMonk believes to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. RedMonk encourages interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. RedMonk rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
- Languages that have communities based outside of Stack Overflow such as Mathematica will be under-represented on that axis. It is not possible to scale a process that measures one hundred different community sites, both because many do not have public metrics available and because measuring different community sites against one another is not statistically valid.
With that, here is RedMonk’s first quarter plot of Programming Language Rankings for 2019.
Besides the above plot, which can be difficult to parse even at full size,
RedMonk also offered the following numerical rankings of the top 20 programming languages. RedMonk noted that this run produced several ties which are reflected below (they are listed out here alphabetically rather than consolidated as ties because the latter approach led to misunderstandings).
RedMonk noted that there was little movement within their Tier 1 languages compared to their previous ranking. Generally speaking, the top ten to twelve languages in these rankings tend to be relatively static, with changes both rare and minor in nature. While the landscape remains fantastically diverse in terms of technologies and approaches employed, including the variety of programming languages in common circulation, code written and discussion are counting metrics, and thus characterized by gradual growth or increase. This makes growth for new languages tougher to come by the higher they ascend the rankings – which makes any rapid growth that much more noticeable.
For additional details regarding these rankings, please visit RedMonk’s web site. I provide the link here.