Github is wrong and maybe it's your fault

Github is wrong and maybe it's your fault

Yes, you read correct. It could be your fault.

Probably you have already noted that your repository language on Github is classified in a wrong way. Sometimes you publish a dotnet application and the repository detect it as a javascript repository. Have you noted that sometime?

No alt text provided for this image

In most cases it is not a relevant information and almost everybody ignores it. But this is the way Github creates all the statics and determine the code adoption by their users. Now imagine how much a language is considered adopted by the platform, but it is not the truth?

You can imagine how this report is impacted by this statistics?

https://octoverse.github.com/

But how the Github do that?

The platform uses a a Ruby library called Linguist to determine the code percentage in the repository based on multiple rules. With that said, sometimes the tool reads the documentation or the "modules" folder and consider that files as binary files, wich changes the statistics and obviously the "repository language" and consequently the reports about code adoption and usage.

Very impressive, no?

Ok, but how to fix that?

This is pretty simple. According to the Linguist documentation, the rules used to calculate the statistics could be stored on the .gitattributes file. On that, you can override or create new rules and configures the linguist to ignore some files or folders.

And this is the most important thing for me. Ignore files that are listed on the project but are not so relevant to the statistics.

An example? The swagger files. Despite it being part of the project and being inside the repository, that doesn't make the project or repository based on javascript, right?

In my scenario, the application is entirely dotnet and has a frontend with swagger and due to the amount of javascript files present in this library that I use - but do not do maintenance -, the linguist mistakenly inferred that my repository is Javascript.

So, what I did? I simply created the .gitattributes file with the following content and few seconds after pushing the code to the repository, the statistics were correctly updated.

# Example of a `.gitattributes` file which reclassifies the files as csharp
*.cs linguist-language=csharp
*.csproj linguist-language=csharp
*.sln linguist-language=csharp

# And ignore those repositories considering them as documentation
src/*/wwwroot/* linguist-documentation
wwwroot/** linguist-documentation
*/bin/* linguist-documentation        

And this is the report after uploading the .gitattributes file.

No alt text provided for this image

Much better, right?

And you? Have you detected this issue on your repositories too?

Please comment below and share this article with your colleagues.


Thanks,

Lucas Massena

Fantastic! My repo for my biggest webdev project so far is roughly 50/50 python and JavaScript. Imagine my frustration upon seeing that it's telling me 99.3% Python! The whole point of it is to be a demonstration of not only my Django skills, but my Fullstack skills, which includes JS and React. So thanks for this, I'll try this out when I get home today.

Thank you! I've been watching this scenario last week.

This is very important! Thanks!

To view or add a comment, sign in

More articles by Lucas Massena

  • Event storming as a Digital Transformation enabler

    In the age of digital transformation, businesses need to find new ways to innovate and stay competitive. Event storming…

  • The end of the Middle Ages

    Can you imagine the Middle Ages are ending right now? Yes, we are exactly in the transition. Til now every teacher…

    2 Comments
  • Microservices e LGPD, tá preparado?

    Um dos trending topics do momento, é a Lei Geral Proteção de Dados, a tal LGPD. Você já sabe o quanto vai precisar…

    1 Comment
  • Habemus microservice

    Avisem aos clientes que nós temos microservices! Parece até piada, não é mesmo? Mas é exatamente assim que vejo muitas…

    2 Comments
  • Microsserviços? Esqueças as API's

    Pois é, você não leu errado. Esqueça as API's.

    10 Comments
  • Microsserviços, você precisa mesmo?

    Microsserviço é um dos temas mais polêmicos e que mais tenho falado recentemente, que sempre gera muitas e muitas…

    45 Comments
  • Cache pode não ser a solução do seu SQL

    Há muitos anos, venho trabalhando como consultor de aplicações web de alto desempenho e constantemente me deparo com…

  • {A} 1st - API First Strategy

    Ao longo da última década, surgiram uma infinidades de paradigmas sobre desenvolvimento de software. Um deles é {A}1st…

  • Microservices - A escalabilidade natural para o enterprise

    O conceito de microserviços existe há pouco tempo, é verdade. Mas foi só quando o Netflix anunciou a migração para esse…

Others also viewed

Explore content categories