
With the rise of multi-language projects and open-source contributions, it is important to identify the programming language of a code snippet to ensure the right tools and libraries are used for analysis, compilation, and execution. This project focuses on building a machine learning model that can automatically detect the programming language used in a given code sample.
The project involves collecting a diverse dataset of code snippets from various programming languages, preprocessing the data to remove noise, and training a classification model to predict the language. The model will consider syntax, keywords, and structure to make accurate predictions. Such a system can be used in code editors, online coding platforms, and code review tools to enhance productivity and automate language detection.
Basic Knowledge of Programming Languages like Python, R, NLTK / SpaCy etc.
Before Commencing the project the following links have to be examined.
https://www.kaggle.com/
https://towardsai.net/
https://codeforces.com/
https://github.com/