Escolar Documentos
Profissional Documentos
Cultura Documentos
E-mail: aniket.diat@gmail.com
https://in.linkedin.com/in/aniket-gurav-a1a9a129
Data Scientist
Analyst
Projects
At Digitially Insight:
Aspect Based Sentiment Analysis:
Aim of this project is to provide Aspect Based Sentiment to end user. Source of data was twitter product reviews. It
involves fetching review tweets regarding product using API designed in python. Cleaning of these tweets to reduce
noise and get only relevant data. Aspects were extracted from this clean data using unsupervised approach. After
this sentiment were found on Aspect level sentences using supervised classifiers like SVM, Random Forest, Nave
Bayes. Some other tasks performed in it are Name entities extraction using NLTK / Stanford CoreNLP parser. POS
tagging, stemming of tweets. Text clustering, Sentence similarity using word2vec. Sentiment analysis using
word2vec.
At CredR:
This task involves statistical analysis bike sales data over period of time. The data is processed using machine
learning techniques like clustering, PCA. Then this data is visualised and meaningful insights are found which helps
in taking the business decisions.
It involves capturing accelerometer data gathered by mobile app. Data consist of accelerometer and gyroscope
reading in x, y, z direction Then this data is processed to estimate the state of bike like normal ride, bumpy ride,
continuous brakes etc. By applying ML technique the approximate time at which brake, bump occurred are found
out . The results from it are used to give recommendation to rider.
At Tiger Analytics
Sales Analytics:
Inspecting opportunity in loss reasons using NLP, Finding out most frequent loss reasons and analyse it on basis of
quantities and revenue. Weighting lose reasons. Analysing the association between product sales. Reporting
valuable Insights got through Data Analysis.
Completed training Pega Decision enablement process which include Next-Best-Action decision strategies real-time
interactions and simulation activities.
Exploratory Analysis:
It is internal project with aim to reduce data exploration and cleaning time. Designed Data Exploration tools in R
which helps to graphically inspect features/attributes in dataset. It helps to summarize data at feature level enable
studying its statistical properties like mean, mode, median, most frequent observations. This package also uses
different techniques for outlier detection and missing value imputation.
Business Intelligence:
This project involves analysing sales and purchase data across different merchants, clients using Teradata SQL. The
main tasks in this project include Categorising merchants on basis of different products sales, volume of sale,
geography of sale, quantity of product sale. Searching for the measure changes in pattern of sale. Inspecting cross
border sales/ purchase, Product categories. Understanding products and associated buyers. Inspecting behavior of
buyer across all segments.
Cost Optimisation:
The aim of this POC was to Identify cost savings opportunities in ordering parts/components through data analytics.
Dataset consist of orders of different manufacturing parts required for the networking product. These parts were
ordered from different manufactures across different countries from several manufacturers. The main tasks done in
these project involve building a forecasting model to predict future cost of parts based on historical data. It also
involves Correlation analysis, Cost saving through distributing order to different manufacturer.
At CDAC, Pune
Knowledge based elucidation of tertiary structure of protein on their function algorithm development and
web hosting of prediction server.
This project involve designing, developing and testing data mining and machine learning algorithms for various
classification and regression datasets. The algorithms developed was Genetic Algorithm, Simulated Annealing, Ant
Colony Optimisation Algorithm, GSO algorithm for feature selection and dimensionality reduction of high
dimensional datasets. The algorithmic development for this project was done using Java, Matlab, Octave and R. The
measure statistical and data mining tools used in these projects were Weka, Random Forest, SVM.
M.Tech project
EDUCATION:
Publications:
2. Aniket. A. Gurav, Manisha. J. Nene Optimal Path Identification using Ant Colony Optimisation in Wireless
Sensor Network, presented in International conference WIMOA 2013 held at Delhi and published in conference
preceding.
3. Aniket. A. Gurav, Manisha. J. Nene Multiple Optimal Path Identification using Ant Colony Optimisation in
Wireless Sensor Network published in IJWMN Journal.
Awards/ Scholarship
Top 10% in Kaggle Africa Soil Property Prediction Challenge
DST ( Department of Science and Technology) sponsored Fellowship. CDAC Pune, India.
2013 2014 DRDO ( Defence Research & Development Organization, Ministry of Defence) Sponsored
Scholarship. 2011-2013
Skills- Python, Pandas, Scikit Learn - Python
R, Core Java , C, C++, TeradataSQL
Statistical Environments: Octave/ Matlab Popular Libraries/ tools: Word2Vec, Stanford Corenlp parser, arctweetnlp,
Weka, LibSVM , Random forest Linear Regression, Logistic regression, Machine Learning, Probabilistic modelling
Academics Courses :Data Structures, Computer Organization, OOP, Advanced Database System, Core Java ,
Mathematical Modeling & System Analysis, Discrete Event Simulations, Statistical Modeling, Linear Algebra,
Operational Research, HPC
Online Courses : Machine Learning, Probability and Statistics, Regression Model, Learning from Data, Data
Scientist Toolbox, Practical Machine Learning, Statistical Reasoning for Public Health 2: Regression Methods
Aniket Gurav