Summer Internship at VSC DigiTech - 2020

Baipureddy neeraj
5 min readAug 2, 2020

--

During my 9 weeks internship at VSC DigiTech, I worked on solving 2 real-world problems in the medical field. One is on heart disease which is the main reason for most of the deaths in the world and the other is on the novel corona virus.

Heart diseases can be avoided by taking proper care of health and it can also be cured effectively by using medicines and sometimes surgery. Predicting the risk of the patient having heart disease can reduce the cost of treatments and a lot of other expenses. It is also difficult for the doctors to identify whether the person is going to have heart disease or not in an early stage. Most of the time doctors will come to know that the patient is suffering from heart disease when half of the damage is already done to the patient. Hospitals maintain a lot of information that can be used for prediction. The health care industries are still rich in information but poor in knowledge. In the health care industry, there’s a lot of potentials to improve patient care, If we can use the past data and develop a model or algorithm that performs better than an average doctor that would be a significant improvement in accuracy which is good for patient care. And if the model can do this a lot faster than a human doctor could and this is a very tedious time-consuming task, this would save time for doctors, So it would free doctors to focus on more high-level intellectual tasks. For this problem statement, the VSC DigiTech company planned to develop a model that can predict the risk of the patient having heart disease at an early stage. I and my team member developed a machine learning model using 3 classification algorithms on existing data and the model was accurate in its predictions when it got tested on different samples.

The corona virus has affected all the countries across the world and there is a need of doing rapid testing and evaluation of results. Right now to detect the COVID-19 the hospitals are conducting blood tests, but blood tests are costly and they take a lot of time to conduct to be specific around 5 hrs for a single blood test. Detecting COVID-19 from chest X-ray takes less time when compared to blood tests and also from chest X-ray images the extent of spread can also be detected. But right now there are a Huge population of infected people and very few no. of radiologists. There are so many Chest X-ray images of COVID-19 patients made available online for research purposes. Many deep learning methods can be implemented on medical images. For this problem statement, we developed a model using CNN (convolution neural network) that classifies the chest X-ray images of COVID-19 and normal patients and we were able to achieve a good level of accuracy.

About the company

VSC DigiTech (Vishwam Software Consultancy and Digital Technologies Ltd) is a start company in Hyderabad and it is an International Provider of Solutions in SAP, Digital Transformations, and Product Development affecting Business Transformations and Innovations, Business Processes, Business Strategy, P&L, and Competitiveness. VSC has in excess of 400 person years’ experience of the world’s best-trained specialists in 4 nations.

Heart Disease prediction model :

We used 3 models namely KNN (K-nearest neighbor), Random Forest, and Logistic regression, and these 3 models are trained on the training data.

After training all three models we implemented a hybrid machine learning technique called ensemble learning because in some cases the model with highest accuracy might fail due in such cases we consider the predictions of other 2 models. The ensemble learning technique used here is Voting classifier (Hard voting) It takes the output of all the three models and decides the final output based on highest No. Of votes (mode) of prediction of 3 models.

COVID-19 Detection Model :

We had built a CNN which classifies the images based upon “Fine Grained visual Features — Such as texture , sharpness etc..”.

224 Chest X-ray images(112 — covid, 112 — normal) have been used for training the model and 56 images(28 — covid , 28 — normal) for testing the model.

Working of the model :

First the chest x- ray image will be pre-processed ,( In an image the matrix values range is( 0 to 255)) then we will normalize the values by dividing each value in the matrix with 255 and the we will rescale the image size to (224, 224). Then the image will go into the model for prediction.

Experience and learning outcomes

This is my first experience working with a company and I learned how corporate companies work and how they plan the projects. It’s been a great experience I improved my communication skills and learned how companies work and how you need to present your work to others in the company and I learned how to work in teams. When it comes to the projects I learned how machine learning is used in health care and why ensemble learning is important for accurate predictions. And I learned how to work on images and how to make use of medical imaging. And I came to know what machine learning algorithms and deep learning techniques are mostly used by many researchers to solve the problems in health care. And I gained a lot of knowledge while working on these projects because for these projects I studied 6 research papers. And the main thing I learned was how to create a demo working model to satisfy the clients and get the actual data from them.

Acknowledgement :

I would thank Mr. Venu Gopal CEO of VSC DigiTech for giving me the opportunity to serve as a machine learning intern for VSC Digitech in these uncertain times and also I would thank Padma Neeraj Kumar for guiding me in these projects and providing me regular and valuable feedback on my work.

--

--