Q&A: Given the barrage of information around us..How do you all handle and propose to handle information overload given the above?

A member of the Awesome Artificial Intelligence & Deep Learning Facebook group posted this question to which Arthur Chan replied, I am sharing this Q&A here so that we can refer to it later

Question: [summarized] How to do you keep up with the barrage of information in Machine Learning and Deep Learning?

Answer: [Arthur Chan] I would go with one basic tutorial first – depends on my need, keep on absorbing new material. e.g. I first start from Ng’s class, since I need to learn more about DL on both CV and NLP, I listen to Kapathy’s and Socher’s. Sooner/later you would feel those classes are not as in-depth, that’s when I took the Hinton’s class. For interest, I audit Silver’s class as well.
But narrow it down, one thing at a time. Choose quality material first rather than following sensational hyped news. As you learn more, your judgement would improve on a topic. Then you can start to come up with a list of resources you want to go through.

For a detailed discussion and answers from other members refer to the original post

image

I highly recommend you join this Facebook group

How I achieved classification accuracy of 78.78% on PIMA Indian Diabetes Dataset

I picked up my first Machine Learning dataset from this list and after spending few days doing exploratory analysis and massaging data I arrived at the accuracy of 78.78%

The code for this can be downloaded from GitHub or you can run it directly on Kaggle

Here’s how I did it

After carefully observing this data I categorized Insulin and Diabetes Pedigree function features, I then did a train/test split to prepare for analysis before standardizing using StandardScaler() from sklearn

image

After trying various algorithms (Logistic Regression, Random Forest and XGBoost) I tried Support Vector Machine to get an accuracy of 78.78% on this dataset using a Linear kernel, this is by far the highest consistent accuracy that I got.

image

I also noticed that Regularization parameter “C” didn’t have any impact on final accuracy of SVM

Happy “Machine” Learning