At least as an wannabe. In my last post I lamented about not blogging and not being a wannabe data scientist. But that time is over!

In previous posts, I’ve talked about participating in machine learning competitions, mostly on Kaggle. I recently joined three competitions and I’m really excited about them. Here they are, in no order of importance:

  • Google Cloud & NCAA® ML Competition 2019-Men’s - Of course I’m doing this one. I entered this last year and did okay but I really had no clue what I was doing. This year I’m hoping to apply some of the concepts I’ve learned since then and move up in the world of machine learning. (Also it would be nice to win my office bracket pool!)
  • Microsoft Malware Prediction - This one sounds interesting. Predict the probability that a Windows machine will get infected by some sort of malware based on different properties of that machine. I’m not sure how this one’s going to go but it has a real-world feel to it. Companies (like Microsoft) have projects and teams dedicated to this type of research.
  • First TextWorld Problems: A Reinforcement and Language Learning Challenge - Design and train an AI agent to play and win simple text-based games.

That last one intrigues me the most. First, it involves reinforcement learning, a branch of machine learning I know almost nothing about. I mean, I get the main concept- get an AI agent to do a specific task by utilizing a reward system- but I have yet to try to develop anything myself. The other thing about this that excites me is it involves games from childhood. Very specifically, I remember a game called Zork. I can recall firing up my Macintosh computer and playing this game for hours. I can play it again, since small companies and vintage game enthusiasts have made it available to play online. Another way to play it is by using TextWorld. TextWorld is open-source software from Microsoft written in Python. It is available from Github here. It provides a framework for building and playing text-based games, making it perfectly suited to train an AI agent to successfully navigate through a text game. I may be new to this, but I sure am excited to learn and see how it all works!

In terms of deadlines, I better get started. There is still plenty of time on the March Madness and TextWorld competitions, but the Malware Prediction competition closes in 18 days. That’s probably enough time to take a couple of passes through the data, make my best guess at what the relevant features are, and run it through a model. It won’t be pretty but I’ll be able to say that I did it and I may learn something. More likely, though, I will instead spend all my time on March Madness. At this point I’m the most comfortable with it and I’m getting some momentum. So I’ll do that and miss out on the malware prediction altogether.

No. I’ll at least make an attempt on predicting malware. I promise.

So you see, I am back. My job as data science wannabe is secure. You were never in any doubt, were you?

Don’t mind me, I’m just rambling.