Since the beginning of the last month, I have started to produce video content about LangChain. Video content is easier to engage with on a complex topic like ML. Also, because I consider the AiEdge Newsletter an educative platform for Machine Learning, I think it is better to learn if the content builds from the one of the past week to have more continuity in the learning experience.
I have been enjoying creating those videos, and I would like to make more of them! But before doing so, I wanted to get your input on the next subject I should focus on:
The subjects in this poll are the ones I feel comfortable educating people about, but please don’t hesitate to suggest others in the comments. Here is my reasoning about those subjects:
Introduction to Data Science - Machine Learning: the videos would be spread across three subjects: data wrangling with Pandas, Statistical data analysis, and Machine Learning. I have taught this course at the university, and it was an excellent way for students to get the students ready to become Data Scientists.
Introduction to Machine Learning System Design: to me, that is the most essential skill if you want to become a senior engineer. The ability to design ML solutions end-to-end while communicating with different teams and people with different skill sets is fundamental to building successful products.
Introduction to Recommender Systems: Recommender systems are everywhere and are the ML applications generating the most revenue across industries. I want to provide material for people to understand how the latest rec systems are designed.
How to train and fine-tune LLMs: this type of learning material can be hard to find these days, but there is a gold rush in hiring people with this type of skill. I want to focus on how to build those in practice.
Let me know about your thoughts on the subjects. Thank you for subscribing!
Second.
Being new to AL/ML, the first thing I have noticed is there is little focus on working with data, but when I talk to seasoned data scientist its where they spend the bulk of their time.
All the training I have seen and worked through has always used pre-formated datasets, so of course, when I started my first real-world project and had to work with raw data I was struggling to get it formatted correctly and optimally.
Most software engineers I show these datasets to are completely unfamiliar with their structure and how to use them.