Table waiting dogs would be a real draw for me to visit a restaurant at least once--just to see it done. That's about all i've got for llm automl too. I'm eagerly awaiting the OpenClaw Data Science headline that puts someone out of business. The best thing about human data scientists is that, while on average not very skilled or insightful, they can't work around the clock or while you sleep -- their stuff usually doesn't fuck up your business until you put the model in production and that still happens so rarely as to be a nominal risk. For safety's sake hire human data scientists but don't let them use openclaw. ;-)
I’ve been working on AutoML tools since 2016. I’m the author of mljar-supervised, an open-source AutoML tool with automated documentation and support for ML fairness [1].
Current AutoML solutions focus mainly on algorithm selection and hyperparameter tuning. They are still not very strong in feature construction.
In mljar-supervised, we introduced a feature called Golden Features, which searches for combinations of existing features using simple mathematical operations to improve predictive power.
From my recent experiments with LLMs, I see that they are surprisingly good at feature engineering. They can generate new features that are often very useful.
Recently, I built a tool called AutoLab Experiments (available in MLJAR Studio), which uses AI to optimize ML pipelines. It is conceptually similar to Andrej Karpathy’s AutoResearch. I also wrote a short comparison here [2].
Overall, I believe that combining these two approaches: classic AutoML and LLM-generated code can lead to very powerful systems.
Table waiting dogs would be a real draw for me to visit a restaurant at least once--just to see it done. That's about all i've got for llm automl too. I'm eagerly awaiting the OpenClaw Data Science headline that puts someone out of business. The best thing about human data scientists is that, while on average not very skilled or insightful, they can't work around the clock or while you sleep -- their stuff usually doesn't fuck up your business until you put the model in production and that still happens so rarely as to be a nominal risk. For safety's sake hire human data scientists but don't let them use openclaw. ;-)
Businesses will trust LLMs to perform low-value low-impact tasks, but they will not trust LLMs to perform high-value high-impact tasks.
Ergo, to the extent that data science is in the latter category, those jobs are safe.
As a DS who'se first job (2015-2019 at SparkBeyond - which did do Feature engineering, thankyouverymuch) - neat summary!
I hadn't thought about the mlOps part; our issues were more the clients actually managing to access their data and targets.
It usually goes like this:
(1) Decide on model needs
(2) Hunt for data
(3) Wrangle data
(4) Build models
(5) Explain models
(6) Secure approval to implement models
(7) Figure out how to implement models
(8) Implement models
(9) Monitor models
(loop)
2-3 is more of a moebius loop
Thank you for a fantastic article!
I’ve been working on AutoML tools since 2016. I’m the author of mljar-supervised, an open-source AutoML tool with automated documentation and support for ML fairness [1].
Current AutoML solutions focus mainly on algorithm selection and hyperparameter tuning. They are still not very strong in feature construction.
In mljar-supervised, we introduced a feature called Golden Features, which searches for combinations of existing features using simple mathematical operations to improve predictive power.
From my recent experiments with LLMs, I see that they are surprisingly good at feature engineering. They can generate new features that are often very useful.
Recently, I built a tool called AutoLab Experiments (available in MLJAR Studio), which uses AI to optimize ML pipelines. It is conceptually similar to Andrej Karpathy’s AutoResearch. I also wrote a short comparison here [2].
Overall, I believe that combining these two approaches: classic AutoML and LLM-generated code can lead to very powerful systems.
[1] https://github.com/mljar/mljar-supervised
[2] https://mljar.com/blog/autoresearch-karpathy-autonomous-ai-research/
Yeah, that's not feature engineering, sorry.
(That's feature selection, and +- interaction).
Check PerpetualBooster also.
Autogluon for the win.