So, as an example, suppose you have words in an NLP or Natural Language Processing system, and the things that you do to the words to make them numeric is that you could typically run something like word2vec or word to vector.
Statistics on the other hand is about keeping the data that you have in getting the best results out of the data that you have. The difference in philosophy affects how you treat outliers. In ML you go out and find enough outliers that becomes something that you can actually train with. Remember that five sample rule that we had? With statistics you say, "I've got all the data I'll ever be able to collect." So, you throw out outliers.
Statistics is often used in a limited data regime or ML operates with lots of data.
So having an extra column to flag whether on you're missing data is what you would normally do in ML.
When you don't have enough data and you imputed to replace it by an average.